Home
Click

Molecular and Computational Biology Research School

COURSE

Advanced biocomputing

Over the last decade, biological data sets have been rapidly growing in size and complexity. This course focuses on how to use computers to streamline the analysis of biological data, with an emphasis on “working smart” rather than hard. Developing a solid background in how computers can facilitate biological research will not only help with your thesis projects, but will also make you more “marketable” for post-docs and faculty positions in the future.

colourbox6437831.jpg

Schematic representation of DNA strands
Photo:
colourbox

INSTRUCTOR: Dr. Scott R. Santos, Department of Biological Sciences, Auburn University (844-7410, santos@auburn.edu)

LECTURES: Short lectures followed by hands-on activities will be given over a two-week period.  There will be 10 sessions in total, with each being ~3 hours.

REQUIRED TEXTS (these texts will also serve as highly useful references following the conclusion of the course):

  • Barrett, D.J. - “Linux Pocket Guide” by O'Reilly Book (Amazon for $7.99; Book-A-Million for $9.95)
  • Haddock, S.H.D. and Dunn, C.W. - “Practical Computing for Biologists” (Paperback) ($60.00 from Amazon)

STUDENTS ARE REQUIRED TO HAVE THEIR OWN WINDOWS OR APPLE LAPTOP FOR USE IN CLASS. ALL FILES ON THE LAPTOP SHOULD BE BACKED UP PRIOR TO THE FIRST DAY OF CLASS AND THERE SHOULD BE (AT LEAST) 25GB OF FREE HARD DRIVE SPACE ON THE MACHINE. STUDENTS WILL BE WORKING IN A LINUX ENVIRONMENT WITHIN A VIRTUAL MACHINE ON THEIR LAPTOPS OVER THE COURSE DURATION USING VIRTUALBOX (https://www.virtualbox.org/wiki/Downloads), WHICH IS A FREE DOWNLOAD. THIS LINUX ENVIRONMENT WILL BE SET UP ON THE FIRST DAY OF CLASS.

PREREQUISITES: Background in molecular genetics is strongly required. It is expected that everyone has already taken courses in molecular biology, etc. Lecturer will assume that everyone is comfortable with these subjects. If you are not, I would suggest dropping the course and taking it at a later date.  Background in statistics is an advantage. 

ADDITIONAL STUDY AIDS FOR THIS COURSE:
Additional reading materials, in the form of PDFs and HTML documents, will be used to supplement readings from the above books. These will be distributed either via email ttachments or posted links from a computer server. These materials will be made available on the Friday following the class that they are assigned in.

IMPORTANT INFORMATION OF SPECIAL NOTE:
Read the assigned materials prior to coming to lecture. The lectures are meant to clarify and discuss concepts, not to serve as your first exposure to them. Each lecture is presented with the assumption that you have read the material and are at least vaguely familiar with it.

GENERAL POLICY and PROCEDURES: You should retain this schedule of lecture topics and relevant instructions for reference throughout the semester. You are responsible for learning the material that will be covered, for preparing for lectures by reading assignments beforehand, and for being present at all lectures without further notice or additional reminders. Missing classes should be avoided if at all possible. 

Special Request: Cell phones and pagers should be turned off for the duration of the lecture. Students will be asked to leave the classroom for the remainder of the lecture in the event one of these devices is activated during the lecture.

GRADING:  Passed or not passed.  “Pre” and “post” tests will be used to assess the knowledge that students have gained by taking the course. Along with this, “mini” projects will be given in order for students to practice the skills that they have ben exposed to in class. 


LECTURE SCHEDULE BY SUBJECT MATTER (TENTATIVE AND SUBJECT TO CHANGE)

  • Installation of Linux (PCs) and Developer Tools (Apple)
  • The “shell” and basic UNIX commandline
  • The structure of the UNIX file system
  • Simple programs and how to automate commands via scripts (shell, PERL, etc)
  • Compiling programs from scratch (using EMBOSS as an example)
  • Analysis of molecular data
  • Statistics with R
  • Graphics with R
  • Text manipulation using grep, sed and awk
  • Geographic mapping using the Generic Mapping Tools (GMT)
  • Image formats: what’s the difference between them and manipulation using Imagemagick
  • Creation of graphs and figures for publications using ploticus, GIMP and Inkscape
  • Web servers and databases