Read Mapping Exercises

Exercise 1: Read Mapping Exercise


Download data: mapping.tgz

Getting Started

BWA is a widely used tool for mapping individual reads to a reference genome, and SAMTools is a widely used program for scanning the read alignments to find & report variations or measure coverage. Download and install BWA (http://bio-bwa.sourceforge.net) and SAMTools (http://samtools.sourceforge.net/) to learn how to run the tools on a small genome with simulated reads. Both run on any Unix or Mac system (possibly Windows under Cygwin).

The basic steps are:
  1. 'bwa index' to index the reference genome (only needs to be done once per genome).
  2. 'bwa aln' to align the reads
  3. 'bwa sampe' to report the alignments.
After aligning, run SAMTools to find variations. The basic steps are
  1. 'samtools faidx' to index the reference genome
  2. 'samtools view' to load the alignments
  3. 'samtools index' to index the BAM file
  4. 'samtools mpileup' to find variations.
  5. 'samtools view' is a useful command to inspect how the reads align to the genome at a given position

Here is a more detailed description on how to use mpileup: http://samtools.sourceforge.net/mpileup.shtml