Quantitative Biology - 2012
Lecture 1: Exact Matching
Brute-force, Binary search, quicksort, binary trees, hashing, BWT
Lecture 2: Dynamic Programming and Applications
Dynamic programming, Fibonacci, Longest-increasing subsequencing, Edit Distance, Smith-Waterman, Dynamic Time Warp, BLAST
Lecture 3: Graphs and Genomes
Graph representations, network motifs, modularity, breath-first-search, depth-first-search, eulerian path, traveling salesman, genome assembly, assembly validation.
Lecture 4: Gene Finding and HMMs
Prokaryotic and Eukaryotic gene finding, Glimmer, GlimmerHMM, Forward algorithm, Viterbi, GHMM
Code Sample: BWT Construction and Searching
Construct the BWT for a genome, and compute exact matches against it.
Handouts on the BWT
Details on BWT Construction and searching
Quantitative Biology Bootcamp - 2012
Lecture 1: Molecular Biology, Computers, and Unix
Milestones in Molecular Biology and the rise of sequencing.
Overview of computer systems, introduction to unix and scripting.
Lecture 2: Sequence Alignment and Computational Thinking
Introduction to Alignment and Algorithms, Suffix Arrays, Binary Search, QuickSort,
Burrows-Wheeler Transform, Bowtie.
Lecture 3: Genomic Resources
NCBI, UCSC, CSHL Meetings and Courses, Galaxy.
Lecture 4: Dicovering Origins of Replication
Background on tracking DNA replication with next-gen sequencing,
Walk-through of analysis steps, Visualization of discovered replication sites.
Lecture 5: Matlab Tutorial
Introduction to Matlab, working with variables and matrices, ploting, working with
strings, working with structures, working with CEL arrays.
SBU Graduate Genetics - 2012
Lecture 1: Sequence Alignment and Computational Thinking
Introduction to Alignment and Algorithms, Suffix Arrays, Binary Search, QuickSort,
Burrows-Wheeler Transform, Bowtie.
Lecture 2: Whole Genome Assembly and Alignment
Theory and Practice of whole genome assembly and alignment highlighting the Celera Assembler.
CSHL Advanced Sequencing Course - 2012
Course Wiki
Schedule of presentations and slides.
Whole Genome Assembly and Alignment
[ppt]
Theory and Practice of whole genome assembly and alignment highlighting Allpaths, SOAPdenovo, and the Celera Assembler.
Assembly Tutorial
[ppt]
Tutorial on using Allpaths for whole genome assembly
Challenge Problem Data
[ref genome]
Data for challenge problem in the tutorial
Undergraduate Research Program in Bioinformatics - 2012
Lecture 1: Sequence Alignment and Computational Thinking
In this class we explored the problem of finding exact occurrences of a query
sequence in a large genome or database of sequences. Under this theme, we
started by analyzing the brute force approach introducing the concepts of
algorithm, complexity analysis, and E-values. Next we discussed suffix arrays
as an index for accelerating the search, including analyzing the performance of
binary search. We also considered two traditional algorithms for sorting
(Selection Sort versus QuickSort) and their relative performance. In the second
half of the class we discussed finding approximate occurrences of a short query
sequence in a large genome or database of sequences. We first defined the
problem by considering various metrics of an approximate occurrence such as
hamming distance, or edit distance. We then considered different methods for
computing inexact alignments including brute force global & local
alignments, and seed-and-extend algorithms. Finally we discussed Bowtie as a
Burrows-Wheeler transform based short read mapping algorithm for discovering
alignments to reference genome.
Lecture 2: Sequencing Pitfalls
In this session we reviewed the currently available sequencing technologies and best practices,
focusing on the widely used Illumina sequencing platform, the up and coming PacBio
sequencing platform, and the recently announced Oxford Nanopore instruments. Special
attention was placed on the complexities and biases with Illumina.
Lecture 3: Graphs and Genomes
The theme of this class was graphs and methods for graph analysis. The emphasis
was on genome assembly but included a discussion of other biological networks
including PPI networks, regulation networks, neuron interaction networks, and
cell cycle graphs. In the class, we considered fundamental properties of graphs,
such as nodes, edges, degrees, and shortest paths. We then examined in detail
algorithms for searching graphs with a with breadth-first-search, and then
approaches for finding minimum cost paths through weighed graphs (traveling
salesman problem), including exhaustive search, greedy algorithms, and
branch-and-bound. This lead to a discussion of the intractable nature of
NP-complete problems, and reviewed several important examples (vertex cover,
clique finding, knapsack problem, Hamiltonian cycle).
|