CSHL Quantitive Biology
IFX0a. High level intro to genome assembly
Introduction to de Bruijn graphs for genome assembly
IFX0b. CSHL Inhouse
De novo assembly with long PacBio reads, detection of indels with Scalpel
IFX1. Computational Thinking: Sorting, Searching, and Indexing
Introduction to binary search, suffix arrays, hashing, and the BWT
IFX1b. Notes on the BWT
Scribed notes on computing and searching with the BWT
IFX2. Dynamic Programming: LIS and Sequence Alignment
Application of dynamic programming for sequence alignment: longest increasing subsequence, edit distance, sequence similarity, BLAST, Dynamic Time-warping
IFX2b. Notes on Dynamic Programming
Scribed notes on dynamic programming
IFX3. Graphs and Genomes
Algorithms for graph searching, detailed look at genome assembly.
IFX4. Gene Finding and HMMs
Approaches to prokayotic and eukaryotic gene finding, hidden markov models, forward algorithm, viterbi algorithm.
CSHL Quantitative Biology Bootcamp
Lecture 1: Biology, Computers, and Python
Milestones in Molecular Biology and the rise of sequencing.
Overview of computer systems, introduction to python and scripting.
Lecture 2: Sequence Alignment and Computational Thinking
Introduction to Alignment and Algorithms, Suffix Arrays, Binary Search
Lecture 3: Genomic Resources
NCBI, UCSC, CSHL Meetings and Courses, Galaxy.
Lecture 4: Unix Scripting
Introduction to Unix, searching the human genome annotation
Lecture 5: Dicovering Origins of Replication
Problem by Justin Kinney.
Background on tracking DNA replication with next-gen sequencing,
Walk-through of analysis steps, Visualization of discovered replication sites.
Lecture 6: Advanced Origins of Replication Analyis [iPython Notebook]
Problem by Justin Kinney.
Plotting, smoothing, and analyzying data
Lecture 7: Transcription Factor Binding Sites [iPython Notebook]
Problem by Justin Kinney.
Parsing, and discovering transcription factor binding sites
Python Exercises
Exercises on working with python
Python Exercise Solutions
Solutions to exercises
CSHL Advanced Sequencing Course
Course Wiki
Schedule and archive of presentations.
Whole Genome Assembly and Alignment
De novo assembly theory and practice; whole genome alignment with MUMmer
Assembly Tutorial
Assembly tutorial to detect a secret message embedded into a microbial genome
AdvSeq.asm.tgz
Data for assembly tutorial
CSHL Programming for Biology
Whole Genome Assembly and Alignment
De novo assembly theory and practice; whole genome alignment with MUMmer
Assembly Tutorial
Assembly tutorial to detect a secret message embedded into a microbial genome
P4B.asm.challenge.tgz
Data for assembly tutorial
CSHL Undergraduate Research Program in Bioinformatics
Lecture 1: Sequence Alignment and Computational Thinking
In this class we explored the problem of finding exact occurrences of a query
sequence in a large genome or database of sequences. Under this theme, we
started by analyzing the brute force approach introducing the concepts of
algorithm, complexity analysis, and E-values. Next we discussed suffix arrays
as an index for accelerating the search, including analyzing the performance of
binary search. We also considered two traditional algorithms for sorting
(Selection Sort versus QuickSort) and their relative performance. In the second
half of the class we discussed finding approximate occurrences of a short query
sequence in a large genome or database of sequences. We first defined the
problem by considering various metrics of an approximate occurrence such as
hamming distance, or edit distance. We then considered different methods for
computing inexact alignments including brute force global & local
alignments, and seed-and-extend algorithms. Finally we discussed Bowtie as a
Burrows-Wheeler transform based short read mapping algorithm for discovering
alignments to reference genome.
Lecture 2: Sequencing Pitfalls
In this session we reviewed the currently available sequencing technologies and best practices,
focusing on the widely used Illumina sequencing platform, the up and coming PacBio
sequencing platform, and the recently announced Oxford Nanopore instruments. Special
attention was placed on the complexities and biases with Illumina.
Lecture 3: Graphs and Genomes
The theme of this class was graphs and methods for graph analysis. The emphasis
was on genome assembly but included a discussion of other biological networks
including PPI networks, regulation networks, neuron interaction networks, and
cell cycle graphs. In the class, we considered fundamental properties of graphs,
such as nodes, edges, degrees, and shortest paths. We then examined in detail
algorithms for searching graphs with a with breadth-first-search, and then
approaches for finding minimum cost paths through weighed graphs (traveling
salesman problem), including exhaustive search, greedy algorithms, and
branch-and-bound. This lead to a discussion of the intractable nature of
NP-complete problems, and reviewed several important examples (vertex cover,
clique finding, knapsack problem, Hamiltonian cycle).
SBU Graduate Genetics
Next gen sequence analysis
Rise of sequencing; brute-force matching, binary search, genetics of autism.
Whole genome assembly
Review of *-seq assays, assembly theory, ALLPATHS-LG, Celera Assembler, PacBio.
SBU Intro to Computational Biology
Next-gen sequence analysis
Rise of sequencing; alignment with the BWT; genetic of autism
Lecture notes on the BWT
BWT construction, unwinding, exact match
SBU Introduction to Physical and Quantitative Biology
Sequence Alignment and Computational Thinking
Milestones in Molecular Biology and the rise of sequencing. Algorithms for searching and aligning sequences
|