Research

My lab focuses on the development of scalable algorithms and systems to analyze biological sequence data, concentrating on the alignment, assembly, and analysis of high-throughput DNA sequencing reads. This work is critical to unlocking the potential of large-scale sequencing as a tool for genetics research. Addressing the magnitude of the challenge, my work includes capitalizing on the latest advances in distributed and parallel computing to advance the state of the art in bioinformatics and genomics.

Research Interests

  • Genome Assembly & Validation
  • Sequence Alignment
  • High Performance and Multi-Core Computing
  • Human Genetics
  • Environmental Sampling & Metagenomics
  • Scientific Visualization
  • Cloud Computing

Affiliations



Selected Software Packages

AMOSA fast and flexible API for genome assembly and manipulations
AssemblyticsInteractive analysis of variants within an assembly
Celera Assembler The program used to assemble the human genome at Celera Genomics in 2001
CloudBurst Highly Sensitive Short Read Mapping with MapReduce
ContrailAssembly of Large Genomes using Cloud Computing
Crossbow Whole Genome Resequencing Analysis in the Clouds
CrossStitch Hybrid Phasing and Personal Genome Construction
FALCON Phased Diploid Genome Assembly with PacBio sequencing reads
Genome-indexingRapid Burrows-Wheeler Transform Construction with MapReduce
GenomeScopeEstimate genomic properties from unassembled sequencing reads
GinkgoInteractive analysis and assessment of single-cell copy-number variations
Hawkeye Genome Assembly Viewer and Analysis Tool.
LRSim Linked Read Simulator
MUMmer A modular system for the rapid whole genome alignment of finished or draft sequence
MUMmerGPUHigh-throughput Sequence Alignment on the GPU using CUDA GPGPU API from nVidia
NanoCorrHybrid Error Correction of Oxford Nanopore Sequencing Reads
NGM-LRFast and Accurate Mapping of Long Reads
PhyloTracVisualization and Analysis tool for the PhlyoChip
QuakeQuality guided correction and filtration of errors in short reads.
RibbonVisualization of Long Read Alignments
ScalpelIndel variant analysis of short-read sequencing data
SplitMemPan Genome Analysis with suffix skips
SnifflesDetection of Structural Variations from Long Reads
SURVIVORDetection and Analysis of Structural Variations
TeaserFast personalized benchmarks and optimization for NGS read mapping