Crossbow
Crossbow leverages Bowtie and SOAPsnp to perform Hadoop-based parallel short-read alignment and single nucleotide polymorphism (SNP) calling on high-throughput DNA sequencing data.
Key Features:
- Integration of Bowtie and SOAPsnp: Integrates Bowtie for efficient short-read alignment and SOAPsnp for SNP calling.
- Hadoop-based parallel processing: Uses the Hadoop framework to distribute computation across cloud-based clusters for parallel processing.
- High-throughput sequencing support: Processes high-throughput DNA sequencing and short DNA sequence data at scale.
- Combined alignment and genotyping workflow: Combines alignment and SNP calling into a cohesive pipeline for genotyping analyses.
- Performance demonstration: Demonstrated analysis of a 38-fold coverage human genome in approximately 3 hours on a 320-CPU cloud cluster.
- Cost example: The reported performance example was achieved at an approximate expense of $85.
Scientific Applications:
- Large-scale genomic studies: Enables rapid processing of whole-genome sequencing datasets such as human genome analyses.
- Genotyping and SNP discovery: Performs genome-wide single nucleotide polymorphism detection and genotyping from short-read data.
- High-coverage variant analysis: Applicable to analysis of high-coverage sequencing datasets for variant detection and characterization.
Methodology:
Alignment is performed with Bowtie, SNP calling with SOAPsnp, and computations are distributed using the Hadoop framework across cloud-based clusters (example: 320 CPUs analyzing a 38-fold human genome in ~3 hours at ~$85).
Topics
Details
- Maturity:
- Mature
- Tool Type:
- command-line tool
- Programming Languages:
- Perl
- Added:
- 1/13/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL. Searching for SNPs with cloud computing. Genome Biology. 2009;10(11). doi:10.1186/gb-2009-10-11-r134. PMID:19930550. PMCID:PMC3091327.