Rainbow
Rainbow clusters and assembles restriction-site associated DNA sequencing (RAD-seq) paired-end reads to produce de novo RAD locus contigs and haplotypes for population-genomic analyses.
Key Features:
- Paired-end read clustering: Clusters paired-end short reads into groups based on unique tags.
- Spaced seed method: Uses a spaced seed method to organize reads while tolerating sequencing errors.
- Error, heterozygosity, and repeat handling: Accommodates sequencing errors, varying levels of heterozygosity, and repetitive sequences during clustering.
- Haplotype subdivision: Employs a heterozygote-calling-like top-down strategy to divide potential groups into haplotypes.
- Guided tree and merging: Constructs a guided tree and merges sibling leaves in a bottom-up manner when they exhibit sufficient similarity.
- Similarity assessment: Assesses similarity by comparing the second reads of RAD segments to collapse heterozygous sequences and distinguish repeats.
- Local assembly: Uses a greedy algorithm to locally assemble merged reads into contigs.
- Multiple assembly outputs: Produces both optimal and suboptimal assembly results.
- Performance: Implemented in C and reported as ultra-fast and memory-efficient for RAD-seq data.
Scientific Applications:
- RAD-seq de novo assembly: Assembly of RAD loci from next-generation sequencing RAD-seq datasets.
- Haplotype reconstruction: Reconstruction of haplotypes and resolution of heterozygous alleles within RAD loci.
- Repeat discrimination: Distinguishing repetitive sequences from allelic variation in RAD segments.
- Population genomics: Analyses of genetic diversity and structure, demonstrated by simulation studies and real guppy RAD-seq datasets.
Methodology:
Clusters paired-end reads using a spaced seed method based on unique tags; applies a heterozygote-calling-like top-down subdivision into haplotypes; constructs a guided tree and merges sibling leaves bottom-up by comparing second reads of RAD segments; assembles merged reads into contigs with a greedy algorithm and reports optimal and suboptimal assembly results; implemented in C.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- C
- Added:
- 12/18/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Chong Z, Ruan J, Wu C. Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads. Bioinformatics. 2012;28(21):2732-2737. doi:10.1093/bioinformatics/bts482. PMID:22942077.