Rainbow

Rainbow clusters and assembles restriction-site associated DNA sequencing (RAD-seq) paired-end reads to produce de novo RAD locus contigs and haplotypes for population-genomic analyses.


Key Features:

  • Paired-end read clustering: Clusters paired-end short reads into groups based on unique tags.
  • Spaced seed method: Uses a spaced seed method to organize reads while tolerating sequencing errors.
  • Error, heterozygosity, and repeat handling: Accommodates sequencing errors, varying levels of heterozygosity, and repetitive sequences during clustering.
  • Haplotype subdivision: Employs a heterozygote-calling-like top-down strategy to divide potential groups into haplotypes.
  • Guided tree and merging: Constructs a guided tree and merges sibling leaves in a bottom-up manner when they exhibit sufficient similarity.
  • Similarity assessment: Assesses similarity by comparing the second reads of RAD segments to collapse heterozygous sequences and distinguish repeats.
  • Local assembly: Uses a greedy algorithm to locally assemble merged reads into contigs.
  • Multiple assembly outputs: Produces both optimal and suboptimal assembly results.
  • Performance: Implemented in C and reported as ultra-fast and memory-efficient for RAD-seq data.

Scientific Applications:

  • RAD-seq de novo assembly: Assembly of RAD loci from next-generation sequencing RAD-seq datasets.
  • Haplotype reconstruction: Reconstruction of haplotypes and resolution of heterozygous alleles within RAD loci.
  • Repeat discrimination: Distinguishing repetitive sequences from allelic variation in RAD segments.
  • Population genomics: Analyses of genetic diversity and structure, demonstrated by simulation studies and real guppy RAD-seq datasets.

Methodology:

Clusters paired-end reads using a spaced seed method based on unique tags; applies a heterozygote-calling-like top-down subdivision into haplotypes; constructs a guided tree and merges sibling leaves bottom-up by comparing second reads of RAD segments; assembles merged reads into contigs with a greedy algorithm and reports optimal and suboptimal assembly results; implemented in C.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
C
Added:
12/18/2017
Last Updated:
11/25/2024

Operations

Publications

Chong Z, Ruan J, Wu C. Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads. Bioinformatics. 2012;28(21):2732-2737. doi:10.1093/bioinformatics/bts482. PMID:22942077.

Documentation

Links