RecoverY

RecoverY: k-mer-based Y Chromosome Read Classification and Assembly Optimization

RecoverY enhances assembly of the haploid mammalian Y chromosome by applying k-mer-based read classification to identify and select Y-specific reads from datasets characterized by high repeat content and low sequencing depth.

Key Features:

Automated Parameter Selection: Automatically determines the k-mer abundance threshold defining Y-specific k-mers, eliminating manual parameter tuning.
Integration of Prior Knowledge: Incorporates Y chromosome information from related species or known Y transcript sequences to improve Y-specific read identification accuracy.
Robust Performance Across Datasets: Validated on simulated and real human and gorilla genome data, demonstrating stability across parameter variations.
Improved Assembly Metrics: Achieves 33% increase in assembly size and 20% improvement in NG50 compared to read or contig filtering strategies.

Scientific Applications:

Y-Chromosome Genomics: Supports accurate assembly for studies of genetic diversity, evolutionary biology, and sex chromosome-associated disease.

Methodology:

RecoverY classifies sequencing reads using k-mer abundance profiles to identify Y-specific sequences. Automatic threshold selection, guided by prior Y chromosome knowledge, refines discrimination of Y-derived reads within complex genomic datasets prior to assembly.

Visit Official Homepage →

Topics

Sequencing Sequence assembly Sequence analysis

Details

Tool Type:: command-line tool
Operating Systems:: Linux, Mac
Programming Languages:: Python
Added:: 6/24/2018
Last Updated:: 11/25/2024

Operations

Publications

Rangavittal S, Harris RS, Cechova M, Tomaszkiewicz M, Chikhi R, Makova KD, Medvedev P. RecoverY: <i>k</i> -mer-based read classification for Y-chromosome-specific sequencing and assembly. Bioinformatics. 2017;34(7):1125-1131. doi:10.1093/bioinformatics/btx771. PMID:29194476. PMCID:PMC6030959.

DOI: 10.1093/bioinformatics/btx771

PMID: 29194476

PMCID: PMC6030959

Funding: - NSF: DBI-1356529, IIS-1453527, IIS-1421908 and CCF-1439057, DBI-ABI 0965596

Documentation

General

https://github.com/makovalab-psu/RecoverY

← Back to search