misFinder
misFinder identifies and corrects mis-assemblies in genome assemblies by integrating reference-genome comparison and paired-end read alignment to distinguish assembly errors from structural variations.
Key Features:
- Unbiased Error Identification: Integrates reference-genome comparison and aligned paired-end reads to distinguish mis-assemblies from true structural variations.
- High Accuracy in Error Detection: Leverages a reference genome (or closely related references) together with coverage data and insert-distance consistency features derived from paired-end reads to make high-confidence error calls.
- Reduction of False Positives/Negatives: Minimizes false positives and false negatives in mis-assembly detection to improve the reliability of downstream genomic analyses.
- Performance Superiority: Demonstrated superior detection of true mis-assemblies with fewer false calls than QUAST and REAPR on simulated and real paired-end read datasets.
Scientific Applications:
- Variant Detection: Improves accuracy of variant calling by correcting mis-assemblies that could confound variant detection.
- Gene Annotation: Supports reliable gene annotation by ensuring contiguous and correctly assembled gene loci.
- Comparative Genomics: Enables accurate comparative genomics by distinguishing structural variation from assembly artifacts.
Methodology:
Combines reference-genome comparison with paired-end read alignment and assesses coverage consistency and insert-distance features to identify and correct mis-assembled positions and to discriminate structural variations from assembly errors.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- C
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Data Inputs & Outputs
Sequence assembly
Inputs
Outputs
Publications
Zhu X, Leung HCM, Wang R, Chin FYL, Yiu SM, Quan G, Li Y, Zhang R, Jiang Q, Liu B, Dong Y, Zhou G, Wang Y. misFinder: identify mis-assemblies in an unbiased manner using reference and paired-end reads. BMC Bioinformatics. 2015;16(1). doi:10.1186/s12859-015-0818-3. PMID:26573684. PMCID:PMC4647709.