FusorSV

FusorSV integrates outputs from multiple structural-variation callers to fuse callsets and improve detection of structural variations (deletions, insertions, duplications, inversions, translocations) from next-generation sequencing data.


Key Features:

  • Fusion model: A fusion model developed from analysis of 27 deep-coverage human genomes from the 1000 Genomes Project to enhance SV detection reliability.
  • Ensemble integration: Integrates outputs from an ensemble of existing SV-calling algorithms and merges their callsets into a cohesive dataset.
  • Data fusion method: Employs a novel data fusion method to combine SV calls across callers.
  • Data mining strategy: Uses a data mining strategy to evaluate caller performance prior to merging callsets.
  • SV types detected: Targets deletions, insertions, duplications, inversions, and translocations from next-generation sequencing data.
  • Novel call discovery: Identified 843 novel SV calls not reported in the original 1000 Genomes Project data for the analyzed samples.
  • Experimental validation: A subset of newly identified SVs achieved an experimental validation rate of 86.7%.

Scientific Applications:

  • Comprehensive SV analysis: Facilitates comprehensive analysis of structural variation across large-scale genomic datasets.
  • Novel SV discovery: Enables discovery of novel structural variants absent from prior callsets.
  • Genetic variation studies: Supports studies of genetic variation relevant to complex disease and evolutionary biology.
  • Genomic research and clinical use: Improves SV detection accuracy for genomic research and potential clinical applications.

Methodology:

Integrates outputs from an ensemble of SV-calling algorithms using a novel data fusion method and a data mining strategy to evaluate caller performance and merge callsets, with a fusion model developed from analysis of 27 deep-coverage human genomes from the 1000 Genomes Project.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
Python
Added:
7/14/2018
Last Updated:
12/16/2018

Operations

Publications

Becker T, Lee W, Leone J, Zhu Q, Zhang C, Liu S, Sargent J, Shanker K, Mil-homens A, Cerveira E, Ryan M, Cha J, Navarro FCP, Galeev T, Gerstein M, Mills RE, Shin D, Lee C, Malhotra A. FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods. Genome Biology. 2018;19(1). doi:10.1186/s13059-018-1404-6. PMID:29559002. PMCID:PMC5859555.

PMID: 29559002
PMCID: PMC5859555
Funding: - National Human Genome Research Institute: U41HG007497 - National Cancer Institute: P30CA034196 - Ewha Womans University: Ewha Womans University Research grant of 2015-6

Documentation