BIGpre
BIGpre evaluates and preprocesses next-generation sequencing (NGS) data to assess read-level quality and prepare datasets for downstream analyses such as genome assembly, variant calling, and transcriptome profiling.
Key Features:
- Platform Compatibility: Supports Illumina and 454 sequencing platforms.
- Read-level Quality Metrics: Evaluates correlation between forward and reverse reads, analyzes read GC-content distribution, and assesses base N quality.
- Duplicate Read Management: Detects and removes duplicate reads while accounting for sequencing errors.
- Quality Trimming: Trims low-quality reads from raw data.
- Efficient Processing: Processes hundreds of millions of reads within minutes to provide rapid diagnostic information.
- Graphical and Tabular Summaries: Generates tabular and graphical summaries using the R statistics package.
- Implementation: Written primarily in Perl.
Scientific Applications:
- NGS Quality Control: Provides immediate and comprehensive assessment of read-level quality metrics for NGS datasets.
- Genome Assembly: Improves input data quality for genome assembly by removing duplicates and trimming low-quality reads.
- Variant Calling: Supports variant calling by assessing base quality and removing duplicate reads that could bias calls.
- Transcriptome Profiling: Enhances transcriptome analyses by evaluating read quality metrics and trimming low-quality reads.
Methodology:
Computational steps include evaluating forward/reverse read correlation, computing read GC-content distributions and base N quality, detecting and removing duplicate reads with error-aware filtering, performing quality trimming, and producing tabular and graphical summaries via R; the software is implemented primarily in Perl and processes hundreds of millions of reads within minutes.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- Perl
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Zhang T, Luo Y, Liu K, Pan L, Zhang B, Yu J, Hu S. BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data. Genomics, Proteomics & Bioinformatics. 2011;9(6):238-244. doi:10.1016/s1672-0229(11)60027-2. PMID:22289480. PMCID:PMC5054156.