gdsfmt
'gdsfmt' (Genomic Data Storage Format) is an R package 'SeqArray' component to address the challenges associated with the analysis of whole-genome sequencing (WGS) data, particularly the limitations of the Variant Call Format (VCF) in terms of large file sizes and slower data retrieval.
The 'gdsfmt' component within the SeqArray package contributes to implementing a new WGS variant data format. This format is designed array-oriented, providing capabilities similar to VCF but with enhanced compression options and efficient data access through high-performance parallel computing. Benchmarks using 1000 Genomes Phase 3 data demonstrate improved file sizes and faster genotype reading compared to VCF and binary VCF (BCF). The SeqArray package, including 'gdsfmt,' offers a flexible, feature-rich, and high-performance programming environment for analyzing WGS variant data within the R/Bioconductor framework.
Topic
Data management
Detail
Operation: Data handling
Software interface: Command-line user interface,Library
Language: R
License: The GNU General Public License v3.0
Cost: Free
Version name: 1.38.0
Credit: NIH.
Input: Nucleic acid features [Sequence variation annotation format]
Output: Nucleic acid features [Textual format] [Sequence variation annotation format]
Contact: Xiuwen Zheng zhengx@u.washington.edu
Collection: -
Maturity: Stable
Publications
- SeqArray-a storage-efficient high-performance data format for WGS variant calls.
- Zheng X, et al. SeqArray-a storage-efficient high-performance data format for WGS variant calls. SeqArray-a storage-efficient high-performance data format for WGS variant calls. 2017; 33:2251-2257. doi: 10.1093/bioinformatics/btx145
- https://doi.org/10.1093/bioinformatics/btx145
- PMID: 28334390
- PMC: PMC5860110
Download and documentation
Source: http://bioconductor.org/packages/release/bioc/src/contrib/gdsfmt_1.38.0.tar.gz
Documentation: http://bioconductor.org/packages/release/bioc/manuals/gdsfmt/man/gdsfmt.pdf
Home page: http://bioconductor.org/packages/release/bioc/html/gdsfmt.html
Links: http://bioconductor.org/packages/release/bioc/vignettes/gdsfmt/inst/doc/gdsfmt.html
Links: http://bioconductor.org/packages/release/bioc/vignettes/gdsfmt/inst/doc/gdsfmt.R
< Back to DB search