GWASTools

GWASTools provides R/Bioconductor-based functions for quality control, NetCDF-based storage, and analysis of genome-wide association study (GWAS) data to identify associations between genetic variants and traits.


Key Features:

  • NetCDF storage: Stores GWAS datasets in NetCDF format to enable efficient handling of datasets that exceed R memory limits.
  • R/Bioconductor integration: Operates within R/Bioconductor and leverages R statistical libraries for analysis.
  • Data format conversion: Imports and converts data from multiple formats, including formats derived from sequencing variants.
  • Genotype–intensity linkage: Links genotypes and intensity data with corresponding sample and SNP annotations.
  • Quality control: Implements quality control checks for sample quality and genotype quality.
  • Statistical association analysis: Performs statistical analyses to detect associations between genetic variants, including SNPs, and phenotypic traits.
  • Data cleaning and downstream analysis: Provides tools for data cleaning and subsequent analysis to support reproducibility of GWAS findings.

Scientific Applications:

  • GWAS quality control: Conducts sample- and genotype-level quality control to ensure integrity of GWAS datasets.
  • Association discovery: Identifies associations between SNPs/genetic variants and traits or diseases.
  • Large-scale genotype and intensity analysis: Enables analysis of large-scale genotype and intensity datasets that exceed typical in-memory capacities.
  • Integration of sequencing-derived variants: Supports use of data originating from sequencing variant calls in GWAS analyses.

Methodology:

Stores datasets in NetCDF; imports and converts multiple input formats including sequencing-variant-derived formats; links genotypes and intensity data with sample and SNP annotations; implements sample and genotype quality-control checks; and performs statistical analyses to identify associations between genetic variants and traits.

Topics

Collections

Details

License:
Artistic-2.0
Tool Type:
command-line tool, library
Operating Systems:
Windows, Mac
Programming Languages:
R
Added:
1/17/2017
Last Updated:
1/9/2019

Operations

Publications

Gogarten SM, Bhangale T, Conomos MP, Laurie CA, McHugh CP, Painter I, Zheng X, Crosslin DR, Levine D, Lumley T, Nelson SC, Rice K, Shen J, Swarnkar R, Weir BS, Laurie CC. GWASTools: an R/Bioconductor package for quality control and analysis of genome-wide association studies. Bioinformatics. 2012;28(24):3329-3331. doi:10.1093/bioinformatics/bts610. PMID:23052040. PMCID:PMC3519456.

Documentation

Downloads