cn.farms

cn.farms implements a probabilistic latent variable approach to detect and refine copy number variation (CNV) calls from oligonucleotide genotyping arrays, including Affymetrix SNP 6.0, with the goal of reducing false discovery rate and improving CNV detection accuracy.


Key Features:

  • Probabilistic model (cn.FARMS): Uses a probabilistic latent variable model named 'cn.FARMS' for CNV inference.
  • Bayesian maximum a posteriori optimization: Estimates model parameters using a Bayesian maximum a posteriori approach.
  • FDR control via information gain: Controls false discovery rate by leveraging information gain from the posterior distribution relative to the prior.
  • Null prior of copy number 2: Adopts a prior representing the null hypothesis that all samples have copy number 2, requiring strong and consistent signals to call deviations.
  • Target data types: Optimized for CNV analysis from oligonucleotide genotyping arrays, specifically Affymetrix SNP 6.0 arrays.
  • Reduced CNV overestimation: Addresses overestimation of CNV number and size to lower false positives.
  • Benchmarking on HapMap: Demonstrated higher sensitivity and lower FDR compared to two prevalent CNV detection methods using HapMap data.
  • R implementation with parallel processing: Implemented as an R package with support for parallel processing via the 'snow' and 'ff' packages for large datasets.

Scientific Applications:

  • Array-based CNV discovery: Detection and refinement of CNVs from oligonucleotide genotyping arrays such as Affymetrix SNP 6.0.
  • Clinical association studies: Reduction of false-positive CNV calls to decrease incorrect associations between CNVs and disease.
  • Genetic study power improvement: Improved CNV call accuracy to increase discovery power in genetic association studies.
  • Method benchmarking and validation: Comparative performance evaluation and validation using HapMap datasets.

Methodology:

Implements the 'cn.FARMS' probabilistic latent variable model optimized by Bayesian maximum a posteriori estimation, controls FDR via posterior-versus-prior information gain with a null prior of copy number 2, and supports parallel data processing using the 'snow' and 'ff' R packages.

Topics

Collections

Details

License:
GPL-2.0
Tool Type:
command-line tool, library
Operating Systems:
Linux, Windows, Mac
Programming Languages:
R
Added:
1/17/2017
Last Updated:
1/17/2019

Operations

Publications

Clevert D, Mitterecker A, Mayr A, Klambauer G, Tuefferd M, Bondt AD, Talloen W, Gohlmann H, Hochreiter S. cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate. Nucleic Acids Research. 2011;39(12):e79-e79. doi:10.1093/nar/gkr197. PMID:21486749. PMCID:PMC3130288.

Documentation

Downloads

Links