FSQN

FSQN (Feature Specific Quantile Normalization) is a computational method specifically designed to address the challenge of platform-based bias in the analysis of gene expression data. This issue arises when comparing or integrating datasets derived from gene expression profiling platforms, such as RNA sequencing (RNA-seq) and DNA microarrays. Each platform has its own set of technical biases, which can hinder the direct comparison of data and the extraction of meaningful biological insights, especially when investigating complex diseases like cancer and autoimmune disorders that exhibit significant molecular heterogeneity.

The motivation behind FSQN is rooted in the growing recognition of the importance of molecular subtypes in understanding disease pathogenesis, heterogeneity, and response to therapy. These subtypes are typically defined through transcriptomic profiling. However, the technical biases inherent to different gene expression profiling platforms pose a unique challenge, particularly when analyzing data generated from disparate studies.

FSQN addresses this challenge by employing a machine learning approach that leverages the rich repository of existing gene expression data, specifically DNA microarray data, to normalize and classify RNA-seq data. By training machine learning classifiers on DNA microarray data and applying them to RNA-seq data normalized using FSQN, the method effectively removes platform-based bias, enabling the accurate comparison and integration of datasets across platforms.

An important finding from the study is that FSQN performs best when normalizing RNA-seq datasets containing at least 25 samples, highlighting the method's scalability and its potential for analyzing larger datasets. By facilitating the comparison of RNA-seq data to existing DNA microarray datasets, FSQN opens up new possibilities for leveraging historical gene expression data in contemporary analyses despite the differences in profiling platforms.

Topic

Gene expression;RNA-Seq;Microarray experiment;Oncology;Probes and primers;Rare diseases

Detail

  • Operation: Standardisation and normalisation;Expression analysis;Essential dynamics

  • Software interface: Library

  • Language: R

  • License: GNU General Public License, version 2

  • Cost: Free with restrictions

  • Version name: 0.0.1

  • Credit: The Burroughs-Wellcome Big Data in the Life Sciences Training Program, the Dr. Ralph and Marian Falk Medical Research Trust Catalyst and Transformational Awards, the National Institutes of Health, the Scleroderma Research Foundation.

  • Input: -

  • Output: -

  • Contact: Michael L Whitfield michael.l.whitfield@dartmouth.edu

  • Collection: -

  • Maturity: -

Publications

  • Feature specific quantile normalization enables cross-platform classification of molecular subtypes using gene expression data.
  • Franks JM, et al. Feature specific quantile normalization enables cross-platform classification of molecular subtypes using gene expression data. Feature specific quantile normalization enables cross-platform classification of molecular subtypes using gene expression data. 2018; 34:1868-1874. doi: 10.1093/bioinformatics/bty026
  • https://doi.org/10.1093/BIOINFORMATICS/BTY026
  • PMID: 29360996
  • PMC: PMC5972664

Download and documentation


< Back to DB search