missRows
The missRows software tool proposes a multiple imputation approach in a multivariate framework to deal with missing row values in incomplete datasets. Missing row values are challenging because most statistical methods cannot be directly applied to incomplete datasets. This tool focuses on multiple-factor analysis (MFA) to compare and integrate multiple layers of information. The method fills the missing rows with plausible values, resulting in M completed datasets. MFA is then applied to each completed dataset to produce M different configurations (the matrices of coordinates of individuals). Finally, the M configurations are combined to yield a single consensus solution.
The authors assessed the approach on two real omics datasets with different patterns of missingness. They compared it with two other approaches: regularized iterative MFA (RI-MFA) and mean variable imputation (MVI-MFA). The solution's suitability was determined against the true MFA configuration obtained from the original data for each configuration resulting from these three strategies. A comprehensive graphical comparison was produced showing how the MI-, RI- or MVI-MFA configurations diverge from the true configuration. Additionally, two approaches, namely confidence ellipses and convex hulls, were described to visualize and assess the uncertainty due to missing values.
The tool provides a useful and attractive method for estimating the coordinates of individuals on the first MFA components despite missing rows. It considers the uncertainty of MI-MFA configurations induced by the missing rows, thereby allowing the reliability of the results to be evaluated. The MI-MFA configurations were close to the true configuration even when many individuals were missing in several data tables. This method considers the uncertainty of MI-MFA configurations induced by the missing rows, thereby evaluating the results' reliability.
Topic
Computational biology;Statistics and probability
Detail
Operation: Visualisation
Software interface: Library
Language: R
License: The Artistic License 2.0
Cost: Free
Version name: 1.20.0
Credit: The INRA GA (Génétique Animale), the INRA PHASE (Physiologie Animale et Systèmes d’Élevage) and the région Languedoc-Roussillon Midi-Pyrénées.
Input: -
Output: -
Contact: Gonzalez Ignacio ignacio.gonzalez@bbox.fr
Collection: -
Maturity: Stable
Publications
- Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework.
- Voillet V, et al. Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework. Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework. 2016; 17:402. doi: 10.1186/s12859-016-1273-5
- https://doi.org/10.1186/s12859-016-1273-5
- PMID: 27716030
- PMC: PMC5048483
Download and documentation
Source: http://bioconductor.org/packages/release/bioc/src/contrib/missRows_1.20.0.tar.gz
Documentation: http://bioconductor.org/packages/release/bioc/manuals/missRows/man/missRows.pdf
Home page: http://bioconductor.org/packages/release/bioc/html/missRows.html
Links: http://bioconductor.org/packages/release/bioc/vignettes/missRows/inst/doc/missRows.pdf
< Back to DB search