SPARSim
SPARSim simulates single-cell RNA sequencing (scRNA-seq) count data to reproduce count intensity, variability, and sparsity characteristics for development and validation of bioinformatics methods.
Key Features:
- Gamma-Multivariate Hypergeometric Model: Uses a Gamma-Multivariate Hypergeometric distribution model to generate count data that mimic real scRNA-seq characteristics.
- Realistic Count Intensity and Variability: Produces simulated datasets matching empirical count intensity and variability observed in scRNA-seq experiments.
- Sparsity and Zero Distribution: Captures the distribution of zeros across varying expression intensities to reflect scRNA-seq sparsity.
- Benchmarking against Splat: Has been compared with the Splat simulator and reported to perform comparably or better in replicating real-data characteristics, particularly zero distribution across expression levels.
Scientific Applications:
- Method Development: Provides simulated scRNA-seq datasets for testing and refining analytical techniques.
- Validation Studies: Serves as a benchmark for validating performance of bioinformatics tools and algorithms on controlled count data.
- Educational Use: Supplies realistic simulated data for training and instructional purposes in single-cell transcriptomics.
Methodology:
Simulation is performed using a Gamma-Multivariate Hypergeometric distribution model with explicit modeling of zero distribution across expression intensities, and performance has been benchmarked against the Splat simulator.
Topics
Details
- Programming Languages:
- R
- Added:
- 1/9/2020
- Last Updated:
- 1/16/2021
Operations
Publications
Baruzzo G, Patuzzi I, Di Camillo B. SPARSim single cell: a count data simulator for scRNA-seq data. Bioinformatics. 2019;36(5):1468-1475. doi:10.1093/bioinformatics/btz752. PMID:31598633.
PMID: 31598633
Links
Repository
https://gitlab.com/sysbiobig/sparsim