FEELnc
FEELnc annotates long non-coding RNAs (lncRNAs) from RNA-seq assembled transcripts using an alignment-free Random Forest classifier to distinguish lncRNAs from mRNAs.
Key Features:
- Alignment-Free Annotation: Performs classification without sequence alignment using features such as multi k-mer frequencies and relaxed open reading frames (ORFs).
- Machine Learning Classifier: Implements a Random Forest model trained on general transcript features to separate lncRNAs and mRNAs.
- Benchmarking Performance: Shows similar or superior classification performance compared to five state-of-the-art tools on datasets including GENCODE and NONCODE.
- Customizable Modules: Provides modules to fine-tune classification parameters and formalize lncRNA annotations.
- Training Set Independence: Can identify lncRNAs when no prior annotated non-coding training set is available.
Scientific Applications:
- Canine genome annotation (LUPA consortium): Applied to 20 canine RNA-seq samples, identifying 10,374 novel lncRNAs and 58,640 mRNA transcripts to expand canine genome annotations.
Methodology:
Random Forest models trained on multi k-mer frequencies and relaxed ORFs classify assembled transcripts in an alignment-free manner.
Topics
Details
- License:
- GPL-3.0
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- R, Shell, Perl
- Added:
- 5/11/2018
- Last Updated:
- 12/10/2018
Operations
Publications
Wucher V, Legeai F, Hédan B, Rizk G, Lagoutte L, Leeb T, Jagannathan V, Cadieu E, David A, Lohi H, Cirera S, Fredholm M, Botherel N, Leegwater PA, Le Béguec C, Fieten H, Johnson J, Alföldi J, André C, Lindblad-Toh K, Hitte C, Derrien T. FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Research. 2017. doi:10.1093/nar/gkw1306. PMID:28053114. PMCID:PMC5416892.
Documentation
Downloads
- Command-line specificationhttps://github.com/tderrien/FEELnc