CONSORT-TM
CONSORT-TM provides corpus-based text mining of randomized controlled trial (RCT) reports to identify and evaluate CONSORT checklist items for appraisal of RCT reporting quality.
Key Features:
- Annotated Corpus: Contains 50 RCT articles annotated at the sentence level with 37 fine-grained CONSORT checklist items, totaling 10,709 sentences with 4,845 (45%) annotated across 5,246 labels.
- Inter-Annotator Agreement: A subset of 31 articles underwent double-annotation and adjudication while 19 were reconciled by a single annotator, with MASI scores averaging 0.60 (article) and 0.64 (section) and Krippendorff's α for individual checklist items ranging from 0.06 to 0.96.
- Methodology Coverage: Targets recognition of 17 methodology-related CONSORT items specifically within RCT Methods sections.
- Computational Approaches: Implements rule-based methods (phrase-based and section header-based) and supervised learning classifiers including support vector machine and a BioBERT-based neural network.
- Performance Metrics: The BioBERT-based model achieved micro-precision 0.82, micro-recall 0.63, and micro-F1 0.71, and combining models via majority vote and label aggregation further improved precision and recall.
- Challenges and Improvements: Low frequency of certain CONSORT items limits training effectiveness, with suggested remedies including minor annotation-scheme modifications and expanding corpus size.
Scientific Applications:
- Reporting Quality Assessment: Enables automated evaluation of CONSORT adherence to assess RCT transparency, rigor, and reliability.
- Text Mining Development: Serves as a testbed for developing and refining text mining methods that evaluate RCT reporting standards.
- Gap Identification: Facilitates identification of missing or incomplete trial documentation in RCT reports.
- Automated Peer Review and Authoring Support: Supports development of automated tools for peer review and manuscript preparation focused on reporting quality.
Methodology:
Uses phrase-based and section header-based rule methods, support vector machine and BioBERT-based neural network classifiers, model combination via majority vote and label aggregation, and inter-annotator agreement assessment with MASI and Krippendorff's α.
Topics
Details
- Tool Type:
- database
- Programming Languages:
- Python
- Added:
- 3/19/2021
- Last Updated:
- 11/24/2024
Operations
Publications
Kilicoglu H, Rosemblat G, Hoang L, Wadhwa S, Peng Z, Malički M, Schneider J, Riet Gt. Toward Assessing Clinical Trial Publications for Reporting Transparency. Unknown Journal. 2021. doi:10.1101/2021.01.12.21249695.
Kilicoglu H, Rosemblat G, Hoang L, Wadhwa S, Peng Z, Malički M, Schneider J, ter Riet G. Toward assessing clinical trial publications for reporting transparency. Journal of Biomedical Informatics. 2021;116:103717. doi:10.1016/j.jbi.2021.103717. PMID:33647518. PMCID:PMC8112250.