CONSORT-TM

CONSORT-TM provides corpus-based text mining of randomized controlled trial (RCT) reports to identify and evaluate CONSORT checklist items for appraisal of RCT reporting quality.

Key Features:

Annotated Corpus: Contains 50 RCT articles annotated at the sentence level with 37 fine-grained CONSORT checklist items, totaling 10,709 sentences with 4,845 (45%) annotated across 5,246 labels.
Inter-Annotator Agreement: A subset of 31 articles underwent double-annotation and adjudication while 19 were reconciled by a single annotator, with MASI scores averaging 0.60 (article) and 0.64 (section) and Krippendorff's α for individual checklist items ranging from 0.06 to 0.96.
Methodology Coverage: Targets recognition of 17 methodology-related CONSORT items specifically within RCT Methods sections.
Computational Approaches: Implements rule-based methods (phrase-based and section header-based) and supervised learning classifiers including support vector machine and a BioBERT-based neural network.
Performance Metrics: The BioBERT-based model achieved micro-precision 0.82, micro-recall 0.63, and micro-F1 0.71, and combining models via majority vote and label aggregation further improved precision and recall.
Challenges and Improvements: Low frequency of certain CONSORT items limits training effectiveness, with suggested remedies including minor annotation-scheme modifications and expanding corpus size.

Scientific Applications:

Reporting Quality Assessment: Enables automated evaluation of CONSORT adherence to assess RCT transparency, rigor, and reliability.
Text Mining Development: Serves as a testbed for developing and refining text mining methods that evaluate RCT reporting standards.
Gap Identification: Facilitates identification of missing or incomplete trial documentation in RCT reports.
Automated Peer Review and Authoring Support: Supports development of automated tools for peer review and manuscript preparation focused on reporting quality.

Methodology:

Uses phrase-based and section header-based rule methods, support vector machine and BioBERT-based neural network classifiers, model combination via majority vote and label aggregation, and inter-annotator agreement assessment with MASI and Krippendorff's α.

Visit Official Homepage →

Topics

Preclinical and clinical studies Natural language processing Machine learning

Details

Tool Type:: database
Programming Languages:: Python
Added:: 3/19/2021
Last Updated:: 11/24/2024

Operations

Publications

Kilicoglu H, Rosemblat G, Hoang L, Wadhwa S, Peng Z, Malički M, Schneider J, Riet Gt. Toward Assessing Clinical Trial Publications for Reporting Transparency. Unknown Journal. 2021. doi:10.1101/2021.01.12.21249695.

DOI: 10.1101/2021.01.12.21249695

Kilicoglu H, Rosemblat G, Hoang L, Wadhwa S, Peng Z, Malički M, Schneider J, ter Riet G. Toward assessing clinical trial publications for reporting transparency. Journal of Biomedical Informatics. 2021;116:103717. doi:10.1016/j.jbi.2021.103717. PMID:33647518. PMCID:PMC8112250.

DOI: 10.1016/j.jbi.2021.103717

PMID: 33647518

PMCID: PMC8112250

← Back to search