A Sample Size Extractor for RCT Reports

Autor:	Fengyang, Lin, Hao, Liu, Paul, Moon, Chunhua, Weng
Rok vydání:	2022
Předmět:	Machine Learning Sample Size COVID-19 Humans Natural Language Processing Randomized Controlled Trials as Topic
Zdroj:	Studies in health technology and informatics. 290
ISSN:	1879-8365
Popis:	Sample size is an important indicator of the power of randomized controlled trials (RCTs). In this paper, we designed a total sample size extractor using a combination of syntactic and machine learning methods, and evaluated it on 300 Covid-19 abstracts (Covid-Set) and 100 generic RCT abstracts (General-Set). To improve the performance, we applied transfer learning from a large public corpus of annotated abstracts. We achieved an average F1 score of 0.73 on the Covid-Set testing set, and 0.60 on the General-Set using exact matches. The F1 scores for loose matches on both datasets were over 0.74. Compared with the state-of-the-art tool, our extractor reports total sample sizes directly and improved F1 scores by at least 4% without transfer learning. We demonstrated that transfer learning improved the sample size extraction accuracy and minimized human labor on annotations.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=pmid________::47cd73460a9a2adcaa0de85eab55ce8b https://pubmed.ncbi.nlm.nih.gov/35673090 Zobrazit plný text záznamu