Analysis of eligibility criteria representation in industry-standard clinical trial protocols
Autor: | Sanmitra Bhattacharya, Michael N. Cantor |
---|---|
Rok vydání: | 2013 |
Předmět: |
Male
Standardization Computer science Eligibility Determination Health Informatics computer.software_genre Set (abstract data type) Software portability Clinical trials Clinical Protocols Controlled vocabulary Information retrieval Humans Clinical Trials as Topic Class (computer programming) business.industry Natural language processing Computer Science Applications Clinical trial Inclusion and exclusion criteria Female Artificial intelligence Data mining business Delivery of Health Care computer |
Zdroj: | Journal of Biomedical Informatics. 46:805-813 |
ISSN: | 1532-0464 |
Popis: | Graphical abstractDisplay Omitted We compare textual complexity of full-text and ClinicalTrials.gov (CT) protocols.We use cosine-similarity measures to identify clusters for standardization.We find that CT protocols are very condensed and convey lesser information.Developing a template set is feasible and could lead to efficient criteria design. Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from ClinicalTrials.gov (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this study we first compare the representation characteristics and textual complexity of a set of Pfizer's internal full-text protocols to their corresponding entries in CT. Next, we identify clusters of similar criteria sentences from both full-text and CT protocols and outline methods for standardized representation of eligibility criteria. We also study the distribution of eligibility criteria in full-text and CT protocols with respect to pre-defined semantic classes used for eligibility criteria classification. We find that in comparison to full-text protocols, CT protocols are not only more condensed but also convey less information. We also find no correlation between the variations in word-counts of the ClinicalTrials.gov and full-text protocols. While we identify 65 and 103 clusters of inclusion and exclusion criteria from full text protocols, our methods found only 36 and 63 corresponding clusters from CT protocols. For both the full-text and CT protocols we are able to identify 'templates' for standardized representations with full-text standardization being more challenging of the two. In our exploration of the semantic class distributions we find that the majority of the inclusion criteria from both full-text and CT protocols belong to the semantic class "Diagnostic and Lab Results" while "Disease, Sign or Symptom" forms the majority for exclusion criteria. Overall, we show that developing a template set of eligibility criteria for clinical trials, specifically in their full-text form, is feasible and could lead to more efficient clinical trial protocol design. |
Databáze: | OpenAIRE |
Externí odkaz: |