Automatic Outlier Detection in Laboratory Result Distributions Within a Real World Data Network.

Autor: Muñoz Monjas A; Biomedical Informatics Group, Universidad Politécnica de Madrid, Spain., Rubio Ruiz D; Biomedical Informatics Group, Universidad Politécnica de Madrid, Spain.; TriNetX, LLC, Cambridge, MA, USA., Pérez-Rey D; Biomedical Informatics Group, Universidad Politécnica de Madrid, Spain., Palchuk M; TriNetX, LLC, Cambridge, MA, USA.
Jazyk: angličtina
Zdroj: Studies in health technology and informatics [Stud Health Technol Inform] 2023 May 18; Vol. 302, pp. 88-92.
DOI: 10.3233/SHTI230070
Abstrakt: Laboratory data must be interoperable to be able to accurately compare the results of a lab test between healthcare organizations. To achieve this, terminologies like LOINC (Logical Observation Identifiers, Names and Codes) provide unique identification codes for laboratory tests. Once standardized, the numeric results of laboratory tests can be aggregated and represented in histograms. Due to the characteristics of Real World Data (RWD), outliers and abnormal values are common, but these cases should be treated as exceptions, excluding them from possible analysis. The proposed work analyses two methods capable of automating the selection of histogram limits to sanitize the generated lab test result distributions, Tukey's box-plot method and a "Distance to Density" approach, within the TriNetX Real World Data Network. The generated limits using clinical RWD are generally wider for Tukey's method and narrower for the second method, both greatly dependent on the values used for the algorithm's parameters.
Databáze: MEDLINE