Avoiding Doubles in Distributed Nominative Medical Databases: Optimization of the Needleman and Wunsch Algorithm.

Autor: Le Mignot, Loïc, Mugnier, Claude, Saïd, Mohamed Ben, Jais, Jean-Philippe, Richard, Jean-Baptiste, Le Bihan-Benjamin, Christine, Taupin, Pierre, Landais, Paul
Zdroj: Studies in Health Technology & Informatics; Aug2005, Vol. 116, p83-88, 6p, 1 Diagram, 1 Chart, 4 Graphs
Abstrakt: Difficulties in reconstituting patients' trajectory in the public health information systems are raised by errors in patients' identification processes. A crucial issue to achieve is avoiding doubles in distributed web databases. We explored Needleman and Wunsch (N&W) algorithm in order to optimize the properties of string matching. Five variants of the N&W algorithm were developed. The algorithms were implemented for a web Multi-Source Information System. This system was dedicated to tracking patients with End-Stage Renal Disease at both regional and national level. A simulated study database of 73,210 records was created. An insertion or suppression of each character of the original string was simulated. The rate of double entries was 2% given an acceptable distance set to 5 modifications. The search was sensitive and specific with an acceptable detection time. It detected up to 10% of modifications that is above the estimated error rate. A variant of the N&W algorithm designed as “cut-off heuristic”, proved to be efficient for the search of double entries occurring in nominative distributed databases. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index