Near duplicate detection in relational databases [Ilişkisel veri tabanlarinda mükerrer kayitlarin makine ögrenmesiyle tespiti]
Autor: | Bayrak A.T., Yilmaz A.I., Yilmaz K.B., Duzagac R., Bilimi V., Bolumu A., Yildiz O.T. |
---|---|
Přispěvatelé: | Bölüm Yok, Bayrak, A.T., ETSTUR, Istanbul, Turkey -- Yilmaz, A.I., ETSTUR, Istanbul, Turkey -- Yilmaz, K.B., ETSTUR, Istanbul, Turkey -- Duzagac, R., ETSTUR, Istanbul, Turkey -- Bilimi, V., ETSTUR, Istanbul, Turkey -- Bolumu, A., ETSTUR, Istanbul, Turkey -- Yildiz, O.T., Işik Üniversitesi Bilgisayar Mühendisligi Bölümü, Turkey |
Jazyk: | turečtina |
Rok vydání: | 2018 |
Předmět: | |
Popis: | Aselsan;et al.;Huawei;IEEE Signal Processing Society;IEEE Turkey Section;Netas 26th IEEE Signal Processing and Communications Applications Conference, SIU 2018 -- 2 May 2018 through 5 May 2018 -- -- 137780 While data amount increases, number of duplicate records in relational databases increase gradually. The duplicate records might cause inconsistency on reports and analyzes. To reduce the effects of this problem, we aim to detect duplicate records using machine learning algorithms with features that are produced by similarity of the records. We achieved to detect 28412 duplicate records in 9301467 records. The detected duplicate rows are removed from the data source and the data become more consistent. © 2018 IEEE. |
Databáze: | OpenAIRE |
Externí odkaz: |