Autor: |
Yasmin Mohd Yacob, Khalid Jamal Jadaa, Amiza Amir, Mohammed Ragheb Hakawati, Jabiry M. Mohammed |
Rok vydání: |
2020 |
Předmět: |
|
Zdroj: |
European Journal of Electrical Engineering and Computer Science. 4 |
ISSN: |
2736-5751 |
Popis: |
Extensible Markup Language (XML) is emerging as the primary standard for representing and exchanging data, with more than 60% of the total; XML considered the most dominant document type over the web; nevertheless, their quality is not as expected. XML integrity constraint especially XFD plays an important role in keeping the XML dataset as consistent as possible, but their ability to solve data quality issues is still intangible. The main reason is that old-fashioned data dependencies were basically introduced to maintain the consistency of the schema rather than that of the data. The purpose of this study is to introduce a method for discovering pattern tableaus for XML conditional dependencies to be used for enhancing XML document consistency as a part of data quality improvement phases. The notations of the conditional dependencies as new rules are designed mainly for improving data instance and extended traditional XML dependencies by enforcing pattern tableaus of semantically related constants. Subsequent to this, a set of minimal approximate conditional dependencies (XCFD, XCIND) is discovered and learned from the XML tree using a set of mining algorithms. The discovered patterns can be used as a Master data in order to detect inconsistencies that don’t respect the majority of the dataset. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|