Comparison of hierarchical clustering methods for binary data from molecular markers
Autor: | Christos Dordas, Alexios N. Polidoros, Ilias G. Eleftherohorinos, S. Ntoanidou, Emmanouil D. Pratsinakis, Panagiotis Madesis, George Menexes |
---|---|
Rok vydání: | 2020 |
Předmět: |
0106 biological sciences
0301 basic medicine Linkage (software) Information Systems and Management business.industry Squared euclidean distance Applied Mathematics Dendrogram UPGMA Pattern recognition 01 natural sciences Correspondence analysis Hierarchical clustering 03 medical and health sciences 030104 developmental biology Binary data Artificial intelligence business Cluster analysis 010606 plant biology & botany Mathematics Information Systems |
Zdroj: | International Journal of Data Analysis Techniques and Strategies. 12:190 |
ISSN: | 1755-8069 1755-8050 |
Popis: | Data from molecular markers used for constructing dendrograms, which are based on genetic distances between different plant species, are encoded as binary data. For dendrograms' construction, the most commonly used linkage method is the UPGMA in combination with the squared Euclidean distance. It seems that in this scientific field, this is the 'golden standard' clustering method. In this study, a review is presented on clustering methods used with binary data. Furthermore, an evaluation of the linkage methods and the corresponding appropriate distances (comparison of 163 clustering methods) is attempted using binary data resulted from molecular markers applied to five populations of the wild mustard Sinapis arvensis species. The validation of the various cluster solutions was tested using external criteria. The results showed that the 'golden standard' is not a 'panacea' for dendrogram construction, based on binary data derived from molecular markers. Thirty seven other hierarchical clustering methods could be used. |
Databáze: | OpenAIRE |
Externí odkaz: |