A comparison of heuristic and model-based clustering methods for dietary pattern analysis
Autor: | Greve, Benjamin, Pigeot, Iris, Huybrechts, Inge, Pala, Valeria, Börnhorst, Claudia, The IDEFICS consortium |
---|---|
Rok vydání: | 2015 |
Předmět: |
0301 basic medicine
Medicine (miscellaneous) Disease cluster Diet Surveys 03 medical and health sciences Statistics Cluster Analysis Heuristics Humans Cluster analysis Child Mathematics 030109 nutrition & dietetics Nutrition and Dietetics Orientation (computer vision) Heuristic Public Health Environmental and Occupational Health k-means clustering Infant Feeding Behavior Dietary pattern Mixture model Research Papers Diet Identification (information) Research Design IDEFICS study k-means Multidimensional data Ward’s minimum variance method Gaussian mixture model |
Zdroj: | Public Health Nutr Public health nutrition, 19(Supplement 2):255-264 |
ISSN: | 1475-2727 1368-9800 |
Popis: | ObjectiveCluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward’s method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data.DesignThe clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice.ResultsThe GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward’s methods, the performance of k-means was better in 64–100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a ‘non-processed’ cluster with a high consumption of fruits, vegetables and wholemeal bread, a ‘balanced’ cluster with only slight preferences of single foods and a ‘junk food’ cluster.ConclusionsThe simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data. |
Databáze: | OpenAIRE |
Externí odkaz: |