Investigating the power of goodness-of-fit test for multinomial logistic regression using K-Means clustering technique

Autor: Yap Bee Wah, Nor Azrita Mohd Amin, Aniza Hassan, Hamzah Abdul Hamid
Rok vydání: 2018
Předmět:
Zdroj: AIP Conference Proceedings.
ISSN: 0094-243X
DOI: 10.1063/1.5054203
Popis: The Logistic Regression is used to model a relationship between categorical dependent variable and one or more independent variable(s). The Logistic Regression can be divided into three types which are binary, multinomial and ordinal. The binary logistic regression is used when the dependent variable has two categories, while multinomial logistic regression is used when dependent variable has more than two nominal categories. In case when dependent variable has more than two ordinal categories, ordinal logistic regression is more suitable to be used. For all regression models, once the model is fitted, the model should be examined to identify whether it fits the data or not. For multinomial logistic regression, several goodness-of-fit tests are available and can be used to examine the fit model. Recently, the test based on clustering partitioning strategy has been proposed. The proposed test used Ward’s hierarchical clustering method is used to group the data. Thus, the performance of the test using different clustering methods is still vague. This study investigates the performance of goodness-of-fit test based on partitioning clustering strategy using K-Means clustering technique. The power of the test is evaluated using a simulation study via R. The results show that the test using K-Means clustering technique has controlled Type I error. It also has ample power to detect some lack of fit except for highly skewed covariate distribution and omission of interaction term. The application on a real data set confirmed the results obtained in a simulation study.
Databáze: OpenAIRE