Computational Statistics with Dummy Variables

Autor: Rinaldo, Fernandes, Adji Achmad, Solimun, Nurjannah
Rok vydání: 2022
Zdroj: Computational Statistics and Applications
DOI: 10.5772/intechopen.101460
Popis: Cluster analysis is a technique commonly used to group objects and then further analysis is carried out to obtain a model, named cluster integration. This process can be continued with various analyzes, including path analyzes, discriminant analyzes, logistics, etc. In this chapter, the author discusses the reason to use dummy variables in this type of cluster analysis. Dummy variables are the main way that categorical variables are included as predictors in modeling. With statistical models such as linear regression, one of the dummy variables needs to be excluded, otherwise the predictor variables are perfectly correlated. Thus, usually if a categorical variable can take k values, we only need k-1 dummy variables, the k-th variable being redundant, it does not bring any new information. When more dummy variables than needed are used this is known as dummy variable trapping. The advantage to use dummy variables is that they are simple to use and the decision making process is easier to manage. The novelty in this chapter is the perspective of the dummy variable technique using cluster analysis in statistical modeling. The data used in this study is an assessment of the provision of credit risk at a bank in Indonesia. All analyzes were carried out using software R.
Databáze: OpenAIRE