Finding Influential Factors for Different Types of Cancer: A Data Mining Approach
Autor: | Elham Akhond Zadeh Noughabi, Behrouz H. Far, Reda Alhajj, Munima Jahan |
---|---|
Rok vydání: | 2018 |
Předmět: |
medicine.medical_specialty
education.field_of_study 020205 medical informatics Association rule learning Contrast set Population Cancer 02 engineering and technology Disease medicine.disease Data science 03 medical and health sciences 0302 clinical medicine Epidemiology 0202 electrical engineering electronic engineering information engineering medicine National Health Interview Survey 030212 general & internal medicine Cancer development education Psychology |
Zdroj: | Applications of Data Management and Analysis ISBN: 9783319958095 |
Popis: | Cancer is one of the leading causes of death around the world. Finding the risk factors related to different types of cancer can help researchers understand the process of cancer development and find new ways of preventing the disease. Most of the researches done on cancer datasets focus only one type of cancer. This research aims to provide a new methodology for extracting significant influential factors affecting multiple cancer types by employing frequent pattern mining, association rule mining, and contrast set mining techniques. The datasets used are US general population collected from the National Health Interview Survey (NHIS) and the Surveillance, Epidemiology, and End Results (SEER) Program. The rules discovered have invaluable contribution in two aspects: some of the rules validate the existing knowledge about cancer and a few of them expand further research scope to enrich expert knowledge in cancer domain. Experimental results illustrate that high cholesterol and high blood pressure are evident among cancer patients. Considering the demographic facts, female and the age group between 61 and 85 are more prone to cancer. Also, the Hispanic origin “not Hispanic/Spanish origin” are the majority among cancer patients. This research is one of the few works that implies to diverse cancer domain and unique in methodology for finding dominant factors associated with cancer. |
Databáze: | OpenAIRE |
Externí odkaz: |