Classification of Citation Sentence for Filtering Scientific References

Autor: Masayu Leylia Khodra, Ghoziyah Haitan Rachman, Dwi H. Widyantoro
Rok vydání: 2019
Předmět:
Zdroj: 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE).
DOI: 10.1109/icitisee48480.2019.9003736
Popis: Citation sentence is able to inform readers about relation between scientific articles that cite and are cited by finding its purpose against the research. Besides giving credit to other researchers and recommendation to read other related articles, citation can help readers to know what knowledge they have obtained based on the cited scientific articles they have read. In this research, we try to define citation categories for filtering scientific references which will be initial step in guided summarization of scientific articles. Our goal is to classify citation sentence first into ‘problem’, ‘other’, ‘useModel’, ‘useTool’ and ‘useData’. This category will make it easier to classify scientific articles into more specific topics. Then we use features namely voice, tenses, citation location, meta-discourse and bag of words. Then, we employ SVM Linear for building classification model and sampling technique, namely SMOTE for imbalance dataset. The best result of f-measure for our citation classification is achieved at 61.2% when combining voice& tense, meta-discourse, bag of words and sampling the feature data of UseData category with SMOTE.
Databáze: OpenAIRE