BanglaMusicStylo: A Stylometric Dataset of Bangla Music Lyrics

Autor: Rafayet Hossain, Ahmed Al Marouf
Rok vydání: 2018
Předmět:
Zdroj: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP).
Popis: With the rapid growth of Bangla music industry huge volume of Bangla songs are produced every day. Immense number of producers, lyricists, singers and artists are involved in production of songs from different genres. Among many genres of Bangla music; classical, folk, baul, modern music, Rabindra Sangeet, Nazrul Geeti, film music, rock music and fusion music has gained the highest popularity. Lyricists try to express their feelings and views towards any situation or subject through their writings. Therefore, each lyricist have their own dictionary of thoughts to put on music lyrics. In this paper, we have presented “BanglaMusicStylo”, the very first stylometric dataset of Bangla music lyrics. We have collected 2824 Bangla song lyrics of 211 lyricists in a digital form. All the lyrics are stored in text format for further use. This dataset could be used for stylometric analysis such as authorship attribution, linguistic forensics, gender identification from textual data, Bangla music genre classification, vandalism detection, emotion classification etc. Identifying the significant research opportunities in this area, we have formalized this dataset which could be used for stylometric analysis.
Databáze: OpenAIRE