Zobrazeno 1 - 10
of 7 278
pro vyhledávání: '"Vijayalakshmi, P."'
In ideal human computer interaction (HCI), the colloquial form of a language would be preferred by most users, since it is the form used in their day-to-day conversations. However, there is also an undeniable necessity to preserve the formal literary
Externí odkaz:
http://arxiv.org/abs/2409.14348
Publikováno v:
S. J. Joysingh, P. Vijayalakshmi and T. Nagarajan, "Development of Large Annotated Music Datasets using HMM based Forced Viterbi Alignment," TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), Kochi, India, 2019, pp. 1298-1302
Datasets are essential for any machine learning task. Automatic Music Transcription (AMT) is one such task, where considerable amount of data is required depending on the way the solution is achieved. Considering the fact that a music dataset, comple
Externí odkaz:
http://arxiv.org/abs/2408.14890
Publikováno v:
TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), Kochi, India, 2019, pp. 1303-1306
The evolution and diversity of a language is evident from it's various dialects. If the various dialects are not addressed in technological advancements like automatic speech recognition and speech synthesis, there is a chance that these dialects may
Externí odkaz:
http://arxiv.org/abs/2408.14887
Whisper to normal speech conversion is an active area of research. Various architectures based on generative adversarial networks have been proposed in the recent past. Especially, recent study shows that MaskCycleGAN, which is a mask guided, and cyc
Externí odkaz:
http://arxiv.org/abs/2408.14797
Whispered speech as an acceptable form of human-computer interaction is gaining traction. Systems that address multiple modes of speech require a robust front-end speech classifier. Performance of whispered vs normal speech classification drops in th
Externí odkaz:
http://arxiv.org/abs/2408.14777
Publikováno v:
Joysingh, S.J., Vijayalakshmi, P. & Nagarajan, T. Quartered Spectral Envelope and 1D-CNN-Based Classification of Normally Phonated and Whispered Speech. Circuits Syst Signal Process 42, 3038-3053 (2023)
Whisper, as a form of speech, is not sufficiently addressed by mainstream speech applications. This is due to the fact that systems built for normal speech do not work as expected for whispered speech. A first step to building a speech application th
Externí odkaz:
http://arxiv.org/abs/2408.13746
Publikováno v:
Circuits Syst Signal Process 41, 4004-4027 (2022)
Culture and language evolve together. The old literary form of Tamil is used commonly for writing and the contemporary colloquial Tamil is used for speaking. Human-computer interaction applications require Colloquial Tamil (CT) to make it more access
Externí odkaz:
http://arxiv.org/abs/2408.13739
The onset of a musical note is the earliest time at which a note can be reliably detected. Detection of these musical onsets pose challenges in the presence of ornamentation such as vibrato, bending, and if the attack of the note transient is slower.
Externí odkaz:
http://arxiv.org/abs/2408.13734
Autor:
Jena, Shuvam, Rajendran, Sushmetha Sumathi, Seemakurthy, Karthik, A, Sasithradevi, M, Vijayalakshmi, Poornachari, Prakash
In the field of object detection, domain generalisation (DG) aims to ensure robust performance across diverse and unseen target domains by learning the robust domain-invariant features corresponding to the objects of interest across multiple source d
Externí odkaz:
http://arxiv.org/abs/2408.01746
Autor:
Scarano, Stephen, Vasudevan, Vijayalakshmi, Bagchi, Chhandak, Samory, Mattia, Yang, JungHwan, Grabowicz, Przemyslaw A.
Polls posted on social media have emerged in recent years as an important tool for estimating public opinion, e.g., to gauge public support for business decisions and political candidates in national elections. Here, we examine nearly two thousand Tw
Externí odkaz:
http://arxiv.org/abs/2406.03340