Výsledky vyhledávání

Report

ProSLM : A Prolog Synergized Language Model for explainable Domain Specific Knowledge Based Question Answering

Autor: Vakharia, Priyesh, Kufeldt, Abigail, Meyers, Max, Lane, Ian, Gilpin, Leilani

Publikováno v: Springer, Lecture Notes on Computer Science (LNAI,volume 14980), 2024

Neurosymbolic approaches can add robustness to opaque neural systems by incorporating explainable symbolic representations. However, previous approaches have not used formal logic to contextualize queries to and validate outputs of large language mod

Externí odkaz: http://arxiv.org/abs/2409.11589

Zobrazit plný text záznamu

Report

Online Continual Learning of End-to-End Speech Recognition Models

Autor: Yang, Muqiao, Lane, Ian, Watanabe, Shinji

Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available. While prior research on continual learning in automatic speech recognition has focused on the adaptation of models across multiple d

Externí odkaz: http://arxiv.org/abs/2207.05071

Zobrazit plný text záznamu

Report

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

Autor: Peng, Yifan, Dalmia, Siddharth, Lane, Ian, Watanabe, Shinji

Conformer has proven to be effective in many speech processing tasks. It combines the benefits of extracting local dependencies using convolutions and global dependencies using self-attention. Inspired by this, we propose a more flexible, interpretab

Externí odkaz: http://arxiv.org/abs/2207.02971

Zobrazit plný text záznamu

Report

Identifying Actions for Sound Event Classification

Autor: Elizalde, Benjamin, Revutchi, Radu, Das, Samarjit, Raj, Bhiksha, Lane, Ian, Heller, Laurie M.

In Psychology, actions are paramount for humans to identify sound events. In Machine Learning (ML), action recognition achieves high accuracy; however, it has not been asked whether identifying actions can benefit Sound Event Classification (SEC), as

Externí odkaz: http://arxiv.org/abs/2104.12693

Zobrazit plný text záznamu

Report

Learning Question-Guided Video Representation for Multi-Turn Video Question Answering

Autor: Chao, Guan-Lin, Rastogi, Abhinav, Yavuz, Semih, Hakkani-Tür, Dilek, Chen, Jindong, Lane, Ian

Understanding and conversing about dynamic scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans. Video question answering is a specific scenario of such AI-human interaction where an

Externí odkaz: http://arxiv.org/abs/1907.13280

Zobrazit plný text záznamu

Report

BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer

Autor: Chao, Guan-Lin, Lane, Ian

An important yet rarely tackled problem in dialogue state tracking (DST) is scalability for dynamic ontology (e.g., movie, restaurant) and unseen slot values. We focus on a specific condition, where the ontology is unknown to the state tracker, but t

Externí odkaz: http://arxiv.org/abs/1907.03040

Zobrazit plný text záznamu

Report

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

Autor: Chao, Guan-Lin, Chan, William, Lane, Ian

Speech recognition in cocktail-party environments remains a significant challenge for state-of-the-art speech recognition systems, as it is extremely difficult to extract an acoustic signal of an individual speaker from a background of overlapping sp

Externí odkaz: http://arxiv.org/abs/1906.05962

Zobrazit plný text záznamu

Report

Speaker Diarization With Lexical Information

Autor: Park, Tae Jin, Han, Kyu, Lane, Ian, Georgiou, Panayiotis

This work presents a novel approach to leverage lexical information for speaker diarization. We introduce a speaker diarization system that can directly integrate lexical as well as acoustic information into a speaker clustering process. Thus, we pro

Externí odkaz: http://arxiv.org/abs/1811.10761

Zobrazit plný text záznamu

Report

Understanding and Improving Recurrent Networks for Human Activity Recognition by Continuous Attention

Autor: Zeng, Ming, Gao, Haoxiang, Yu, Tong, Mengshoel, Ole J., Langseth, Helge, Lane, Ian, Liu, Xiaobing

Publikováno v: The International Symposium on Wearable Computers (ISWC) 2018

Deep neural networks, including recurrent networks, have been successfully applied to human activity recognition. Unfortunately, the final representation learned by recurrent networks might encode some noise (irrelevant signal components, unimportant

Externí odkaz: http://arxiv.org/abs/1810.04038

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání