Výsledky vyhledávání - "Ananthakrishnan, Shankar"

Report

Design Considerations For Hypothesis Rejection Modules In Spoken Language Understanding Systems

Autor: Alok, Aman, Gupta, Rahul, Ananthakrishnan, Shankar

Spoken Language Understanding (SLU) systems typically consist of a set of machine learning models that operate in conjunction to produce an SLU hypothesis. The generated hypothesis is then sent to downstream components for further action. However, it

Externí odkaz: http://arxiv.org/abs/2211.09711

Zobrazit plný text záznamu

Report

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

Autor: Soltan, Saleh, Ananthakrishnan, Shankar, FitzGerald, Jack, Gupta, Rahul, Hamza, Wael, Khan, Haidar, Peris, Charith, Rawls, Stephen, Rosenbaum, Andy, Rumshisky, Anna, Prakash, Chandana Satya, Sridhar, Mukund, Triefenbach, Fabian, Verma, Apurv, Tur, Gokhan, Natarajan, Prem

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various

Externí odkaz: http://arxiv.org/abs/2208.01448

Zobrazit plný text záznamu

Report

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

Publikováno v: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA

We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the N

Externí odkaz: http://arxiv.org/abs/2206.07808

Zobrazit plný text záznamu

Report

One-vs-All Models for Asynchronous Training: An Empirical Analysis

Autor: Gupta, Rahul, Alok, Aman, Ananthakrishnan, Shankar

Any given classification problem can be modeled using multi-class or One-vs-All (OVA) architecture. An OVA system consists of as many OVA models as the number of classes, providing the advantage of asynchrony, where each OVA model can be re-trained i

Externí odkaz: http://arxiv.org/abs/1906.08858

Zobrazit plný text záznamu

Report

A Re-ranker Scheme for Integrating Large Scale NLU models

Autor: Su, Chengwei, Gupta, Rahul, Ananthakrishnan, Shankar, Matsoukas, Spyros

Large scale Natural Language Understanding (NLU) systems are typically trained on large quantities of data, requiring a fast and scalable training strategy. A typical design for NLU systems consists of domain-level NLU modules (domain classifier, int

Externí odkaz: http://arxiv.org/abs/1809.09605

Zobrazit plný text záznamu

Report

Play Duration based User-Entity Affinity Modeling in Spoken Dialog System

Autor: Xiao, Bo, Monath, Nicholas, Ananthakrishnan, Shankar, Ravi, Abishek

Multimedia streaming services over spoken dialog systems have become ubiquitous. User-entity affinity modeling is critical for the system to understand and disambiguate user intents and personalize user experiences. However, fully voice-based interac

Externí odkaz: http://arxiv.org/abs/1806.11479

Zobrazit plný text záznamu