Výsledky vyhledávání - "Shiralkar, Prashant"

Report

Effective Proxy for Human Labeling: Ensemble Disagreement Scores in Large Language Models for Industrial NLP

Autor: Du, Wei, Advani, Laksh, Gambhir, Yashmeet, Perry, Daniel J, Shiralkar, Prashant, Xing, Zhengzheng, Colak, Aaron

Large language models (LLMs) have demonstrated significant capability to generalize across a large number of NLP tasks. For industry applications, it is imperative to assess the performance of the LLM on unlabeled production data from time to time to

Externí odkaz: http://arxiv.org/abs/2309.05619

Zobrazit plný text záznamu

Report

Extracting Shopping Interest-Related Product Types from the Web

Autor: Li, Yinghao, Lockard, Colin, Shiralkar, Prashant, Zhang, Chao

Recommending a diversity of product types (PTs) is important for a good shopping experience when customers are looking for products around their high-level shopping interests (SIs) such as hiking. However, the SI-PT connection is typically absent in

Externí odkaz: http://arxiv.org/abs/2305.14549

Zobrazit plný text záznamu

Report

Label-Efficient Self-Training for Attribute Extraction from Semi-Structured Web Documents

Autor: Sarkhel, Ritesh, Huang, Binxuan, Lockard, Colin, Shiralkar, Prashant

Extracting structured information from HTML documents is a long-studied problem with a broad range of applications, including knowledge base construction, faceted search, and personalized recommendation. Prior works rely on a few human-labeled web pa

Externí odkaz: http://arxiv.org/abs/2208.13086

Zobrazit plný text záznamu

Report

DOM-LM: Learning Generalizable Representations for HTML Documents

Autor: Deng, Xiang, Shiralkar, Prashant, Lockard, Colin, Huang, Binxuan, Sun, Huan

HTML documents are an important medium for disseminating information on the Web for human consumption. An HTML document presents information in multiple text formats including unstructured text, structured key-value pairs, and tables. Effective repre

Externí odkaz: http://arxiv.org/abs/2201.10608

Zobrazit plný text záznamu

Report

TCN: Table Convolutional Network for Web Table Interpretation

Autor: Wang, Daheng, Shiralkar, Prashant, Lockard, Colin, Huang, Binxuan, Dong, Xin Luna, Jiang, Meng

Information extraction from semi-structured webpages provides valuable long-tailed facts for augmenting knowledge graph. Relational Web tables are a critical component containing additional entities and attributes of rich and diverse knowledge. Howev

Externí odkaz: http://arxiv.org/abs/2102.09460

Zobrazit plný text záznamu

Report

ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages

Autor: Lockard, Colin, Shiralkar, Prashant, Dong, Xin Luna, Hajishirzi, Hannaneh

In many documents, such as semi-structured webpages, textual semantics are augmented with additional information conveyed using visual elements including layout, font size, and color. Prior work on information extraction from semi-structured websites

Externí odkaz: http://arxiv.org/abs/2005.07105

Zobrazit plný text záznamu

Report

TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition

Autor: Lin, Bill Yuchen, Lee, Dong-Ho, Shen, Ming, Moreno, Ryan, Huang, Xiao, Shiralkar, Prashant, Ren, Xiang

Publikováno v: Proc. of ACL 2020, page 8503--8511

Training neural models for named entity recognition (NER) in a new domain often requires additional human annotations (e.g., tens of thousands of labeled instances) that are usually expensive and time-consuming to collect. Thus, a crucial research qu

Externí odkaz: http://arxiv.org/abs/2004.07493

Zobrazit plný text záznamu

Report

CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web

Autor: Lockard, Colin, Dong, Xin Luna, Einolghozati, Arash, Shiralkar, Prashant

The web contains countless semi-structured websites, which can be a rich source of information for populating knowledge bases. Existing methods for extracting relations from the DOM trees of semi-structured webpages can achieve high precision and rec

Externí odkaz: http://arxiv.org/abs/1804.04635

Zobrazit plný text záznamu

Report

RelSifter: Scoring Triples from Type-like Relations - The Samphire Triple Scorer at WSDM Cup 2017

Autor: Shiralkar, Prashant, Avram, Mihai, Ciampaglia, Giovanni Luca, Menczer, Filippo, Flammini, Alessandro

We present RelSifter, a supervised learning approach to the problem of assigning relevance scores to triples expressing type-like relations such as 'profession' and 'nationality.' To provide additional contextual information about individuals and rel

Externí odkaz: http://arxiv.org/abs/1712.08674

Zobrazit plný text záznamu

Report

Finding Streams in Knowledge Graphs to Support Fact Checking

Autor: Shiralkar, Prashant, Flammini, Alessandro, Menczer, Filippo, Ciampaglia, Giovanni Luca

The volume and velocity of information that gets generated online limits current journalistic practices to fact-check claims at the same rate. Computational approaches for fact checking may be the key to help mitigate the risks of massive misinformat

Externí odkaz: http://arxiv.org/abs/1708.07239

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání