Hybrid Feature Factored System for Scoring Extracted Passage Relevance in Regulatory Filings
Autor: | Julien Perez, Ágnes Sándor, Denys Proux, Claude Roux |
---|---|
Rok vydání: | 2017 |
Předmět: |
060201 languages & linguistics
Document Structure Description Information retrieval Computer science business.industry 06 humanities and the arts 02 engineering and technology Ranking (information retrieval) Data set Text mining Ranking Content analysis 0602 languages and literature 0202 electrical engineering electronic engineering information engineering Feature (machine learning) 020201 artificial intelligence & image processing Relevance (information retrieval) business |
Zdroj: | DSMM@SIGMOD |
Popis: | We report in this paper our contribution to the FEIII 2017 challenge addressing relevance ranking of passages extracted from 10-K and 10-Q regulatory filings. We leveraged our previous work on document structure and content analysis for regulatory filings to train hybrid text analytics and decision making models. We designed and trained several layers of classifiers fed with linguistic and semantic features to improve relevance prediction. We discuss in this paper our experiments and results on the competition data set. |
Databáze: | OpenAIRE |
Externí odkaz: |