Autor: |
Ivan José Reis Filho, Ricardo Marcondes Marcacini, Solange Oliveira Rezende |
Jazyk: |
angličtina |
Rok vydání: |
2022 |
Předmět: |
|
Zdroj: |
MethodsX, Vol 9, Iss , Pp 101758- (2022) |
Druh dokumentu: |
article |
ISSN: |
2215-0161 |
DOI: |
10.1016/j.mex.2022.101758 |
Popis: |
Forecasting models in the financial market generally use quantitative time-series data. However, external factors can influence data in time-series, such as weather events, economic crises, and the foreign exchange market. This information is not explicit in the time-series and can influence the prediction of the variable values. Textual data can be a source of knowledge about external factors and is potentially helpful for time-series forecasting models. Some studies have presented text mining techniques to combine textual and time-series data. However, the existing representations have limitations, such as the curse of dimensionality and sparse data. This work investigates the finite use of domain-specific terms to investigate these problems by representing textual data with low dimensional space. We consider thirty-three keywords that are potentially important in the domain to enrich time-series using text mining techniques. Four regression models were applied to the representation proposed to predict the future daily price of corn and soybeans. The experimental setup considers a real market scenario, in which the daily sliding window strategy and step-forward forecast were used. The representation proposed has better accuracy in some forecasting scenarios. The results indicate that text data are a promising alternative for enriching time-series representations and reducing uncertainty forecasting models. • We show an approach to enriching time-series using domain-specific terms; • Representation proposed combines quantitative data with qualitative market factors; • Regression Models to learn a forecasting function from enriched time-series. |
Databáze: |
Directory of Open Access Journals |
Externí odkaz: |
|