Ensembles of Text and Time-Series Models for Automatic Generation of Financial Trading Signals from Social Media Content

Autor: Omar A. Bari, Arvin Agah
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: Journal of Intelligent Systems, Vol 29, Iss 1, Pp 753-772 (2018)
ISSN: 0334-1860
Popis: Event studies in finance have focused on traditional news headlines to assess the impact an event has on a traded company. The increased proliferation of news and information produced by social media content has disrupted this trend. Although researchers have begun to identify trading opportunities from social media platforms, such as Twitter, almost all techniques use a general sentiment from large collections of tweets. Though useful, general sentiment does not provide an opportunity to indicate specific events worthy of affecting stock prices. This work presents an event clustering algorithm, utilizing natural language processing techniques to generate newsworthy events from Twitter, which have the potential to influence stock prices in the same manner as traditional news headlines. The event clustering method addresses the effects of pre-news and lagged news, two peculiarities that appear when connecting trading and news, regardless of the medium. Pre-news signifies a finding where stock prices move in advance of a news release. Lagged news refers to follow-up or late-arriving news, adding redundancy in making trading decisions. For events generated by the proposed clustering algorithm, we incorporate event studies and machine learning to produce an actionable system that can guide trading decisions. The recommended prediction algorithms provide investing strategies with profitable risk-adjusted returns. The suggested language models present annualized Sharpe ratios (risk-adjusted returns) in the 5–11 range, while time-series models produce in the 2–3 range (without transaction costs). The distribution of returns confirms the encouraging Sharpe ratios by identifying most outliers as positive gains. Additionally, machine learning metrics of precision, recall, and accuracy are discussed alongside financial metrics in hopes of bridging the gap between academia and industry in the field of computational finance.
Databáze: OpenAIRE