Gálvez, R.H.; Gravano, A."Assessing the usefulness of online message board mining in automatic stock prediction systems" (2017) Journal of Computational Science. 19:43-56
Estamos trabajando para incorporar este artículo al repositorio
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor


We provide evidence of the usefulness of exploiting online text data in stock prediction systems. We do this by mining a popular Argentinian stock message board and empirically answering two questions. First, is there information in the online stock message board useful for predicting stock returns? Second, if useful information is found, is it novel or it is simply a different way of expressing information already available in the past behavior of stock prices? To address these questions, we build and validate a series of predictive models using state-of-the-art machine learning and topic discovery techniques. Running experiments in which the models are trained with different combinations of features extracted from the past behavior of stock prices, or mined from the online message boards. Evidence suggests that it is possible to extract predictive information from stock message boards. Furthermore, we find that adding this information improves the performance of classification systems trained solely on technical indicators. Our results suggest that information from online text data is complementary to the one available in the past evolution of stock prices. Additionally, we find that highly predictive features derived from the message board data seem to have an important and relevant semantic content. © 2017 Elsevier B.V.


Documento: Artículo
Título:Assessing the usefulness of online message board mining in automatic stock prediction systems
Autor:Gálvez, R.H.; Gravano, A.
Filiación:Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Argentina
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina
Palabras clave:Latent semantic analysis; Random forest; Ridge regression; Stock market; Text mining; Classification (of information); Costs; Data mining; Decision trees; Electronic trading; Financial markets; Forecasting; Investments; Learning systems; Regression analysis; Semantics; Classification system; Latent Semantic Analysis; Predictive information; Predictive models; Random forests; Ridge regression; Technical indicator; Text mining; Online systems
Página de inicio:43
Página de fin:56
Título revista:Journal of Computational Science
Título revista abreviado:J. Comput. Sci.


  • Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L., Detecting influenza epidemics using search engine query data (2009) Nature, 457 (7232), pp. 1012-1014
  • Eichstaedt, J.C., Schwartz, H.A., Kern, M.L., Park, G., Labarthe, D.R., Merchant, R.M., Jha, S., Seligman, M.E.P., Psychological language on twitter predicts county-level heart disease mortality (2015) Psychol. Sci., 26 (2), pp. 159-169
  • Yunusoglu, M.G., Selim, H., A fuzzy rule based expert system for stock evaluation and portfolio construction: an application to Istanbul stock exchange (2013) 2nd International Fuzzy Systems Symposium, Expert Syst. Appl., 40 (3), pp. 908-920. , Ankara, Turkey, 17–18 November 2011
  • Patel, J., Shah, S., Thakkar, P., Kotecha, K., Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques (2015) Expert Syst. Appl., 42 (1), pp. 259-268
  • Guresen, E., Kayakutlu, G., Daim, T.U., Using artificial neural network models in stock market index prediction (2011) Expert Syst. Appl., 38 (8), pp. 10389-10397
  • Boyacioglu, M.A., Avci, D., An adaptive network-based fuzzy inference system (ANFIS) for the prediction of stock market return: the case of the Istanbul stock exchange (2010) Expert Syst. Appl., 37 (12), pp. 7908-7912
  • Yao, J., Tan, C.L., Poh, H.-L., Neural networks for technical analysis: a study on KLCI (1999) Int. J. Theor. Appl. Finance, 2 (2), pp. 221-241
  • Kara, Y., Boyacioglu, M.A., Baykan, Ö.K., Predicting direction of stock price index movement using artificial neural networks and support vector machines: the sample of the istanbul stock exchange (2011) Expert Syst. Appl., 38 (5), pp. 5311-5319
  • Nardo, M., Petracco-Giudici, M., Naltsidis, M., Walking down wall street with a tablet: a survey of stock market predictions using the web (2016) J. Econ. Surv., 30 (2), pp. 356-369
  • Bollen, J., Mao, H., Zeng, X., Twitter mood predicts the stock market (2011) J. Comput. Sci., 2 (1), pp. 1-8
  • Lee, H., Surdeanu, M., MacCartney, B., Jurafsky, D., On the importance of text analysis for stock price prediction (2014) Proceedings of LREC 2014, pp. 1170-1175. ,
  • Shynkevich, Y., McGinnity, T., Coleman, S.A., Belatreche, A., Forecasting movements of health-care stock prices based on different categories of news articles using multiple kernel learning (2016) Decis. Support Syst., 85, pp. 74-83
  • Schumaker, R.P., Chen, H., Textual analysis of stock market prediction using breaking financial news: the AZFin text system (2009) ACM Trans. Inf. Syst., 27 (2). , 12:1–12:19
  • Preis, T., Moat, H.S., Stanley, H.E., Quantifying trading behavior in financial markets using google trends (2013) Sci. Rep., 3
  • Moat, H.S., Curme, C., Avakian, A., Kenett, D.Y., Stanley, H.E., Preis, T., Quantifying wikipedia usage patterns before stock market moves (2013) Sci. Rep., 3
  • Hagenau, M., Liebmann, M., Neumann, D., Automated news reading: stock price prediction based on financial news using context-capturing features (2013) Decis. Support Syst., 55 (3), pp. 685-697
  • Geva, T., Zahavi, J., Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news (2014) Decis. Support Syst., 57, pp. 212-223
  • Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L., Text mining for market prediction: a systematic review (2014) Expert Syst. Appl., 41 (16), pp. 7653-7670
  • Jurafsky, D., Martin, J.H., Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2008), 2nd edition Prentice Hall PTR Upper Saddle River, NJ, USA; Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R., Indexing by latent semantic analysis (1990) J. Am. Soc. Inf. Sci., 41 (6), pp. 391-407
  • Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R., Sentiment analysis of twitter data (2011) Proceedings of the Workshop on Languages in Social Media, LSM’11, pp. 30-38. ,, Association for Computational Linguistics Stroudsburg, PA, USA
  • Bird, S., Klein, E., Loper, E., Natural Language Processing with Python (2009), O'Reilly Media, Inc; Dumais, S.T., Latent Semantic Analysis (2004) Annu. Rev. Inf. Sci. Technol., 38 (1), pp. 188-230
  • Torgo, L., Data Mining with R, Learning with Case Studies (2010), Chapman and Hall/CRC; Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning, Springer Series in Statistics (2001), 2nd edition Springer New York Inc. New York, NY, USA; Alpaydin, E., Introduction to Machine Learning (2009), 2nd edition MIT Press; Clements, M.P., Hendry, D.F., Forecasting Non-stationary Economic Time Series (2001), MIT Press; Ulrich, J., TTR: Technical Trading Rules, R Package Version 0.23-0 (2015),; Rosillo, R., de la Fuente, D., Brugos, J.A.L., Technical analysis and the Spanish stock exchange: testing the RSI, MACD, momentum and stochastic rules using Spanish market companies (2013) Appl. Econ., 45 (12), pp. 1541-1550
  • Armano, G., Marchesi, M., Murru, A., A hybrid genetic-neural architecture for stock indexes forecasting (2005) Inf. Sci., 170 (1), pp. 3-33. , Computational Intelligence in Economics and Finance


---------- APA ----------
Gálvez, R.H. & Gravano, A. (2017) . Assessing the usefulness of online message board mining in automatic stock prediction systems. Journal of Computational Science, 19, 43-56.
---------- CHICAGO ----------
Gálvez, R.H., Gravano, A. "Assessing the usefulness of online message board mining in automatic stock prediction systems" . Journal of Computational Science 19 (2017) : 43-56.
---------- MLA ----------
Gálvez, R.H., Gravano, A. "Assessing the usefulness of online message board mining in automatic stock prediction systems" . Journal of Computational Science, vol. 19, 2017, pp. 43-56.
---------- VANCOUVER ----------
Gálvez, R.H., Gravano, A. Assessing the usefulness of online message board mining in automatic stock prediction systems. J. Comput. Sci. 2017;19:43-56.