Identifying investment signals via social media analytics


US-headquartered quantitative investment management firm


The client was looking to upgrade inhouse quantitative models to incorporate investment signals originating across multiple social media sources


  • Established significant degree of correlation between the past 5 days sentiment average and the next leading five days closing price average
  • Expanded the approach to cover other sectors
  • Our client integrated our approach across their in house quant models to enhance their ability to deliver alpha


  • SGA conducted a detailed background analysis to identify sectors and corresponding social media sources that would generate the most meaningful signals for the client’s quant models
  • Based on the background analysis, SGA identified pharmaceuticals as the most preferred sector to conduct the analysis. The decision was based on relatively lower spam / irrelevant content across social media and ability to leverage pharma-specific content from social media sources such as Stocktwits
  • In addition to StockTwits, SGA also leveraged social media sources including Twitter


  • Post data cleaning, SGA generated sentiment scores for each tweet / StockTwit. SGA leveraged a detailed sector-specific lexicon to enhance the accuracy of the model
  • SGA aggregated the scores up to day level and normalized the values
  • Thereafter, we combined the sentiment score table and the stock price data. As part of the backtesting process, SGA determined the correlation between the past 5 day’s sentiment score for a particular pharma stock and the stock price of the company over the next 5 days.
  • We repeated this analysis across multiple time periods and across different pharma stocks
  • We created new variables for attributes such as lag sentiment scores, lead closing prices, normalized all variables, performed Pearson’s correlation between dependent and independent variables, along with regression analysis