15: Effect of Twitter Activity on Financial Markets
Introduction
Prediction of the stock market and crypto market performance is one of the most difficult things to do. There are so many aspects influencing the prediction – physical factors vs. psychological, behaviour, sentiments, etc. All these factors combine to make share prices volatile and very difficult to predict with a high degree of accuracy.
In this project, we use sentiment analysis and machine learning to establish the correlation between “public sentiment” and “market sentiment”. We use data from twitter to predict public mood and use the predicted mood to forecast the stock market and crypto market movements. We access historical tweets using snscrape, along with daily historical stock data, which we use to incorporate while training the model to make it more robust.
Problem Identified
Researchers for a long time have been working on volatile market prediction. The Efficient Market Hypothesis (EMH) says that stock market prices are largely influenced by trending information and follow a random walk pattern. Though this hypothesis is widely accepted by the research community as a central paradigm governing the markets in general, several people have attempted to extract patterns in the way stock markets behave and respond to external stimuli.
Financial markets consist of a lot of investment opportunities. Trading on Indian stock exchanges(NSE & BSE) is done by a lot of people in India as well as abroad. In recent years, investment activities in cryptocurrencies have spiked due to the rise of bitcoin and ethereum based coins. Social media networks, particularly twitter, allows people to interact with each other in relation to particular stocks or coins. Any financial instrument generally indicates the sentiment of investors, and any market is affected by the dominant sentiment. Hence this sentiment can be used to find effect on the market with respect to particular coins and stocks.
The further sections are divided into two, one for stocks and the other for cryptocurrencies.
Stocks
Evolving Process
The tweet data will be used to categorise tweets into positive and negative tweets. The volume of
negative and positive tweets will determine a polarity index through which we will classify the event as a
negative, neutral or positive event. To cover a broad scale of the market, we are taking stocks from all
the different sectors like telecom, pharma etc. Further the polarity data and current twitter prices will be
fed into different machine learning models.
Data Collection
In addition to data collected using Twitter api v2, we used snscrape, which can scrape tweet data from
beyond one week. Present data includes tweet data from Dec 7,2021 to April 30,2022 and daily stock
price data from Dec 7,2021 to April 30,2022.
Methodology
Results
We have used a complex random forest model. The dataset is split into 4:1, after experimenting with
7:3 and 9:1 train:test ratio, with 4:1 giving the most optimum results. Sentiments of tweets are analysed
and processed in order to relate sentiment score to the prices of a particular stock.
The model is able to mimic the general pattern but is unable to fit to an exact level of price.This means
that we can take its help to analyse the general sentiment to get an understanding if the stock price can
rise or fall accordingly.
CRYPTOCURRENCIES
Data Collection
Using Twitter api v2, we used tweepy to collect tweet volumes. We have used coingecko api to collect
price values for our listed crypto-coins which includes daily tweet volume from March 31st,2021 to
April 15,2022.
Methodology
The tweet data has been used to classify tweets into positive and negative categories. The volume of
negative and positive tweets help us in determining a polarity index through which we will classify the
event as a negative, neutral or positive event. To cover a broad scale of the market, we have taken
stocks from all the different sectors like telecom, pharma etc. Further we feed the polarity data and
current twitter prices into different machine learning models.
Results
Conclusion
We used sentiment analysis on tweets to forecast stock and crypto market movements. Our model is able to predict a general movement pattern in stocks but fails to fit in exact price ranges. This means a general sentiment analysis of tweets can forecast rise and fall in Indian stock markets. Oppositely such forecasts cannot be done in the case of cryptocurrencies. This implies that cryptocurrency market movements are feebly linked with tweet sentiments. But using certain variations of the LSTM model we were able to predict rise-fall movements in cryptocurrencies. This can be a topic of research with future scope.
Future Work
Finally, it's worthy to mention that our analysis doesn’t take into account many aspects. Firstly, our dataset only considers twitter using English speaking people and hence doesn’t really map the real public sentiment. A higher correlation can possibly occur if the actual mood is studied. It may be said that people’s mood definitely influences their investment decisions, hence the correlation. But in that case, people who invest in stocks are not directly correlated to those who use twitter more often. Though there certainly is an indirect correlation because investment decisions of people may be affected by the moods of people around them, ie. the general public sentiment. All these remain as areas of future research.
Check out our video presentation here: https://youtu.be/NGkEJDbIHtY
Team Members
Aman
Avi Kothari
Kshitij Mishra
Kushal Jain
Pranav Kannan
Pratham Gupta
Veeral Agarwal
Vaibhav Kashera
Yash Amin
Comments
Post a Comment