How Reddit Took Over Wall Street

Can’t Stop Won’t Stop GameStop 🚀🚀🚀

Nadim Kawwa
DataDrivenInvestor

--

Photo by John Baker on Unsplash

On January 28th 2021, the price of GameStop Corp. ($GME) shares shot up to an all time high of $483, an approximate 100 fold increase compared to its listing price a year ago. As news outlets scrambled to obtain an explanation, a subreddit called r/WallStreetBets (WSB) was singled out as the driving force behind the surge.

If you go on WSB, you will mostly see a whole lot of rocket emojis (🚀🚀 🚀 🚀 🚀), diamond hands(💠 🙌), paper hands (🧻 🙌), “Apes together strong 🦍” , talk of “tendies” 🐔, and “Still holding” with screenshots of portfolios deep in the red. If the slang on WSB seems confusing, Reuters has got you covered. To understand how Reddit itself works, refer to CNBC’s explanation.

In this article, my objective is to dissect WSB using data science. Particularly, I aim to show the trends among the swarm of loosely-coordinated, individual retail investors who short squeezed GME. I will do this by first, demonstrating how to collect the data. Next, I will present plots that offer snapshots into WSB. Finally, I will use NLP to measure emotional engagement on the subreddit.

Setting up Parameters

On July 27th 2020, youtuber Roaring Kitty (AKA u/DeepFuckingValue) posted his thesis on why he’s bullish on GME. This is a good starting point to begin collecting data from WSB as this is when the community probably became first aware of GME.

There are two kinds of content on Reddit: comments and submissions. A submission is an original post created by a user, whereas a comment is extracted from the thread of comments under that submission.

For each day starting July 1st 2020 until February 2021, I collected the top 100 submissions and top 100 comments in WSB in 8 hour intervals. This can be done via pushshift or more simply via PMAW.

Creating a URL to capture that data is made easy thanks to pushshift. The python function below creates a URL that tells pushshift to collect the top 1000 posts in a time frame.

From that URL I can build a feature space:

  • Time of creation
  • Submission title
  • Text content
  • Awards given if any
  • If it contains video and/or pictures
  • Number of comments
  • Flair (Meme, YOLO, Loss)

Visualizing a Subreddit

As WSB went mainstream, users flocked to join the subreddit. What does going viral look like? The plot below shows the growth of the subreddit from July 2020 to January 2021, in what resembles a hockey stick growth. For reference, all plots contain the high and low price of GME stock on that day.

Source: The Author

Emojis are used everywhere on the internet and WSB is no exception. The plot below shows the cumulative use of the top 10 emojis. Unsurprisingly, the rocket emoji shot up in early November and has gone up since, despite GME tanking later in January.

Source: The Author

Moreover, in early January 2021, it seemed like GME was the one and only topic on WSB, the plot below shows how it came to dominate conversation. The stocks of interest are based on a perusal of WSB forums and stocks associated with it. For example, Tesla and Palantir are two other WSB favorites when it comes to making bullish bets.

Source: The Author

In the future though, tasks like these might become harder because WSB users will try to throw you off with random tickers.

Online Sentiment & Stock Price

The prevailing assumption is that we live in an efficient market where assets are priced based on a logical assessment of business fundamentals. However, some economists such as Daniel Kahneman suggested that trades are driven by emotions such as over-confidence, fads, and arrogance.

In this section, I go through every top comment and measure the sentiment with the help of VADER (not the Sith Lord). Bear in mind that this sentiment analysis is limited by a positive/negative scale and might not capture things like humor and sarcasm, let alone understand “GME GO BRRRRRRR 🚀”.

I wanted to capture the effect of a post that has attracted thousands of comments, and prevent it from domineering the conversation. Moreover, a post is impactful based on the amount of feedback it receives from the community. These considerations yield the function below which is applied as a multiplier on the sentiment score returned by VADER:

Source: The Author

My reasoning is that upvote ratio is always between 0 and 1, and a post can only trend for up to 3–4 days before Reddit removes it from the trending page. However, the number of comments and awards received can range from none to tens of thousands. Therefore, number of comments and awards are better explained as logarithmic measures.

The plot below shows the 7 day rolling average for sentiments.

Source: The Author

Zooming in on January 2021, there appears to be no significant uptick in “positive” sentiment. Moreover, the stock price of GME does not seem to be correlated. At this point, surprisingly, rocket emojis are your best indicator of market movement.

Even though VADER is specifically attuned to sentiments expressed in social media, it still falls short. It shows that when it comes to social media text, spending some time to read the entries is your best bet to understand overall sentiment.

Predicting The Future

It’s tempting to try and fit a regression model to predict some future astronomical rise of an asset. In this post, I abstain from doing this based on my personal conviction that stocks, in particular ‘stonks’ which only go up, are too complex to predict by some mathematical model.

What WSB Means for Investing

The GME phenomenon is a watershed moment in financial markets, as individual traders realized that they can organize swarm a stock. In a way, retail investors on Reddit acted like a decentralized hedge fund, and bet against established institutions.

WSB users made their case for GME and other stocks over the years, some offered insightful deep dives, while others won us over with “ I like the stock”. In the future, would you take your investing advice from Reddit?

Before You Go

To see how the project was coded, please refer to the GitHub repository. Have you also worked with Reddit’s API and got insights to share? Please let me know!

Disclaimer

PS: Nothing in this post should be taken as financial advice. I am not making any financial recommendations. You shouldn’t listen to what I have to say about stocks and/or stonks. I don’t own any shares in GME.

Gain Access to Expert View — Subscribe to DDI Intel

--

--