[Music] what is up everybody so we got our next video in our stock trading series using data science techniques and today really really excited to try out the hidden marov model the hmm for stock trading so we're going to be putting thousand real life dollars on the line today and seeing if the hmm model can give us good returns compared to other strategies we might use so first let's explain how this is going to work and probably most importantly and most curious question you have is what the heck are the hidden States to be in
our hidden Markov model pertaining to stock trading so one thing that we know is that the stock market is highly driven by people's sentiment about a given stock if I feel very positive about Tesla's stock today and if many other people do as well then the stock price will probably go up if many of us feel very negative about Tesla stock today then the price will probably go down if many people just kind of feel me neutral about Tesla stock today then probably the stock price won't move now in reality there's many many other factors
affecting the stock price but to keep our hidden Markov model simple today and to not add too many moving Parts we're going to be assuming that sentiment is the only thing that is going to affect the stock price and see how much mileage we can get out of that now there's only one problem with using sentiment to do stock return forecasting and that is that sentiment is hidden it's not like I can really go around polling everybody and ask what is your sentiment about Tesla what is your sentiment about Microsoft stock today there might be
proxies for that online maybe there's polls but in general it's this kind of hidden variable and that seems like the perfect place to use it would be in the hidden marov model as a latent feature that affects the observed feature the actual stocks returns and so we're going to make our first assumption here which is that we assume that sentiment is hid in but that sentiment influences the return of the stock on any given day and how that's going to manifest in our model is we're going to assume that if the sentiment of a given
stock is positive today then the return distribution is going to be this green one here who has a mean above zero if the sentiment on a given day is negative the return distribution is going to be this red one here centered below zero and if the sentiment is neutral then we're going to have this gray distribution here which is centered pretty much at zero one thing I did try to capture here is that the standard deviations or volatilities of these can be very different and that's going to be a factor that is baked into the
model as we're going to see so our first assumption here is that given the sentiment we're going to assume there's three classes today given I know this sentiment of the stock there's going to be some normal distribution of returns and the mean and standard deviation of that normal distribution are going to be affected by that sentiment the other thing we're going to assume and the other assumption for any hidden marov model is that one hidden state is going to give us a probability distribution of the next day's hidden States so we're going to assume that
today's sentiment affects tomorrow's or knowing today's sentiment gives us exactly the probability distribution for tomorrow's sentiment so if I know that people were happy about Tesla stock today then that's going to fix that's going to pin down the probabilities that people are going to be happy neutral or sad about Tesla stock tomorrow and so in a little picture we're basically looking at this picture here so each of these boxes represents a day so let's say people were happy about Tesla stock today then that's going to affect whether they're happy about Tesla stock tomorrow which is
going to affect whether they're sad about Tesla stock the next day and also given each day given we know the sentiment our first assump says that that's going to pin down the mean and variance of the normal distribution from which we're going to draw the return of Tesla stock today and so in the language of hidden marov models these links between hidden states are called the transition probabilities so transition probabilities and these vertical arrows here are called the emission probabilities I don't have enough room to write it but we'll do emm emission probs so when
one cool thing to do as well is think about how many parameters are going to be in this hidden marov model well we have three possibilities for the hidden state so happy sad and neutral so there's going to be some probability of starting our Markov chain on the very first zeroth day with any of these and because those three probabilities of starting with any of these different sentiments needs to add up to one that's just two free parameters because once we fix those two the third one has to be one minus the sum of those
so that's what this first one is telling us there's going to be two parameters for telling us how the marov chain starts then what about transition probabilities how many of them are they so there's three states that you could have for your last day three hidden States there's three hidden States you can have for your next day so the transition Matrix as we're going to see in a second is going to be 3x3 but again the same thing there's only going to be six free parameters because for any given day once we know the probability
of going from happy to happy and happy to sad we already know the probability of going from happy to neutral because those need to add up to one so that's going to give us 3^ 2us 3 more parameters and then the last one are what's going to dictate our emission probabilities so for each of these hidden States happy sad neutral we're going to have a normal distribution that we assume we draw our return for from that given day and each normal distribution is pinned down by of course a mean and a standard deviation so that's
going to be three times two extra parameters so in total we're going to have 14 parameters in our model and the reason I bring up that 14 parameters is to start talking about the pros and the cons of using a hidden marov model for stock return prediction versus you know literally anything else we could use and so the pros one of the biggest Pros not to be overlooked is interpretability if this model does do really well then not only does it do well at giving us good returns it also gives us a framework by which
to understand how the stock market works so if this model does well then it basically says that hey maybe today's sentiment does affect tomorrow's sentiment and I have the exact transition Matrix as we'll see in a second that lets me predict that and also because it's a rather simple model it only has 14 parameters here versus if we use something more complicated like a recurrent neural network or even more complicated ones that's going to have an order of magnitude more parameters so by having less parameters this hidden Markov model is less prone to overfit it's
less prone to learn the training data exactly and therefore it's going to be able to generalize to unseen data potentially better now the con is that it's a very rigid architecture there's a lot of assumptions that you probably kind of scratching your head at on this previous page so does yesterday's sentiment is that all that affects today's sentiment definitely not is today's sentiment only what's going to affect the return of the stock definitely not so it's a very rigid architecture but that's the price we have to pay if we're going to get something interpretable and
something that has a lower risk of overfitting so now let's get to the part that you probably came here here for looking at some actual results and then seeing how the strategy does overall especially stacked up to something like a recurrent neural network strategy which does something similar to this but in a more complicated more flexible way so before getting to those overall results I want to use the interpretability that we got from this model to show what is the transition Matrix for the Tesla stock so we see here's the transition Matrix and it's really
cool because we can say that on the rows we have the previous day's hidden states on the columns we have the next day hidden States so this is saying if People's sentiment about the stock was positive yesterday then there's a 3% chance of it being positive today a 28% chance of it being neutral today and a 69% chance of it being negative today so positive is most likely followed by negative now negative is most likely followed by neutral and neutral is most likely followed by positive so we see we have this kind of cool repeating
pattern if we look just at the maximum probabilities so that is to say let me use purple for this if it's positive we said that the most likely is going to be negative if it's negative then the most likely is going to be neutral and if it's neutral then the most likely is going to be positive so we have this kind of positive goes to negative goes to neutral goes to positive goes to negative kind of repeating pattern going on here and so that tells us all about the transition probabilities of the model learned for
Tesla stock and so by the way this model was trained on data from March of 2023 to March of 2024 and now we can look at the emission probabilities so the emission probabilities say that if the sentiment was positive for Tesla stock on any given day in that range then the distribution of returns as we expected would be centered slightly at something positive but they have a pretty big standard deviation as you would expect because this is a stock return problem after all and so of course there's going to be a lot of volatility probably
because we didn't capture a lot of Dynamics and this model did have a very rigid architecture so that does make sense if the sentiment was neutral today then as we see we Center pretty much on zero with again large standard deviation if the sentiment was negative we Center on something less than zero and we have a pretty big standard deviation here so I really love the hidden marov model just because it is so interpretable with just these two little visualizations here we have a framework probably not a correct framework but we have some framework to
go off of for how the stock market works for Tesla's stock over the last year and now we're going to get to the results but really quickly how do we actually use this model let's say we're deciding whether or not we want to buy Tesla stock how do we predict the return of Tesla stock tomorrow using this model well we're going to look at what was the sentiment according to this model yesterday so let's say that the sentiment was neutral yesterday and then we use the transition Matrix to come up with the most likely sentiment
tomorrow which is going to be positive then given we think the sentiment is going to be positive tomorrow we look at the corresponding normal distribution and we take a draw from this normal distrib contribution which could totally be negative but is more likely to be positive so let's say we take a draw right here and so that's going to be our expected return for Tesla stock tomorrow and we do the same thing for all the other stocks in the S&P 500 we go through the exact same process and then we rank all those expected tomorrow's
returns descending we pick the top 10 and we're going to take our $1,000 and put $100 into each of these top 10 stocks and we're going to see what happened with those stocks in a day and we're going to calculate those return returns and I'm going to show you those now so first here's a baseline strategy which is called the random strategy this means I just pick 10 stocks at a random in the s&p500 if I did do that the return in the next day would have been 0.2% so 0.02% so slightly positive but as
we expected nothing to write home about pretty much zero so with the hidden marov model with the place where we actually invested our onek our 1K dollar what was the return the return and here's the part we all care about da dral please the return is 0.2% so it's 10 times higher than the random strategy so we're doing something right and overall we made a solid $2 so this is going to be $2 that we made so huz we can buy a quarter of a coffee or something with that and the other thing we want
to look at is with the recurrent neural network model which is a more complex architecture what would we have gotten so if we did the same experiment using a recurrent neural network model instead using the exact same time period then we would have got a return of 0.4% or double of what we got with the hidden marov model so of course there's a big grain of salt many grains of salt to be taken with all this this is just the returns from one day the stock market is notoriously difficult to predict so who knows what
these numbers would look like if I did this experiment one day earlier or one day later so I'm going to put asterisks asterisks everywhere everywhere just to say that you know what this this doesn't mean that the recurrent neural network model is always going to do double as much as the hmm But on this given day it probably would have been better to use the recurrent neural network model instead of the hmm model but it was really cool to see that we still did get some kind of positive return that was better than random using
the hidden Mark Hub model so hopefully you enjoy this video if you have any questions comments ideas for more stock trading strategies we can try always always welcome in the comment section below if you like this video please like And subscribe for more videos just like this one here and I'll see all you wonderful people next time