What is Recurrent Neural Network in Deep Learning? | RNN

70.68k views1209 WordsCopy TextShare

Learn With Jay

in this video, we will understand what is Recurrent Neural Network in Deep Learning. Recurrent Neura...

Video Transcript:

hey siri tell me a joke if someone could just reverse the process of making wine that would be great yeah you must have used siri google assistant or alexa in your life but have you ever wondered how are these built what deep learning model do they actually use or you might have also come across grammarly suggestions or google autocomplete if you have then this series is going to be a great because from this video i'm starting a new playlist on recurrent neural network now this is the model in deep learning that is used for natural

language processing or creating applications like virtual assistants or google auto complete or language translation for example converting a spanish sentence into an english sentence all these are built using recurrent neural network and in this series we will cover the mathematical details behind this model as well as at the end once we understand everything we will implement it ourselves once you understand how this works and once you're comfortable building an application yourself you will feel like it's a door open in front of you where you can create many interesting applications that you have used yourself so

this series is going to be a very interesting one so if you are new to this channel you know subscribe hit the like button and also the bell icon so that you get notified whenever i upload new video and let's not wait further and let's see first what is rna end why do we actually need it now let us first see why do we need recurrent neural network and why can't we just use the simple artificial neural network for these tasks let's say we want to build a language translation model whose job is to convert

an english sentence into a spanish sentence now here we cannot use artificial neural network because if you remember artificial neural network has fixed number of neurons in the input and the output so we cannot feed variable length input data to a model which accepts fixed length input now here you might say that hey jay i have a solution to this why do not we convert all the sentence into fixed length sentences by padding them with zeros and i might say that this might work but there is another problem with this now here arrangement of the

words also matter or the sequence in which the words appear matters for example if you see these two sentences both have same words but they appear in different order and thus the translation for them are different and for a simple neural network the order in which we feed the input does not matter for example if we are building an application of house price prediction and we are feeding the data like square foot area number of bedrooms garage size etc then the order in which we feed this data does not matter it will give the same

exact output even if we interchange the input order but here in the natural language processing task the order does matter because of these two problems we cannot use artificial neural network for natural language processing task so the people have come up with the recurrent neural network the model structure of recurrent neural network looks something like this here the word recurrent means occurring repeatedly which means that the neural network occurs repeatedly through time so in this diagram this part is a neural network now we do not pass the entire sentence to this but we pass one

word now this word will give us one output and it will also produce an activation which will be passed to the next time step that you can see in this diagram both these diagrams represent the same thing but this is represented through time so for example this is time equal to 1 this is time equal to 2 this is time equal to 3 and so on this is the same neural network we will be able to understand this better with the help of an example let's take an example let's say we are making a named

entity recognition model now the job of this model is to identify the entities that are occurring in the given sentence for example this is a person this is an organization and san francisco is a location these kinds of applications are very useful for example let's say if we are interacting with a chatbot and we pass this message as hey show me the best places to visit in mumbai during january 2022. so the chatbot must know that the mumbai is a location and january 2022 is a date so let's say we are making this application now

instead of passing this entire sentence we will only pass the first word to this rnn now this rnn will produce one output and it will also produce an output activation and it will also take an input activation let's say in the beginning we are passing a matrix of 0 as activation now this is the time step equal to 1 and the next time step we will pass the word worked into this network now this will take word work as an input it will also produce an activation which will be a slightly different from this activation

and an output here 0 means that it is not an entity then at the next time stamp it will take the third word and it will continue repeating the process until we reach the end of the sentence so here few things to note here is that this is the same block that is repetitive through the time which means that this is time equal to 1 this is time equal to 2 time equal to 3 and so on and for every word it is giving an output whether it is an entity or not this way we

can pass a sentence of any length and here the order in which we pass the sentence does matter for example here the prediction organization is made not just considering this word google but also considering all the words that have occurred into the sentence so far and similarly the prediction of the location is made based on all these sentences that have occurred because we are passing the activations from from every time stamps into the network so if we use an architecture like this we can create any natural language processing application now here we only discussed an

application which has the same output length as the input length which means that the output size was same as the input size but what if we have a different output size we will discuss how the model looks like in the next video where we will be studying the different types of recurrent neural network so i hope you like this video if so then hit the like button and let's jump quickly right on to the next video you can find the link to the next video by clicking somewhere in the left or right side of this

video and i see you in the next one