FORMATION DEEP LEARNING COMPLETE (2021)

989.14k views3871 WordsCopy TextShare
Machine Learnia
Cette formation sur le Deep Learning vous apprendra à développer des réseaux de neurones artificiels...
Video Transcript:
This is an artificial neural network. One of the most sophisticated artificial intelligence algorithms in the world. Originally inspired by the functioning of biological neurons, this algorithm is able to learn to perform any task.
Drving a car, playing chess, carrying on a conversation, or recognizing and classifying images such as those numbers you see on the screen right now. In this series of videos, I will show you how to create these kinds of algorithms yourself, by explaining you in a simple and playful way all the equations and mathematical concepts behind artificial intelligence. I will teach you how to program neural networks neural networks entirely by hand with Numpy, but also with the TensorFlow and Keras libraries, so that at the end of these videos, you will be able to create yourself a computer vision application like the one you see here.
My name is Guillaume Saint-Cirgue, I am a Data Scientist in England, and I welcome you in this free training on Deep Learning! Okay, in this first video, we're going to start off easy by seeing together what Deep Learning is, what is its place in the world of artificial intelligence, and how artificial neural networks work. But first, there may be some novices watching this video, that's why I propose to quickly review the basics of Machine Learning, just to put everyone on the same level.
Machine learning is a field of artificial intelligence, which consists in programming a machine to learn how to perform tasks by studying examples of these tasks. From a mathematical point of view, these examples are represented by data that the machine uses to develop a model. for example a function of the type f(x) = a x + b The goal of the machine learning game is to find the parameters a and b that give the best possible model, that is, the model that best fits our data.
For this, we program in the machine an optimization algorithm that will test different values of a and b until we obtain the combination that minimizes the distance between the model and the points. And that's it! That's machine learning.
It's about developing a model using an optimization algorithm to minimize the errors between the model and our data. That's it! And there are lots of models: like linear models, decision trees, or Support Vector Machines.
Each coming with its own optimization algorithm: gradient descent for linear models, the CART algorithm for decision trees, or the maximum margin for Support Vector Machines. Now, what about deep learning? Well, the Deep Learning is a domain of Machine Learning in which, instead of developing one of the models we just mentioned, we develop instead what we call artificial neural networks.
So the principle remains exactly the same: that is, we provide the machine with data, and it uses an optimization algorithm to adjust the model to these data. the model to this data. But this time, our model is not a simple function of the type f(x) = ax + b, but rather a network of interconnected functions.
. . a neural network.
We'll see in a moment how these networks are built, how they work, but what we need to know for now, is that the deeper these networks are, the more functions they contain inside, the more the machine is able to learn to do complex tasks, like recognizing objects, identifying a person in a picture, driving a car, all that kind of stuff. . .
So that's why we talk about deep learning, that is, deep learning, when we develop artificial neural networks. So, remember that actually Deep Learning is a field of machine learning, which is based on the same foundations as machine learning, and that machine learning is itself a domain of artificial intelligence. That's it!
Now that we've got the basics down, let's start our journey into the world of artificial neural networks. To understand how artificial neural networks work, I would like to go back to the origin of their history. I'd like to tell you how they were invented and how they evolved over time to the technology we know today.
This will allow you to better understand and remember how they work, but also to enrich your general culture with some interesting anecdotes about artificial intelligence. So the first neural networks were invented in 1943 by two mathematicians and neuroscientists named Warren McCulloch and Walter Pitts. In their scientific paper entitled: "A Logical Calculus of the ideas immanent in nervous activity", they explain how they were able to program artificial neurons inspired by the functioning of biological neurons.
Remember, in biology, neurons are excitable cells connected to each other, and their role is to transmit information in our nervous system. Each neuron is composed of several dendrites, a cell body, and an axon. Dendrites are the gateways to a neuron.
It is at this point, at the synapse, that the neuron receives signals from the preceding neurons. These signals can be excitatory or inhibitory. (a bit like having signals that and others that are worth -1).
When the sum of these signals exceeds a certain threshold, the neuron activates and produces an electrical signal. This signal travels along the axon to the endings and is sent to other neurons in our nervous system. .
. . .
. neurons that will work exactly the same way! This is roughly how the neurons.
What Warren McCulloch and Walter Pitts have tried to do is to model this operation, by considering that a neuron could be represented by a transfer function, which takes as input X signals and returns an output y. Inside this function, there are 2 main steps. The first one is an aggregation step.
We make the sum of all the inputs of the neuron, by multiplying each input by a coefficient W. This coefficient represents the synaptic activity, i. e.
whether the signal is excitatory, in which case w is +1, or inhibitory, in which case it is -1. In this aggregation phase, we obtain an expression of the form w1 x1 + w 2 x 2 + w3 x3 etc etc. Once this step has been completed, we move on to the activation phase.
We look at the result of the calculation made previously, and if it exceeds a certain threshold, usually 0, then the neuron is activated and returns an output y = 1. Otherwise, it remains at 0. So this is how Warren McCulloch and Walter Pitts managed to develop the first artificial neurons later renamed "Threshold Logic Unit".
This name comes from the fact that their model was originally designed only to process logic inputs that are either 0 or 1. They were able to demonstrate that with this model, it was possible to certain logic functions, such as the END gate and the OR gate. They also showed that by connecting several of these functions to each other, a little bit like the neurons in our brain, then it would be possible to solve any Boolean logic problem.
0:09:39. 765,0:09:46. 117 Inevitably, following this announcement, there was an excessive craze for artificial intelligence.
Some people even thought that in a few years we would be able to develop artificial intelligence capable of completely replacing human beings! Of course, this was not the case. .
. because even though this model lays the foundations of what Deep Learning is still today, it contains a number of flaws. .
. notably the fact that it does not have a learning algorithm, and that we have to find ourselves the values of the W parameters if we want to use it for real world applications. Fortunately, about fifteen years later, in 1957, an American psychologist found a way to improve this model, by proposing the first learning algorithm in the history of Deep Learning.
You may have already heard of this man. . .
He is Franck Rosenblatt, the inventor of the Perceptron. The model of the Perceptron looks in fact very very closely to the one we have just studied. This is an artificial neuron, which is activated when the weighted sum of its inputs exceeds a certain threshold, usually 0.
But with this, the perceiver also has a learning algorithm allowing it to find the values of its parameters w in order to obtain the outputs y that we want. To develop this algorithm, Frank Rosenblatt was inspired by Hebb's theory. This theory suggests that when two biological neurons are jointly excited, then they strengthen their synaptic links that is, they strengthen the connections between them.
In neuroscience, this is called synaptic plasticity, and this is what allows our brain to build memory, to learn new things, to make new associations. So from this idea, Frank Rosenblatt developed a learning algorithm, which consists in training an artificial neuron on reference data (X, y) so that it reinforces its parameters w each time an input X is activated at the same time as the output y present in these data. To do this, he devised the following formula, in which the parameters w are updated by calculating the difference between the reference output and the output produced by the neuron, and multiplying this difference by the value of each input X, as well as by a positive learning step.
This way, if our neuron produces a different output than the one it is supposed to produce, for example if it outputs y=0, while we would like to have y=1, then our formula will give us w= w + alpha X. So, for the inputs x that are worth 1, the coefficient w will be increased by a small alpha step. it will be "reinforced" (to use the terms of the Hebb theory) which will cause an increase of the function w1 x1 + w2 x2.
. . .
. . which will bring our neuron closer to its activation threshold.
As long as we are below this threshold, that is to say as long as the neuron produces a bad output, then the coefficient w will continue to increase thanks to our formula, until y_true equals y. . .
and at that moment our formula will give w = w + 0 ! This means that our parameters will stop evolving. And that's it!
This is how Frank Rosenblatt developed the first learning algorithm in the history of Deep Learning. Following this invention, there was again an unbridled craze for artificial intelligence. It was thought that with Perceptrons, it would be possible to build machines that could read, speak, walk, and even have a conscience!
What a crazy thing to do! But all this craze collapsed a few years later, when we realized that these promises could not be kept. .
. . .
. partly because the perceptron is a linear model (as we will see in a moment). The first winter of artificial intelligence was known, from 1974 to 1980, period during which there were almost no more investors to finance to finance A.
I. research. Artificial intelligence was about to die.
. . Fortunately, everything changed in the 1980s when Geoffrey Hinton, one of the fathers of Deep Learning, developed the multi-layer Perceptron, the first true artificial neural network!
As I told you just now, the Perceptron is actually a linear model. Indeed, if we plot the graph of its aggregation function f(x1, x2) = w1x1 + w2x2 we then obtain a line, whose inclination depends on the parameters w and whose position can be modified with a small additional parameter called the bias. With this line you can do great things, like separating 2 classes of points, since thanks to our activation function everything above this line will give an output y = 1 and everything below will give y = 0.
The only problem is that a lot of the phenomena in our universe are not linear phenomena. And in that case, the permit alone is not very useful. But remember the idea of McCulloch and Pitts: by connecting several neurons together, it is possible to solve more complex problems than with one.
So let's see what happens if we connect for example 3 Perceptrons together. The first 2 receive each the inputs x1 and x2. They do their little calculation, according to their parameters, and return an output y which they send in turn to the third Perceptron, which will also make its own calculations to produce a final output.
Well, if we plot the graphical representation of the final output as a function of the inputs x1 x2, we obtain this time a non linear model which is much more interesting. with this example, you have your first artificial neural network. 3 neurons, divided in 2 layers (an input layer and an output layer) this is what we call a Multilayer Perceptron And you can add as many layers and neurons as you want!
You can add 2, 3, 4, 10, why not even 100! The more you add, the more complex and interesting the output will be. However, a question remains.
. . How do you train such a neural network to do what you ask it to do?
That is, how do you find the values of all the parameters w and b so that you get a good model? Well, the solution is to use a technique called Back Propagation, which is to determine how the output of the network varies according to the parameters present in each layer of the model. For this, we compute a gradient chain, indicating how the output varies according to the last layer, then how the last layer varies according to the second last, then how the second last varies according to the second last etc.
. . .
. . until we get to the very first layer of our network.
This is a Back Propagation: a propagation towards back propagation! With this information, these gradients, we can then update the parameters of each layer, so that they minimize the error between the model output and the expected response (the famous y_true value) And for that, we use a formula very close to the one of Frank Rosenblatt, this is the Gradient Descent formula, which we will talk about in more detail in the next videos. In summary, to develop and train artificial neural networks, we repeat the following four steps in a loop: The first step is the Forward Propagation step: we pass the data from the first layer to the last, in order to produce an output y.
The second step is to calculate the error between this output and the reference output y_true that we want to have. For this we use what is called a cost function. Then, the third step is the Back Propagation: we measure how this cost function varies with respect to each layer of our model, starting from the last one and going up to the very first one.
Finally, the fourth and last step is to correct each parameter of the model with the gradient descent algorithm, before looping back to the first step, Forward Propagation, to start a new training cycle. I know it can be a lot of information when you look at it like that, but don't worry about it. .
. we'll really see all this in detail in the next videos. this is just a little introduction video, and by the way it's time to finish our story about artificial neural networks.
As time went on, the multilayer perceptron model continued to evolve, especially with the appearance of new activation functions, such as the logistic function, the tangent function hyperbolic function, or the ReLU function. These functions have now completely replaced the Heaviside function that we have seen so far, because it actually offers much better performance. In the 90's the first variants of the multilayer Perceptron started to be developed.
The famous Yann LeCun invented the first Convolutional Neural Networks, networks that are able to recognize and process images, by introducing at the beginning of these networks mathematical filters called Convolution and Pooling. We will talk about them later in this training. It was also during these years that the first recurrent neural networks appeared, which are again a variant of the multi-layer Perceptron and which allow to process efficiently time series problems such as text reading or text reading or speech recognition.
So you might ask: "But if all of this existed back in the 90s, why did we wait so long for the technologies we have today to emerge? " Well there are two big reasons for that. The first is that to work well, a neural network needs to be trained on a very large amount of data, sometimes exceeding millions or tens of millions of data.
But in the 90's, we didn't have so much data. . .
We didn't have millions or even tens of millions of pictures of dogs, cats, cars, pedestrians all well classified and well indexed. . .
No. We had to wait for the arrival of internet, smartphones, and connected objects to start collecting large amounts of data, images and sounds that can be used for Deep Learning Now, the second reason why it took so long to actually use neural networks, is because the power of computers in the 80's and 90's just didn't allow it. And yes, because as we will see in this series of videos, training a neural network requires a lot of time and a lot of power.
And we had to wait until we had excellent CPU and GPU to finally get good results. In fact, Deep Learning ni didn't really take off until 2012, during a computer vision competition called ImageNet, where a team of researchers led by Geoffrey Hinton, developed a neural network, capable of recognizing any image with better performance than any other algorithm at the time. Since that day, everyone is talking about machine learning and Deep Learning, We are constantly praising the merits of this technology.
. . sometimes even going a bit too far!
For example when we say that "neural networks work like the human brain". . .
well no it's completely wrong! Today, we know for sure that the human brain is much more complex and much more sophisticated than the Perceptron model or the Multilayer Perceptron. And as Yann LeCun says, compare a neural network to a human brain, is a bit like comparing a plane to a bird.
. . Maybe we were inspired by what we saw in nature at the very beginning to make the first sketches, but that doesn't mean that airplanes fly by flapping their wings!
No, behind all this, it's mathematics, linear algebra, differential calculus. . .
a whole well-oiled machine that we're going to discover together in this training. And speaking of which, it's time to conclude this video by showing you the program we'll be following throughout this series! Well, I hope you enjoyed the story I told you about the origin and evolution of Deep Learning.
If so, please feel free to share this video and give it a little blue thumbs up :) This will help me to know if I did a good job! Now it's time to discover the program of this training, and you'll see it naturally follows the story I told you! To begin with, let's study together the model of the simple Perceptron.
How to write and understand its mathematical formulas, how to calculate its performance thanks to the cost function called log loss, and how to train this model with a Gradient descent. At this point, I'll show you how to program this using only matrices and Numpy. Next, we will see what happens when we add a second Perceptron next to the first one, and then a third one, the whole forming our first neural network.
We will then see how to program the Forward Propagation step, then the Back propagation step, and very importantly, I will do by hand all the mathematical calculations and demonstrations that lead to the formulas of the different gradients. I assure you it's going to be ultra cool to do this and I'm going to make sure that everyone can understand what I'm writing. Then as the videos go on we'll continue to add neurons and layers to our networks, by completing each time the code we wrote in the previous videos.
You'll see that finally, once we've got the basics right, well, you can create any neural network without having to really modify your code, which is pretty cool. From here I'll also show you how to enhance your networks with other activation features, like the hyperbolic tangent function, ReLU or the softmax function. And once we've done all that, it's time to discover the TensorFlow library.
I will explain how this framework works, what is inside and how to easily develop neural networks thanks to its Keras library. Then we'll start doing real projects, like computer vision projects, where I'll show you how to develop applications like the one you see on the screen. Well, here it is!
I hope you like this program! and you can believe me, with all this, you'll be good at Deep Learning! Of course, you'll still have to practice, but don't worry I'll be there to help you and to motivate you!
Because at the end of each video, I will propose you an exercise, with in the following video the answer of this exercise. I assure you that it is very important for you to do these exercises but also to find other projects on google or on other websites. By the way, if you have any project ideas, any particular desires, if you have any goals, so feel free to share them in the comments so I can be aware of them and I could adapt to these goals.
in the future with new videos made especially for you. As you know, it's been two years since I've been on this YouTube channel, and I answer absolutely every comment. It's just like that it makes me very happy to have that connection with people so enjoy it.
:) By the way, if you are new to this YouTube channel, feel free to join our Discord server to meet the rest of the community. Because I have to admit, there are some great people in this community! Really amazing!
Really enjoy it! Go to the discord, ask your questions, develop your network. .
. And by the way I just want to thank from the bottom of my heart all those from the bottom of my heart to all these people, we are with great people. Anyway, you can also visit my website: www.
machinelearnia. com where you can find a lot of additional content and free training. Also on my tipeee page, you can support me with a 5 euro donation.
It helps a lot the development of the channel, the projects, the material, and as a bonus I give you every time exclusive content, tutorials, lessons, courses, videos, which are only available on tipeee so thank you to everyone who supports me. Well here it is! I hope you enjoyed this video!
I hope you are as motivated as me in this new adventure! If you are, please subscribe so you don't miss the next future videos, as for me I say to you very quickly!
Copyright © 2024. Made with ♥ in London by YTScribe.com