Integration and the fundamental theorem of calculus | Chapter 8, Essence of calculus

2.53M views3363 WordsCopy TextShare

3Blue1Brown

Intuition for integrals, and why they are inverses of derivatives. Help fund future projects: https:...

Video Transcript:

This guy, Grothendieck, is somewhat of a mathematical idol to me, and I just love this quote, don't you? Too often in math, we dive into showing that a certain fact is true with a long series of formulas before stepping back and making sure it feels reasonable, and preferably obvious, at least at an intuitive level. In this video, I want to talk about integrals, and the thing that I want to become almost obvious is that they are an inverse of derivatives.

Here we're just going to focus on one example, which is a kind of dual to the example of a moving car that I talked about in chapter 2 of the series, introducing derivatives. Then in the next video we're going to see how this same idea generalizes, but to a couple other contexts. Imagine you're sitting in a car, and you can't see out the window, all you see is the speedometer.

At some point the car starts moving, speeds up, and then slows back down to a stop, all over the course of 8 seconds. The question is, is there a nice way to figure out how far you've travelled during that time based only on your view of the speedometer? Or better yet, can you find a distance function, s of t, that tells you how far you've travelled after a given amount of time, t, somewhere between 0 and 8 seconds?

Let's say you take note of the velocity at every second, and make a plot over time that looks something like this. And maybe you find that a nice function to model that velocity over time in meters per second is v of t equals t times 8 minus t. You might remember, in chapter 2 of this series we were looking at the opposite situation, where you knew what a distance function was, s of t, and you wanted to figure out the velocity function from that.

There I showed how the derivative of a distance vs. time function gives you a velocity vs. time function.

So in our current situation, where all we know is velocity, it should make sense that finding a distance vs. time function is going to come down to asking what function has a derivative of t times 8 minus t. This is often described as finding the antiderivative of a function, and indeed, that's what we'll end up doing, and you could even pause right now and try that.

But first, I want to spend the bulk of this video showing how this question is related to finding the area bounded by the velocity graph, because that helps to build an intuition for a whole class of problems, things called integral problems in math and science. To start off, notice that this question would be a lot easier if the car was just moving at a constant velocity, right? In that case, you could just multiply the velocity in meters per second times the amount of time that has passed in seconds, and that would give you the number of meters traveled.

And notice, you can visualize that product, that distance, as an area. And if visualizing distance as area seems kind of weird, I'm right there with you. It's just that on this plot, where the horizontal direction has units of seconds, and the vertical direction has units of meters per second, units of area just very naturally correspond to meters.

But what makes our situation hard is that velocity is not constant, it's incessantly changing at every single instant. It would even be a lot easier if it only ever changed at a handful of points, maybe staying static for the first second, and then suddenly discontinuously jumping to a constant 7 meters per second for the next second, and so on, with discontinuous jumps to portions of constant velocity. That would make it uncomfortable for the driver, in fact it's actually physically impossible, but it would make your calculations a lot more straightforward.

You could just compute the distance traveled on each interval by multiplying the constant velocity on that interval by the change in time, and then just add all of those up. So what we're going to do is approximate the velocity function as if it was constant on a bunch of intervals, and then, as is common in calculus, we'll see how refining that approximation leads us to something more precise. Here, let's make this a little more concrete by throwing in some numbers.

Chop up the time axis between 0 and 8 seconds into many small intervals, each with some little width dt, something like 0. 25 seconds. Consider one of those intervals, like the one between t equals 1 and 1.

25. In reality, the car speeds up from 7 m per second to about 8. 4 m per second during that time, and you could find those numbers just by plugging in t equals 1 and t equals 1.

25 to the equation for velocity. What we want to do is approximate the car's motion as if its velocity was constant on that interval. Again, the reason for doing that is we don't really know how to handle situations other than constant velocity ones.

You could choose this constant to be anything between 7 and 8. 4. It actually doesn't matter.

All that matters is that our sequence of approximations, whatever they are, gets better and better as dt gets smaller and smaller. That treating this car's journey as a bunch of discontinuous jumps in speed between portions of constant velocity becomes a less-wrong reflection of reality as we decrease the time between those jumps. So for convenience, on an interval like this, let's just approximate the speed with whatever the true car's velocity is at the start of that interval, the height of the graph above the left side, which in this case is 7.

In this example interval, according to our approximation, the car moves 7 m per second times 0. 25 seconds. That's 1.

75 meters, and it's nicely visualized as the area of this thin rectangle. In truth, that's a little under the real distance traveled, but not by much. The same goes for every other interval.

The approximated distance is v of t times dt, it's just that you'd be plugging in a different value for t at each one of these, giving a different height for each rectangle. I'm going to write out an expression for the sum of the areas of all those rectangles in kind of a funny way. Take this symbol here, which looks like a stretched s for sum, and put a 0 at its bottom and an 8 at its top, to indicate that we'll be ranging over time steps between 0 and 8 seconds.

And as I said, the amount we're adding up at each time step is v of t times dt. Two things are implicit in this notation. First of all, that value dt plays two separate roles.

Not only is it a factor in each quantity we're adding up, it also indicates the spacing between each sampled time step. So when you make dt smaller and smaller, even though it decreases the area of each rectangle, it increases the total number of rectangles whose areas we're adding up, because if they're thinner, it takes more of them to fill that space. And second, the reason we don't use the usual sigma notation to indicate a sum is that this expression is technically not any particular sum for any particular choice of dt.

It's meant to express whatever that sum approaches as dt approaches 0. And as you can see, what that approaches is the area bounded by this curve and the horizontal axis. Remember, smaller choices of dt indicate closer approximations for the original question, how far does the car actually go?

So this limiting value for the sum, the area under this curve, gives us the precise answer to the question in full unapproximated precision. Now tell me that's not surprising. We had this pretty complicated idea of approximations that can involve adding up a huge number of very tiny things.

And yet, the value that those approximations approach can be described so simply, it's just the area underneath this curve. This expression is called an integral of v of t, since it brings all of its values together, it integrates them. Now at this point, you could say, how does this help?

You've just reframed one hard question, finding how far the car has traveled, into an equally hard problem, finding the area between this graph and the horizontal axis. And you'd be right. If the velocity-distance duo was the only thing we cared about, most of this video, with all the area under a curve nonsense, would be a waste of time.

We could just skip straight ahead to finding an antiderivative. But finding the area between a function's graph and the horizontal axis is somewhat of a common language for many disparate problems that can be broken down and approximated as the sum of a large number of small things. You'll see more in the next video, but for now I'll just say in the abstract that understanding how to interpret and how to compute the area under a graph is a very general problem-solving tool.

In fact, the first video of this series already covered the basics of how this works, but now that we have more of a background with derivatives, we can take this idea to its completion. For a velocity example, think of this right endpoint as a variable, capital T. So we're thinking of this integral of the velocity function between 0 and T, the area under this curve between those inputs, as a function where the upper bound is the variable.

That area represents the distance the car has travelled after T seconds, right? So in reality, this is a distance vs. time function, s of t.

Now ask yourself, what is the derivative of that function? On the one hand, a tiny change in distance over a tiny change in time is velocity, that is what velocity means. But there's another way to see this, purely in terms of this graph and this area, which generalizes a lot better to other integral problems.

A slight nudge of dt to the input causes that area to increase, some little ds represented by the area of this sliver. The height of that sliver is the height of the graph at that point, v of t, and its width is dt. And for small enough dt, we can basically consider that sliver to be a rectangle, so this little bit of added area, ds, is approximately equal to v of t times dt.

And because that's an approximation that gets better and better for smaller dt, the derivative of that area function, ds, dt, at this point equals vt, the value of the velocity function at whatever time we started on. And that right there is a super general argument. The derivative of any function giving the area under a graph like this is equal to the function for the graph itself.

So, if our velocity function is t times 8-t, what should s be? What function of t has a derivative of t times 8-t? It's easier to see if we expand this out, writing it as 8t minus t squared, and then we can just take each part one at a time.

What function has a derivative of 8t? We know that the derivative of t squared is 2t, so if we just scale that up by a factor of 4, we can see that the derivative of 4t squared is 8t. And for that second part, what kind of function do you think might have negative t squared as a derivative?

Using the power rule again, we know that the derivative of a cubic term, t cubed, gives us a square term, 3t squared. So if we just scale that down by a third, the derivative of 1 third t cubed is exactly t squared. And then making that negative, we'd see that negative 1 third t cubed has a derivative of negative t squared.

Therefore, the antiderivative of our function, 8t minus t squared, is 4t squared minus 1 third t cubed. But there's a slight issue here. We could add any constant we want to this function, and its derivative is still 8t minus t squared.

The derivative of a constant always goes to zero. And if you were to graph s of t, you could think of this in the sense that moving a graph of a distance function up and down does nothing to affect its slope at every input. So in reality, there's actually infinitely many different possible antiderivative functions, and every one of them looks like 4t squared minus 1 third t cubed plus c, for some constant c.

But there is one piece of information we haven't used yet that will let us zero in on which antiderivative to use, the lower bound of the integral. This integral has to be zero when we drag that right endpoint all the way to the left endpoint, right? The distance travelled by the car between 0 seconds and 0 seconds is… well, zero.

So as we found, the area as a function of capital T is an antiderivative for the stuff inside. And to choose what constant to add to this expression, you subtract off the value of that antiderivative function at the lower bound. If you think about it for a moment, that ensures that the integral from the lower bound to itself will indeed be zero.

As it so happens, when you evaluate the function we have here at t equals zero, you get zero. So in this specific case, you don't need to subtract anything off. For example, the total distance travelled during the full 8 seconds is this expression evaluated at t equals 8, which is 85.

33 minus 0. So the answer as a whole is 85. 33.

But a more typical example would be something like the integral between 1 and 7. That's the area pictured here, and it represents the distance travelled between 1 second and 7 seconds. What you do is evaluate the antiderivative we found at the top bound, 7, and subtract off its value at the bottom bound, 1.

Notice, by the way, it doesn't matter which antiderivative we chose here. If for some reason it had a constant added to it, like 5, that constant would cancel out. More generally, any time you want to integrate some function, and remember, you think of that as adding up values f of x times dx for inputs in a certain range, and then asking what is that sum approach as dx approaches 0.

The first step to evaluating that integral is to find an antiderivative, some other function, capital F, whose derivative is the thing inside the integral. Then the integral equals this antiderivative evaluated at the top bound minus its value at the bottom bound. And this fact right here that you're staring at is the fundamental theorem of calculus.

And I want you to appreciate something kind of crazy about this fact. The integral, the limiting value for the sum of all these thin rectangles, takes into account every single input on the continuum, from the lower bound to the upper bound. That's why we use the word integrate, it brings them all together.

And yet, to actually compute it using an antiderivative, you only look at two inputs, the top bound and the bottom bound. It almost feels like cheating. Finding the antiderivative implicitly accounts for all the information needed to add up the values between those two bounds.

That's just crazy to me. This idea is deep, and there's a lot packed into this whole concept, so let's recap everything that just happened, shall we? We wanted to figure out how far a car goes just by looking at the speedometer.

And what makes that hard is that velocity is always changing. If you approximate velocity to be constant on multiple different intervals, you could figure out how far the car goes on each interval with multiplication, and then add all of those up. Better and better approximations for the original problem correspond to collections of rectangles whose aggregate area is closer and closer to being the area under this curve between the start time and the end time.

So that area under the curve is actually the precise distance traveled for the true nowhere constant velocity function. If you think of that area as a function itself, with a variable right endpoint, you can deduce that the derivative of that area function must equal the height of the graph at every point. And that's really the key right there.

It means that to find a function giving this area, you ask, what function has v of t as a derivative? There are actually infinitely many antiderivatives of a given function, since you can always just add some constant without affecting the derivative, so you account for that by subtracting off the value of whatever antiderivative function you choose at the bottom bound. By the way, one important thing to bring up before we leave is the idea of negative area.

What if the velocity function was negative at some point, meaning the car goes backwards? It's still true that a tiny distance traveled ds on a little time interval is about equal to the velocity at that time multiplied by the tiny change in time. It's just that the number you'd plug in for velocity would be negative, so the tiny change in distance is negative.

In terms of our thin rectangles, if a rectangle goes below the horizontal axis, like this, its area represents a bit of distance traveled backwards, so if what you want in the end is to find a distance between the car's start point and its end point, this is something you'll want to subtract. And that's generally true of integrals. Whenever a graph dips below the horizontal axis, the area between that portion of the graph and the horizontal axis is counted as negative.

What you'll commonly hear is that integrals don't measure area per se, they measure the signed area between the graph and the horizontal axis. Next up, I'm going to bring up more context where this idea of an integral and area under curves comes up, along with some other intuitions for this fundamental theorem of calculus. Maybe you remember, chapter 2 of this series introducing the derivative was sponsored by The Art of Problem Solving, so I think there's something elegant to the fact that this video, which is kind of a duel to that one, was also supported in part by The Art of Problem Solving.

I really can't imagine a better sponsor for this channel, because it's a company whose books and courses I recommend to people anyway. They were highly influential to me when I was a student developing a love for creative math, so if you're a parent looking to foster your own child's love for the subject, or if you're a student who wants to see what math has to offer beyond rote schoolwork, I cannot recommend The Art of Problem Solving enough. Whether that's their newest development to build the right intuitions in elementary school kids, called Beast Academy, or their courses in higher-level topics and contest preparation, going to aops.

com slash 3blue1brown, or clicking on the link in the description, lets them know you came from this channel, which may encourage them to support future projects like this one. I consider these videos a success not when they teach people a particular bit of math, which can only ever be a drop in the ocean, but when they encourage people to go and explore that expanse for themselves, and The Art of Problem Solving is among the few great places to actually do that exploration.