Introduction to Derivatives
Learning objectives
- Understand that derivatives are the instantaneous rate of change of a function
- Understand how to calculate a derivative
- Understand how to express taking a derivative at a given point, and evaluating a function at a given point mathematically
Introduction
In the lesson discussing step sizes of our gradient descent algorithm, we filled in some more information on how to find "best fit" regression line with using gradient descent. Namely, we learned how to more carefully change the y-intercept of the regression line to minimize the residual sum of squares. We did this by calibrating the size and direction of of our change in
With our gradient descent algorithm, the larger the absolute value of the slope, the larger our change in
While we know our gradient descent technique depends on changing our values according to the slope of our cost curve, we do not know how to find that slope at a given point. In this lesson, we'll learn how to calculate this slope.
Talking about derivatives
The slope of a line tangent to the function at that point is called the derivative. Or equivalently, the derivative is defined as the instantaneous rate of change of a function. This makes sense. The more our function is changing, the larger the magnitude of our slope at a given point.
For these purposes, magnitude describes the absolute value of a number. We use it because it's not accurate to say that -100 is larger than -99. After all, -100 is more negative and thus smaller than -99. But it is correct to say that the magnitude of -100 is larger than the magnitude of -99, as absolute value of -100 equals 100 and the absolute value of -99 is 99.
So a derivative answers questions about change. If you look at our blue curve above, the various slopes indicate how much is our output changing (in this case, our RSS) as we increase our input (here, our value of
Ok, so the derivative of a function is the rate of change of a function. But how do we calculate the rate of change of a function?
Calculating the derivative
Remember from our previous lesson that
Ok, so now how do we calculate the instantaneous rate of change of our function,
Ok to measure
The way we can calculate the rate of change at
To translate this to math we can express this as:
Now plugging in values we have
If you think about it, it makes sense that the rate of change of a function $f(x) = 3x $ is 3. For every unit of
So that's how we calculate the derivative. Derivative is simply the rate of change. So we see how much the output changes per a change in a given input. Expressed mathematically, our formula for calculating the derivative looks like this:
Take some time to take in this formula. It's not going away. This formula encapsulates our earlier approach. In our above approach, we let
Another word for
The derivative of a function
Derivatives of non-linear functions
But things quickly becomes trickier when working with more complicated functions. And we will run into these functions. For example, let's consider how to take the derivative of something resembles our cost curve. After all, figuring out the slope at a given point of a cost curve is what led us here.
This is the graph of the function
Ok, now let's calculate
Ok, sweet! Now let's show a line with this slope where
Take a close look at the straight line in the graph above. That straight line is a supposed to have the same slope as the blue curve at the point
The slope of the straight line should be pointing more downwards. Where did we go wrong? Let's take another look at our calculation of the derivative.
The problem is that if we calculate change in our output divided by the change in our input, and we set
But what we want to do is calculate the rate of change at just that point
So what we need to do is decrease our value of
So how do we calculate the rate of change of our function across no change in input? We use our imagination. Really. We calculate the derivative with a $\Delta $ of .1, then calculate it again with a $\Delta $ of .01, then again with $\Delta $ .001. Our derivative calculation should show convergance on a single number as our $\Delta $ approaches zero and that number is our derivative.
** That is, the derivative of a function is a change in the function's output as h, that is
When
When
When
Notice that our curves approach being tangent to the line as we decrease
$ \Delta x $ | $ \Delta y/\Delta x $ |
---|---|
.1 | -171,000 |
.01 | -179,100 |
.001 | -179,910 |
.0001 | -179,991 |
As you can see, as $\Delta x $ approaches zero, $f'(x) $ approaches $ -180,000 $. This convergance around one number as we change another number, is called the **limit **. So to describe the above, we would say, at the point
When
Or, better yet, we can update and correct our definition of derivative to equal:
That is our real definition of a derivative, and you best not forget it.
Summary
In this section, we learned about derivatives. A derivative is the instantaneous rate of change of a function. To calculate the instantaneous rate of change of a function, we see the value that $\frac{\Delta y}{\Delta x} $ approaches as $\Delta x $ approaches zero. This way, we are not calculating the rate of change of a function across a given distance, but rather are finding the rate of change instantaneously.