pretiosus.io

Gradient

Topic

Tags

Gradient

Intro

The gradient describes the rate of change of a function. This type of function can be anything such as a cost function, room temperature, or pressure in an air tank. The gradient describes how these values behave over the function parameters and can mathematically be computed by taking the derivative of the function.

Gradient of a 1D field

The gradient in a 1D field can simply be computed by taking the derivative of the function, as it only depends on one variable. Take for example the function f(x)=12x2f(x) = \frac{1}{2}x^2, which defines a basic parabola, shown in Figure 1. By taking the derivative of f(x)f(x), we obtain f(x)=xf'(x) = x.

By looking at Figure 1, there can be seen that for the parabola, there is a gradient defined by f(x)=xf'(x) = x. This means that the gradient is non-constant and is a function of x.

Starting at x=0x=0 there is a zero slope for the function f(x)f(x). At this location, the gradient is also zero. If we now move to the right, so there holds x>0x>0, there clearly can be seen that the slope of f(x)=12x2f(x) = \frac{1}{2}x^2 starts to increase significantly. Which is similarly observed in the behavior of the gradient as well. Eventually, as xx becomes really large, the slope of f(x)f(x) approaches asymptotically a vertical line. This means that the slope becomes really steep, which implies a high rate of change, this is also the case for the gradient. For a large x, the gradient becomes large as well and shows that a large rate of change is present in the function of f(x)f(x).

Now there is the understanding that the gradient is simply the derivative of the function, we will compute the gradient for a 2D case in the next section.

Gradient of a 2D field

Computing the gradient in a 2D field is practically the same as for the 1D field case. However, now 2 variables need to be taken into account. The derivative can be computed by taking partial derivatives of a function with respect to each parameter. This action can be noted by the gradient operator \nabla, the symbol is nothing more than taking the partial derivatives.

Let's consider the function f(x1,x2)=12x12+14x24f(x_1,x_2) = \frac{1}{2}x_1^2 + \frac{1}{4}x_2^4. The gradient of this function can be computed accordingly;

f(x1,x2)=fx1+fx2=x1+x23\begin{equation} \nabla f(x_1,x_2) = \frac{\partial f}{\partial x_1} + \frac{\partial f}{\partial x_2} = x_1 + x_2^3 \end{equation}

By plotting these two functions in Figure 2, there can again be obtained that the gradient indeed shows the rate of change of the original function f(x)f(x). Namely, there is no rate of change where the f(x)=0f(x) = 0, and for larger xx, the rate increases as a function f(x)f'(x).

There is shown how the gradient can be computed and what its meaning is with respect to the original function. The gradient itself can be used further by exploiting its properties, for example, the algorithm gradient descent, minima are found based on the gradient of the function.

More about Gradient