Mathwords logoMathwords

Gradient — Definition, Formula & Examples

The gradient is a vector that collects all the partial derivatives of a multivariable function, pointing in the direction where the function increases most rapidly. Its magnitude tells you how steep that increase is.

For a scalar-valued function f:RnRf: \mathbb{R}^n \to \mathbb{R} that is differentiable at a point a\mathbf{a}, the gradient of ff at a\mathbf{a}, denoted f(a)\nabla f(\mathbf{a}), is the vector in Rn\mathbb{R}^n whose ii-th component is the partial derivative fxi\frac{\partial f}{\partial x_i} evaluated at a\mathbf{a}. Geometrically, f(a)\nabla f(\mathbf{a}) is orthogonal to the level set of ff passing through a\mathbf{a} and points in the direction of greatest rate of increase of ff.

Key Formula

f=fx1,  fx2,  ,  fxn\nabla f = \left\langle \frac{\partial f}{\partial x_1},\; \frac{\partial f}{\partial x_2},\; \dots,\; \frac{\partial f}{\partial x_n} \right\rangle
Where:
  • ff = A differentiable scalar-valued function of n variables
  • x1,x2,,xnx_1, x_2, \dots, x_n = The independent variables of the function
  • f\nabla f = The gradient vector (read "del f" or "nabla f")

How It Works

To compute the gradient, take the partial derivative of the function with respect to each variable separately, then assemble those derivatives into a vector. At any given point, the gradient vector points "uphill" — the direction you would walk to increase the function value as quickly as possible. Moving in the opposite direction, f-\nabla f, gives the steepest descent. The directional derivative of ff in any unit direction u\mathbf{u} equals fu\nabla f \cdot \mathbf{u}, so the gradient encodes the rate of change in every direction at once. If the gradient is the zero vector at a point, that point is a critical point where no single direction produces a first-order change.

Worked Example

Problem: Find the gradient of f(x,y,z)=3x2+2yzz3f(x, y, z) = 3x^2 + 2yz - z^3 and evaluate it at the point (1,1,2)(1, -1, 2).
Step 1: Partial derivative with respect to x: Differentiate ff with respect to xx, treating yy and zz as constants.
fx=6x\frac{\partial f}{\partial x} = 6x
Step 2: Partial derivative with respect to y: Differentiate ff with respect to yy, treating xx and zz as constants.
fy=2z\frac{\partial f}{\partial y} = 2z
Step 3: Partial derivative with respect to z: Differentiate ff with respect to zz, treating xx and yy as constants.
fz=2y3z2\frac{\partial f}{\partial z} = 2y - 3z^2
Step 4: Assemble and evaluate: Write the gradient vector and substitute (1,1,2)(1, -1, 2).
f=6x,  2z,  2y3z2    f(1,1,2)=6,  4,  14\nabla f = \langle 6x,\; 2z,\; 2y - 3z^2 \rangle \;\Rightarrow\; \nabla f(1,-1,2) = \langle 6,\; 4,\; -14 \rangle
Answer: f(1,1,2)=6,4,14\nabla f(1, -1, 2) = \langle 6, 4, -14 \rangle. This vector points in the direction of steepest increase of ff at that point, and its magnitude 36+16+196=248=262\sqrt{36+16+196} = \sqrt{248} = 2\sqrt{62} gives the maximum rate of increase.

Another Example

Problem: Given g(x,y)=x2+y2g(x,y) = x^2 + y^2, find the direction of steepest ascent at the point (3,4)(3, 4) and the rate of increase in that direction.
Step 1: Compute the gradient: Take partial derivatives and form the vector.
g=2x,  2y    g(3,4)=6,  8\nabla g = \langle 2x,\; 2y \rangle \;\Rightarrow\; \nabla g(3,4) = \langle 6,\; 8 \rangle
Step 2: Find the direction (unit vector): Normalize the gradient to get a unit vector.
g=36+64=10,u^=610,  810=0.6,  0.8\|\nabla g\| = \sqrt{36 + 64} = 10, \quad \hat{\mathbf{u}} = \left\langle \tfrac{6}{10},\; \tfrac{8}{10} \right\rangle = \left\langle 0.6,\; 0.8 \right\rangle
Step 3: State the maximum rate of increase: The magnitude of the gradient equals the maximum directional derivative.
Max rate of increase=g(3,4)=10\text{Max rate of increase} = \|\nabla g(3,4)\| = 10
Answer: At (3,4)(3,4), the function increases fastest in the direction 0.6,0.8\langle 0.6, 0.8 \rangle at a rate of 10 units per unit distance.

Why It Matters

The gradient is central to Multivariable Calculus (Calc III) and appears in nearly every topic that follows — directional derivatives, tangent planes, Lagrange multipliers, and the divergence theorem. In machine learning, gradient descent uses f-\nabla f to iteratively minimize a loss function, making it the engine behind training neural networks. Physics and engineering rely on the gradient to describe electric fields (E=V\mathbf{E} = -\nabla V), heat flow, and fluid pressure changes.

Common Mistakes

Mistake: Writing the gradient as a scalar instead of a vector.
Correction: The gradient is always a vector. Each component is a partial derivative, and the result has the same number of components as the function has input variables.
Mistake: Confusing the gradient direction with the direction of the level curve.
Correction: The gradient is perpendicular to level curves (2D) or level surfaces (3D). It points away from the level set toward higher values, not along it.