Mathwords logoMathwords

Hessian Matrix — Definition, Formula & Examples

The Hessian matrix is a square matrix of all second-order partial derivatives of a multivariable function. It tells you about the curvature of a function at a point and is the key tool for determining whether a critical point is a local minimum, local maximum, or saddle point.

For a scalar-valued function f:RnRf: \mathbb{R}^n \to \mathbb{R} that is twice continuously differentiable, the Hessian matrix H(f)\mathbf{H}(f) is the n×nn \times n symmetric matrix whose (i,j)(i, j)-entry is 2fxixj\dfrac{\partial^2 f}{\partial x_i \, \partial x_j}. At a critical point where f=0\nabla f = \mathbf{0}, the eigenvalues of H\mathbf{H} determine the local behavior: all positive eigenvalues indicate a local minimum, all negative indicate a local maximum, and mixed signs indicate a saddle point.

Key Formula

H(f)=[2fx122fx1x22fx1xn2fx2x12fx222fx2xn2fxnx12fxnx22fxn2]\mathbf{H}(f) = \begin{bmatrix} \dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1 \partial x_n} \\ \dfrac{\partial^2 f}{\partial x_2 \partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \dfrac{\partial^2 f}{\partial x_n \partial x_1} & \dfrac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2} \end{bmatrix}
Where:
  • ff = A twice continuously differentiable scalar function of n variables
  • xix_i = The i-th independent variable
  • nn = The number of independent variables

How It Works

First, find all critical points by setting the gradient f=0\nabla f = \mathbf{0}. Then compute every second-order partial derivative and arrange them into the Hessian matrix. For a function of two variables, the second derivative test uses the determinant D=fxxfyy(fxy)2D = f_{xx} f_{yy} - (f_{xy})^2. If D>0D > 0 and fxx>0f_{xx} > 0, the point is a local minimum; if D>0D > 0 and fxx<0f_{xx} < 0, it is a local maximum; if D<0D < 0, it is a saddle point. When D=0D = 0, the test is inconclusive.

Worked Example

Problem: Classify the critical point of f(x, y) = x² + xy + y² − 3x at its critical point.
Find critical point: Compute the partial derivatives and set them to zero.
fx=2x+y3=0,fy=x+2y=0f_x = 2x + y - 3 = 0, \quad f_y = x + 2y = 0
Solve the system: From the second equation, x = −2y. Substituting into the first: 2(−2y) + y − 3 = 0, so −3y = 3, giving y = −1 and x = 2.
(x,y)=(2,1)(x, y) = (2, -1)
Build the Hessian: Compute the second-order partial derivatives: f_xx = 2, f_yy = 2, f_xy = 1.
H=[2112]\mathbf{H} = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}
Apply the second derivative test: Compute D = f_xx · f_yy − (f_xy)² = 4 − 1 = 3. Since D > 0 and f_xx = 2 > 0, the critical point is a local minimum.
D=(2)(2)(1)2=3>0D = (2)(2) - (1)^2 = 3 > 0
Answer: The point (2, −1) is a local minimum of f.

Why It Matters

The Hessian matrix is central to optimization in machine learning, economics, and engineering. Newton's method for multivariable optimization uses the inverse of the Hessian to iteratively find minima. In statistics, the Hessian of a log-likelihood function yields standard errors for parameter estimates.

Common Mistakes

Mistake: Forgetting to verify that the gradient is zero before applying the Hessian test.
Correction: The second derivative test only classifies critical points. You must first confirm that all first-order partial derivatives equal zero at the point in question.