Hessian Matrix — Definition, Formula & Examples
The Hessian matrix is a square matrix of all second-order partial derivatives of a multivariable function. It tells you about the curvature of a function at a point and is the key tool for determining whether a critical point is a local minimum, local maximum, or saddle point.
For a scalar-valued function that is twice continuously differentiable, the Hessian matrix is the symmetric matrix whose -entry is . At a critical point where , the eigenvalues of determine the local behavior: all positive eigenvalues indicate a local minimum, all negative indicate a local maximum, and mixed signs indicate a saddle point.
Key Formula
Where:
- = A twice continuously differentiable scalar function of n variables
- = The i-th independent variable
- = The number of independent variables
How It Works
First, find all critical points by setting the gradient . Then compute every second-order partial derivative and arrange them into the Hessian matrix. For a function of two variables, the second derivative test uses the determinant . If and , the point is a local minimum; if and , it is a local maximum; if , it is a saddle point. When , the test is inconclusive.
Worked Example
Problem: Classify the critical point of f(x, y) = x² + xy + y² − 3x at its critical point.
Find critical point: Compute the partial derivatives and set them to zero.
Solve the system: From the second equation, x = −2y. Substituting into the first: 2(−2y) + y − 3 = 0, so −3y = 3, giving y = −1 and x = 2.
Build the Hessian: Compute the second-order partial derivatives: f_xx = 2, f_yy = 2, f_xy = 1.
Apply the second derivative test: Compute D = f_xx · f_yy − (f_xy)² = 4 − 1 = 3. Since D > 0 and f_xx = 2 > 0, the critical point is a local minimum.
Answer: The point (2, −1) is a local minimum of f.
Why It Matters
The Hessian matrix is central to optimization in machine learning, economics, and engineering. Newton's method for multivariable optimization uses the inverse of the Hessian to iteratively find minima. In statistics, the Hessian of a log-likelihood function yields standard errors for parameter estimates.
Common Mistakes
Mistake: Forgetting to verify that the gradient is zero before applying the Hessian test.
Correction: The second derivative test only classifies critical points. You must first confirm that all first-order partial derivatives equal zero at the point in question.
