Cost Function
A cost function is a formula that takes in one or more variables and outputs a single number representing how far off a result is from what you want. The goal in optimization is usually to minimize this value.
A cost function is a mapping from a set of parameters to the real numbers, where the output quantifies the error or expense associated with a particular choice of parameters. In optimization and machine learning, the objective is typically to find the parameter values that minimize . Cost functions are also called loss functions or objective functions depending on the context.
Key Formula
Where:
- = the cost function (total error)
- = the number of data points
- = the actual (observed) value for data point i
- = the predicted value for data point i, given parameters θ
- = the parameter(s) being optimized
Worked Example
Problem: You are fitting a line to three data points: , , and . Calculate the mean squared error cost.
Step 1: Compute the predicted value for each data point using .
Step 2: Find the error (difference) between each actual value and the predicted value .
Step 3: Square each error.
Step 4: Sum the squared errors and divide by the number of data points .
Answer: The mean squared error cost is approximately . A different slope or intercept might produce a lower cost, which is the whole point of optimization — adjusting parameters to make as small as possible.
Visualization
Why It Matters
Cost functions are central to machine learning and data science. When a computer "learns" from data — for example, training a model to recognize images or predict prices — it repeatedly evaluates a cost function and adjusts its parameters to reduce the error. Without a well-defined cost function, there would be no way to measure whether one solution is better than another.
Common Mistakes
Mistake: Forgetting to square the errors, so positive and negative differences cancel out.
Correction: In mean squared error, each difference is squared before summing. This ensures that an error of counts just as much as an error of , and the total cost is always non-negative.
Mistake: Confusing the cost function with the model itself.
Correction: The model makes predictions; the cost function measures how bad those predictions are. They are two separate things — you adjust the model's parameters to reduce the cost.
