The Math Behind It
Support Vector Machines (SVM) are one of the most powerful and popular algorithms in Machine Learning. Known for their ability to handle classification tasks with high accuracy, SVMs are widely used in text classification, bioinformatics, and image recognition.
But to truly understand SVMs, let’s break down the math step by step.
What is an SVM?
At its core, an SVM tries to find the best boundary (hyperplane) that separates data points of different classes.
-
For 2D data, this hyperplane is just a line.
-
For 3D data, it’s a plane.
-
In higher dimensions, it’s called a hyperplane.
The best hyperplane is the one that maximizes the margin - the distance between the hyperplane and the nearest data points (called support vectors).
The Math of SVM
1. The Hyperplane Equation
A hyperplane in n-dimensions can be written as:
-
→ weight vector (normal to the hyperplane)
-
→ input vector
-
→ bias term
2. The Classification Rule
For a data point :
-
If result > 0 → class +1
-
If result < 0 → class -1
3. Margin Maximization
The margin is defined as:
SVM tries to maximize the margin, which means minimizing .
4. The Optimization Problem
Formally, we solve:
subject to:
This is a convex optimization problem - which means there’s a unique global solution.
5. Soft Margin SVM
Real-world data isn’t always perfectly separable. That’s where slack variables () come in, allowing misclassifications:
And the objective becomes:
where controls the tradeoff between maximizing margin and minimizing classification error.
Comments
Post a Comment