SVM and Kernels
In Part 1, we saw how SVM separates classes with a hyperplane. But what if the data isn’t linearly separable?
Example: Imagine classifying points arranged in concentric circles. No straight line (hyperplane) can separate them.
This is where Kernels become powerful.
What is a Kernel?
A kernel is a mathematical function that allows SVM to work in higher-dimensional spaces without explicitly computing the coordinates in that space.
This trick is called the Kernel Trick.
Instead of mapping data explicitly into higher dimensions (, kernels compute the dot product in that space directly:
This makes computation efficient and feasible even in very high dimensions.
Common Kernels
1. Linear Kernel
-
Used when data is linearly separable.
-
Fast and simple.
2. Polynomial Kernel
-
Allows curved decision boundaries.
-
Degree controls flexibility.
3. Radial Basis Function (RBF) / Gaussian Kernel
-
Most widely used kernel.
-
Handles complex, non-linear boundaries.
-
controls how far the influence of a point reaches.
4. Sigmoid Kernel
-
Inspired by neural networks (acts like an activation function).
Choosing the Right Kernel
-
Linear Kernel: when features are already separable or dataset is very large.
-
Polynomial Kernel: when interaction between features is important.
-
RBF Kernel: default choice when unsure, works well in most cases.
-
Sigmoid Kernel: rarely used, but works in some neural-network-like cases.
Comments
Post a Comment