Classification Modeling with Support Vector Machine (SVM)

Jeff Accomando
5 min readNov 15, 2020

“That model shows class!”

By Yane Mode

One of the many cool things about being a data scientist is modeling! No, not that kind of modeling. I mean the kind that uses algorithms to predict outcomes. Although fashion modeling is also very cool.

As you probably already know, a classification model tries to draw conclusions from observed values. A few examples are will a person default on a loan or pay it back, or will someone who contracts Covid 19 die or survive. And there are countless other examples of how a classification model can help to predict or “classify” an outcome.

There are also many different methods available to data scientists to do this. Some of the more popular classification models are Logistic Regression, Decision Trees, and K-Nearest Neighbors.

I am going to focus on Support Vector Machine (SVM) which I think is a fun and slightly different kind of classification model! In simple terms, SVM creates a line known as a hyperplane which separates or classifies the data.

To demonstrate SVM visually we will first import some basic libraries and tools for generating and plotting some data.

Here you can see a simple scatter plot of 100 data points. For illustrative purposes the scatter plot shows two very distinct groupings of data. Our task is to fit a line that separates the data represented by the blue dots from the data represented by the green dots so that we can predict where future data will land.

However, looking at the plot above, we could probably fit a lot of different lines making it hard to understand how to eventually place new data into a class.

SVM’s job is to find the hyperplane with the maximum margin, or the maximum distance between data points of both classes. Maximizing the margin distance enables more confident classification of future data points.

Possible Hyperplanes and Extreme Data Points

SVM looks at the extremes of the data sets and draws a decision boundary nearest the extreme points. The extreme data points are the support vectors which give the algorithm its name. Can you guess which points are “extreme”?

Let’s create and visualize a Support Vector Model to identify those extreme support vectors!

Optimal Hyperplane, Maximum Margin and Support Vectors

Here we see a visualization of the optimal hyperplane, the maximum margin, and the support vectors.

SVM finds the points closest to the line from both classes. These are the support vectors. Then it computes the distance between the line and the support vectors. The distance is called the margin and the goal is to find the optimal hyperplane by maximizing that margin.

Simply put hyperplanes are decision boundaries that help classify the data points. Data points falling on either side of the hyperplane can be attributed to different classes.

Metrics

Just like in any model we want to know how accurate our model is at predicting outcomes. Scikit-learn’s metrics work on Support Vector Models like many other models.

Note that the scores are unrealistically high due to the data. But hopefully you get the point.

Pros and Cons of Support Vector Models

Pros:

  • SVM is quite effective in high dimensional spaces (The Kenal Trick!)
By StackExchange
  • SVM works well with a clear margin of class separation and is less likely to overfit due to maximum margin on both sides
  • SVM is effective where the number of dimensions is greater than the number of samples
  • SVM is memory efficient

Cons:

  • SVM doesn’t work well with large data sets
  • SVM does not perform well with very noisy data
  • SVM underperforms when data point features exceed the training data samples
  • SVM does not have a direct probabilistic interpretation so is estimated via internal cross-validation which can be costly

Summary

In summary, Support Vector Machine is a simple and reliable algorithm for classifying binary data. It has a super cool feature which allows transformation to a higher dimensional space to linearly separate the classes in the higher dimensional feature space. As with so many other things in machine learning, there is much more to to explore about Support Vector Mahcine and I encourage you to do so. You can start by checking out it’s Scikit-learn site. I hope to use it on a project and write more about it in the future.

By TimoElliot.com

Some code snippets used in this blog were taken from an excerpt of the Python Data Science Handbook by Jake VanderPlas and the code is released via opensource.org under the MIT license.

--

--

Jeff Accomando
0 Followers

I am a Data Scientist with a background in fin tech, and account management. I am a graduate of Flatiron School's Data Science program.