In [6]:
# Use if you run the notebook on Google colab
from google.colab import drive
drive.mount('/content/drive/', force_remount=True)

Mounted at /content/drive/


In [7]:
!pip install mglearn

Collecting mglearn
  Downloading mglearn-0.2.0-py2.py3-none-any.whl.metadata (628 bytes)
Downloading mglearn-0.2.0-py2.py3-none-any.whl (581 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m581.4/581.4 kB[0m [31m12.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: mglearn
Successfully installed mglearn-0.2.0


# 7: RBF SVM

## Imports

In [8]:
import sys

import IPython
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from IPython.display import Image, HTML

sys.path.append("/content/drive/MyDrive/50603/code")
os.chdir('/content/drive/MyDrive/50603')

import ipywidgets as widgets
import mglearn
from IPython.display import display
from ipywidgets import interact, interactive
from plotting_functions import *
from sklearn.dummy import DummyClassifier
from sklearn.model_selection import cross_validate, train_test_split
from utils import *

%matplotlib inline

pd.set_option("display.max_colwidth", 200)
import warnings

warnings.filterwarnings("ignore")

NameError: name 'os' is not defined

<br><br>

## Support Vector Machines (SVMs) with RBF kernel [[video](https://youtu.be/ic_zqOhi020)]

- Very high-level overview
- Our goals here are
    - Use `scikit-learn`'s SVM model.
    - Broadly explain the notion of support vectors.  
    - Explain how `C` and `gamma` hyperparameters control the fundamental tradeoff.
    
> (Optional) RBF stands for radial basis functions. We won't go into what it means here. Refer to [this video](https://www.youtube.com/watch?v=Qc5IyLW_hns) if you want to know more.

### Overview

- SVM RBFs **only remember the key examples (*support vectors*)**
- The decision boundary is defined by **a set of positive and negative examples** and **their weights** together with **their similarity measure**
- Different kernel functions can be used but a popular kernel is Radial Basis Functions (RBFs)

### Let's explore SVM RBFs

Let's try SVMs on the cities dataset.

In [None]:
cities_df = pd.read_csv("data/canada_usa_cities.csv")
X_cities = cities_df[["longitude", "latitude"]]
y_cities = cities_df["country"]

In [None]:
mglearn.discrete_scatter(X_cities.iloc[:, 0], X_cities.iloc[:, 1], y_cities)
plt.xlabel("longitude")
plt.ylabel("latitude")
plt.legend(loc=1);

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X_cities, y_cities, test_size=0.2, random_state=123
)

In [None]:
from sklearn.svm import SVC

svm = SVC(gamma=0.01)  # Ignore gamma for now
scores = cross_validate(svm, X_train, y_train, return_train_score=True)
print("SVC Mean validation score %0.3f" % (np.mean(scores["test_score"])))
pd.DataFrame(scores)

### Decision boundary of SVMs
- We can think of SVM with RBF kernel as "smooth KNN".

In [None]:
fig, ax = plt.subplots(figsize=(8, 5))

svm.fit(X_train, y_train)  # Fitting the svm model with the training data
mglearn.plots.plot_2d_separator(
    svm, X_train.to_numpy(), fill=True, eps=0.5, ax=ax, alpha=0.4
)  # Plotting the decision boundary for the svm model
mglearn.discrete_scatter(X_train.iloc[:, 0], X_train.iloc[:, 1], y_train, ax=ax)  # Plotting the training data
ax.set_title(svm)  # Setting the title to the svm model
ax.set_xlabel("longitude")  # Setting the x-label to "longitude"
ax.set_ylabel("latitude")  # Setting the y-label to "latitude"

### Support vectors

- Each training example either is or isn't a "support vector".
  - This gets decided during `fit`.

- **Main insight: the decision boundary only depends on the support vectors.**

- Let's look at the support vectors.

In [None]:
from sklearn.datasets import make_blobs

n = 20
n_classes = 2
X_toy, y_toy = make_blobs(
    n_samples=n, centers=n_classes, random_state=300
)  # Let's generate some fake data

In [None]:
mglearn.discrete_scatter(X_toy[:, 0], X_toy[:, 1], y_toy)
plt.xlabel("Feature 0")
plt.ylabel("Feature 1")
svm = SVC(kernel="rbf", C=10, gamma=0.1).fit(X_toy, y_toy)
mglearn.plots.plot_2d_separator(svm, X_toy, fill=True, eps=0.5, alpha=0.4)

In [None]:
svm.support_

In [None]:
X_toy[svm.support_]

In [None]:
mglearn.plots.plot_2d_separator(svm, X_toy, fill=True, eps=0.5, alpha=0.4)
plot_support_vectors(svm, X_toy, y_toy)

The support vectors are the bigger points in the plot above.

### Hyperparameters of SVM

- Key hyperparameters of `rbf` SVM are
    - `gamma`
    - `C`
    
- We are not equipped to understand the meaning of these parameters at this point but you are expected to describe **their relation to the fundamental tradeoff**.

Optionally, see [`scikit-learn`'s explanation of RBF SVM parameters](https://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html).

### Relation of `gamma` and the fundamental trade-off

- `gamma` controls the complexity (fundamental trade-off), just like other hyperparameters we've seen.
  - larger `gamma` $\rightarrow$ more complex
  - smaller `gamma` $\rightarrow$ less complex

In [None]:
gamma = [0.001, 0.01, 0.1, 1.0, 10.0]
plot_svc_gamma(
    gamma,
    X_train.to_numpy(),
    y_train.to_numpy(),
    x_label="longitude",
    y_label="latitude",
)

### Relation of `C` and the fundamental trade-off

- `C` _also_ affects the fundamental tradeoff
    - larger `C` $\rightarrow$ more complex
    - smaller `C` $\rightarrow$ less complex

In [None]:
C = [0.1, 1.0, 100.0, 1000.0, 100000.0]
plot_svc_C(
    C, X_train.to_numpy(), y_train.to_numpy(), x_label="longitude", y_label="latitude"
)

### Search over multiple hyperparameters

- So far you have seen how to carry out search over a hyperparameter
- In the above case the best training error is achieved by the most complex model (large `gamma`, large `C`).
- Best validation error requires a hyperparameter search to balance the fundamental tradeoff.
  - In general we can't search them one at a time.
  - You may look up the following:
    - [sklearn.model_selection.GridSearchCV](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html)
    - [sklearn.model_selection.RandomizedSearchCV](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html)