线性svm推理inference

svm.SVC()

svm.SVC()使用ovo，每一个类别和其他类别分别构造一个svm，一共n*(n-1)/2svm分类器，当判断一个未知类别的样本，每个分类器对这个点进行分类，最后投票，得票最多的类别为最终类别

构造四个类别：

cls1 = np.array([[0, 1, 0],[1, 1, 0], [2, 1, 0], [3, 1, 0], [4, 1, 0]])
cls2 = np.array([[0, 3, 1],[1, 3, 1], [2, 3, 1], [3, 3, 1], [4, 3, 1]])
cls3 = np.array([[0, 5, 2],[1, 5, 2], [2, 5, 2], [3, 5, 2], [4, 5, 2]])
cls4 = np.array([[0, 7, 3],[1, 7, 3], [2, 7, 3], [3, 7, 3], [4, 7, 3]])

分别为A,B,C,D类，AB构造一个分类器y=2x, AC构造一个分类器y=3x, AD构造一个分类器y=4x, BC构造一个分类器y=4x, BD构造一个分类器y=5x, CD构造一个分类器y=6x

当来一个点时，计算它到每个分类平面的距离，如果大于0则分为正类，否则负类，如点(0,1)到这两个平面距离都大于0，得到[Ture,True,True,Ture,True,True],AB分类，A=1；AC分类，A=2；AD分类，A=3；BC分类，B=1；BD分类，B=2；CD分类，C=1；最后A类得分最多，该点被分为A类

sklearn ovo源码：

# multiclass.py
k = 0
for i in range(n_classes):
    for j in range(i + 1, n_classes):
        sum_of_confidences[:, i] -= confidences[:, k]
        sum_of_confidences[:, j] += confidences[:, k]
        votes[predictions[:, k] == 0, i] += 1
        votes[predictions[:, k] == 1, j] += 1
        k += 1

svm.SVC()有参数 decision_function_shape=’ovr’不过没用，还是采用的ovo的方式

svm.LinearSVC()

svm.LinearSVC 使用ovr进行多分类，每个类别和剩下的类别训练得到一个svm分类器，分类的时候，对于一个未知的点，计算该点到所有超平面的距离(没有除模)，最后改样本被分为距离超平面最远的那一个类，之所以不除模，是因为这个超平面可以将绝对距离远的分为正类，而将绝对距离近的超平面分为负类，下面看一个例子，还是上面那四个点，

训练得到的参数w&b:

[[ 0.07355282 -0.82129448]
 [-0.00218351 -0.10126131]
 [-0.01921047  0.08401843]
 [-0.09088882  0.39134251]]
[ 1.42011892 -0.08837931 -0.77819071 -2.07359502]

测试点(0,1)到四个超平面的距离

0.59882445 -0.18964062 -0.69417228 -1.68225251]]```

1 2	最大的位置在0，所以该点被分为0类；测试点(0,2)到四个超平面的距离 ```[[-1.04376451 -0.39216323 -0.52613543 -0.89956749]]

之所以该点到第二个超平面还更‘近’，是因为它的距离没有除以模

sklearn源码：

# path/linear_model/base.py
scores = safe_sparse_dot(X, self.coef_.T,
                         dense_output=True) + self.intercept_
return scores.ravel() if scores.shape[1] == 1 else scores

测试样例代码：

import numpy as np
from matplotlib import pyplot as plt
from sklearn.svm import LinearSVC
from sklearn import svm

if __name__ == '__main__':
    cls1 = np.array([[0, 1, 0],[1, 1, 0], [2, 1, 0], [3, 1, 0], [4, 1, 0]])
    cls2 = np.array([[0, 3, 1],[1, 3, 1], [2, 3, 1], [3, 3, 1], [4, 3, 1]])
    cls3 = np.array([[0, 5, 2],[1, 5, 2], [2, 5, 2], [3, 5, 2], [4, 5, 2]])
    cls4 = np.array([[0, 7, 3],[1, 7, 3], [2, 7, 3], [3, 7, 3], [4, 7, 3]])
    data = np.vstack((cls1, cls2, cls3, cls4))
    X = data[:,0:2]
    Y = data[:, -1]
    clf = LinearSVC(max_iter=10000)
    #clf = svm.SVC(kernel='linear',decision_function_shape='ovr')
    clf.fit(X, Y)
    w = clf.coef_
    b = clf.intercept_
    print(w)
    print(b)
    ax = plt.subplot()
    ax.scatter(cls1[:, 0], cls1[:, 1], label=cls1[:, 2])
    ax.scatter(cls2[:, 0], cls2[:, 1], label=cls2[:, 2])
    ax.scatter(cls3[:, 0], cls3[:, 1], label=cls3[:, 2])
    ax.scatter(cls4[:, 0], cls4[:, 1], label=cls4[:, 2])

    point1 = np.array([[0,1]])
    point2 = np.array([[0,3]])
    point3 = np.array([[0,5]])
    point4 = np.array([[0,7]])
    scores1 = clf.decision_function(point1)
    scores2 = clf.decision_function(point2)
    scores3 = clf.decision_function(point3)
    scores4 = clf.decision_function(point4)
    clf.predict(point1)
    print('scores 1:', scores1, "cls:", clf.predict(point1))
    print('scores 2:', scores2, "cls:", clf.predict(point2))
    print('scores 3:', scores3, "cls:", clf.predict(point3))
    print('scores 4:', scores4, "cls:", clf.predict(point4))

    # 超平面
    x = np.linspace(0,5)
    #x = np.meshgrid(x)
    y1 = (w[0, 0]*x+b[0])/(-w[0, 1])
    y2 = (w[1, 0]*x+b[1])/(-w[1, 1])
    y3 = (w[2, 0]*x+b[2])/(-w[2, 1])
    y4 = (w[3, 0]*x+b[3])/(-w[3, 1])
    ax.plot(x, y1)
    ax.plot(x, y2)
    ax.plot(x, y3)
    ax.plot(x, y4)
    plt.xlim([0,30])
    plt.show()