r语言SVM包的使用
标签:
算法分类器数据集对于方法 |
分类: r语言挖掘实现 |
包里函数ksvm()通过.Call接口,使用bsvm和libsvm库中的优化方法,得以实现svm算法。对于分类,有C-SVM分类算法和v-SVM分类算法,同时还包括C分类器的有界约束的版本。对于回归,提供了ε-SVM回归算法和v-SVM回归算法。对于多类分类,有一对一(one-against-one)方法和原生多类分类方法,下面将会有介绍。例如
> library("kernlab") #导入包
> data("iris") #导入数据集iris
> irismodel <- ksvm(Species ~ ., data = iris,
+ type = "C-bsvc", kernel = "rbfdot",
+ kpar = list(sigma = 0.1), C = 10,
+ prob.model = TRUE) #训练
其中,type表示是用于分类还是回归,还是检测,取决于y是否是一个因子。缺省取C-svc或eps-svr。可取值有
• C-svc C classification
• nu-svc nu classification
• C-bsvc bound-constraint svm classification
• spoc-svc Crammer, Singer native multi-class
• kbb-svc Weston, Watkins native multi-class
• one-svc novelty detection
• eps-svr epsilon regression
• nu-svr nu regression
• eps-bsvr bound-constraint svm regression
Kernel设置核函数。可设核函数有
• rbfdot Radial Basis kernel "Gaussian"
• polydot Polynomial kernel
• vanilladot Linear kernel
• tanhdot Hyperbolic tangent kernel
• laplacedot Laplacian kernel
• besseldot Bessel kernel
• anovadot ANOVA RBF kernel
• splinedot Spline kernel
• stringdot String kernel
> irismodel
Support Vector Machine object of class "ksvm"
SV type: C-bsvc (classification)
parameter : cost C = 10
Gaussian Radial Basis kernel function.
Hyperparameter : sigma = 0.1
Number of Support Vectors : 32
Training error : 0.02
Probability model included.
>predict(irismodel, iris[c(3, 10, 56, 68, 107, 120), -5], type = "probabilities")
setosa versicolor virginica
[1,] 0.986432820 0.007359407 0.006207773
[2,] 0.983323813 0.010118992 0.006557195
[3,] 0.004852528 0.967555126 0.027592346
[4,] 0.009546823 0.988496724 0.001956452
[5,] 0.012767340 0.069496029 0.917736631
[6,] 0.011548176 0.150035384 0.838416441
Ksvm支持自定义核函数。如
>k <- function(x, y) { (sum(x * y) + 1) * exp(0.001 * sum((x - y)^2)) }
> class(k) <- "kernel"
> data("promotergene")
> gene <- ksvm(Class ~ ., data = promotergene, kernel = k, C = 10, cross = 5)#训练
> gene
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 10
Number of Support Vectors : 66
Training error : 0
Cross validation error : 0.141558
对于二分类问题,可以对结果用plot()进行可视化。例子如下
>x <- rbind(matrix(rnorm(120), , 2), matrix(rnorm(120, mean = 3), , 2))
> y <- matrix(c(rep(1, 60), rep(-1, 60)))
> svp <- ksvm(x, y, type = "C-svc", kernel = "rbfdot", kpar = list(sigma = 2))
> plot(svp)
http://images.cnblogs.com/cnblogs_com/zgw21cn/031609_1027_SVM1.png
包的连接地址在这里http://cran.r-project.org/web/packages/kernlab/index.html
在R中使用支持向量机( 3. e1071包和klaR包)
包e1071提供了对libsvm的接口。库libsvm包括了常用的核,如线性,多项式,RBF,sigmoid等。多分类通过一对一的投票机制(one-against-one voting scheme)而实现。predict()是训练函数,plot()可视化数据,支持向量,决策边界(如果提供的话)。参数调整tune()。
例如
> library("e1071")
> model <- svm(Species ~ ., data = iris,
method = "C-classification", kernel = "radial",
cost = 10, gamma = 0.1)
> summary(model)
Call:
svm(formula = Species ~ ., data = iris, method = "C-classification", kernel = "radial", cost = 10,
gamma = 0.1)
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 10
gamma: 0.1
Number of Support Vectors: 32
( 3 16 13 )
Number of Classes: 3
Levels:
setosa versicolor virginica
显示2-维数据投影及支持向量。
>plot(model, iris, Petal.Width ~
Petal.Length, slice = list(Sepal.Width = 3,
Sepal.Length = 4))
http://images.cnblogs.com/cnblogs_com/zgw21cn/041409_1122_R31.png
上图中,支持向量被标为"x",预测类区域以颜色背景标亮。
包klaR对库SVMlight进行了简单的包装,提供了函数svmlight()以分类,可视化。Svmlight()支持C-SVM进行分类,ε-SVM进行回归。以一对所有(one-against-all)进行多类分类。SVMlight支持高斯核,多项式,线性和sigmoid核。Svmlight()的参数形式为字符串形式。函数predict()返回每个case的标签。
如
> library("klaR")
> data("B3")
> Bmod <- svmlight(PHASEN ~ ., data = B3,
+ svm.options = "-c 10 -t 2 -g 0.1 -v 0")
> predict(Bmod, B3[c(4, 9, 30, 60, 80, 120),
+ -1])
$class
[1] 3 3 4 3 4 1
Levels: 1 2 3 4
$posterior
1 2 3 4
[1,] 0.09633177 0.09627103 0.71112031 0.09627689
[2,] 0.09628235 0.09632512 0.71119794 0.09619460
[3,] 0.09631525 0.09624314 0.09624798 0.71119362
[4,] 0.09632530 0.09629393 0.71115614 0.09622463
[5,] 0.09628295 0.09628679 0.09625447 0.71117579
[6,] 0.71123818 0.09627858 0.09620351 0.09627973

加载中…