Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the model and do see a "coefficients" item, but printing it returns an NULL result. Can someone point me in the right direction? Thanks -- Noah
SVM coefficients
7 messages · Noah Silverman, Steve Lianoglou, Achim Zeileis +1 more
Hi,
On Sun, Aug 30, 2009 at 6:10 PM, Noah Silverman<noah at smartmediacorp.com> wrote:
Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the model and do see a "coefficients" item, but printing it returns an NULL result.
Hmm .. I don't see a "coefficients" attribute, but rather a "coefs"
attribute, which I guess is what you're looking for (?)
Run "example(svm)" to its end and type:
R> m$coefs
[,1]
[1,] 1.00884130
[2,] 1.27446460
[3,] 2.00000000
[4,] -1.00000000
[5,] -0.35480340
[6,] -0.74043692
[7,] -0.87635311
[8,] -0.04857869
[9,] -0.03721980
[10,] -0.64696793
[11,] -0.57894605
HTH,
-steve
Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Steve, That doesn't work. I just trained an SVM with 80 variables. svm_model$coefs gives me a list of 10,000 items. My training set is 30,000 examples of 80 variables, so I have no idea what the 10,000 items represent. There should be some attribute that lists the "weights" for each of the 80 variables. -- Noah
On 8/30/09 7:47 PM, Steve Lianoglou wrote:
Hi, On Sun, Aug 30, 2009 at 6:10 PM, Noah Silverman<noah at smartmediacorp.com> wrote:
Hello,
I'm using the svm function from the e1071 package.
It works well and gives me nice results.
I'm very curious to see the actual coefficients calculated for each input
variable. (Other packages, like RapidMiner, show you this automatically.)
I've tried looking at attributes for the model and do see a "coefficients"
item, but printing it returns an NULL result.
Hmm .. I don't see a "coefficients" attribute, but rather a "coefs"
attribute, which I guess is what you're looking for (?)
Run "example(svm)" to its end and type:
R> m$coefs
[,1]
[1,] 1.00884130
[2,] 1.27446460
[3,] 2.00000000
[4,] -1.00000000
[5,] -0.35480340
[6,] -0.74043692
[7,] -0.87635311
[8,] -0.04857869
[9,] -0.03721980
[10,] -0.64696793
[11,] -0.57894605
HTH,
-steve
On Mon, 31 Aug 2009, Noah Silverman wrote:
Steve, That doesn't work. I just trained an SVM with 80 variables. svm_model$coefs gives me a list of 10,000 items. My training set is 30,000 examples of 80 variables, so I have no idea what the 10,000 items represent.
Presumably, the coefficients of the support vectors times the training
labels, see help("svm", package = "e1071"). See also
http://www.jstatsoft.org/v15/i09/
for some background information and the different formulations available.
There should be some attribute that lists the "weights" for each of the 80 variables.
Not sure what you are looking for. Maybe David, the author auf svm() (and now Cc), can help. Z
-- Noah On 8/30/09 7:47 PM, Steve Lianoglou wrote:
Hi, On Sun, Aug 30, 2009 at 6:10 PM, Noah Silverman<noah at smartmediacorp.com> wrote:
Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the model and do see a "coefficients" item, but printing it returns an NULL result.
Hmm .. I don't see a "coefficients" attribute, but rather a "coefs"
attribute, which I guess is what you're looking for (?)
Run "example(svm)" to its end and type:
R> m$coefs
[,1]
[1,] 1.00884130
[2,] 1.27446460
[3,] 2.00000000
[4,] -1.00000000
[5,] -0.35480340
[6,] -0.74043692
[7,] -0.87635311
[8,] -0.04857869
[9,] -0.03721980
[10,] -0.64696793
[11,] -0.57894605
HTH,
-steve
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Noah Silverman wrote:
Steve, That doesn't work. I just trained an SVM with 80 variables. svm_model$coefs gives me a list of 10,000 items. My training set is 30,000 examples of 80 variables, so I have no idea what the 10,000 items represent. There should be some attribute that lists the "weights" for each of the 80 variables.
Hi Noah, does this help? # make binary problem from iris mydata <- iris[1:100,] mydata$Species <- mydata$Species[,drop=T] str(mydata) #'data.frame': 100 obs. of 5 variables: # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... # $ Species : Factor w/ 2 levels "setosa","versicolor": 1 1 1 1 1 1 1 1 1 1 ... # inputs X <- as.matrix(mydata[,-5]) # train svm with linear kernel, # to make later stuff easier we dont scale m <- svm(Species~., data=mydata, kernel="linear", scale=F) # .... # Number of Support Vectors: 3 # we get 3 support vectors, these are weights for training cases # or in svm therory speak: our dual variables alpha m$coefs[,1] # [1] 0.67122500 0.07671148 -0.74793648 # these are the indices of the cases to which the alphas belong m$index # [1] 24 42 99 # lets calculate the primary vars from the dual ones # svm theory says # w = sum x_i alpha_i w <- t(m$coefs) %*% X[m$index,] # Sepal.Length Sepal.Width Petal.Length Petal.Width # [1,] -0.04602689 0.5216377 -1.003002 -0.4641042 # test whether the above was nonsense..... # e1071 predict p1 <- predict(m, newdata=mydata, decision.values=T) p1 <- attr(p1, "decision.values") # do it manually with w, simple linear predictor with intercept -m$rho p2 <- X %*% t(w) - m$rho # puuuh, lucky.... max(abs(p1 - p2)) # [1] 6.439294e-15 Bernd
Hi,
On Aug 31, 2009, at 3:32 AM, Noah Silverman wrote:
Steve, That doesn't work.
Actually, it does :-)
I just trained an SVM with 80 variables. svm_model$coefs gives me a list of 10,000 items. My training set is 30,000 examples of 80 variables, so I have no idea what the 10,000 items represent. There should be some attribute that lists the "weights" for each of the 80 variables.
No, not really. The coefficients that you're pulling out are the weights for the support vectors. These aren't the coefficients you're expecting as in the "normal" linear model case, or whatever. I guess you're using the RBF kernel, right? The 80 variables that you're using are being transformed into some higher dimensional space, so the 80 weights you expect to get back don't really exist in the way you're expecting. SVMs are great for accuracy, but notoriously hard for interpretation. To try and squeeze some interpretability from your classifier in your feature space, you might try to look at the weights over your w vector: http://www.nabble.com/How-to-get-w-and-b-in-SVR--%28package-e1071%29-td24790413.html#a24791423 -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090831/2e28bbc2/attachment-0001.pl>