Skip to content

svm

10 messages · Amy Hessen, Steve Lianoglou, Charles C. Berry +1 more

#
Hi,
On Tue, Jan 5, 2010 at 7:01 PM, Amy Hessen <amy_4_5_84 at hotmail.com> wrote:
This isn't exactly correct ... look at the examples in the ?svm
documentation a bit closer.
Using the first example in ?svm

attach(iris)
model <- svm(Species ~ ., data = iris)

The first argument in the function call is the formula. The "Species"
column is the class label.

`iris` is a data.frame ... you can see that it has the label *in it*,
look at the output of "head(iris)
Just follow the example in ?svm some more, you'll see training a model
and then testing it on data. The example happens to be the same data
the model trained on. To use new data, you'll just need a data
matrix/data.frame with as many columns as your original data, and as
many rows as you have observations.

The first step separates the labels from the data (you can do the same
in  your data -- you don't have to have separate test and train files
that are different -- just remove the labels from it in R):

attach(iris)
x <- subset(iris, select = -Species)
y <- Species
model <- svm(x, y)

# test with train data
pred <- predict(model, x)

Hope that helps,
-steve
#
Hi Amy,
On Wed, Jan 6, 2010 at 4:33 PM, Amy Hessen <amy_4_5_84 at hotmail.com> wrote:
Since you're not doing anything funky with the formula, a preference
of mine is to just skip this way of calling SVM and go "straight" to
the svm(x,y,...) method:

R> mydata <- as.matrix(read.delim("the_whole_dataset.txt"))
R> train.x <- mydata[,-1]
R> train.y <- mydata[,1]

R> mymodel <- svm(train.x, train.y, cross=3, type="C-classification")
## or
R> mymodel <- svm(train.x, train.y, cross=3, type="eps-regression")

As an aside, I also like to be explicit about the type="" parameter to
tell what I want my SVM to do (regression or classification). If it's
not specified, the SVM picks which one to do based on whether or not
your y vector is a vector of factors (does classification), or not
(does regression)
I guess you'll want to report your accuracy/MSE/something on your
model for your testing set? Just load the data in the same way then
use `predict` to calculate the metric your after. You'll have to have
the labels for your data to do that, though, eg:

testdata <- as.matrix(read.delim('testdata.txt'))
test.x <- testdata[,-1]
test.y <- testdata[,1]
preds <- predict(mymodel, test.x)

Let's assume you're doing classification, so let's report the accuracy:

acc <- sum(preds == test.y) / length(test.y)

Does that help?
-steve
1 day later
1 day later
#
Hi,
On Fri, Jan 8, 2010 at 11:57 AM, Amy Hessen <amy_4_5_84 at hotmail.com> wrote:
No Problem.
No, that's not correct.

There are two svm functions, one that takes a "formula" object
(svm.formula), and one that takes an x matrix, and a y vector
(svm.default). The svm.formula function is called when the first
argument in your "svm(..)" call is a formula object. This function
simply parses the formula and manipulates your data object into an x
matrix and y vector, then calls the svm.default function with those
params ... I usually prefer to just skip the formula and provide the x
and y objects directly.

Load the e1071 library and look at the source code:

R> library(e1071)
R> e1071:::svm.formula

You'll see what I mean.
The author of the e1071 package did you a favor. The predict.svm
function checks to see if your svm object was built using the formula
interface .. if so, it looks for you label column in the data you are
trying to predict on and ignores it.

Look at the function's source code (eg, type e1071:::predict.svm at
the R prompt), and look for the call to the delete.response function
... you can also look at the help in ?delete.response.

-steve

  
    
2 days later
11 days later
#
On Sun, 24 Jan 2010, Amy Hessen wrote:

            
I can!

By following the _posting guide_, I see in the 'Do Your Homework' section 
that I should try something like:

 	RSiteSearch("feature selection")

and

 	RSiteSearch("genetic algorithm")

And each seems to produce lots of good candidates!

HTH,

Chuck

p.s. Don't forget to check the Tasks Views on CRAN
Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901