Skip to content

How to create a column in dependence of another column

3 messages · fxen3k, Rui Barradas, William Dunlap

#
Hi there,

I'm sorry for the bad subject decision. Couldn't describe it better...

In my dataset called "dataSet" I want to create a new variable column called
"deal_category" which depends on another column called "trans_value".
In column "trans_value" I have values in USDm. Now what I want to do is to
give these values a category called "low", "medium" or "high". The
classification depends on the size of the values. 

"low", if value in "trans_value" is < 200 USDm
"medium", if value x in "trans_value" is: 200 USDm =< x < 500 USDm
"high", if value in "trans_value" is: >= 500 USDm

Having defined these deals with low, medium, high I want to run a lm() with
these categories as independent variable.

deal_category2 <- factor(deal_category)
levels(deal_category2) <- c("low", "medium", "high")
reg_1 <- lm(dep_var1 ~ indep_1 + indep_2 + deal_category2)
summary(reg_1)

Is this correct? Does R recognize my categories as variables?

Thanks for all your support!

Felix



--
View this message in context: http://r.789695.n4.nabble.com/How-to-create-a-column-in-dependence-of-another-column-tp4645548.html
Sent from the R help mailing list archive at Nabble.com.
#
Hello,

As for creating the new variable try


dataSet <- within(dataSet,
     deal_category <- ifelse(trans_value < 200, "low", 
ifelse(trans_value < 500, "medium", "high")))


And the rest seems ok. Run the code and see if it is.

Hope this helps,

Rui Barradas
Em 09-10-2012 10:25, fxen3k escreveu:
#
It might be simpler to use cut():
   > trans_value <- c(3000, 200, 400, 50, 2000)
   > cut(trans_value, breaks=c(-Inf, 200, 500, Inf), labels=c("low","medium","high"))
   [1] high   low    medium low    high
   Levels: low medium high

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com