Skip to content

Reclassify string values

5 messages · Zev Ross, Peter Langfelder, David Winsemius

#
Hi All,

Is there a simple way to convert a string such as c("A", "B" ,"C", "D") 
to a string of c("Group1", "Group1", "Group2", "Group2"). Naturally I 
could use the factor function as below but I don't like seeing that 
warning message (and I don't want to turn off warning messages). Perhaps 
a function called "reclassify" or "recategorize"?

Zev

x<-LETTERS[1:4]
x2<-as.character(factor(x, levels=LETTERS[1:4], labels=rep(c("Group1", 
"Group2"), each=2)))

Warning message:
In `levels<-`(`*tmp*`, value = c("Group1", "Group1", "Group2", "Group2" :
   duplicated levels will not be allowed in factors anymore
#
On Thu, Nov 3, 2011 at 11:59 AM, Zev Ross <zev at zevross.com> wrote:
If you want to "translate", why not first build a translation table

tt = cbind(LETTERS[1:4], c("group1", "group1", "group2", "group2"))

then apply it on an example:

xx = sample(LETTERS[1:4], 20, replace = TRUE)

translation = tt[ match(xx, tt[, 1]), 2]
[1] "group2" "group2" "group2" "group2" "group2" "group1" "group2" "group1"
 [9] "group2" "group1" "group1" "group2" "group2" "group2" "group1" "group2"
[17] "group2" "group1" "group1" "group2"

Or did I misunderstand your intent?

Peter
#
Hi Peter,

Thanks for the response. What you've suggested works fine but I'm 
looking for something that is simpler than my solution and avoids the 
pesky warning message. Your response avoids the warning message but just 
as complex (if not more). I just assumed there would be a function along 
the lines of:

 > mydata <- c("A", "C", "A", "D", "B", "B")
 > reclassify(mydata, inCategories=c("A", "B" ,"C", "D"),  
outCategories=c("Group1", "Group1", "Group2", "Group2"))

[1] "Group1" "Group2" "Group1" "Group2" "Group1" "Group1"

Zev
On 11/3/2011 3:13 PM, Peter Langfelder wrote:

  
    
#
On Thu, Nov 3, 2011 at 1:31 PM, Zev Ross <zev at zevross.com> wrote:
But of course, except sometimes you have to write the function yourself.

reclassify = function(data, inCategories, outCategories)
{
   outCategories[ match(data, inCategories)]
}

Sorry I can't make it any simpler than a 1-line solution :)

Feel free to add some checking of input validity, if you need that.

Peter
#
On Nov 3, 2011, at 4:40 PM, Peter Langfelder wrote:

            
It will be difficult to beat a oneliner like that. If Zev is still  
holding out for a canned solution he might look in the 'car'' package  
where there is at least one function that does releveling and  
grouping. I foget its name at the moment but it wouldn't hurt a new  
learneR to scroll through the entire 'car' suite of functions.