Changing entries of column of type "factor"/Adding a new level to a factor
Perfectly sensible, and indeed what I originally wrote. But it only works for my trivial example, not the general situation where it might not be the first level that needs changing. -- Bert
On Mon, Aug 27, 2012 at 1:52 PM, David Winsemius <dwinsemius at comcast.net> wrote:
On Aug 27, 2012, at 12:18 PM, Bert Gunter wrote:
Well ...See below. -- Cheers, Bert On Mon, Aug 27, 2012 at 9:19 AM, David Winsemius <dwinsemius at comcast.net> wrote:
On Aug 27, 2012, at 3:09 AM, Fridolin wrote:
What is a smart way to change an entry inside a column of a dataframe or matrix which is of type "factor"? Here is my script incl. input data:
#set working directory:
setwd("K:/R")
#read in data:
input<-read.table("Exampleinput.txt", sep="\t", header=TRUE)
#check data:
input
Ind M1 M2 M3 1 1 96/98 120/120 0/0 2 2 102/108 120/124 305/305 3 3 96/108 120/120 0/0 4 4 0/0 116/120 300/305 5 5 96/108 120/130 300/305 6 6 98/98 116/120 300/305 7 7 98/108 120/120 305/305 8 8 98/108 120/120 305/305 9 9 98/102 120/124 300/300 10 10 108/108 120/120 305/305
str(input)
'data.frame': 10 obs. of 4 variables: $ Ind: int 1 2 3 4 5 6 7 8 9 10 $ M1 : Factor w/ 8 levels "0/0","102/108",..: 5 2 4 1 4 8 7 7 6 3 $ M2 : Factor w/ 4 levels "116/120","120/120",..: 2 3 2 1 4 1 2 2 3 2 $ M3 : Factor w/ 4 levels "0/0","300/300",..: 1 4 1 3 3 3 4 4 2 4
#replace 0/0 by 999/999: for (r in 1:10)
+ for (c in 2:4) + if (input[r,c]=="0/0") input[r,c]<-"999/999" Warnmeldungen: 1: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : invalid factor level, NAs generated 2: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : invalid factor level, NAs generated 3: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : invalid factor level, NAs generated
input
Ind M1 M2 M3 1 1 96/98 120/120 <NA> 2 2 102/108 120/124 305/305 3 3 96/108 120/120 <NA> 4 4 <NA> 116/120 300/305 5 5 96/108 120/130 300/305 6 6 98/98 116/120 300/305 7 7 98/108 120/120 305/305 8 8 98/108 120/120 305/305 9 9 98/102 120/124 300/300 10 10 108/108 120/120 305/305 I want to replace all "0/0" by "999/999". My code should work for columns of type "character" and "integer". But to make it work for a "factor"-column I would need to add the new level of "999/999" at first, I guess. How do I add a new level?
?levels levels(input$M1) <- c(levels(input$M1), "999/999")
This adds an additional level; then you have to replace the "0/0" level with this one; then you have to call levels again to remove the "0/0" level.
Then do it this way (different from what I thought was originally desired):
x <- factor(letters[1:3])
levels(x) <- c("d", levels(x)[2:3])
x
[1] d b c Levels: d b c
I think the following slight tweak may be preferred, as illustrated with a little example (opinions?):
x <- factor(letters[1:3]) x
[1] a b c Levels: a b c ## create a new levels vector
newlvl <- levels(x) newlvl[newlvl == "a"] <- "d"
## Create the new factor and replace the old with it
x <- factor(newlvl[x]) x
[1] d b c Levels: b c d Note, however, as Bill D. said, in either case your level ordering -- which will be used, e.g. in printing and displaying -- will be weird.
So the above method might be what you expect. Several options are now available to the questioner. -- David.
-- David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
David Winsemius, MD Alameda, CA, USA
Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm