Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt. Name: nicht verf?gbar URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111023/ae046d6a/attachment.pl>
How to create a new variable based on parts of another character variable.
4 messages · Philipp Fischer, jim holtman, Jim Lemon +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111023/66304c15/attachment.pl>
On 10/24/2011 12:35 AM, Philipp Fischer wrote:
Hello, I am just starting with R and I am having a (most probably) stupid problem by creating a new variable in a data.frame based on a part of another character variable. I have a data frame like this one: A B C AWI-test1 1 i AWI-test5 2 r AWI-tes75 56 z UFT-2 5 I UFT56 f t UFT356 9j t etc. etc. 89 t I now want to look in the variable A if the string AWI is present and then create a variable D and putting "Arctic" inside. However, if the string UFT occurs in the variable A, then the variable D shall be "Boreal" etc. etc. The resulting data.frame file should look like A B C D AWI-test1 1 i Arctic AWI-test5 2 r Arctic AWI-tes75 56 z Arctic UFT-2 5 I Boreal UFT56 f t Boreal UFT356 9j t Boreal etc. etc. 89 t
Hi Philipp,
Since you mentioned that you were just starting with R, it might be a
little optimistic to throw you into the regular expression cage and
expect you to emerge unscathed. You can do this by constructing a 2
column matrix or data frame of replacement values:
replacements<-matrix(c("AWI","UFT","Arctic","Boreal"),ncol=2)
replacements
[,1] [,2]
[1,] "AWI" "Arctic"
[2,] "UFT" "Boreal"
Then write a function using grep to replace the values:
swapLabels<-function(x,y) {
for(swaprow in 1:dim(y)[1])
if(length(grep(y[swaprow,1],x))) return(y[swaprow,2])
return(NA)
}
Finally, apply the function to the first row of the data frame:
pf.df$D<-unlist(lapply(pf.df[,1],swapLabels,replacements))
pf.df$D
[1] "Arctic" "Arctic" "Arctic" "Boreal" "Boreal" "Boreal"
Jim
Hi If you want to get rid of regular expressions at all and your A values start AWI for Arctic and UFT for boreal you can DF$D <- ifelse(substr(DF$A, 1,1) == "A", "Arctic", "Boreal") Regards Petr
Hello, I am just starting with R and I am having a (most probably) stupid
problem
by creating a new variable in a data.frame based on a part of another character variable. I have a data frame like this one: A B C AWI-test1 1 i AWI-test5 2 r AWI-tes75 56 z UFT-2 5 I UFT56 f t UFT356 9j t etc. etc. 89 t I now want to look in the variable A if the string AWI is present and
then
create a variable D and putting "Arctic" inside. However, if the string UFT occurs in the variable A, then the variable D shall be "Boreal" etc.
etc.
The resulting data.frame file should look like A B C D AWI-test1 1 i Arctic AWI-test5 2 r Arctic AWI-tes75 56 z Arctic UFT-2 5 I Boreal UFT56 f t Boreal UFT356 9j t Boreal etc. etc. 89 t I know how to do this when I want to look for the entire string of A
means
when there is "AWI-test1" and then create the variable D with "Arctic"
but
not how to look only for a substring in A? Would be great if somebody might help. Thanks Philipp *************************************************** [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.