Hi,
I am using colsplit (package = reshape) to split all strings
in a column according to the same patterns. Here
an example:
library(reshape2)
df1 <- data.frame(x=c("str1_name2", "str3_name5"))
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
This is nearly what I want but I want to remove the words "str" and
"name" from the values, because the columns are already named with
that words. Is there a way to remove them using colsplit? Or any other
simple way?
/johannes
Colsplit, removing parts of a string
5 messages · Johannes Radinger, Ista Zahn, Rui Barradas +1 more
Hi Johannes, On Thu, Sep 27, 2012 at 7:25 AM, Johannes Radinger
<johannesradinger at gmail.com> wrote:
Hi,
I am using colsplit (package = reshape) to split all strings
in a column according to the same patterns. Here
an example:
library(reshape2)
df1 <- data.frame(x=c("str1_name2", "str3_name5"))
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
This is nearly what I want but I want to remove the words "str" and
"name" from the values, because the columns are already named with
that words. Is there a way to remove them using colsplit? Or any other
simple way?
You can remove them afterwords, e.g.,
df2$str <- gsub("[^0-9]", "", df2$str)
df2$name <- gsub("[^0-9]", "", df2$name)
Best,
Ista
/johannes
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello,
By looking at the output of
pat <- "(str)|(_name)|( name)"
strsplit(c("str1_name2", "str3_name5"), pat)
[[1]]
[1] "" "1" "2"
[[2]]
[1] "" "3" "5"
I could understand why colsplit includes NAs as column 'str' values.
So the hack is to fake we want three coluns and then set the first one
to NULL.
df2 <- data.frame(df1, colsplit(df1$x, pattern = pat, names=c("Null",
"str","name")))
df2$Null <- NULL
df2
I don't like it very much but it's simple and it works.
Hope this helps,
Rui Barradas
Em 27-09-2012 12:25, Johannes Radinger escreveu:
Hi,
I am using colsplit (package = reshape) to split all strings
in a column according to the same patterns. Here
an example:
library(reshape2)
df1 <- data.frame(x=c("str1_name2", "str3_name5"))
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
This is nearly what I want but I want to remove the words "str" and
"name" from the values, because the columns are already named with
that words. Is there a way to remove them using colsplit? Or any other
simple way?
/johannes
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi,
You can also try this:
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
df2list<-list(df2$str,df2$name)
df2[,2:3]<-sapply(df2list,function(x) gsub(".*(\\d)","\\1",x))
df2
?# ???????? x str name
#1 str1_name2?? 1??? 2
#2 str3_name5?? 3??? 5
A.K.
----- Original Message -----
From: Ista Zahn <istazahn at gmail.com>
To: Johannes Radinger <johannesradinger at gmail.com>
Cc: r-help at r-project.org
Sent: Thursday, September 27, 2012 7:43 AM
Subject: Re: [R] Colsplit, removing parts of a string
Hi Johannes,
On Thu, Sep 27, 2012 at 7:25 AM, Johannes Radinger
<johannesradinger at gmail.com> wrote:
Hi,
I am using colsplit (package = reshape) to split all strings
in a column according to the same patterns. Here
an example:
library(reshape2)
df1 <- data.frame(x=c("str1_name2", "str3_name5"))
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
This is nearly what I want but I want to remove the words "str" and
"name" from the values, because the columns are already named with
that words. Is there a way to remove them using colsplit? Or any other
simple way?
You can remove them afterwords, e.g.,
df2$str <- gsub("[^0-9]", "", df2$str)
df2$name <- gsub("[^0-9]", "", df2$name)
Best,
Ista
/johannes
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you, this works perfectly... best regards, Johannes
On Thu, Sep 27, 2012 at 1:43 PM, Ista Zahn <istazahn at gmail.com> wrote:
Hi Johannes, On Thu, Sep 27, 2012 at 7:25 AM, Johannes Radinger <johannesradinger at gmail.com> wrote:
Hi,
I am using colsplit (package = reshape) to split all strings
in a column according to the same patterns. Here
an example:
library(reshape2)
df1 <- data.frame(x=c("str1_name2", "str3_name5"))
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
This is nearly what I want but I want to remove the words "str" and
"name" from the values, because the columns are already named with
that words. Is there a way to remove them using colsplit? Or any other
simple way?
You can remove them afterwords, e.g.,
df2$str <- gsub("[^0-9]", "", df2$str)
df2$name <- gsub("[^0-9]", "", df2$name)
Best,
Ista
/johannes
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.