Skip to content

Colsplit, removing parts of a string

5 messages · Johannes Radinger, Ista Zahn, Rui Barradas +1 more

#
Hi,

I am using colsplit (package = reshape) to split all strings
in a column according to the same patterns. Here
an example:

library(reshape2)


df1 <- data.frame(x=c("str1_name2", "str3_name5"))
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))

This is nearly what I want but I want to remove the words "str" and
"name" from the values, because the columns are already named with
that words. Is there a way to remove them using colsplit? Or any other
simple way?

/johannes
#
Hi Johannes,

On Thu, Sep 27, 2012 at 7:25 AM, Johannes Radinger
<johannesradinger at gmail.com> wrote:
You can remove them afterwords, e.g.,

df2$str <- gsub("[^0-9]", "", df2$str)
df2$name <- gsub("[^0-9]", "", df2$name)

Best,
Ista
#
Hello,

By looking at the output of

pat <- "(str)|(_name)|( name)"
strsplit(c("str1_name2", "str3_name5"), pat)
[[1]]
[1] ""  "1" "2"

[[2]]
[1] ""  "3" "5"

I could understand why colsplit includes NAs as column 'str' values.
So the hack is to fake we want three coluns and then set the first one 
to NULL.

df2 <- data.frame(df1, colsplit(df1$x, pattern = pat, names=c("Null", 
"str","name")))
df2$Null <- NULL
df2

I don't like it very much but it's simple and it works.

Hope this helps,

Rui Barradas
Em 27-09-2012 12:25, Johannes Radinger escreveu:
#
Hi,
You can also try this:
df2 <- data.frame(df1, colsplit(df1$x, pattern = "_", names=c("str","name")))
df2list<-list(df2$str,df2$name)
df2[,2:3]<-sapply(df2list,function(x) gsub(".*(\\d)","\\1",x))
df2
?# ???????? x str name
#1 str1_name2?? 1??? 2
#2 str3_name5?? 3??? 5
A.K.



----- Original Message -----
From: Ista Zahn <istazahn at gmail.com>
To: Johannes Radinger <johannesradinger at gmail.com>
Cc: r-help at r-project.org
Sent: Thursday, September 27, 2012 7:43 AM
Subject: Re: [R] Colsplit, removing parts of a string

Hi Johannes,

On Thu, Sep 27, 2012 at 7:25 AM, Johannes Radinger
<johannesradinger at gmail.com> wrote:
You can remove them afterwords, e.g.,

df2$str <- gsub("[^0-9]", "", df2$str)
df2$name <- gsub("[^0-9]", "", df2$name)

Best,
Ista
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Thank you,

this works perfectly...

best regards,
Johannes
On Thu, Sep 27, 2012 at 1:43 PM, Ista Zahn <istazahn at gmail.com> wrote: