splitting a character field in R
Here is one additional solution:
read.table(textConnection(sub("abc", " ", B)), fill = TRUE)
It also works if there are more than 2 fields. If there can
be spaces in the lines then the sub should be modified to
translate "abc" to some unique character not appearing in
the lines and sep= should be added to the read.table call.
Also as.is=TRUE can be added to the read.table call if
its desired to return character rather than factor columns
and col.name= can be added to the read.table call if it
is desired to control the naming of the returned columns.
This solution will also work with more than two fields.
On 10/28/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
You could use:
data.frame(First = sub("abc.*", "", B), Second = sub(".*abc", "", B))
or if you want to prevent conversion to factors:
data.frame(First = I(sub("abc.*", "", B)), Second = I(sub(".*abc", "", B)))
On 10/28/05, ManuelPerera-Chang at fmc-ag.com
<ManuelPerera-Chang at fmc-ag.com> wrote:
Hi Jim,
Thanks for your post, I was aware of strsplit, but really could not find
out how i could use it.
I tried like in your example ...
A<-c(1,2,3)
B<-c("dgabcrt","fgrtabc","sabcuuu")
C<-strsplit(B,"abc")
C
[[1]]
[1] "dg" "rt"
[[2]]
[1] "fgrt"
[[3]]
[1] "s" "uuu"
Which looks promissing, but here C is a list with three elements. But how
to create the two vectors I need from here, that is
("dg","fgrt", "s") and ("rt","","uuu")
(or how to get access to the substrings "rt" or "uuu").
Greetings
Manuel
jim holtman
<jholtman at gmail.c To: "ManuelPerera-Chang at fmc-ag.com" <ManuelPerera-Chang at fmc-ag.com>
om> cc: r-help at stat.math.ethz.ch
Subject: Re: [R] splitting a character field in R
28.10.2005 16:00
x <- 'dfabcxy' strsplit(x, 'abc')
[[1]] [1] "df" "xy"
On 10/28/05, ManuelPerera-Chang at fmc-ag.com <ManuelPerera-Chang at fmc-ag.com >
wrote:
Dear R users,
I have a dataframe with one character field, and I would like to
create two
new fields (columns) in my dataset, by spliting the existing
character
field into two using an existing substring.
... something that in SAS I could solve e.g. combining substr(which I
am
aware exist in R) and "index" for determining the position of the
pattern
within the string.
e.g. if my dataframe is ...
A B
1 dgabcrt
2 fgrtabc
3 sabcuuu
Then by splitting by substring "abc" I would get ...
A B B1 B2
1 dgabcrt dg rt
2 fgrtabc fgrt
3 sabcuuu s uuu
Do you know how to do this basic string(dataframe) manipulation in R
Saludos,
Manuel
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
--
Jim Holtman
Cincinnati, OH
+1 513 247 0281
What the problem you are trying to solve?
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html