Skip to content

Converting strings with internal delimiters into lists

4 messages · Bert Gunter, Thomas Lumley, shelby berkowitz

#
Hi UserRs,

I know that there has to be an easy way to do this in
R (probably easy enough that once someone clues me in
I'll smack myself on the forehead for not figuring it
out myself), but my searches on my own have not
yielded any hints.

I have many fields in my dataset that participants
entered as "free lists" - i.e., the field constitutes
a varying number of names each separated by a
delimiter.  The resulting data frame might look
something like:

 testtable<-
as.data.frame(cbind(c("Joe,Mary,Jane","Mary"),c("Fred,Joe","Pete,Joe,Mary,Fred")))

In actuality the names are typically multi-word
organization names, but you get the idea...

What I need to do is to convert these text strings
into lists comprised of the elements separated by the
commas so that I can work with these elements across
the dataset, manipulate them, etc.

Thanks in advance to any kind soul who can offer me a
tip to the appropriate functions or a line of code!

Best,

Shelby
#
???  I guess I don't get it.

Note that
as.data.frame(cbind(c("Joe,Mary,Jane","Mary"),c("Fred,Joe","Pete,Joe,Mary,Fr
ed")))

is probably not what you want since cbind expects vectors of equal length.
By default, shorter vectors are recycled to the length of the longest one,
which I doubt is what you want.

Why isn't
mylist<-list(c("Joe,Mary,Jane","Mary"),c("Fred,Joe","Pete,Joe,Mary,Fred"),..
.)

suitable? S lists are specifically designed to handle different objects with
different lengths (or totally different objects). lapply(), sapply() and
friends or for() loops can then work over such a list to do what you want.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
#
On Thu, 11 Aug 2005, shelby berkowitz wrote:

            
I think you are looking for strsplit()

 	-thomas
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
#
yes, that does it.  thanks!
--- Thomas Lumley <tlumley at u.washington.edu> wrote:

            
as.data.frame(cbind(c("Joe,Mary,Jane","Mary"),c("Fred,Joe","Pete,Joe,Mary,Fred")))