Skip to content

Basic question: Reading in multiple choice question responses to a single column in data frame

7 messages · Damion Dooley, Frank E Harrell Jr, Magnus Torfason

#
You might look at the mChoice function in the Hmisc package for some 
indirect help.

Frank
Damion Dooley wrote:

  
    
#
Are you looking for something like this?

 > d      = data.frame(a=1:5,b=c("1","2,3","2","3,4","1"))
 > d
   a   b
1 1   1
2 2 2,3
3 3   2
4 4 3,4
5 5   1
 > multis = strsplit(d$b,",")
 > counts = sapply(strsplit(d$b,","),length )
 > d2 = data.frame( a=rep(d$a,counts), b=unlist(multis) )
 > d2
   a b
1 1 1
2 2 2
3 2 3
4 3 2
5 4 3
6 4 4
7 5 1

Best,
Magnus
On 8/19/2009 3:12 PM, Damion Dooley wrote:
#
Magnus,

Looks like that solution should work, and I like the flexibility of your
data output, but I get a "error in strsplit(d$b,","): non-character
argument" at:

	multis = strsplit(d$b,",")

Seems like the c() function converts integer looking items like "1" into
integers and then strsplit fails on them?  I was running into this earlier
when attempting strsplit directly on column values.



Damion 

-----Original Message-----
From: Magnus Torfason [mailto:zulutime.net at gmail.com] 
Sent: August 19, 2009 12:33 PM
To: Damion Dooley
Cc: r-help at r-project.org
Subject: Re: [R] Basic question: Reading in multiple choice question
responses to a single column in data frame

Are you looking for something like this?

 > d      = data.frame(a=1:5,b=c("1","2,3","2","3,4","1"))
 > d
   a   b
1 1   1
2 2 2,3
3 3   2
4 4 3,4
5 5   1
 > multis = strsplit(d$b,",")
 > counts = sapply(strsplit(d$b,","),length )
a b
1 1 1
2 2 2
3 2 3
4 3 2
5 4 3
6 4 4
7 5 1

Best,
Magnus
#
Hi, Magnus,

I discovered that

	multis = strsplit(as.character(d$b),",")

Works in the example you gave.  Thanks very much, looks like that's the way
I'll go for now.  P.s. for those others who may want, my selected column was
plugged in as
	
	myData=read.delim(myDataFile etc. etc....);
	myColumn = myData[[myQuestion]]; #myQuestion is name of column
	d = data.frame(a=1:length(myColumn),b=myColumn);
	multis = strsplit(as.character(d$b),",");
	etc. as per Magnus's code.

And thank you Frank for pointing me to mChoice, which will require further
study on my part.

Regards,

Damion

Damion Dooley  .   LearningPoint.ca  Website Technology   .   604 877 0304


-----Original Message-----
From: Magnus Torfason [mailto:zulutime.net at gmail.com] 
Sent: August 19, 2009 12:33 PM
To: Damion Dooley
Cc: r-help at r-project.org
Subject: Re: [R] Basic question: Reading in multiple choice question
responses to a single column in data frame

Are you looking for something like this?

 > d      = data.frame(a=1:5,b=c("1","2,3","2","3,4","1"))
 > d
   a   b
1 1   1
2 2 2,3
3 3   2
4 4 3,4
5 5   1
 > multis = strsplit(d$b,",")
 > counts = sapply(strsplit(d$b,","),length )
a b
1 1 1
2 2 2
3 2 3
4 3 2
5 4 3
6 4 4
7 5 1

Best,
Magnus
On 8/19/2009 3:12 PM, Damion Dooley wrote:
#
Slight addendum.  Working from your code, I found 1 line of code does the
conversion:

	myColumn = unlist(strsplit(as.character(myData[[myQuestion]]),","));

But the dataframe you set up may prove more useful.

Regards,

Damion

-----Original Message-----
From: Magnus Torfason [mailto:zulutime.net at gmail.com] 
Sent: August 19, 2009 12:33 PM
To: Damion Dooley
Cc: r-help at r-project.org
Subject: Re: [R] Basic question: Reading in multiple choice question
responses to a single column in data frame

Are you looking for something like this?

 > d      = data.frame(a=1:5,b=c("1","2,3","2","3,4","1"))
 > d
   a   b
1 1   1
2 2 2,3
3 3   2
4 4 3,4
5 5   1
 > multis = strsplit(d$b,",")
 > counts = sapply(strsplit(d$b,","),length )
a b
1 1 1
2 2 2
3 2 3
4 3 2
5 4 3
6 4 4
7 5 1

Best,
Magnus
#
On 8/19/2009 11:06 PM, Damion Dooley wrote:
I'm glad my suggestion was useful. My more comprehensive example assumed 
that you needed to be able to match individual multi-choice selections 
with other questions through the observation ID after the processing.
If that is not needed, the one-liner should be adequate.

Best,
Magnus