complex splits - R-help | R Mailing Lists

Thu, Aug 15, 2002 4:25 PM #

Hi everyone,

I'm having trouble figuring out how to split a dataframe more than once.

Let's say I have a dataframe d with a certain column called splitcol
composed of four possible ordinal values.  The same dataframe has
two other columns, col1 and col2, that have one of two possible values
each. I'd like to split d$splitcol based on col1 and col2 so I can report
frequencies of the subgroups.

So if col1 and col2 can each have two different values, after splitting
I should have four vectors based on d$splitcol.

Can I do this in one step, or do I need to do some intermediate
splitting?

-Tim

Tim Wilson      |   Visit Sibley online:   | Check out:
Henry Sibley HS |  http://www.isd197.org   | http://www.zope.com
W. St. Paul, MN |                          | http://slashdot.org
wilson at visi.com |  <dtml-var pithy_quote>  | http://linux.com
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Pierre Kleiber

Thu, Aug 15, 2002 5:30 PM #

split(d$splitcol,paste(d$col1,d$col2))

will give you vectors as elements of a list based on possible
combinations of col1 and col2.  Is that what you're after?
   Cheers, Pierre

Tim Wilson wrote:

-----------------------------------------------------------------
Pierre Kleiber             Email: pkleiber at honlab.nmfs.hawaii.edu
Fishery Biologist                     Tel: 808 983-5399/737-7544
NOAA FISHERIES - Honolulu Laboratory         Fax: 808 983-2902
2570 Dole St., Honolulu, HI 96822-2396
-----------------------------------------------------------------
  "God could have told Moses about galaxies and mitochondria and
   all.  But behold... It was good enough for government work."
-----------------------------------------------------------------


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Tim Wilson

Thu, Aug 15, 2002 5:42 PM #

On Thu, Aug 15, 2002 at 02:30:08PM -1000, Pierre Kleiber wrote:

No, when I try this I only get two groups based on d$col1. What I'm
looking for is a split on d$col1 (producing two groups) and a subsequent
split on d$col2 for a total of four groups. Then I'll do frequencies of
those four groups.

-Tim

Tim Wilson      |   Visit Sibley online:   | Check out:
Henry Sibley HS |  http://www.isd197.org   | http://www.zope.com
W. St. Paul, MN |                          | http://slashdot.org
wilson at visi.com |  <dtml-var pithy_quote>  | http://linux.com
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Pierre Kleiber

Thu, Aug 15, 2002 5:55 PM #

Strange that you only get two groups.  Note that col2 is
pasted to col1.   Here's what I get:


 > d <- data.frame(col1=sample(c("a","b"),20,
         rep=T),col2=sample(c("c","d"),20,rep=T),
         splitcol=sample(LETTERS[1:4],20,rep=T))
 > d
    col1 col2 splitcol
1     a    c        A
2     b    d        B
3     a    c        B
4     b    c        C
5     a    d        A
6     a    d        C
7     b    c        D
8     a    d        D
9     b    c        B
10    a    c        D
11    b    c        C
12    b    d        C
13    b    d        B
14    a    d        C
15    a    c        B
16    a    d        A
17    b    d        C
18    b    d        A
19    b    c        A
20    b    d        C
 > split(d$splitcol,paste(d$col1,d$col2))
$"a c"
[1] A B D B
Levels:  A B C D

$"a d"
[1] A C D C A
Levels:  A B C D

$"b c"
[1] C D B C A
Levels:  A B C D

$"b d"
[1] B C B C A C
Levels:  A B C D

Tim Wilson wrote:

-----------------------------------------------------------------
Pierre Kleiber             Email: pkleiber at honlab.nmfs.hawaii.edu
Fishery Biologist                     Tel: 808 983-5399/737-7544
NOAA FISHERIES - Honolulu Laboratory         Fax: 808 983-2902
2570 Dole St., Honolulu, HI 96822-2396
-----------------------------------------------------------------
  "God could have told Moses about galaxies and mitochondria and
   all.  But behold... It was good enough for government work."
-----------------------------------------------------------------


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Tim Wilson

Thu, Aug 15, 2002 6:04 PM #

On Thu, Aug 15, 2002 at 02:55:36PM -1000, Pierre Kleiber wrote:

Thanks Pierre! It works now. I must have typed something wrong before.

OK, everyone move along...nothing more to see here. :-)

-Tim

Tim Wilson      |   Visit Sibley online:   | Check out:
Henry Sibley HS |  http://www.isd197.org   | http://www.zope.com
W. St. Paul, MN |                          | http://slashdot.org
wilson at visi.com |  <dtml-var pithy_quote>  | http://linux.com
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

Fri, Aug 16, 2002 7:49 AM #

On Thu, 15 Aug 2002, Pierre Kleiber wrote:

or more simply

	split(d$splitcol,list(d$col1,d$col2))

	-thomas


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._