An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111102/8ea2030f/attachment.pl>
how to count number of occurrences
6 messages · Sl K, Sarah Goslee, William Dunlap +3 more
Hi,
On Wed, Nov 2, 2011 at 12:54 PM, Sl K <s.karmv at gmail.com> wrote:
Dear R users, I have this data frame, ? ? ? ? ? y samp 8 0.03060419 ? ?X 18 0.06120838 ? ?Y 10 0.23588374 ? ?X 3 0.32809965 ? ?X 1 ?0.36007100 ? ?X 7 0.36730571 ? ?X 20 0.47176748 ? ?Y 13 0.65619929 ? ?Y 11 0.72014201 ? ?Y 17 0.73461142 ? ?Y 6 0.76221313 ? ?X 2 0.77005691 ? ?X 4 0.92477243 ? ?X 9 0.93837591 ? ?X 5 0.98883581 ? ?X 16 1.52442626 ? ?Y 12 1.54011381 ? ?Y 14 1.84954487 ? ?Y 19 1.87675183 ? ?Y 15 1.97767162 ? ?Y and I am trying to find the number of X's that occur before ith Y occurs. For example, there is 1 X before the first Y, so I get 1. There are 4 X's before the second Y, so I get 4, there is no X between second and third Y, so I get 0 and so on. Any hint to at least help me to start this will be appreciated. Thanks a lot!
Using dput() to provide reproducible data would be nice, but failing that here's a simple example with sample data:
testdata <- c("x", "y", "x", "x", "x", "y", "x", "x", "x", "x", "x", "y", "y")
rle(testdata)
Run Length Encoding lengths: int [1:6] 1 1 3 1 5 2 values : chr [1:6] "x" "y" "x" "y" "x" "y" You can use the values component of the list returned by rle to subset the lengths component of the list to get only the x values if that's what you need to end up with.
rle(testdata)$lengths[rle(testdata)$values == "x"]
[1] 1 3 5
Sarah Goslee http://www.functionaldiversity.org
Is the following what you want? It should give the number of "X"s immediately preceding each "Y".
samp <- c("X", "Y", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "X", "X",
"X", "X", "X", "Y", "Y", "Y", "Y", "Y")
diff((seq_along(samp) - cumsum(samp=="Y"))[samp=="Y"])
[1] 4 0 0 0 5 0 0 0 0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sl K
Sent: Wednesday, November 02, 2011 9:55 AM
To: r-help at r-project.org
Subject: [R] how to count number of occurrences
Dear R users,
I have this data frame,
y samp
8 0.03060419 X
18 0.06120838 Y
10 0.23588374 X
3 0.32809965 X
1 0.36007100 X
7 0.36730571 X
20 0.47176748 Y
13 0.65619929 Y
11 0.72014201 Y
17 0.73461142 Y
6 0.76221313 X
2 0.77005691 X
4 0.92477243 X
9 0.93837591 X
5 0.98883581 X
16 1.52442626 Y
12 1.54011381 Y
14 1.84954487 Y
19 1.87675183 Y
15 1.97767162 Y
and I am trying to find the number of X's that occur before ith Y occurs.
For example, there is 1 X before the first Y, so I get 1. There are 4 X's
before the second Y, so I get 4, there is no X between second and third Y,
so I get 0 and so on. Any hint to at least help me to start this will be
appreciated. Thanks a lot!
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Nov 2, 2011, at 12:54 PM, Sl K wrote:
Dear R users,
I have this data frame,
y samp
8 0.03060419 X
18 0.06120838 Y
10 0.23588374 X
3 0.32809965 X
1 0.36007100 X
7 0.36730571 X
20 0.47176748 Y
13 0.65619929 Y
11 0.72014201 Y
17 0.73461142 Y
6 0.76221313 X
2 0.77005691 X
4 0.92477243 X
9 0.93837591 X
5 0.98883581 X
16 1.52442626 Y
12 1.54011381 Y
14 1.84954487 Y
19 1.87675183 Y
15 1.97767162 Y
dat$nXs <- cumsum(dat$samp=="X")
dat$nYs <- cumsum(dat$samp=="Y")
dat
#
y samp nXs nYs
8 0.03060419 X 1 0
18 0.06120838 Y 1 1
10 0.23588374 X 2 1
3 0.32809965 X 3 1
1 0.36007100 X 4 1
7 0.36730571 X 5 1
20 0.47176748 Y 5 2
13 0.65619929 Y 5 3
11 0.72014201 Y 5 4
17 0.73461142 Y 5 5
6 0.76221313 X 6 5
2 0.77005691 X 7 5
4 0.92477243 X 8 5
9 0.93837591 X 9 5
5 0.98883581 X 10 5
16 1.52442626 Y 10 6
12 1.54011381 Y 10 7
14 1.84954487 Y 10 8
19 1.87675183 Y 10 9
15 1.97767162 Y 10 10
I find that there are 5 X's before the second Y.
> nXbefore_mthY <- function(m) dat[which(dat$nYs==m), "nXs"]
> nXbefore_mthY(2)
[1] 5
and I am trying to find the number of X's that occur before ith Y occurs. For example, there is 1 X before the first Y, so I get 1. There are 4 X's before the second Y, so I get 4, there is no X between second and third Y, so I get 0 and so on. Any hint to at least help me to start this will be appreciated. Thanks a lot! [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD Heritage Laboratories West Hartford, CT
1 day later
This was very helpful. Thank you very much. Just one question, I notice that it does not count the number of X's before the first Y. I want the result be 1 4 0 0 0 5 0 0 0 0. I tried combining this output with the first value of rle output, but realized that rle doesn't give me the 0s. So, if my first observation was Y, then I want it to show that there are 0 Xs before that. Thank you again. -- View this message in context: http://r.789695.n4.nabble.com/how-to-count-number-of-occurrences-tp3979546p3988529.html Sent from the R help mailing list archive at Nabble.com.
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of uka Sent: Thursday, November 03, 2011 4:10 PM To: r-help at r-project.org Subject: Re: [R] how to count number of occurrences This was very helpful. Thank you very much. Just one question, I notice that it does not count the number of X's before the first Y. I want the result be 1 4 0 0 0 5 0 0 0 0. I tried combining this output with the first value of rle output, but realized that rle doesn't give me the 0s. So, if my first observation was Y, then I want it to show that there are 0 Xs before that. Thank you again.
You should really provide the relevant context from previous posts so that potential helpers don't need to go looking for it. That being said, you could try something like
samp <- c("X", "Y", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "X", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y")
diff(which(c('Y', samp)=='Y'))-1
Hope this is helpful,
Dan
Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204