Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id date value a 2000-01-01 x a 2000-03-01 x b 2000-11-11 w c 2000-11-11 y c 2000-10-01 y c 2000-09-10 y c 2000-12-12 z c 2000-10-11 z d 2000-11-11 w d 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id date value a 2000-01-01 x c 2000-09-10 y c 2000-10-11 z I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed. Thanks in advance Scott Williams MD Melbourne, Australia This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.
how to subset based on other row values and multiplicity
9 messages · Williams Scott, jim holtman, John McKown +2 more
On Wed, Jul 16, 2014 at 8:07 AM, Williams Scott
<Scott.Williams at petermac.org> wrote:
Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id date value a 2000-01-01 x a 2000-03-01 x b 2000-11-11 w c 2000-11-11 y c 2000-10-01 y c 2000-09-10 y c 2000-12-12 z c 2000-10-11 z d 2000-11-11 w d 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id date value a 2000-01-01 x c 2000-09-10 y c 2000-10-11 z
Question: the c-y id-value pair occurs 3 times. In two cases (2000-11-11 vs. 2000-10-01 & 2000-11-11 vs 2000-09-01) the difference is >30 days. Why isn't c 2000-10-01 y also part of the result? Is it because you only want a single id-value pair in which the date is the minimal? Or you want the one in which the date difference is maximal? Or you overlooked that particular match? I can't figure it out from your description.
I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed.
Hum, is the source of this data in a relational database such as Oracle, PostgreSQL, MySQL, MS-SQL, or SQLite (or "other")? I ask because some of this processing might be easier do to in the data base using a "self join", instead of reading the entire relational table into a data.frame and doing it in R.
Thanks in advance Scott Williams MD Melbourne, Australia
There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown
I can reproduce what you requested, but there was the question about what happens with the multiple 'c-y' values. ====================
require(data.table) x <- read.table(text = 'id date value
+ a 2000-01-01 x + a 2000-03-01 x + b 2000-11-11 w + c 2000-11-11 y + c 2000-10-01 y + c 2000-09-10 y + c 2000-12-12 z + c 2000-10-11 z + d 2000-11-11 w + d 2000-11-10 w', as.is = TRUE, header = TRUE)
setDT(x) x[, date := as.Date(date)] setkey(x, id, value, date) y <- x[
+ , {
+ if (.N == 1) val <- NULL # only one -- delete
+ else {
+ dif <- difftime(tail(date, -1), head(date, -1), units = 'days')
+ # return first value if any > 31
+ if (any(dif >= 31)) val <- list(date = date[1L])
+ else val <- NULL
+ }
+ val
+ }
+ , keyby = 'id,value'
+ ]
y
id value date 1: a x 2000-01-01 2: c y 2000-09-10 3: c z 2000-10-11 Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Jul 16, 2014 at 9:25 AM, John McKown
<john.archie.mckown at gmail.com> wrote:
On Wed, Jul 16, 2014 at 8:07 AM, Williams Scott <Scott.Williams at petermac.org> wrote:
Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id date value a 2000-01-01 x a 2000-03-01 x b 2000-11-11 w c 2000-11-11 y c 2000-10-01 y c 2000-09-10 y c 2000-12-12 z c 2000-10-11 z d 2000-11-11 w d 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id date value a 2000-01-01 x c 2000-09-10 y c 2000-10-11 z
Question: the c-y id-value pair occurs 3 times. In two cases (2000-11-11 vs. 2000-10-01 & 2000-11-11 vs 2000-09-01) the difference is >30 days. Why isn't c 2000-10-01 y also part of the result? Is it because you only want a single id-value pair in which the date is the minimal? Or you want the one in which the date difference is maximal? Or you overlooked that particular match? I can't figure it out from your description.
I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed.
Hum, is the source of this data in a relational database such as Oracle, PostgreSQL, MySQL, MS-SQL, or SQLite (or "other")? I ask because some of this processing might be easier do to in the data base using a "self join", instead of reading the entire relational table into a data.frame and doing it in R.
Thanks in advance Scott Williams MD Melbourne, Australia
-- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Wed, Jul 16, 2014 at 8:51 AM, jim holtman <jholtman at gmail.com> wrote:
I can reproduce what you requested, but there was the question about what happens with the multiple 'c-y' values. ====================
require(data.table) x <- read.table(text = 'id date value
+ a 2000-01-01 x + a 2000-03-01 x + b 2000-11-11 w + c 2000-11-11 y + c 2000-10-01 y + c 2000-09-10 y + c 2000-12-12 z + c 2000-10-11 z + d 2000-11-11 w + d 2000-11-10 w', as.is = TRUE, header = TRUE)
setDT(x) x[, date := as.Date(date)] setkey(x, id, value, date) y <- x[
+ , {
+ if (.N == 1) val <- NULL # only one -- delete
+ else {
+ dif <- difftime(tail(date, -1), head(date, -1), units = 'days')
+ # return first value if any > 31
+ if (any(dif >= 31)) val <- list(date = date[1L])
+ else val <- NULL
+ }
+ val
+ }
+ , keyby = 'id,value'
+ ]
y
id value date 1: a x 2000-01-01 2: c y 2000-09-10 3: c z 2000-10-11 Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
Wow, I picked up a couple of _nice_ techniques from that one post! Looks like "data.table" will let me do SQL like things in R. I have a warped brain. I think in "result sets" and "matrix operations" Many thanks.
There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown
Thanks guys - amazingly prompt solutions from the R community as always. Yes, the c-y value reverts to just the first date event - the spirit of this is that I am trying to identify and confirm a list of diagnoses that a patient has coded in government administrative data. Once a diagnosis is made and confirmed, I am not interested in whether it is listed again and again later on. I just need that date at which it first became apparent. So in the multiple c-y case, the min date is the correct one. Some cases will have the same diagnosis listed dozens of times, hence the very bloated dataset. Time to churn through the data is not a big issue, so I will have a go with Jim?s neat code he just sent on perhaps a few thousand rows and see how I get on. S
On 17/07/2014 12:09 am, "John McKown" <john.archie.mckown at gmail.com> wrote:
On Wed, Jul 16, 2014 at 8:51 AM, jim holtman <jholtman at gmail.com> wrote:
I can reproduce what you requested, but there was the question about what happens with the multiple 'c-y' values. ====================
require(data.table) x <- read.table(text = 'id date value
+ a 2000-01-01 x + a 2000-03-01 x + b 2000-11-11 w + c 2000-11-11 y + c 2000-10-01 y + c 2000-09-10 y + c 2000-12-12 z + c 2000-10-11 z + d 2000-11-11 w + d 2000-11-10 w', as.is = TRUE, header = TRUE)
setDT(x) x[, date := as.Date(date)] setkey(x, id, value, date) y <- x[
+ , {
+ if (.N == 1) val <- NULL # only one -- delete
+ else {
+ dif <- difftime(tail(date, -1), head(date, -1), units =
'days')
+ # return first value if any > 31
+ if (any(dif >= 31)) val <- list(date = date[1L])
+ else val <- NULL
+ }
+ val
+ }
+ , keyby = 'id,value'
+ ]
y
id value date 1: a x 2000-01-01 2: c y 2000-09-10 3: c z 2000-10-11 Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
Wow, I picked up a couple of _nice_ techniques from that one post! Looks like "data.table" will let me do SQL like things in R. I have a warped brain. I think in "result sets" and "matrix operations" Many thanks. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown
This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.
Hi, If `dat` is the dataset library(dplyr) dat%>% group_by(id,value)%>% arrange(date=as.Date(date))%>% filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date)) #Source: local data frame [3 x 3] #Groups: id, value # #? id?????? date value #1? a 2000-01-01???? x #2? c 2000-09-10???? y #3? c 2000-10-11???? z A.K.
On Wednesday, July 16, 2014 9:10 AM, Williams Scott <Scott.Williams at petermac.org> wrote:
Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id? date value a? ? 2000-01-01 x a? ? 2000-03-01 x b? ? 2000-11-11 w c? ? 2000-11-11 y c? ? 2000-10-01 y c? ? 2000-09-10 y c? ? 2000-12-12 z c? ? 2000-10-11 z d? ? 2000-11-11 w d? ? 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id? date? ? ? ? ? value a? ? 2000-01-01 x c? ? 2000-09-10 y c? ? 2000-10-11 z I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed. Thanks in advance Scott Williams MD Melbourne, Australia This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee.? If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email.? Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Using base R you can solve this by doing some sorting and comparing
the first and last dates in each id-value group. Computing the last
and last dates can be vectorized.
f1 <- function(data) {
# sort by id, break ties with value, break remaining ties with date
sortedData <- data[with(data, order(id, value, date)), ]
i <- seq_len(NROW(sortedData)-1)
# a 'group' has same id and value, entries in group are sorted by date
isBreakPoint <- with(sortedData, id[i]!=id[i+1] | value[i]!=value[i+1])
isFirstInGroup <- c(TRUE, isBreakPoint)
isLastInGroup <- c(isBreakPoint, TRUE)
sortedData[isFirstInGroup,][sortedData[isLastInGroup,"date"] -
sortedData[isFirstInGroup,"date"] >= 31,]
}
dat <- read.table(colClasses=c("character", "Date", "character"),
header=TRUE, text=
"id date value
a 2000-01-01 x
a 2000-03-01 x
b 2000-11-11 w
c 2000-11-11 y
c 2000-10-01 y
c 2000-09-10 y
c 2000-12-12 z
c 2000-10-11 z
d 2000-11-11 w
d 2000-11-10 w")
f1(dat)
id date value 1 a 2000-01-01 x 6 c 2000-09-10 y 8 c 2000-10-11 z Bill Dunlap TIBCO Software wdunlap tibco.com
On Wed, Jul 16, 2014 at 7:49 AM, arun <smartpink111 at yahoo.com> wrote:
Hi, If `dat` is the dataset library(dplyr) dat%>% group_by(id,value)%>% arrange(date=as.Date(date))%>% filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date)) #Source: local data frame [3 x 3] #Groups: id, value # # id date value #1 a 2000-01-01 x #2 c 2000-09-10 y #3 c 2000-10-11 z A.K. On Wednesday, July 16, 2014 9:10 AM, Williams Scott <Scott.Williams at petermac.org> wrote: Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id date value a 2000-01-01 x a 2000-03-01 x b 2000-11-11 w c 2000-11-11 y c 2000-10-01 y c 2000-09-10 y c 2000-12-12 z c 2000-10-11 z d 2000-11-11 w d 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id date value a 2000-01-01 x c 2000-09-10 y c 2000-10-11 z I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed. Thanks in advance Scott Williams MD Melbourne, Australia This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date))
Note that the 'date == min(date)' will cause superfluous output rows when there are several readings on initial date for a given id/value pair. E.g.,
dat1 <- data.frame(stringsAsFactors=FALSE, id=rep("A", 4), value=rep("x", 4), date=as.Date("2000-10-1")+c(1,1,50,50))
f2(dat1) # want 1 output row: A, x, 2000-10-2
Source: local data frame [2 x 3]
Groups: id, value
id value date
1 A x 2000-10-02
2 A x 2000-10-02
where f2 is your code wrapped up in a function (to make testing and use easier)
f2 <- function (data)
{
library(dplyr)
data %>% group_by(id, value) %>% arrange(date = as.Date(date)) %>%
filter(any(c(abs(diff(as.Date(date))), NA) > 31) & date == min(date))
}
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Jul 16, 2014 at 7:49 AM, arun <smartpink111 at yahoo.com> wrote:
Hi, If `dat` is the dataset library(dplyr) dat%>% group_by(id,value)%>% arrange(date=as.Date(date))%>% filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date)) #Source: local data frame [3 x 3] #Groups: id, value # # id date value #1 a 2000-01-01 x #2 c 2000-09-10 y #3 c 2000-10-11 z A.K. On Wednesday, July 16, 2014 9:10 AM, Williams Scott <Scott.Williams at petermac.org> wrote: Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id date value a 2000-01-01 x a 2000-03-01 x b 2000-11-11 w c 2000-11-11 y c 2000-10-01 y c 2000-09-10 y c 2000-12-12 z c 2000-10-11 z d 2000-11-11 w d 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id date value a 2000-01-01 x c 2000-09-10 y c 2000-10-11 z I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed. Thanks in advance Scott Williams MD Melbourne, Australia This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Bill,
Modifying `f2` seems to solve the problem.
f2 <- function (data)
{
??? library(dplyr)
??? data%>%
??? group_by(id, value) %>%
??? mutate(date=as.Date(date))%>%
??? arrange(date) %>%
??? filter(indx =any(c(abs(diff(date)),NA) >31)& date==min(date)) %>%
??? filter(row_number()==1)
}
?f2(dat)
Source: local data frame [3 x 3]
Groups: id, value
? id?????? date value
1? a 2000-01-01???? x
2? c 2000-09-10???? y
3? c 2000-10-11???? z
f2(dat1)
Source: local data frame [1 x 3]
Groups: id, value
? id value?????? date
1? A???? x 2000-10-02
A.K.
On Wednesday, July 16, 2014 4:25 PM, William Dunlap <wdunlap at tibco.com> wrote:
filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date))
Note that the 'date == min(date)' will cause superfluous output rows when there are several readings on initial date for a given id/value pair.? E.g.,
dat1 <- data.frame(stringsAsFactors=FALSE, id=rep("A", 4), value=rep("x", 4), date=as.Date("2000-10-1")+c(1,1,50,50))
f2(dat1) # want 1 output row: A, x, 2000-10-2
Source: local data frame [2 x 3]
Groups: id, value
? id value? ? ? date
1? A? ? x 2000-10-02
2? A? ? x 2000-10-02
where f2 is your code wrapped up in a function (to make testing and use easier)
f2 <- function (data)
{
? ? library(dplyr)
? ? data %>% group_by(id, value) %>% arrange(date = as.Date(date)) %>%
? ? ? ? filter(any(c(abs(diff(as.Date(date))), NA) > 31) & date == min(date))
}
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Jul 16, 2014 at 7:49 AM, arun <smartpink111 at yahoo.com> wrote:
Hi, If `dat` is the dataset library(dplyr) dat%>% group_by(id,value)%>% arrange(date=as.Date(date))%>% filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date)) #Source: local data frame [3 x 3] #Groups: id, value # #? id? ? ? date value #1? a 2000-01-01? ? x #2? c 2000-09-10? ? y #3? c 2000-10-11? ? z A.K. On Wednesday, July 16, 2014 9:10 AM, Williams Scott <Scott.Williams at petermac.org> wrote: Hi R experts, I have a dataset as sampled below. Values are only regarded as ?confirmed? in an individual (?id?) if they occur more than once at least 30 days apart. id? date value a? ? 2000-01-01 x a? ? 2000-03-01 x b? ? 2000-11-11 w c? ? 2000-11-11 y c? ? 2000-10-01 y c? ? 2000-09-10 y c? ? 2000-12-12 z c? ? 2000-10-11 z d? ? 2000-11-11 w d? ? 2000-11-10 w I wish to subset the data to retain rows where the value for the individual is confirmed more than 30 days apart. So, after deleting all rows with just one occurrence of id and value, the rest would be the earliest occurrence of each value in each case id, provided 31 or more days exist between the dates. If >1 value is present per id, each value level needs to be assessed independently. This example would then reduce to: id? date? ? ? ? ? value a? ? 2000-01-01 x c? ? 2000-09-10 y c? ? 2000-10-11 z I can do this via some crude loops and subsetting, but I am looking for as much efficiency as possible as the dataset has around 50 million rows to assess. Any suggestions welcomed. Thanks in advance Scott Williams MD Melbourne, Australia This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee.? If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email.? Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.