Dear member list, In many experimental sciences, there is a lower detection limit (LDL) when a dosage of a product is done. Then some samples are evaluated to be below this limit. I search for the best way to indicate in a data.frame that some values are such LDL. Ideally, an equivalent of NA would be the best. Until now I manage by indicating all the column in characters. So the question is: is it possible to define a value that could be named LDL and that could take place in vectors or data.frame such as: v <- c(0.2, 0.28, LDL, 0.9) is the same way that NA can be used. with of course a function is.ldl(v) that would return F F T F Thanks a lot for any direction to solve this Marc
Is it possible to define another kind of NA
10 messages · Chel Hee Lee, Dustin Tran, Bert Gunter +5 more
LDL <- NA_real_
> is.LDL <- is.na > v <- c(0.2, 0.28, LDL, 0.9) > v [1] 0.20 0.28 NA 0.90 > is.LDL(v) [1] FALSE FALSE TRUE FALSE > Hope this helps. Chel Hee Lee
On 11/9/2014 12:07 PM, Marc Girondot wrote:
Dear member list, In many experimental sciences, there is a lower detection limit (LDL) when a dosage of a product is done. Then some samples are evaluated to be below this limit. I search for the best way to indicate in a data.frame that some values are such LDL. Ideally, an equivalent of NA would be the best. Until now I manage by indicating all the column in characters. So the question is: is it possible to define a value that could be named LDL and that could take place in vectors or data.frame such as: v <- c(0.2, 0.28, LDL, 0.9) is the same way that NA can be used. with of course a function is.ldl(v) that would return F F T F Thanks a lot for any direction to solve this Marc
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Marc, This may be a helpful link: http://stackoverflow.com/questions/16074384/specify-different-types-of-missing-values-nas
On Nov 9, 2014, at 1:07 PM, Marc Girondot <marc_grt at yahoo.fr> wrote: Dear member list, In many experimental sciences, there is a lower detection limit (LDL) when a dosage of a product is done. Then some samples are evaluated to be below this limit. I search for the best way to indicate in a data.frame that some values are such LDL. Ideally, an equivalent of NA would be the best. Until now I manage by indicating all the column in characters. So the question is: is it possible to define a value that could be named LDL and that could take place in vectors or data.frame such as: v <- c(0.2, 0.28, LDL, 0.9) is the same way that NA can be used. with of course a function is.ldl(v) that would return F F T F Thanks a lot for any direction to solve this Marc
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ouch! The values are **NOT** missing -- they are (left) censored, and need to be handled by appropriate censored data methods. I suggest you (all!) either read up on this or consult someone locally who has knowledge of such methods. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll
On Sun, Nov 9, 2014 at 10:40 AM, Dustin Tran <dustinviettran at gmail.com> wrote:
Hi Marc, This may be a helpful link: http://stackoverflow.com/questions/16074384/specify-different-types-of-missing-values-nas
On Nov 9, 2014, at 1:07 PM, Marc Girondot <marc_grt at yahoo.fr> wrote: Dear member list, In many experimental sciences, there is a lower detection limit (LDL) when a dosage of a product is done. Then some samples are evaluated to be below this limit. I search for the best way to indicate in a data.frame that some values are such LDL. Ideally, an equivalent of NA would be the best. Until now I manage by indicating all the column in characters. So the question is: is it possible to define a value that could be named LDL and that could take place in vectors or data.frame such as: v <- c(0.2, 0.28, LDL, 0.9) is the same way that NA can be used. with of course a function is.ldl(v) that would return F F T F Thanks a lot for any direction to solve this Marc
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
3 days later
Along the lines of what Bert Gunter said, the ideal way to represent <LDL results depends on the functions used later to analyze them. I deal with such data on a daily basis and have never found it necessary to incorporate that information in the same variable as the results. What would you do if data were censored at both ends, both low and high? Anyway, the functions I use mostly incorporate that information in a second variable, a ?detection indicator? variable, and that?s what I do. -Don
Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/9/14, 10:07 AM, "Marc Girondot" <marc_grt at yahoo.fr> wrote: >Dear member list, > >In many experimental sciences, there is a lower detection limit (LDL) >when a dosage of a product is done. Then some samples are evaluated to >be below this limit. >I search for the best way to indicate in a data.frame that some values >are such LDL. Ideally, an equivalent of NA would be the best. >Until now I manage by indicating all the column in characters. >So the question is: is it possible to define a value that could be named >LDL and that could take place in vectors or data.frame such as: >v <- c(0.2, 0.28, LDL, 0.9) is the same way that NA can be used. >with of course a function is.ldl(v) that would return F F T F > >Thanks a lot for any direction to solve this > >Marc > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Le 13/11/2014 01:26, MacQueen, Don a ?crit :
Along the lines of what Bert Gunter said, the ideal way to represent <LDL results depends on the functions used later to analyze them. I deal with such data on a daily basis and have never found it necessary to incorporate that information in the same variable as the results. What would you do if data were censored at both ends, both low and high? Anyway, the functions I use mostly incorporate that information in a second variable, a ?detection indicator? variable, and that?s what I do. -Don
I agree that LDL is a special case of what could be named ODL (Out of detection limit). To answer to Bert Gunter, indeed if LDL (or ODL) values are changed into NA, the results will be biased. That's why I would like to introduce another category. I don't plan to just transform them as NA. But thinking again about this problem, a LDL must be always associated with one value (or two in the case of ODL) that indicates the detection limit. In a dataset, all values have not necessarily the same limit depending on the experimental conditions. The best solution that I find is to use attributes to indicate the limits. A NA attribute for a NA value will be treated as a "true" NA. For exemple: > values <- c(NA, 29, 30, NA, 3) > attributes(values) <- list(ODL=c(NA, "[10, 40]", "[0, 40]", "[0, 40]", "[0, 40]")) > values [1] NA 29 30 NA 3 attr(,"ODL") [1] NA "[10, 40]" "[0, 40]" "[0, 40]" "[0, 40]" > values[3] [1] 30 > attributes(values)$ODL[3] [1] "[0, 40]" > values[1] [1] NA > attributes(values)$ODL[1] [1] NA The attributes are retained in data.frame. So it seems to be a good solution. > essai <- data.frame(c1=values) > essai c1 1 NA 2 29 3 30 4 NA 5 3 > essai$c1 [1] NA 29 30 NA 3 attr(,"ODL") [1] NA "[10, 40]" "[0, 40]" "[0, 40]" "[0, 40]" Thanks to the list members, Marc
On 13/11/2014 11:08, Marc Girondot wrote:
Le 13/11/2014 01:26, MacQueen, Don a ?crit :
Along the lines of what Bert Gunter said, the ideal way to represent <LDL results depends on the functions used later to analyze them. I deal with such data on a daily basis and have never found it necessary to incorporate that information in the same variable as the results. What would you do if data were censored at both ends, both low and high? Anyway, the functions I use mostly incorporate that information in a second variable, a ?detection indicator? variable, and that?s what I do. -Don
I agree that LDL is a special case of what could be named ODL (Out of detection limit). To answer to Bert Gunter, indeed if LDL (or ODL) values are changed into NA, the results will be biased. That's why I would like to introduce another category. I don't plan to just transform them as NA. But thinking again about this problem, a LDL must be always associated with one value (or two in the case of ODL) that indicates the detection limit. In a dataset, all values have not necessarily the same limit depending on the experimental conditions. The best solution that I find is to use attributes to indicate the limits. A NA attribute for a NA value will be treated as a "true" NA. For exemple:
> values <- c(NA, 29, 30, NA, 3) > attributes(values) <- list(ODL=c(NA, "[10, 40]", "[0, 40]", "[0,
40]", "[0, 40]"))
> values
[1] NA 29 30 NA 3 attr(,"ODL") [1] NA "[10, 40]" "[0, 40]" "[0, 40]" "[0, 40]"
> values[3]
[1] 30
> attributes(values)$ODL[3]
[1] "[0, 40]"
> values[1]
[1] NA
> attributes(values)$ODL[1]
[1] NA The attributes are retained in data.frame. So it seems to be a good solution.
> essai <- data.frame(c1=values) > essai
c1 1 NA 2 29 3 30 4 NA 5 3
> essai$c1
[1] NA 29 30 NA 3 attr(,"ODL") [1] NA "[10, 40]" "[0, 40]" "[0, 40]" "[0, 40]" Thanks to the list members, Marc
I strongly recommend you re-read and take action on Bert Gunter's comment, quoted here for truth! ----------------------------- Ouch! The values are **NOT** missing -- they are (left) censored, and need to be handled by appropriate censored data methods. I suggest you (all!) either read up on this or consult someone locally who has knowledge of such methods. -- Bert ------------------------ You are re-inventing the wheel and yours will probably end up square! R already has facilities for handling censored data, e.g. Surv in the survival package (which despite its name is applicable to applications other than survival analysis).
As you may know already, that design can be considerably improved by making the wheel triangular, reducing the bump count per revolution by a full 25% -pd
On 13 Nov 2014, at 16:17 , Keith Jewell <Keith.Jewell at campdenbri.co.uk> wrote:
You are re-inventing the wheel and yours will probably end up square!
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
On Thu, 13 Nov 2014, Keith Jewell wrote:
You are re-inventing the wheel and yours will probably end up square! R already has facilities for handling censored data, e.g. Surv in the survival package (which despite its name is applicable to applications other than survival analysis).
There is also the NADA package that provides several approaches for handling left-censored data. Rich
On Thu, 13 Nov 2014, peter dalgaard wrote:
As you may know already, that design can be considerably improved by making the wheel triangular, reducing the bump count per revolution by a full 25%
It has been written that those who go around in circles think of themselves as big wheels. Rich