Skip to content

Is it possible to define another kind of NA

10 messages · Chel Hee Lee, Dustin Tran, Bert Gunter +5 more

#
Dear member list,

In many experimental sciences, there is a lower detection limit (LDL) 
when a dosage of a product is done. Then some samples are evaluated to 
be below this limit.
I search for the best way to indicate in a data.frame that some values 
are such LDL. Ideally, an equivalent of NA would be the best.
Until now I manage by indicating all the column in characters.
So the question is: is it possible to define a value that could be named 
LDL and that could take place in vectors or data.frame such as:
v <- c(0.2, 0.28, LDL, 0.9) is the same way that NA can be used.
with of course a function is.ldl(v) that would return F F T F

Thanks a lot for any direction to solve this

Marc
#
> is.LDL <- is.na
 > v <- c(0.2, 0.28, LDL, 0.9)
 > v
[1] 0.20 0.28   NA 0.90
 > is.LDL(v)
[1] FALSE FALSE  TRUE FALSE
 >

Hope this helps.

Chel Hee Lee
On 11/9/2014 12:07 PM, Marc Girondot wrote:
#
Hi Marc,

This may be a helpful link: http://stackoverflow.com/questions/16074384/specify-different-types-of-missing-values-nas
#
Ouch!

The values are **NOT** missing -- they are (left) censored, and need
to be handled by appropriate censored data methods. I suggest you
(all!) either read up on this or consult someone locally who has
knowledge of such methods.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Sun, Nov 9, 2014 at 10:40 AM, Dustin Tran <dustinviettran at gmail.com> wrote:
3 days later
#
Along the lines of what Bert Gunter said, the ideal way to represent <LDL
results depends on the functions used later to analyze them. I deal with
such data on a daily basis and have never found it necessary to
incorporate that information in the same variable as the results. What
would you do if data were censored at both ends, both low and high?

Anyway, the functions I use mostly incorporate that information in a
second variable, a ?detection indicator? variable, and that?s what I do.

-Don
#
Le 13/11/2014 01:26, MacQueen, Don a ?crit :
I agree that LDL is a special case of what could be named ODL (Out of 
detection limit).
To answer to Bert Gunter, indeed if LDL (or ODL) values are changed into 
NA, the results will be biased. That's why I would like to introduce 
another category. I don't plan to just transform them as NA.

But thinking again about this problem, a LDL must be always associated 
with one value (or two in the case of ODL) that indicates the detection 
limit. In a dataset, all values have not necessarily the same limit 
depending on the experimental conditions.
The best solution that I find is to use attributes to indicate the 
limits. A NA attribute for a NA value will be treated as a "true" NA.
For exemple:

 > values <- c(NA, 29, 30, NA, 3)
 > attributes(values) <- list(ODL=c(NA, "[10, 40]", "[0, 40]", "[0, 
40]", "[0, 40]"))
 > values
[1] NA 29 30 NA  3
attr(,"ODL")
[1] NA         "[10, 40]" "[0, 40]"  "[0, 40]"  "[0, 40]"
 > values[3]
[1] 30
 > attributes(values)$ODL[3]
[1] "[0, 40]"
 > values[1]
[1] NA
 > attributes(values)$ODL[1]
[1] NA

The attributes are retained in data.frame. So it seems to be a good 
solution.

 > essai <- data.frame(c1=values)
 > essai
   c1
1 NA
2 29
3 30
4 NA
5  3
 > essai$c1
[1] NA 29 30 NA  3
attr(,"ODL")
[1] NA         "[10, 40]" "[0, 40]"  "[0, 40]"  "[0, 40]"

Thanks to the list members,

Marc
#
On 13/11/2014 11:08, Marc Girondot wrote:
I strongly recommend you re-read and take action on Bert Gunter's 
comment, quoted here for truth!
-----------------------------
Ouch!

The values are **NOT** missing -- they are (left) censored, and need
to be handled by appropriate censored data methods. I suggest you
(all!) either read up on this or consult someone locally who has
knowledge of such methods.

-- Bert
------------------------
You are re-inventing the wheel and yours will probably end up square!
R already has facilities for handling censored data, e.g. Surv in the 
survival package (which despite its name is applicable to applications 
other than survival analysis).
#
As you may know already, that design can be considerably improved by making the wheel triangular, reducing the bump count per revolution by a full 25% 

-pd
On 13 Nov 2014, at 16:17 , Keith Jewell <Keith.Jewell at campdenbri.co.uk> wrote:

            

  
    
#
On Thu, 13 Nov 2014, Keith Jewell wrote:

            
There is also the NADA package that provides several approaches for
handling left-censored data.

Rich
#
On Thu, 13 Nov 2014, peter dalgaard wrote:

            
It has been written that those who go around in circles think of
themselves as big wheels.

Rich