Skip to content

"CV" for log normal data

6 messages · Bert Gunter, Peter Langfelder, Peter Dalgaard +1 more

#
On Tue, Feb 21, 2012 at 1:44 PM, array chip <arrayprofile at yahoo.com> wrote:
You may want to ask this question in the bioconductor list since it
isn't really an R question.

Do you also have some sort of an expression p-value? If you only have
expression itself, you could simply look at variance and hope that
non-expressed genes have expression values determined chiefly by noise
which varies quite a bit, so they would have a higher variance than
genes with stable expression higher than the typical noise.

HTH,

Peter
#
Inline below.

On Tue, Feb 21, 2012 at 2:07 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:
Good advice. But perhaps ?mad or some other perhaps robust plain old
measure of spread?
-- Bert

  
    
#
The problem is not (lack of) robustness to outliers, the problem is to
find genes whose expression variation is small compared to (mean)
expression. Trouble is, Agilent throws the mean expression information
away, so you have to find heuristic workarounds. I have encountered
the same issue before and haven't really found a good solution.

Peter
#
On Feb 21, 2012, at 22:44 , array chip wrote:

            
What's wrong with the SD of log(X)?? That's pretty much equivalent to CV at least for CV's less than 50%:
[1] 0.5252718
[1] 0.5037995

Looking for a relative measure of precision _after_ taking log strikes me as very odd. If you scale your original observations by a constant factor, this will be _added_ to the log transformed data, without affecting their variation at all.