Skip to content
Prev 299365 / 398503 Next

NADA Data Frame Format: Wide or Long?

Hi Rich,

So what you're faced with is that the cenros() function has no built-in
methods for grouping or subsetting -- unlike some other R methods,
especially those that work with the lattice package, or the many modeling
functions like lm() that have a subset argument or employ a conditioning
syntax for models [like  y ~ x | g ]. In effect, this means you have to
roll your own.

The wide format could help, but you would still probably end up writing
loops. Each parameter would then presumably be represented by two columns,
one for the result, one for non-detection indicator. And they would all
have different names, such as ceneq1.ag, ceneq1.al, and so on. I think
you'd probably end up with more complicated scripts. This approach is
especially tricky if not all analtyes and locations were sampled on the
same days (which is normally the case for my data).

You're probably aware that there are various functions for splitting a
dataframe into subsets and then applying the same function to every
subset, such as by() and aggregate(), and probably others. These may turn
out to be fairly simple to use with a NADA function such as cenros(), but
you won't really know until you start trying them.

One can also do it oneself with constructs like

tmpsub <- split( mydf, list(mydf$site, mydf$param) )
tmpss <- lapply(tmpsub, myfun)

where myfun is a wrapper function around, say, cenros().

This is obviously just an outline.

-Don