Skip to content

ccf and covariance

4 messages · Bob Farmer, Brian Ripley

#
Hi.
It's my understanding that a cross-correlation function of vectors x
and y at lag zero is equivalent to their correlation (or covariance,
depending on how the ccf is defined).
If this is true, could somebody please explain why I get an
inconsistent result between cov() and ccf(type = "covariance"), but a
consistent result between cor() and ccf(type = "correlation")?
Or have I misunderstood what is a cross-correlation?
(unfortunately, I can't seem to get a look at the ccf code, since I
think it's buried in some C function outside of the main environment)

Thanks very much.
--Bob Farmer
PhD candidate, Dalhousie University
Halifax, NS, Canada

Example:
d1<-data.frame(matrix(ldeaths, nrow = 6, byrow = T))
seventy_4<-as.numeric(d1[1,])
seventy_5<-as.numeric(d1[2,])

ccf(x=seventy_4, y=seventy_5,
  plot = F, lag.max = 0, type = "covariance"
)
cov(seventy_4, seventy_5)  #inconsistent

ccf(x=seventy_4, y=seventy_5,
  plot = F, lag.max = 0, type = "correlation"
)
cor(seventy_4, seventy_5)  #consistent
#
On Wed, 23 Apr 2008, Bob Farmer wrote:

            
The ratio of your values is
[1] 12/11

?  Do you recognize it?

There is an explanation in MASS4, p. 390, for example.
It is in the R sources, not 'buried' at all - that is what 'Open Source' 
means. You can browse them at https://svn.r-project.org/R/trunk, or 
download them for study.

  
    
#
Thanks to Prof. Ripley and Phil Spector for pointing out that the
autocorrelation functions must use a "nontraditional" definition of
the covariance, involving a denominator of n (instead of n-1) in order
to satisfy an assumption of second-order stationarity in the
(unbiased) covariance estimators of a time series.

In terms of getting at the source code for the (apparently compiled)
"R_acf", however, I've had no luck.  While
https://svn.r-project.org/R/trunk
seems to be able to show me the source code for otherwise obscured (in
the R console) functions like print()
(e.g. https://svn.r-project.org/R/trunk/src/library/base/R/print.R ),
I can't seem to find the C code ("R_acf"?) called in this section:
....
array(.C(R_acf, as.double(x), as.integer(sampleT),
        as.integer(nser), as.integer(lag.max), as.integer(type ==
            "correlation"), acf = double((lag.max + 1) * nser *
            nser), NAOK = TRUE)
....

of acf().
For instance, in https://svn.r-project.org/R/trunk/src/library/stats/src/
there is (seemingly) no "R_acf.C" or "stats.C" file that I would expect to see.
I apologize in advance if this question is elementary or naive -- this
is my first time dealing with the source code.

Thanks again.
--Bob Farmer

On Wed, Apr 23, 2008 at 3:31 PM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
#
On Wed, 23 Apr 2008, Bob Farmer wrote:

            
Actually, they are biased.  Being a covariance sequence is the issue. 
(It's a longer explanation than I wanted or want to write out, hence my 
reference to a readily accessible source.)
It is easier if you download and search the sources.  In the same way that 
ccf() is not in ccf.R, 'R_acf' is entry point 'acf' in 
src/library/stats/src/filter.c.