Hi. It's my understanding that a cross-correlation function of vectors x and y at lag zero is equivalent to their correlation (or covariance, depending on how the ccf is defined). If this is true, could somebody please explain why I get an inconsistent result between cov() and ccf(type = "covariance"), but a consistent result between cor() and ccf(type = "correlation")? Or have I misunderstood what is a cross-correlation? (unfortunately, I can't seem to get a look at the ccf code, since I think it's buried in some C function outside of the main environment) Thanks very much. --Bob Farmer PhD candidate, Dalhousie University Halifax, NS, Canada Example: d1<-data.frame(matrix(ldeaths, nrow = 6, byrow = T)) seventy_4<-as.numeric(d1[1,]) seventy_5<-as.numeric(d1[2,]) ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "covariance" ) cov(seventy_4, seventy_5) #inconsistent ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "correlation" ) cor(seventy_4, seventy_5) #consistent
ccf and covariance
4 messages · Bob Farmer, Brian Ripley
On Wed, 23 Apr 2008, Bob Farmer wrote:
Hi. It's my understanding that a cross-correlation function of vectors x and y at lag zero is equivalent to their correlation (or covariance, depending on how the ccf is defined).
The ratio of your values is
MASS::fractions(282568.5/259021)
[1] 12/11 ? Do you recognize it? There is an explanation in MASS4, p. 390, for example.
If this is true, could somebody please explain why I get an inconsistent result between cov() and ccf(type = "covariance"), but a consistent result between cor() and ccf(type = "correlation")? Or have I misunderstood what is a cross-correlation? (unfortunately, I can't seem to get a look at the ccf code, since I think it's buried in some C function outside of the main environment)
It is in the R sources, not 'buried' at all - that is what 'Open Source' means. You can browse them at https://svn.r-project.org/R/trunk, or download them for study.
Thanks very much. --Bob Farmer PhD candidate, Dalhousie University Halifax, NS, Canada Example: d1<-data.frame(matrix(ldeaths, nrow = 6, byrow = T)) seventy_4<-as.numeric(d1[1,]) seventy_5<-as.numeric(d1[2,]) ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "covariance" ) cov(seventy_4, seventy_5) #inconsistent ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "correlation" ) cor(seventy_4, seventy_5) #consistent
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thanks to Prof. Ripley and Phil Spector for pointing out that the autocorrelation functions must use a "nontraditional" definition of the covariance, involving a denominator of n (instead of n-1) in order to satisfy an assumption of second-order stationarity in the (unbiased) covariance estimators of a time series. In terms of getting at the source code for the (apparently compiled) "R_acf", however, I've had no luck. While https://svn.r-project.org/R/trunk seems to be able to show me the source code for otherwise obscured (in the R console) functions like print() (e.g. https://svn.r-project.org/R/trunk/src/library/base/R/print.R ), I can't seem to find the C code ("R_acf"?) called in this section: .... array(.C(R_acf, as.double(x), as.integer(sampleT), as.integer(nser), as.integer(lag.max), as.integer(type == "correlation"), acf = double((lag.max + 1) * nser * nser), NAOK = TRUE) .... of acf(). For instance, in https://svn.r-project.org/R/trunk/src/library/stats/src/ there is (seemingly) no "R_acf.C" or "stats.C" file that I would expect to see. I apologize in advance if this question is elementary or naive -- this is my first time dealing with the source code. Thanks again. --Bob Farmer On Wed, Apr 23, 2008 at 3:31 PM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
On Wed, 23 Apr 2008, Bob Farmer wrote:
Hi. It's my understanding that a cross-correlation function of vectors x and y at lag zero is equivalent to their correlation (or covariance, depending on how the ccf is defined).
The ratio of your values is
MASS::fractions(282568.5/259021)
[1] 12/11 ? Do you recognize it? There is an explanation in MASS4, p. 390, for example.
If this is true, could somebody please explain why I get an inconsistent result between cov() and ccf(type = "covariance"), but a consistent result between cor() and ccf(type = "correlation")? Or have I misunderstood what is a cross-correlation? (unfortunately, I can't seem to get a look at the ccf code, since I think it's buried in some C function outside of the main environment)
It is in the R sources, not 'buried' at all - that is what 'Open Source' means. You can browse them at https://svn.r-project.org/R/trunk, or download them for study.
Thanks very much. --Bob Farmer PhD candidate, Dalhousie University Halifax, NS, Canada Example: d1<-data.frame(matrix(ldeaths, nrow = 6, byrow = T)) seventy_4<-as.numeric(d1[1,]) seventy_5<-as.numeric(d1[2,]) ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "covariance" ) cov(seventy_4, seventy_5) #inconsistent ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "correlation" ) cor(seventy_4, seventy_5) #consistent
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Wed, 23 Apr 2008, Bob Farmer wrote:
Thanks to Prof. Ripley and Phil Spector for pointing out that the autocorrelation functions must use a "nontraditional" definition of the covariance, involving a denominator of n (instead of n-1) in order to satisfy an assumption of second-order stationarity in the (unbiased) covariance estimators of a time series.
Actually, they are biased. Being a covariance sequence is the issue. (It's a longer explanation than I wanted or want to write out, hence my reference to a readily accessible source.)
In terms of getting at the source code for the (apparently compiled) "R_acf", however, I've had no luck. While https://svn.r-project.org/R/trunk seems to be able to show me the source code for otherwise obscured (in the R console) functions like print() (e.g. https://svn.r-project.org/R/trunk/src/library/base/R/print.R ), I can't seem to find the C code ("R_acf"?) called in this section: .... array(.C(R_acf, as.double(x), as.integer(sampleT), as.integer(nser), as.integer(lag.max), as.integer(type == "correlation"), acf = double((lag.max + 1) * nser * nser), NAOK = TRUE) .... of acf(). For instance, in https://svn.r-project.org/R/trunk/src/library/stats/src/ there is (seemingly) no "R_acf.C" or "stats.C" file that I would expect to see. I apologize in advance if this question is elementary or naive -- this is my first time dealing with the source code.
It is easier if you download and search the sources. In the same way that ccf() is not in ccf.R, 'R_acf' is entry point 'acf' in src/library/stats/src/filter.c.
Thanks again. --Bob Farmer On Wed, Apr 23, 2008 at 3:31 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
On Wed, 23 Apr 2008, Bob Farmer wrote:
Hi. It's my understanding that a cross-correlation function of vectors x and y at lag zero is equivalent to their correlation (or covariance, depending on how the ccf is defined).
The ratio of your values is
MASS::fractions(282568.5/259021)
[1] 12/11 ? Do you recognize it? There is an explanation in MASS4, p. 390, for example.
If this is true, could somebody please explain why I get an inconsistent result between cov() and ccf(type = "covariance"), but a consistent result between cor() and ccf(type = "correlation")? Or have I misunderstood what is a cross-correlation? (unfortunately, I can't seem to get a look at the ccf code, since I think it's buried in some C function outside of the main environment)
It is in the R sources, not 'buried' at all - that is what 'Open Source' means. You can browse them at https://svn.r-project.org/R/trunk, or download them for study.
Thanks very much. --Bob Farmer PhD candidate, Dalhousie University Halifax, NS, Canada Example: d1<-data.frame(matrix(ldeaths, nrow = 6, byrow = T)) seventy_4<-as.numeric(d1[1,]) seventy_5<-as.numeric(d1[2,]) ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "covariance" ) cov(seventy_4, seventy_5) #inconsistent ccf(x=seventy_4, y=seventy_5, plot = F, lag.max = 0, type = "correlation" ) cor(seventy_4, seventy_5) #consistent
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595