Skip to content
Back to formatted view

Raw Message

Message-ID: <CABdHhvFJbESgCru6NZ=ERrY3NVTBptKQP8_tUNKcogCg196x8g@mail.gmail.com>
Date: 2016-06-10T13:32:33Z
From: Hadley Wickham
Subject: About identification of CRAN CHECK machines in logs
In-Reply-To: <CANMhtdwu_zVROLAj3nsYV3FgY8oYngVUpPdpUm8DA8j48J_xKw@mail.gmail.com>

On Fri, Jun 10, 2016 at 8:27 AM, Marcelo Perlin <marceloperlin at gmail.com> wrote:
> I don't know Hadley. But you can see evidence of "something" systematically
> installing the packages in the log data. From my two CRAN packages I noticed
> a high correlation in the number of downloads.
>
> Try the following script, which will pick 5 random packages from CRAN and
> calculate the correlation matrix between their differenced number of
> downloads. To avoid spurious correlations,  I removed the weekends since we
> can expect some seasonality and also the zero entries. Its crude, I know,
> but it does shows some positive associations between the number of
> installations of the packages.

Which is not at all surprising:

* there are very strong seasonal patterns
* there are big jumps after releases of new versions of R
* some people like to have all packages installed locally

This is an intrinsic problem with download data. There's no way to
tell if a downloader is really using your package or not.

Hadley

-- 
http://hadley.nz