I'm developing a package that comes with a data set called RutgersMapB36. One of the package's functions requires this data frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Thanks,
Brad
---
Brad McNeney
Statistics and Actuarial Science
Simon Fraser University
R CMD check returns NOTE about package data set as global variable
16 messages · Brian Ripley, Hadley Wickham, Hervé Pagès +5 more
On 06/04/2012 19:46, Brad McNeney wrote:
I'm developing a package that comes with a data set called RutgersMapB36. One of the package's functions requires this data frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Use data("RutgersMapB36"), which many think is good practice in code.
Thanks, Brad --- Brad McNeney Statistics and Actuarial Science Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change. In case it matters, I'm check'ing with R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) Brad ----- Original Message -----
From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> To: "Brad McNeney" <mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:18:14 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable On 06/04/2012 19:46, Brad McNeney wrote:
I'm developing a package that comes with a data set called
RutgersMapB36. One of the package's functions requires this data
frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Use data("RutgersMapB36"), which many think is good practice in code.
Thanks, Brad --- Brad McNeney Statistics and Actuarial Science Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Is the dataset something that package users will need, or just your package's functions? Hadley
On Fri, Apr 6, 2012 at 1:46 PM, Brad McNeney <mcneney at sfu.ca> wrote:
I'm developing a package that comes with a data set called RutgersMapB36. One of the package's functions requires this data frame. A toy example is:
test<-function() {
?data(RutgersMapB36)
?return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Thanks,
Brad
---
Brad McNeney
Statistics and Actuarial Science
Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
On Fri, 6 Apr 2012, Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Yes, you will: data() is a function with side effects, which is contrary to the functional programming model being checked. So there is no way to avoid all notes and use data(). If you want to make your code more understandable, consider using LazyData (see 'Writing R Extensions'). My view is that data() is a kludge from long ago when R had much less powerful memory management, except perhaps for very large datasets (at least 100MBs) when you may want to control when they are loaded into memory.
In case it matters, I'm check'ing with R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) Brad ----- Original Message -----
From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> To: "Brad McNeney" <mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:18:14 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable On 06/04/2012 19:46, Brad McNeney wrote:
I'm developing a package that comes with a data set called
RutgersMapB36. One of the package's functions requires this data
frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Use data("RutgersMapB36"), which many think is good practice in code.
Thanks, Brad --- Brad McNeney Statistics and Actuarial Science Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Apr 6, 2012, at 21:33 , Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Hm? It's not like Brian to get such things wrong, did you check properly? Perhaps the code checker is not smart enough to know that data() creates global variables. (That would be heuristic at best. You can't actually be sure that data() creates objects with the name given as the argument -- in fact, several objects might be created, possibly none named as the argument). You are not using LazyData, right? You might consider doing that and forgetting about data() entirely.
In case it matters, I'm check'ing with R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) Brad ----- Original Message -----
From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> To: "Brad McNeney" <mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:18:14 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable On 06/04/2012 19:46, Brad McNeney wrote:
I'm developing a package that comes with a data set called
RutgersMapB36. One of the package's functions requires this data
frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Use data("RutgersMapB36"), which many think is good practice in code.
Thanks, Brad --- Brad McNeney Statistics and Actuarial Science Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
On 04/06/2012 12:33 PM, Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Because when you do return(RutgersMapB36[,1]), the code checker has no
way to know that the RutgersMapB36 variable is actually defined.
Try this:
test<-function() {
RutgersMapB36 <- NULL
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
Cheers,
H.
In case it matters, I'm check'ing with R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) Brad ----- Original Message -----
From: "Prof Brian Ripley"<ripley at stats.ox.ac.uk> To: "Brad McNeney"<mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:18:14 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable On 06/04/2012 19:46, Brad McNeney wrote:
I'm developing a package that comes with a data set called
RutgersMapB36. One of the package's functions requires this data
frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Use data("RutgersMapB36"), which many think is good practice in code.
Thanks, Brad --- Brad McNeney Statistics and Actuarial Science Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
On Apr 6, 2012, at 22:23 , Herv? Pag?s wrote:
On 04/06/2012 12:33 PM, Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Because when you do return(RutgersMapB36[,1]), the code checker has no
way to know that the RutgersMapB36 variable is actually defined.
Try this:
test<-function() {
RutgersMapB36 <- NULL
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
That might remove the NOTE, but as far as I can see, it also breaks the code...
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Package users should have access. Brad ----- Original Message -----
From: "Hadley Wickham" <hadley at rice.edu> To: "Brad McNeney" <mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:38:11 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable Is the dataset something that package users will need, or just your package's functions? Hadley On Fri, Apr 6, 2012 at 1:46 PM, Brad McNeney <mcneney at sfu.ca> wrote:
I'm developing a package that comes with a data set called
RutgersMapB36. One of the package's functions requires this data
frame. A toy example is:
test<-function() {
?data(RutgersMapB36)
?return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Thanks,
Brad
---
Brad McNeney
Statistics and Actuarial Science
Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Thanks (to all), using LazyData removes the note. Brad ----- Original Message -----
From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> To: "Brad McNeney" <mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:43:22 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable On Fri, 6 Apr 2012, Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Yes, you will: data() is a function with side effects, which is contrary to the functional programming model being checked. So there is no way to avoid all notes and use data(). If you want to make your code more understandable, consider using LazyData (see 'Writing R Extensions'). My view is that data() is a kludge from long ago when R had much less powerful memory management, except perhaps for very large datasets (at least 100MBs) when you may want to control when they are loaded into memory.
In case it matters, I'm check'ing with R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) Brad ----- Original Message -----
From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> To: "Brad McNeney" <mcneney at sfu.ca> Cc: r-devel at r-project.org Sent: Friday, 6 April, 2012 12:18:14 PM Subject: Re: [Rd] R CMD check returns NOTE about package data set as global variable On 06/04/2012 19:46, Brad McNeney wrote:
I'm developing a package that comes with a data set called
RutgersMapB36. One of the package's functions requires this data
frame. A toy example is:
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
R CMD check returns a NOTE:
test: no visible binding for global variable 'RutgersMapB36'
Is there any way to avoid this NOTE?
Use data("RutgersMapB36"), which many think is good practice in
code.
Thanks, Brad --- Brad McNeney Statistics and Actuarial Science Simon Fraser University
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 04/06/2012 01:33 PM, peter dalgaard wrote:
On Apr 6, 2012, at 22:23 , Herv? Pag?s wrote:
On 04/06/2012 12:33 PM, Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Because when you do return(RutgersMapB36[,1]), the code checker has no
way to know that the RutgersMapB36 variable is actually defined.
Try this:
test<-function() {
RutgersMapB36<- NULL
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
That might remove the NOTE, but as far as I can see, it also breaks the code...
oops, right...
This should remove the NOTE and work (hopefully):
test<-function() {
data("RutgersMapB36") # loads RutgersMapB36 in .GlobalEnv
RutgersMapB36 <- get("RutgersMapB36", envir=.GlobalEnv)
return(RutgersMapB36[,1])
}
Cheers,
H.
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
On Fri, 2012-04-06 at 13:23 -0700, Herv? Pag?s wrote:
test<-function() {
RutgersMapB36 <- NULL
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
That won't work, but this should:
RutgersMapB36 <- NULL
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
Honestly, this is just another example of a non-helpful 'global
variable' NOTE. I've removed many of these from our packages, often by
resorting to useless workarounds like this one, but I have never once
gotten a valid NOTE out of this message. We provided other examples
earlier in a different thread.
Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
On Fri, Apr 6, 2012 at 1:33 PM, peter dalgaard <pdalgd at gmail.com> wrote:
On Apr 6, 2012, at 22:23 , Herv? Pag?s wrote:
On 04/06/2012 12:33 PM, Brad McNeney wrote:
OK, thanks for the tip on good coding practice. I'm still getting the NOTE though when I make the suggested change.
Because when you do return(RutgersMapB36[,1]), the code checker has no
way to know that the RutgersMapB36 variable is actually defined.
Try this:
test<-function() {
? RutgersMapB36 <- NULL
? data(RutgersMapB36)
? return(RutgersMapB36[,1])
}
That might remove the NOTE, but as far as I can see, it also breaks the code...
For data() per se, which by default clutter up the global environment,
you can do:
test<-function() {
env <- new.env()
? data("RutgersMapB36", envir=env)
? env$RutgersMapB36[,1]
}
That is more explicit, and I do believe you won't get a NOTE about it.
Other than that, one can also use the following style (which still
seems to do the trick) for data(), attach(), load() et al., iff have
to use them:
test<-function() {
# To avoid NOTEs by R CMD check
? RutgersMapB36 <- NULL; rm(RutgersMapB36);
? data(RutgersMapB36)
? return(RutgersMapB36[,1])
}
/Henrik
-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk ?Priv: PDalgd at gmail.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Brian,
On 04/06/2012 02:04 PM, Brian G. Peterson wrote:
On Fri, 2012-04-06 at 13:23 -0700, Herv? Pag?s wrote:
test<-function() {
RutgersMapB36<- NULL
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
That won't work, but this should:
RutgersMapB36<- NULL
test<-function() {
data(RutgersMapB36)
return(RutgersMapB36[,1])
}
Honestly, this is just another example of a non-helpful 'global
variable' NOTE. I've removed many of these from our packages, often by
resorting to useless workarounds like this one, but I have never once
gotten a valid NOTE out of this message. We provided other examples
earlier in a different thread.
Other people might have a different experience. I've personally seen a lot of true positive "no visible binding for global variable" notes in the Bioconductor check results. In that sense 'R CMD check' is no different from other code checking tools like e.g. gcc -Wall. There are sometimes false positives, it's unavoidable. Personally I can live with that. Cheers, H.
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
On Apr 6, 2012, at 23:04 , Brian G. Peterson wrote:
Honestly, this is just another example of a non-helpful 'global variable' NOTE. I've removed many of these from our packages, often by resorting to useless workarounds like this one, but I have never once gotten a valid NOTE out of this message. We provided other examples earlier in a different thread.
Actually, this one is perfectly valid. It is saying that you are messing with global variables, which you might not want to do in package code. It is admittedly rather unlikely that the user has a variable called "RutgersMapB36" lying around for you to clobber, but suppose that it was "x" or "mydata"...
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
3 days later
On 4/6/12 4:04 PM, "Brian G. Peterson" <brian at braverock.com> wrote:
Honestly, this is just another example of a non-helpful 'global
variable'
NOTE. I've removed many of these from our packages, often by
resorting to
useless workarounds like this one, but I have never once
gotten a valid NOTE
out of this message. We provided other examples
earlier in a different
thread.
While I have seen a couple valid ones, it gets really old having to explain these NOTEs to user community. It would really be nice to have something equivalent to LINT comment directives (i.e., NOTREACHED, ARGSUSED, etc.) that could be used to suppress "noise" messages.