Hi,
I am developing a Bioconductor package and can not get rid of some
warning messages. During devtools::check() I get the following warning
messages:
...
summarizeDataFrame: no visible binding for global variable ?name?
summarizeDataFrame: no visible binding for global variable ?gene?
summarizeDataFrame: no visible binding for global variable ?value?
...
Here a short version of the function:
#' Collapse rows with duplicated name column
#'
#' @param dat a \cite{tibble} with the columns name, gene and value
#' @importFrom plyr ddply
#' @import tibble
#' @return a \cite{tibble}
#' @export
#'
#' @examples
#' dat <- tibble(name = c(paste0("position", 1:5), paste0("position",
c(1:3))), gene = paste0("gene", 1:8), value = 1:8)
#' summarizeDataFrame(dat)
summarizeDataFrame <- function(dat){
? ddply(dat, "name", "summarize",
??????? name=unique(name),
??????? gene=paste(unique(gene), collapse = ","),
??????? value=mean(value))
}
R interprets the "name", "gene" and "value" column names as variables
during the check. Does anyone has an idea how to change the syntax of
ddply or how to get rid of the warning message?
Thanks in advance!
Tobias
[Bioc-devel] ddply causes error during R check
5 messages · web working, Mike Smith, Martin Morgan
If you're sure these are false positives (and it looks like they are) then you can use utils::globalVariables() outside of your function to get rid of the note. It might also be worth pointing out that there are also plenty of Bioconductor packages that don't do this and simply have this mentioned in the check results e.g http://bioconductor.org/checkResults/devel/bioc-LATEST/beadarray/malbec2-checksrc.html . Mike
On Tue, 12 Feb 2019 at 08:35, web working <webworking at posteo.de> wrote:
Hi,
I am developing a Bioconductor package and can not get rid of some
warning messages. During devtools::check() I get the following warning
messages:
...
summarizeDataFrame: no visible binding for global variable ?name?
summarizeDataFrame: no visible binding for global variable ?gene?
summarizeDataFrame: no visible binding for global variable ?value?
...
Here a short version of the function:
#' Collapse rows with duplicated name column
#'
#' @param dat a \cite{tibble} with the columns name, gene and value
#' @importFrom plyr ddply
#' @import tibble
#' @return a \cite{tibble}
#' @export
#'
#' @examples
#' dat <- tibble(name = c(paste0("position", 1:5), paste0("position",
c(1:3))), gene = paste0("gene", 1:8), value = 1:8)
#' summarizeDataFrame(dat)
summarizeDataFrame <- function(dat){
ddply(dat, "name", "summarize",
name=unique(name),
gene=paste(unique(gene), collapse = ","),
value=mean(value))
}
R interprets the "name", "gene" and "value" column names as variables
during the check. Does anyone has an idea how to change the syntax of
ddply or how to get rid of the warning message?
Thanks in advance!
Tobias
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
use `globalVariables()` to declare these symbols and quieten warnings, at the expense of quietening warnings about undefined variables in _all_ code and potentially silencing true positives. Avoid non-standard evaluation (this is what ddply is doing, using special rules to resolve symbols like `name`) by using base R functionality; note also that non-standard evaluation is prone to typos, e.g., looking for the typo `hpx` in the calling environment rather than the data frame
hpx = 1 ddply(mtcars, "cyl", "summarize", value = mean(hpx)). ## oops, meant `mean(hp)`.
cyl summarize 1 4 1 2 6 1 3 8 1 Marginally better is
aggregate(hp ~ cyl, mtcars, mean)
cyl hp
1 4 82.63636
2 6 122.28571
3 8 209.21429
where R recognizes symbols in the formula ~ as intentionally unresolved. The wizards on the list might point to constructs in the rlang package.
Martin
?On 2/12/19, 2:35 AM, "Bioc-devel on behalf of web working" <bioc-devel-bounces at r-project.org on behalf of webworking at posteo.de> wrote:
Hi,
I am developing a Bioconductor package and can not get rid of some
warning messages. During devtools::check() I get the following warning
messages:
...
summarizeDataFrame: no visible binding for global variable ?name?
summarizeDataFrame: no visible binding for global variable ?gene?
summarizeDataFrame: no visible binding for global variable ?value?
...
Here a short version of the function:
#' Collapse rows with duplicated name column
#'
#' @param dat a \cite{tibble} with the columns name, gene and value
#' @importFrom plyr ddply
#' @import tibble
#' @return a \cite{tibble}
#' @export
#'
#' @examples
#' dat <- tibble(name = c(paste0("position", 1:5), paste0("position",
c(1:3))), gene = paste0("gene", 1:8), value = 1:8)
#' summarizeDataFrame(dat)
summarizeDataFrame <- function(dat){
ddply(dat, "name", "summarize",
name=unique(name),
gene=paste(unique(gene), collapse = ","),
value=mean(value))
}
R interprets the "name", "gene" and "value" column names as variables
during the check. Does anyone has an idea how to change the syntax of
ddply or how to get rid of the warning message?
Thanks in advance!
Tobias
_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
1 day later
Hi Mike, thank you for pointing out that there are other package which have the same situation. Tobias Am 12.02.19 um 14:47 schrieb Mike Smith:
If you're sure these are false positives (and it looks like they are) then you can use utils::globalVariables() outside of your function to get rid of the note.? It might also be worth pointing out that there are also plenty of Bioconductor packages that don't do this and simply have this mentioned in the check results e.g http://bioconductor.org/checkResults/devel/bioc-LATEST/beadarray/malbec2-checksrc.html . Mike On Tue, 12 Feb 2019 at 08:35, web working <webworking at posteo.de <mailto:webworking at posteo.de>> wrote: Hi, I am developing a Bioconductor package and can not get rid of some warning messages. During devtools::check() I get the following warning messages: ... summarizeDataFrame: no visible binding for global variable ?name? summarizeDataFrame: no visible binding for global variable ?gene? summarizeDataFrame: no visible binding for global variable ?value? ... Here a short version of the function: #' Collapse rows with duplicated name column #' #' @param dat a \cite{tibble} with the columns name, gene and value #' @importFrom plyr ddply #' @import tibble #' @return a \cite{tibble} #' @export #' #' @examples #' dat <- tibble(name = c(paste0("position", 1:5), paste0("position", c(1:3))), gene = paste0("gene", 1:8), value = 1:8) #' summarizeDataFrame(dat) summarizeDataFrame <- function(dat){ ?? ddply(dat, "name", "summarize", ???????? name=unique(name), ???????? gene=paste(unique(gene), collapse = ","), ???????? value=mean(value)) } R interprets the "name", "gene" and "value" column names as variables during the check. Does anyone has an idea how to change the syntax of ddply or how to get rid of the warning message? Thanks in advance! Tobias
_______________________________________________
Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing
list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
Hi Martin, thank you for this approach. I will check my code and see where I can use it. Tobias Am 12.02.19 um 14:58 schrieb Martin Morgan:
use `globalVariables()` to declare these symbols and quieten warnings, at the expense of quietening warnings about undefined variables in _all_ code and potentially silencing true positives. Avoid non-standard evaluation (this is what ddply is doing, using special rules to resolve symbols like `name`) by using base R functionality; note also that non-standard evaluation is prone to typos, e.g., looking for the typo `hpx` in the calling environment rather than the data frame
hpx = 1 ddply(mtcars, "cyl", "summarize", value = mean(hpx)). ## oops, meant `mean(hp)`.
cyl summarize 1 4 1 2 6 1 3 8 1 Marginally better is
aggregate(hp ~ cyl, mtcars, mean)
cyl hp
1 4 82.63636
2 6 122.28571
3 8 209.21429
where R recognizes symbols in the formula ~ as intentionally unresolved. The wizards on the list might point to constructs in the rlang package.
Martin
?On 2/12/19, 2:35 AM, "Bioc-devel on behalf of web working" <bioc-devel-bounces at r-project.org on behalf of webworking at posteo.de> wrote:
Hi,
I am developing a Bioconductor package and can not get rid of some
warning messages. During devtools::check() I get the following warning
messages:
...
summarizeDataFrame: no visible binding for global variable ?name?
summarizeDataFrame: no visible binding for global variable ?gene?
summarizeDataFrame: no visible binding for global variable ?value?
...
Here a short version of the function:
#' Collapse rows with duplicated name column
#'
#' @param dat a \cite{tibble} with the columns name, gene and value
#' @importFrom plyr ddply
#' @import tibble
#' @return a \cite{tibble}
#' @export
#'
#' @examples
#' dat <- tibble(name = c(paste0("position", 1:5), paste0("position",
c(1:3))), gene = paste0("gene", 1:8), value = 1:8)
#' summarizeDataFrame(dat)
summarizeDataFrame <- function(dat){
ddply(dat, "name", "summarize",
name=unique(name),
gene=paste(unique(gene), collapse = ","),
value=mean(value))
}
R interprets the "name", "gene" and "value" column names as variables
during the check. Does anyone has an idea how to change the syntax of
ddply or how to get rid of the warning message?
Thanks in advance!
Tobias
_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel