Skip to content

stdev error

10 messages · Eric Berger, Chris Evans, Ivan Krylov +3 more

#
r-help forum

 

When I run the following code 

 

my_tbl %>% 

  mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>% 

  group_by(Cat, Bse_bwt) %>% 

  summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = sd(Bse_ftv))

 

I get the following error:

 

Error: `stdev` refers to a variable created earlier in this summarise().

Do you need an extra mutate() step?

 

I suspect it is because the standard deviation of a length-one vector is NA
and R is errorerrors out on the standard deviation  of 1. So then I tried

 

summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = if(n()>1)
sd(Bse_ftv) else 0) and this didn't seem to work either. So there has to be
a way to add some sort of error checker to my standard deviation function to
check if n > 1 and then take the standard deviation in dplyr.

 

Jeff
#
try changing
Bse_ftv = mean(Bse_ftv)
to
Bse_ftv_mean = mean(Bse_ftv)

On Fri, Mar 11, 2022 at 4:15 PM Jeff Reichman <reichmanj at sbcglobal.net>
wrote:

  
  
#
Can't see your data but perhaps:

my_tbl %>%
  mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>%
  group_by(Cat, Bse_bwt) %>%
  summarize(count = n(), 
     Bse_ftv = mean(Bse_ftv), 
     stdev = if_else(count > 1,
                     sd(Bse_ftv),
                     NA_real_))
 

----- Original Message -----

  
    
#
Well I can see my "ifelse" syntax is wrong so I've changed it to 

  summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = ifelse(count>1,
sd(Bse_ftv),0)) but still getting

Error: `stdev` refers to a variable created earlier in this summarise().
Do you need an extra mutate() step?

-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Reichman
Sent: Friday, March 11, 2022 8:15 AM
To: r-help at r-project.org
Subject: [R] stdev error

r-help forum

 When I run the following code 

my_tbl %>% 
  mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>% 
  group_by(Cat, Bse_bwt) %>% 
  summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = sd(Bse_ftv))

I get the following error:

Error: `stdev` refers to a variable created earlier in this summarise().
Do you need an extra mutate() step?

 I suspect it is because the standard deviation of a length-one vector is NA
and R is errorerrors out on the standard deviation  of 1. So then I tried

summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = if(n()>1)
sd(Bse_ftv) else 0) and this didn't seem to work either. So there has to be
a way to add some sort of error checker to my standard deviation function to
check if n > 1 and then take the standard deviation in dplyr.

 

Jeff

 



______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>  mailing list -- To
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
? Fri, 11 Mar 2022 08:14:52 -0600
"Jeff Reichman" <reichmanj at sbcglobal.net> ?????:
My interpretation of the error message is that summarise() thinks that
you're creating a variable (Bse_ftv = mean(Bse_ftv)) and then asking
summarise() to compute its standard deviation (stdev = sd(Bse_ftv)) in
the same call. This is apparently not always supported, according
to ?summarise [1].

You know that you meant the previously available Bse_ftv column, not
the newly created Bse_ftv = mean(Bse_ftv), but this is not how the call
is interpreted. Try setting a different name for the result of
mean(Bse_ftv).
#
Hallo

with(my_tbl, aggregate(Bse_bwt, list(Cat), function(x) c(n=length(x), mean=mean(x), st_dev=sd(x))))

Or am I missing something?

Cheers
Petr


-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Chris Evans
Sent: Friday, March 11, 2022 3:22 PM
To: reichmanj at sbcglobal.net
Cc: r-help at r-project.org
Subject: Re: [R] stdev error

Can't see your data but perhaps:

my_tbl %>%
  mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>%
  group_by(Cat, Bse_bwt) %>%
  summarize(count = n(),
     Bse_ftv = mean(Bse_ftv),
     stdev = if_else(count > 1,
                     sd(Bse_ftv),
                     NA_real_))


----- Original Message -----
--
Chris Evans (he/him) <chris at psyctc.org> Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, University of Roehampton, London, UK.
Work web site: https://www.psyctc.org/psyctc/
CORE site:     https://www.coresystemtrust.org.uk/
Personal site: https://www.psyctc.org/pelerinage2016/
OMbook:        https://ombook.psyctc.org/book/

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner?s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
#
Hello,

I cannot reproduce this error with a built-in data set.
Can you post str(my_tbl)?


suppressPackageStartupMessages(library(dplyr))

mtcars %>%
   mutate(hp = round(hp * 2) / 2) %>%
   group_by(cyl, hp) %>%
   summarise(
     count = n(),
     hp = mean(hp),
     stdev = sd(hp)
   )
#> `summarise()` has grouped output by 'cyl'. You can override using the 
`.groups`
#> argument.
#> # A tibble: 23 x 4
#> # Groups:   cyl [3]
#>      cyl    hp count stdev
#>    <dbl> <dbl> <int> <dbl>
#>  1     4    52     1    NA
#>  2     4    62     1    NA
#>  3     4    65     1    NA
#>  4     4    66     2    NA
#>  5     4    91     1    NA
#>  6     4    93     1    NA
#>  7     4    95     1    NA
#>  8     4    97     1    NA
#>  9     4   109     1    NA
#> 10     4   113     1    NA
#> # ... with 13 more rows

Hope this helps,

Rui Barradas


?s 14:14 de 11/03/2022, Jeff Reichman escreveu:
#
Rui

I don't have the data with med. But I did try using the mtcars dataset you used

mtcars %>%
  mutate(hp = round(hp * 2) / 2) %>%
  group_by(gear, carb) %>%
  summarise(count = n(), mean_hp = mean(hp), stdev_hp = sd(hp))

which resulted in 

# A tibble: 11 x 5
# Groups:   gear [3]
    gear  carb count mean_hp stdev_hp
   <dbl> <dbl> <int>   <dbl>    <dbl>
 1     3     1     3   104       6.56
 2     3     2     4   162.     14.4 
 3     3     3     3   180       0   
 4     3     4     5   228      17.9 
 5     4     1     4    72.5    13.7 
 6     4     2     4    79.5    26.9 
 7     4     4     4   116.      7.51
 8     5     2     2   102      15.6 
 9     5     4     1   264      NA   
10     5     6     1   175      NA   
11     5     8     1   335      NA

So maybe there is something odd with my dataset. Because the mtcars dataset code ran just fine. Where count == 1 sd returned NA. Which is what I was expecting originally


-----Original Message-----
From: Rui Barradas <ruipbarradas at sapo.pt> 
Sent: Friday, March 11, 2022 9:24 AM
To: reichmanj at sbcglobal.net; r-help at r-project.org
Subject: Re: [R] stdev error

Hello,

I cannot reproduce this error with a built-in data set.
Can you post str(my_tbl)?


suppressPackageStartupMessages(library(dplyr))

mtcars %>%
   mutate(hp = round(hp * 2) / 2) %>%
   group_by(cyl, hp) %>%
   summarise(
     count = n(),
     hp = mean(hp),
     stdev = sd(hp)
   )
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` #> argument.
#> # A tibble: 23 x 4
#> # Groups:   cyl [3]
#>      cyl    hp count stdev
#>    <dbl> <dbl> <int> <dbl>
#>  1     4    52     1    NA
#>  2     4    62     1    NA
#>  3     4    65     1    NA
#>  4     4    66     2    NA
#>  5     4    91     1    NA
#>  6     4    93     1    NA
#>  7     4    95     1    NA
#>  8     4    97     1    NA
#>  9     4   109     1    NA
#> 10     4   113     1    NA
#> # ... with 13 more rows

Hope this helps,

Rui Barradas


?s 14:14 de 11/03/2022, Jeff Reichman escreveu:
#
Rui

Found my problem, or at least I think I found the problem. 

# BEWARE: reusing variables may lead to unexpected results - https://dplyr.tidyverse.org/reference/summarise.html

I changed my variable name  and problem resolved.

Jeff

-----Original Message-----
From: Rui Barradas <ruipbarradas at sapo.pt> 
Sent: Friday, March 11, 2022 9:24 AM
To: reichmanj at sbcglobal.net; r-help at r-project.org
Subject: Re: [R] stdev error

Hello,

I cannot reproduce this error with a built-in data set.
Can you post str(my_tbl)?


suppressPackageStartupMessages(library(dplyr))

mtcars %>%
   mutate(hp = round(hp * 2) / 2) %>%
   group_by(cyl, hp) %>%
   summarise(
     count = n(),
     hp = mean(hp),
     stdev = sd(hp)
   )
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` #> argument.
#> # A tibble: 23 x 4
#> # Groups:   cyl [3]
#>      cyl    hp count stdev
#>    <dbl> <dbl> <int> <dbl>
#>  1     4    52     1    NA
#>  2     4    62     1    NA
#>  3     4    65     1    NA
#>  4     4    66     2    NA
#>  5     4    91     1    NA
#>  6     4    93     1    NA
#>  7     4    95     1    NA
#>  8     4    97     1    NA
#>  9     4   109     1    NA
#> 10     4   113     1    NA
#> # ... with 13 more rows

Hope this helps,

Rui Barradas


?s 14:14 de 11/03/2022, Jeff Reichman escreveu:
#
Hello,

Yes, you're right.
Thanks for posting this, in my original post unlike what I thought I was 
able to reproduce the error. All stdev values were NA when in fact after 
changing the mean to hp1 = mean(hp) some of them are not, there are 
zeros in the output column stdev.

Rui Barradas

?s 17:49 de 11/03/2022, Jeff Reichman escreveu: