a) Numeric values may be either integers (signed 32 bit) or double
precision (53 bit mantissa).
b) Double precision constants are numeric with no decoration (e.g. 61224).
Integer constants have an L (e.g. 61224L).
c) 61224*61224 > 2^31-1 so that answer cannot fit into an integer.
d) Exponentiation is a floating point operation so the result of 61224L^2L
is a floating point answer that CAN fit into the 53bit mantissa of a double
precision value, so no overflow occurs.
e) Defining a function like yules.k1 and never showing how you called it
does not constitute a reproducible example. To avoid such gaffes you can
use the reprex package to confirm that the errors shown in your question
are in fact reproducible.
f) On this mailing list, the fact that you are using RStudio is at best
irrelevant, and at worst off-topic. If you don't see problems running your
reproducible example from R in the terminal then the question probably
belongs in the RStudio support forum. This is another reason to use the
reprex package to check your reproducibility (this works even if you invoke
it from RStudio).
g) Calling table on the result of table must be one of the more bizarre
calculation sequences I have ever seen in R. I hope you are getting the
answers you are expecting when you do use double precision numeric values.
Also, using the prefix form of multiplication is unnecessarily obscure, and
your use of the return function at the end of your function is redundant.
On May 8, 2018 7:54:26 PM PDT, "Stefan Th. Gries" <stgries at gmail.com>
wrote:
I have problem with integer overflow that I cannot understand.
I have a character vector curr.lemmas with the following properties:
length(curr.lemmas) # 61224
length(unique(curr.lemmas)) # 2652
That vector is the input to the following function:
yules.k1 <- function(input) {
m1 <- length(input); temp <- table(table(input))
m2 <- sum("*"(temp, as.numeric(names(temp))^2))
return(10000*(m2-m1) / (m1*m1))
}
When I run this, I get the following output:
[1] NA
Warning message:
In m1 * m1 : NAs produced by integer overflow
But when I change the function to this one by just replacing m1*m1 by
m1^2 ...
yules.k2 <- function(input) {
m1 <- length(input); temp <- table(table(input))
m2 <- sum("*"(temp, as.numeric(names(temp))^2))
return(10000*(m2-m1) / (m1^2))
}
yules.k2(curr.lemmas) # -> 157.261
I am using RStudio 1.1.447 and here's my sessionInfo
######################
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 18.3
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.4 backports_1.1.2 magrittr_1.5 rprojroot_1.3-2
htmltools_0.3.6 tools_3.4.4 yaml_2.1.19 Rcpp_0.12.16
stringi_1.2.2
[10] rmarkdown_1.9 knitr_1.20 stringr_1.3.0 digest_0.6.15
evaluate_0.10.1
######################
What is even more puzzling is that one time I ran R in the console of
Geany and this happened:
[1] NA
Warning message:
In m1 * m1 : NAs produced by integer overflow
[1] 3748378176
That is, the multiplication worked with the numbers but not the
numeric vectors; the above is literally copied from the console. Why
is that happening?
Any help would be much appreciated!
STG
--
Stefan Th. Gries
----------------------------------
Univ. of California, Santa Barbara
http://tinyurl.com/stgries