Skip to content

Problem with R >3.0.0

14 messages · Peter Langfelder, Brian Ripley, Shelton, Samuel +2 more

#
On Aug 19, 2013, at 20:53 , Shelton, Samuel wrote:

            
What is bicor()? From the WGCNA package? Perhaps the package is doing something incompatible with the long vector support in R 3.0.0. You need to report such queries to the maintainer. So far we have no evidence that the bug is in R itself, and you're not giving us anything reproducible to investigate.
This seems entirely speculative. Please stick to the facts.
#
Hi Sam,

I assume you mean that correlation for _genes_ (not samples)
11262:30000 is 0? I am the maintainer of the WGCNA package but
unfortunately I don't have access to a Mac big enough to try
30000x30000 correlation matrix, but I would be thankful if you could
try reproducing the problem with smaller matrices (e.g. 20000x20000)
and try to produce a small reproducible example by
generating the data using say rnorm, say like this:

nGenes = 20000 # as small as possible that still produces the error
nSamples = 100
datExpr1 = matrix(rnorm(nSamples * nGenes), nSamples, nGenes)

simMat = bicor(datExpr1, use = 'p')

Best,

Peter

On Mon, Aug 19, 2013 at 11:53 AM, Shelton, Samuel
<SheltonS at stemcell.ucsf.edu> wrote:
#
On Tue, Aug 20, 2013 at 9:23 AM, peter dalgaard <pdalgd at gmail.com> wrote:

            
The maintainer is reporting for duty :)

The version of WGCNA currently on CRAN uses .C to call compiled code.
If I read the manuals right, long vectors are not allowed in .C calls.
In my .C calls I use explicit type casts (as.integer, as.double etc)
for all arguments.

Once we see a reproducible example, we can figure out the problem.

Peter
#
Hi all,

Thanks for getting back to me. We would like to move over to v3.0.0 on our
cluster so that we can build matrices larger than 46300*46300 (limit in R
<3.0.0)
but so far we can't get things to work with R v3.0.0 and higher. I am
trying to trouble shoot at the moment and I am now thinking that the
problem is actually with the diag function that has been rewritten in
version 3.0.0. 


The problem is definitely with the diag function and it does not occur on
smaller matrices (20000*20000) and I think it maybe a bug.
This illustrates the problem:

This was done on an iMac i5 with OSX 10.8.5 16GB Ram and with R 3.0.1 (but
I do see the same for 3.0.0). This does not occur when I run it with R
2.15.2. 


mat1=matrix(rexp(20000^2), 20000)

mat1[1:10,1:10]
           [,1]       [,2]       [,3]      [,4]       [,5]      [,6]
 [,7]       [,8]       [,9]      [,10]
 [1,] 0.1829090 0.39867734 0.80499126 4.1746377 0.20717066 1.1477365
0.469843567 2.57767543 0.17449595 0.01949358
 [2,] 0.5731522 0.15835939 0.29165029 0.6781249 0.64553728 2.4438404
2.140374938 0.40091195 0.51201369 0.98904490
 [3,] 0.3250310 0.09934147 0.79962549 0.4933385 0.30473422 0.4556765
0.002640034 0.90606861 2.58772944 0.89884208
 [4,] 1.4195017 0.16082660 0.01377838 0.2115803 1.43992672 0.3883675
0.040903805 0.51011305 0.41998024 0.44209926
 [5,] 0.8328441 1.10335604 0.11875332 0.1600287 0.17333324 0.3388678
3.206179119 0.52170966 1.03084845 0.05843232
 [6,] 1.3179906 0.76376188 1.24231798 0.9424030 0.04440514 1.0237664
2.547528816 1.35629450 0.87983354 0.25236343
 [7,] 0.6990544 1.17003075 0.66063936 0.8632534 0.28965611 0.6718020
1.137348735 0.08371053 0.23144290 0.18915132
 [8,] 0.9908026 1.20471979 0.08816010 0.2652131 0.03537790 0.3295816
0.144371435 3.03299285 0.09728111 0.39890260
 [9,] 0.9557305 0.29196500 0.43955758 0.7332643 2.03457020 0.5858431
2.437192399 0.34689557 0.02039205 0.54898488
[10,] 3.7220703 0.13572389 0.18888673 0.5683698 1.79209016 1.3495723
0.571159401 0.63375850 0.63221987 1.32840290
[,1]      [,2]       [,3]         [,4]      [,5]        [,6]
  [,7]      [,8]       [,9]     [,10]     [,11]
 [1,] 1.7910910 0.5719982 0.38689588 1.2157545685 0.8530179 1.464105574
0.5986705 1.1623393 0.55244563 0.1770146 0.4326310
 [2,] 0.2862914 1.5267870 0.98214645 0.0004617244 0.6395319 0.075217874
0.6725620 0.2403549 0.08436217 0.1435451 0.7487862
 [3,] 2.0492301 0.7216115 0.16951284 0.2726676762 2.1893806 1.202518385
0.9897710 1.4813026 2.42517705 0.3398811 0.7285074
 [4,] 0.6538994 0.2437594 2.08848881 0.3917212249 0.4441824 0.433749415
1.3022991 1.3695241 0.07057642 0.4296937 2.9307556
 [5,] 2.3688094 2.3970048 0.03545232 0.5986997508 0.8914097 0.497023176
0.4210650 1.5337767 0.01141066 1.1562830 1.0572076
 [6,] 2.0626934 0.6186995 0.99197835 1.4794321654 0.1549314 1.296227000
0.2790942 0.9327613 0.84131377 0.8782590 0.3279970
 [7,] 1.2423823 0.2385994 0.11390071 2.0745023842 1.9152523 0.754186281
1.5474078 2.5899490 5.19298969 1.4680934 1.0537164
 [8,] 1.3657070 1.9502828 1.07681438 0.9339731540 1.7532474 0.186193421
1.8699504 1.9187339 5.13248671 0.4621520 0.4753582
 [9,] 0.6512000 0.5104660 0.17820166 0.3965162944 0.0919119 0.187808660
0.7391137 0.1574844 0.65985494 0.4066742 0.8072494
[10,] 0.7435028 1.1395666 2.46096009 0.7060164691 1.7965986 0.008278685
0.4642319 0.1582297 1.71676326 0.3662139 0.7864957
[11,] 0.3537041 0.6622001 2.01642141 1.8225423060 0.3295436 1.260737179
0.8430396 0.5132811 0.30547431 1.6088725 0.4001791

diag(mat1)=0

mat1[1:10,1:10]
           [,1]       [,2]       [,3]      [,4]       [,5]      [,6]
 [,7]       [,8]       [,9]      [,10]
 [1,] 0.0000000 0.39867734 0.80499126 4.1746377 0.20717066 1.1477365
0.469843567 2.57767543 0.17449595 0.01949358
 [2,] 0.5731522 0.00000000 0.29165029 0.6781249 0.64553728 2.4438404
2.140374938 0.40091195 0.51201369 0.98904490
 [3,] 0.3250310 0.09934147 0.00000000 0.4933385 0.30473422 0.4556765
0.002640034 0.90606861 2.58772944 0.89884208
 [4,] 1.4195017 0.16082660 0.01377838 0.0000000 1.43992672 0.3883675
0.040903805 0.51011305 0.41998024 0.44209926
 [5,] 0.8328441 1.10335604 0.11875332 0.1600287 0.00000000 0.3388678
3.206179119 0.52170966 1.03084845 0.05843232
 [6,] 1.3179906 0.76376188 1.24231798 0.9424030 0.04440514 0.0000000
2.547528816 1.35629450 0.87983354 0.25236343
 [7,] 0.6990544 1.17003075 0.66063936 0.8632534 0.28965611 0.6718020
0.000000000 0.08371053 0.23144290 0.18915132
 [8,] 0.9908026 1.20471979 0.08816010 0.2652131 0.03537790 0.3295816
0.144371435 0.00000000 0.09728111 0.39890260
 [9,] 0.9557305 0.29196500 0.43955758 0.7332643 2.03457020 0.5858431
2.437192399 0.34689557 0.00000000 0.54898488
[10,] 3.7220703 0.13572389 0.18888673 0.5683698 1.79209016 1.3495723
0.571159401 0.63375850 0.63221987 0.00000000

mat1[19990:20000,19990:20000]
           [,1]      [,2]       [,3]         [,4]      [,5]        [,6]
  [,7]      [,8]       [,9]     [,10]     [,11]
 [1,] 0.0000000 0.5719982 0.38689588 1.2157545685 0.8530179 1.464105574
0.5986705 1.1623393 0.55244563 0.1770146 0.4326310
 [2,] 0.2862914 0.0000000 0.98214645 0.0004617244 0.6395319 0.075217874
0.6725620 0.2403549 0.08436217 0.1435451 0.7487862
 [3,] 2.0492301 0.7216115 0.00000000 0.2726676762 2.1893806 1.202518385
0.9897710 1.4813026 2.42517705 0.3398811 0.7285074
 [4,] 0.6538994 0.2437594 2.08848881 0.0000000000 0.4441824 0.433749415
1.3022991 1.3695241 0.07057642 0.4296937 2.9307556
 [5,] 2.3688094 2.3970048 0.03545232 0.5986997508 0.0000000 0.497023176
0.4210650 1.5337767 0.01141066 1.1562830 1.0572076
 [6,] 2.0626934 0.6186995 0.99197835 1.4794321654 0.1549314 0.000000000
0.2790942 0.9327613 0.84131377 0.8782590 0.3279970
 [7,] 1.2423823 0.2385994 0.11390071 2.0745023842 1.9152523 0.754186281
0.0000000 2.5899490 5.19298969 1.4680934 1.0537164
 [8,] 1.3657070 1.9502828 1.07681438 0.9339731540 1.7532474 0.186193421
1.8699504 0.0000000 5.13248671 0.4621520 0.4753582
 [9,] 0.6512000 0.5104660 0.17820166 0.3965162944 0.0919119 0.187808660
0.7391137 0.1574844 0.00000000 0.4066742 0.8072494
[10,] 0.7435028 1.1395666 2.46096009 0.7060164691 1.7965986 0.008278685
0.4642319 0.1582297 1.71676326 0.0000000 0.7864957
[11,] 0.3537041 0.6622001 2.01642141 1.8225423060 0.3295436 1.260737179
0.8430396 0.5132811 0.30547431 1.6088725 0.0000000

mat1=matrix(rexp(38000^2), 38000)
dim(mat1)
[1] 38000 38000
mat1[1:10,1:10]
           [,1]      [,2]       [,3]      [,4]        [,5]      [,6]
[,7]      [,8]       [,9]      [,10]
 [1,] 3.8622815 0.1357886 1.64090976 0.4494637 0.812315613 0.5328906
0.45672475 0.2891504 1.57087882 1.27375802
 [2,] 3.8229940 0.1540735 0.30189392 1.2100152 0.003323061 0.7195875
0.60251052 1.9820380 0.18637086 0.05154236
 [3,] 2.0411498 0.8371707 0.02714550 1.3032572 0.330472063 0.3502267
0.11908140 0.4155857 3.46471729 0.31890778
 [4,] 1.5503390 0.7377494 2.01433675 0.6109255 1.844484309 1.2492693
0.09365743 0.4006219 2.37769616 0.38643521
 [5,] 0.4815804 0.5824312 0.61003728 0.4782871 0.526454982 0.1207842
0.93567987 1.7369767 4.47922786 0.20033928
 [6,] 0.3791645 0.1015489 1.96832962 1.5417178 1.030250434 0.1362716
1.72807083 0.2570055 0.02127689 0.80225716
 [7,] 1.5212795 2.8133952 0.15990367 0.4337506 0.526532536 2.9926685
0.01432572 0.6064162 0.69264596 0.50871566
 [8,] 1.2600365 0.1901277 2.34806048 1.1472887 0.141571521 2.0355007
1.12583466 0.3391067 0.18707165 3.71877247
 [9,] 3.0197258 2.3693633 0.94571337 0.2756933 0.938999190 1.5892456
0.18612994 1.0498866 1.89162156 1.56643880
[10,] 0.3573243 0.3047595 0.01894034 2.4666841 0.660994174 0.2248711
0.25436398 1.1275389 0.20960212 0.63957112




mat1[37990:38000,37990:38000]
           [,1]       [,2]       [,3]       [,4]      [,5]       [,6]
[,7]       [,8]       [,9]     [,10]     [,11]
 [1,] 0.8349326 1.32098263 2.68879316 0.52361090 1.4066094 0.00754308
2.4489865 1.76284621 0.09097126 2.1598758 0.8805701
 [2,] 2.5754994 1.15753606 1.76895066 3.06645700 0.8767225 0.33247639
1.5726808 0.12698495 0.75271682 6.4336476 0.1330457
 [3,] 2.5737957 0.58234929 0.40403205 0.34882433 0.4074048 1.54867135
2.5971068 0.27276140 0.56926494 0.2129180 1.3027215
 [4,] 0.9352113 0.46839288 0.41284388 1.40216119 0.8936151 2.98304058
0.4350446 1.14864094 0.26756970 2.6662998 0.1802141
 [5,] 1.6430435 1.22137017 0.06943644 0.03251737 0.2142083 0.16964865
1.6099314 0.95709429 0.22884734 0.5087986 0.9048555
 [6,] 2.5340159 1.28618183 3.86698388 1.05946189 3.5006776 0.39320471
0.4927357 0.66159842 0.18235678 0.5172896 2.2221550
 [7,] 0.1961937 0.17305031 0.62327325 3.14622730 3.4905449 3.08823676
0.5282165 0.48879156 0.11807913 0.7372706 0.9975103
 [8,] 0.8392414 0.17107463 1.08732839 0.28981611 0.4722655 0.17587788
1.7426814 0.13985161 1.86885446 1.9031580 1.1088431
 [9,] 3.4648342 0.33834943 0.07891645 0.07206860 1.1792628 0.54092203
0.8141844 2.82687085 0.78395229 0.5313417 3.2164664
[10,] 0.1524299 0.02045616 0.67881610 1.51491647 2.4390115 0.89033581
0.9026651 0.46858464 1.47711638 2.1821178 2.2149071
[11,] 0.9397162 0.42376104 0.20034405 1.47672836 1.0461904 0.97296202
2.1658717 0.04487329 0.30611082 1.1680312 0.9952517

diag(mat1)=0

mat1[1:10,1:10]
           [,1]      [,2]       [,3]      [,4]        [,5]      [,6]
[,7]      [,8]       [,9]      [,10]
 [1,] 0.0000000 0.1357886 1.64090976 0.4494637 0.812315613 0.5328906
0.45672475 0.2891504 1.57087882 1.27375802
 [2,] 3.8229940 0.0000000 0.30189392 1.2100152 0.003323061 0.7195875
0.60251052 1.9820380 0.18637086 0.05154236
 [3,] 2.0411498 0.8371707 0.00000000 1.3032572 0.330472063 0.3502267
0.11908140 0.4155857 3.46471729 0.31890778
 [4,] 1.5503390 0.7377494 2.01433675 0.0000000 1.844484309 1.2492693
0.09365743 0.4006219 2.37769616 0.38643521
 [5,] 0.4815804 0.5824312 0.61003728 0.4782871 0.000000000 0.1207842
0.93567987 1.7369767 4.47922786 0.20033928
 [6,] 0.3791645 0.1015489 1.96832962 1.5417178 1.030250434 0.0000000
1.72807083 0.2570055 0.02127689 0.80225716
 [7,] 1.5212795 2.8133952 0.15990367 0.4337506 0.526532536 2.9926685
0.00000000 0.6064162 0.69264596 0.50871566
 [8,] 1.2600365 0.1901277 2.34806048 1.1472887 0.141571521 2.0355007
1.12583466 0.0000000 0.18707165 3.71877247
 [9,] 3.0197258 2.3693633 0.94571337 0.2756933 0.938999190 1.5892456
0.18612994 1.0498866 0.00000000 1.56643880
[10,] 0.3573243 0.3047595 0.01894034 2.4666841 0.660994174 0.2248711
0.25436398 1.1275389 0.20960212 0.00000000


## After calling the diag function the bottom of the matrix is all set to
0.


mat1[37990:38000,37990:38000]
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,]    0    0    0    0    0    0    0    0    0     0     0
 [2,]    0    0    0    0    0    0    0    0    0     0     0
 [3,]    0    0    0    0    0    0    0    0    0     0     0
 [4,]    0    0    0    0    0    0    0    0    0     0     0
 [5,]    0    0    0    0    0    0    0    0    0     0     0
 [6,]    0    0    0    0    0    0    0    0    0     0     0
 [7,]    0    0    0    0    0    0    0    0    0     0     0
 [8,]    0    0    0    0    0    0    0    0    0     0     0
 [9,]    0    0    0    0    0    0    0    0    0     0     0
[10,]    0    0    0    0    0    0    0    0    0     0     0
[11,]    0    0    0    0    0    0    0    0    0     0     0

It looks like there is an issue with larger matrices when calling diag
function and it has nothing to do with WGCNA.
On 8/20/13 9:43 AM, "Peter Langfelder" <peter.langfelder at gmail.com> wrote:

            
#
Hi Samuel,

WGCNA currently does not support calculations with matrices larger
than the old R limit, and it will take some time to update it to
support the large matrices. Furthermore, WGCNA relies on other
functions (most notably hclust) that would have to be updated as well
to support long vectors.

In the meantime I suggest using the "blockwise" functions to handle
large data sets, or, if possible, reducing the number of genes to less
than the old limit of 46340 or so.

Sorry I can't be of more help.

Best,

Peter

On Tue, Aug 20, 2013 at 10:42 AM, Shelton, Samuel
<SheltonS at stemcell.ucsf.edu> wrote:
#
Hi Peter,


But this is still an issue bellow the old R limit. I just tried the same
with a matrix of 30000*30000 and I see the same problem. This never used
to happen with R v2.15.2 and we could regularly build similarity matrices
of 45000*45000. This behavior of filling up the bottom of the matrix with
0's after calling diag is only happening with v3.0.0 and v3.0.1.

As I said I don't think that this is an issue with WGCNA but it has
implications for WGCNA because it limits the number of genes to be
included in network construction.

Thanks

Sam
On 8/20/13 10:53 AM, "Peter Langfelder" <peter.langfelder at gmail.com> wrote:

            
#
On Aug 20, 2013, at 19:42 , Shelton, Samuel wrote:

            
Thanks. I can condense this to
[1] 23169 23169
[1]     0 23170

and the fact that 2^14.5 is 23170.48 is not likely to be a coincidence...

It is only happening with some of my builds, though. In particular, my MacPorts build of 3.0.1 does not have the problem on Snow Leopard, nor does the CRAN build of 3.0.0, still on Snow Leopard. It takes forever to check on a 4GB machine....
#
On 21/08/2013 13:45, peter dalgaard wrote:
A much faster check is to look at M[1:3, 1:3]
Note that does not use the diag() function but diag<-(), which is 
essentially unchanged since 2.15.x (the error detection was moved above 
an expensive calculation).

It works correctly on x86_64 Linux and Solaris.  I suspect a 
platform-specific issue in

         x[cbind(i, i)] <- value
#
On Aug 21, 2013, at 16:00 , Prof Brian Ripley wrote:

            
That doesn't show the issue for larger values of 23171, though.
Likely. I'm not seeing it on the iMac/SnowLeopard, only on the MacPro/Lion. I'm upgrading the MacPorts R on the MacPro now to see whether it has issues too, but of course that reinstalls everything but the kitchen sink...
#
Thanks for looking into this for me.

I made a mistake with the version of Osx I was using it is actually 10.8.4 mountain lion. Interestingly it was doing the same on this particular iMac with snow leopard and that was the reason that I upgraded it to mountain lion. I found the same problem with v3.0.0 and with 3.0.1 both with snow leopard and mountain lion.  2.15.2 which is the same verion that we use on our cluster did not give this behaviour on either snow leopard or mountain lion.

Please let me know if you want me to do any more testing as I have access to machines with lots of ram.

Sam

Sent from my iPad
On Aug 21, 2013, at 5:46 AM, "peter dalgaard" <pdalgd at gmail.com> wrote:

            
#
On Aug 21, 2013, at 16:39 , peter dalgaard wrote:

            
Whoops. I don't know what I was thinking there. I seem to have suppressed all memory of the hard disk replacement on the iMac, and its aftereffects. Both machines are in fact running Lion! That makes things even odder...
#
In case this is helpful, I don't see this issue on my Mac Pro with OSX
version 10.7.5. Details below.
[1] 23170 23170
R version 3.0.1 (2013-05-16)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
$OS.type
[1] "unix"

$file.sep
[1] "/"

$dynlib.ext
[1] ".so"

$GUI
[1] "X11"

$endian
[1] "little"

$pkgType
[1] "mac.binary"

$path.sep
[1] ":"

$r_arch
[1] ""
$platform
[1] "x86_64-apple-darwin10.8.0"

$arch
[1] "x86_64"

$os
[1] "darwin10.8.0"

$system
[1] "x86_64, darwin10.8.0"

$status
[1] ""

$major
[1] "3"

$minor
[1] "0.1"

$year
[1] "2013"

$month
[1] "05"

$day
[1] "16"

$`svn rev`
[1] "62743"

$language
[1] "R"

$version.string
[1] "R version 3.0.1 (2013-05-16)"

$nickname
[1] "Good Sport"
On Wed, Aug 21, 2013 at 12:51 PM, peter dalgaard <pdalgd at gmail.com> wrote:
#
On 21/08/2013 15:00, Prof Brian Ripley wrote:
I have tracked this down to an issue with memcpy on vectors of 2^32 or 
more bytes.  That very likely explains why it appears in some OS X 
builds and not others (depending on the compiler and libc used), and not 
on other platforms.

I am looking into a workaround that only uses smaller sections for 
memcpy without losing all the performance gains.