[R-pkg-devel] Different unit test results on MacOS

Mon, Mar 2, 2026 8:06 PM

Hi,

Quick follow-up: after editing the tests on a branch of the repository so that they spit out various values for debug purposes, I have observed that the simulated datasets generated in the tests are the same on all OSs, but the values that the algorithm has at the point where it starts the EM algorithm stage are not. As I am short on time for  now, I am going to convert the unit tests of the reordering step so that they are applying reordering to a pre-run saved set of clustering output objects. In the long run, I need to diagnose where the differences are between the starting points generated by the three OSs.

So I have no need for further assistance right now, but thanks for reading.

Louise

From: Louise McMillan <louise.mcmillan at vuw.ac.nz>
Sent: Tuesday, 3 March 2026 11:36 am
To: r-package-devel at r-project.org <r-package-devel at r-project.org>
Subject: Different unit test results on MacOS

Hi,

My package is called clustord (Github latest version at github.com/vuw-clustering/clustord, and version 2.0.0 pushed to CRAN yesterday 2nd March 2026.)

I have an odd problem: I added a function to my package and an extensive set of unit tests for it, and the unit tests run correctly on Windows and Linux, but half of one single file out of the three test files runs differently on MacOS than it does on Windows and Linux, and fails the tests.

The package is a clustering package, and the new function is designed to be able to reorder the output clusters in order of their cluster effect sizes. The unit tests run the clustering algorithm and then the reorder function on a simulated dataset and then check the output orderings against what I've manually worked out the ordering should be.

The dataset simulation process uses randomness, and the clustering algorithm uses randomness, but the reordering does not. The start of each section of the test script is set.seed(), in order to ensure the dataset is always the same, and then that seed should also fix the output of the clustering algorithm that runs just after the dataset simulation. So therefore the results should always be the same on all operating systems. This is why I'm so puzzled that almost all of the different versions of this test work on MacOS as on Windows and Linux, but this particular version of the test runs differently on MacOS, even though I set the seed at the start of simulating the dataset for this specific test run.

Since I do not have a Mac, it is difficult for me to debug it, though I can see the error when the push to Github triggers the Github Actions check, which runs on multiple OSs.

The start of the section of the test script that's failing is:

------------------------------
    library(clustord)
    ## Dataset simulation
    set.seed(30)
    n <- 30
    p <- 5
    long_df_sim <- data.frame(Y=factor(sample(1:3,n*p,replace=TRUE)),
                              ROW=rep(1:n,times=p),COL=rep(1:p,each=n))

    xr1 <- runif(n, min=0, max=2)
    xr2 <- sample(c("A","B"),size=n, replace=TRUE, prob=c(0.3,0.7))
    xr3 <- factor(sample(1:4, size=n, replace=TRUE))

    xc1 <- runif(p, min=-1, max=1)

    long_df_sim$xr1 <- rep(xr1, times=5)
    long_df_sim$xr2 <- rep(xr2, times=5)
    long_df_sim$xr3 <- rep(xr3, times=5)
    long_df_sim$xc1 <- rep(xc1, each=30)

    ## Clustering algorithm
    # OSM results --------------------------------------------------------------
    ## Model 1 ----
    orig <- clustord(Y~ROWCLUST*xr1+xr2*xr3+COL, model="OSM", RG=4,
                     long_df=long_df_sim, nstarts=1, constraint_sum_zero = FALSE,
                     control_EM=list(maxiter=3,maxiter_start=2,keep_all_params=TRUE))
------------------------------

This section is just the dataset simulation and the clustering algorithm. The reordering checks afterwards are failing, but I think it's more likely it's because the clustering algorithm is somehow producing a different result on the Mac than because the reordering (which is deterministic) is somehow producing a different result on the Mac.

Iff you display orig$out_parlist after running the above code, I expect the $rowc values to be

$rowc
     rowc_1      rowc_2      rowc_3      rowc_4
 0.00000000  0.08713465 -0.26123294  0.05820879

I will keep investigating this myself, but if anyone has any suggestions why the randomness might be working slightly differently on the Mac, or any other possible causes for occasional mismatches between MacOS and other OSs, I would really appreciate reading them.

Thanks very much
Louise

[R-pkg-devel] Different unit test results on MacOS

Thread (5 messages)