Possible causes of unexpected behavior

Dear all,

I am currently having a weird problem with a large-scale optimization
routine. It would be nice to know if any of you have already gone through
something similar, and how you solved it.

I apologize in advance for not providing an example, but I think the
non-reproducibility of the error is maybe a key point of this problem.

Simplest possible description of the problem: I have two functions: g(X)
and f(v).
g(X) does:
 i) inputs a large matrix X;
 ii) derives four other matrices from X (I'll call them A, B, C and D) then
saves to disk for debugging purposes;

Then, f(v) does:
 iii) loads A, B, C, D from disk
 iv) calculates the log-likelihood, which vary according to a vector of
parameters, v.

My goal application is quite big (X is a 40000x40000 matrix), so I created
the following versions to test and run the codes/math/parallelization:
#1) A simulated example with X being 100x100
#2) A degraded version of the goal application, with X being 4000x4000
#3) The goal application, with X being 40000x40000

When I use qsub to submit the job, using the exact same code and processing
cluster, #1 and #2 run flawlessly, so no problem. These results tell me
that the codes/math/parallelization are fine.

For application #3, it converges to a vector v*. However, when I manually
load A, B, C and D from disk and calculate f(v*), then the value I get is
completely different.
For example:
- qsub job says v* = c(0, 1, 2, 3) is a minimum with f(v*) = 1.
- when I manually load A, B, C, D from disk and calculate f(v*) on the
exact same machine with the same libraries and environment variables, I get
f(v*) = 1000.

This is a very confusing behavior. In theory the size of X should not
affect my problem, but it seems that things get unstable as the dimension
grows. The main issue for debugging is that g(X) for simulation #3 takes
two hours to run, and I am completely lost on how I could find the causes
of the problem. Would you have any general advices?

Thank you very much in advance for literally any suggestions you might have!

Best regards,
Arthur