Skip to content
Prev 359952 / 398503 Next

[FORGED] Generating random data with non-linear correlation between two variables

Not really...

You can do lots of stuff with random number generation in R, but it is not clear to what extent we should take your requirements seriously. E.g., you say you want the range of v1 to be 500-1500 and the mean to be 1100. It is easy enough to generate uniform random numbers between 500 and 1500:
but the theoretical mean of v1 is 1000, not 1100:
[1] 985.7375

To increase the mean, on could play around with scaled beta distributions, e.g.
[1] 1093.685

but it is not clear how you ever passed the same requirement to Oracle. 

Next, you wanted v2 with the following requirements

v2 between 300 and 850
mean(v2) == 400
v2 nonlinearly related to v1

If we postulate a relation where the conditional expectation is, like, 
E(v2 | v1) = a0 + a1 * v1 - a2 * v1^2, and v1 is as above, then the constants can be twiddled to satisfy E(v2) = 400. Then to generate random output with that mean and range, one could again use a scaled beta distribution. 

It is, however, not at all clear that this is in fact the kind of solution that you want....

-pd