random numbers with constraints - R-help

Wed, Jan 27, 2021 12:03 AM #

Hi,
I would like to generate random numbers in R with some constraints:
- my vector of numbers must contain 410 values;
- min value must be 9.6 and max value must be 11.6;
- sum of vector's values must be 4200.
Is there a way to do this in R?
And is it possible to generate this series in such a way that it follows a
specific distribution form (for example exponential)?
Thank you in advance,

D.

Ralf Goertz

Wed, Jan 27, 2021 1:50 AM #

Am Wed, 27 Jan 2021 09:03:15 +0100
schrieb Denis Francisci <denis.francisci at gmail.com>:

In principle it should be possible. But I guess you are asking too much
with three given values considering that you only have one paramter for
the exponential distribution. For instance, if you only had given min
and max, and wanted a normal distribution then you could have just taken
410 random values from a standard normal: x=rnorm(410) then centered it:
x=x-mean(x) then scaled it so its span equals the one for your given max
(M) and min (m) values: x=x*(M-m)/(max(x)-min(x)) and finally shift it
such that the mininum becomes m: x=x-min(x)+m. Note however, that the
things you are allowed to do with your vector of random numbers depend
on the distribution if you want the result to still follow that type of
distribution.

Abby Spurdle

Wed, Jan 27, 2021 1:57 AM #

u <- runif (410)
u <- (u - min (u) ) / diff (range (u) )

constrained.sample <- function (rate)
{   plim <- pexp (c (9.6, 11.6), rate)
    p <- plim [1] + diff (plim) * u
    qexp (p, rate)
}

diff.sum <- function (rate)
    sum (constrained.sample (rate) ) - 4200

rate <- uniroot (diff.sum, c (1, 2) )$root
q <- constrained.sample (rate)

length (q)
range (q)
sum (q)


On Wed, Jan 27, 2021 at 9:03 PM Denis Francisci

<denis.francisci at gmail.com> wrote:

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Denis Francisci

Wed, Jan 27, 2021 8:02 AM #

Wonderful!
This is exactly what I need!
Thank you very much!!

Denis



Il giorno mer 27 gen 2021 alle ore 10:58 Abby Spurdle <spurdle.a at gmail.com>
ha scritto:

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Abby Spurdle

Wed, Jan 27, 2021 11:48 AM #

I note that there's a possibility of floating point errors.
If all values have one digit after the decimal point, you could replace:
qexp (p, rate) with round (qexp (p, rate), 1).

However, sometimes uniroot will fail, due to problems with input.

On Thu, Jan 28, 2021 at 5:02 AM Denis Francisci

<denis.francisci at gmail.com> wrote:

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Denis Francisci

Thu, Jan 28, 2021 12:07 AM #

Thanks again for your help,
One digit after the decimal point is enough for my purposes; so, I can
round the qexp function, even if possible errors in floating points are not
a problem.
Thank you very very much,

Denis




Il giorno mer 27 gen 2021 alle ore 20:48 Abby Spurdle <spurdle.a at gmail.com>
ha scritto:

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Martin Maechler

Thu, Jan 28, 2021 7:42 AM #

> I note that there's a possibility of floating point errors.
    > If all values have one digit after the decimal point, you could replace:
    > qexp (p, rate) with round (qexp (p, rate), 1).

    > However, sometimes uniroot will fail, due to problems with input.

I also think the  constrained.sample() function should not
depend on the global variable 'u',
but rather depend on 'n'   and construct 'u' from  runif(n).

Martin

    > On Thu, Jan 28, 2021 at 5:02 AM Denis Francisci

> <denis.francisci at gmail.com> wrote:

>> 
    >> Wonderful!
    >> This is exactly what I need!
    >> Thank you very much!!
    >> 
    >> Denis
    >> 
    >> 
    >> 
    >> Il giorno mer 27 gen 2021 alle ore 10:58 Abby Spurdle <spurdle.a at gmail.com> ha scritto:
    >>> 
    >>> u <- runif (410)
    >>> u <- (u - min (u) ) / diff (range (u) )
    >>> 
    >>> constrained.sample <- function (rate)
    >>> {   plim <- pexp (c (9.6, 11.6), rate)
    >>> p <- plim [1] + diff (plim) * u
    >>> qexp (p, rate)
    >>> }
    >>> 
    >>> diff.sum <- function (rate)
    >>> sum (constrained.sample (rate) ) - 4200
    >>> 
    >>> rate <- uniroot (diff.sum, c (1, 2) )$root
    >>> q <- constrained.sample (rate)
    >>> 
    >>> length (q)
    >>> range (q)
    >>> sum (q)
    >>> 
    >>> 
    >>> On Wed, Jan 27, 2021 at 9:03 PM Denis Francisci

>>> <denis.francisci at gmail.com> wrote:

>>> >
    >>> > Hi,
    >>> > I would like to generate random numbers in R with some constraints:
    >>> > - my vector of numbers must contain 410 values;
    >>> > - min value must be 9.6 and max value must be 11.6;
    >>> > - sum of vector's values must be 4200.
    >>> > Is there a way to do this in R?
    >>> > And is it possible to generate this series in such a way that it follows a
    >>> > specific distribution form (for example exponential)?
    >>> > Thank you in advance,
    >>> >
    >>> > D.

Abby Spurdle

Thu, Jan 28, 2021 11:58 AM #

I recognize the problems with global data.
And my code could certainly be improved.

However, I also note that the random numbers (ignoring
transformations), need to be constant, while computing the rate.
Otherwise, my algorithm wouldn't work well.

As it is, rounding operations can cause "jumps".
Generating random numbers for each value of rate, would make those jumps bigger.

I'm thinking a better solution would be to put the entire code inside
a single function.

Although, in saying all of that, the uniroot approach, is not the best here.
A more "robust" root finding algorithm (or solver) would be far
better, but I'm not sure what that would mean, exactly...

Given the possible relevance of generating constrained samples, a
discussion on what a robust solver means, could be interesting...
Although, I'll leave it to you to agree or disagree, and if in
agreement, to suggest the best forum for such a discussion.


On Fri, Jan 29, 2021 at 4:42 AM Martin Maechler

<maechler at stat.math.ethz.ch> wrote:

Abby Spurdle
    on Thu, 28 Jan 2021 08:48:06 +1300 writes:

    > I note that there's a possibility of floating point errors.
    > If all values have one digit after the decimal point, you could replace:
    > qexp (p, rate) with round (qexp (p, rate), 1).

    > However, sometimes uniroot will fail, due to problems with input.

I also think the  constrained.sample() function should not
depend on the global variable 'u',
but rather depend on 'n'   and construct 'u' from  runif(n).

Martin

    > On Thu, Jan 28, 2021 at 5:02 AM Denis Francisci
    > <denis.francisci at gmail.com> wrote:

    >>
    >> Wonderful!
    >> This is exactly what I need!
    >> Thank you very much!!
    >>
    >> Denis
    >>
    >>
    >>
    >> Il giorno mer 27 gen 2021 alle ore 10:58 Abby Spurdle <spurdle.a at gmail.com> ha scritto:

    >>>
    >>> u <- runif (410)
    >>> u <- (u - min (u) ) / diff (range (u) )
    >>>
    >>> constrained.sample <- function (rate)
    >>> {   plim <- pexp (c (9.6, 11.6), rate)
    >>> p <- plim [1] + diff (plim) * u
    >>> qexp (p, rate)
    >>> }
    >>>
    >>> diff.sum <- function (rate)
    >>> sum (constrained.sample (rate) ) - 4200
    >>>
    >>> rate <- uniroot (diff.sum, c (1, 2) )$root
    >>> q <- constrained.sample (rate)
    >>>
    >>> length (q)
    >>> range (q)
    >>> sum (q)
    >>>
    >>>
    >>> On Wed, Jan 27, 2021 at 9:03 PM Denis Francisci
    >>> <denis.francisci at gmail.com> wrote:

    >>> >
    >>> > Hi,
    >>> > I would like to generate random numbers in R with some constraints:
    >>> > - my vector of numbers must contain 410 values;
    >>> > - min value must be 9.6 and max value must be 11.6;
    >>> > - sum of vector's values must be 4200.
    >>> > Is there a way to do this in R?
    >>> > And is it possible to generate this series in such a way that it follows a
    >>> > specific distribution form (for example exponential)?
    >>> > Thank you in advance,
    >>> >
    >>> > D.