Skip to content
Prev 391086 / 398506 Next

How important is set.seed

Jeff Newmiller makes an interesting point about distributed processing, but I don?t know how to use the usual pseudo-random processes to obtain deterministic results when I don?t know how the data will be sharded. You might have to replace pseudo-random sampling with deterministic sampling using a hash of something involving the unique key. Then the selection of a salt is the equivalent of a call to set.seed in non-parallel processing. The results should be the same as long as you fix the data set & the salt, and then you can test sensitivity to changes in the salt.
Jorgen Harmse


From: Neha gupta <neha.bologna90 at gmail.com>
To: "Ebert,Timothy Aaron" <tebert at ufl.edu>
Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>, "r-help at r-project.org"
        <r-help at r-project.org>
Subject: Re: [R] How important is set.seed
Message-ID:
        <CA+nrPnurAqBUgbrP-Oq4a8eo4Y7CO-k5xfH8c3EK-DGNCscidw at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Thank you all.

Actually I need set.seed because I have to evaluate the consistency of
features selection generated by different models, so I think for this, it's
recommended to use the seed.

Warm regards
On Tuesday, March 22, 2022, Ebert,Timothy Aaron <tebert at ufl.edu> wrote:

            
------------------------------

Subject: Digest Footer

_______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


------------------------------

End of R-help Digest, Vol 229, Issue 20
***************************************
Message-ID: <MW4PR01MB646524FDC9E220F6F3452ADBDC179@MW4PR01MB6465.prod.exchangelabs.com>
In-Reply-To: <mailman.366150.1.1647946802.2091.r-help@r-project.org>