Skip to content
Prev 391080 / 398506 Next

How important is set.seed

First off, "ML models" do not all use random numbers (for prediction I would guess very few of them do). Learn and pay attention to what the functions you are using do.

Second, if you use random numbers properly and understand the precision that your specific use case offers, then you don't need to use set.seed. However, in practice, using set.seed can allow you to temporarily avoid chasing precision gremlins, or set up specific test cases for testing code, not results. It is your responsibility to not let this become a crutch... a randomized simulation that is actually sensitive to the seed is unlikely to offer an accurate result.

Where to put set.seed depends a lot on how you are performing your simulations. In general each process should set it once uniquely at the beginning, and if you use parallel processing then use the features of your parallel processing framework to insure that this happens. Beware of setting all worker processes to use the same seed.
On March 21, 2022 5:03:30 PM PDT, Neha gupta <neha.bologna90 at gmail.com> wrote: