Setting a different seed for each run of the code - python

I am running a code that could potentially benefit from different initialization(s) of random number generators. I use libraries torch and python. I am using the following lines of code to set random seed at the beginning of every iteration.
import numpy as np
import torch
seed = np.random.randint(0, 1000)
print(f"Seed: {seed}")
np.random.seed(seed)
torch.manual_seed(seed)
For some reason though, across (many) iterations I have observed that the seed is always set to one value, 688 in my case. What I do not understand is that the generation of the seed variable is not governed by the seed that is set later. So why does the same seed get set every time and how do I fix it? Thanks.

In your example, you initialize the default random number generator implicitly by not calling and providing seed for the RandomState class. In such cases, NumPy obtains an alternative source for the seed which may be not random enough.
Furthermore, it is not considered as a good practice to generate a random number on a small set of numbers and use it to seed the random number generator, because the probability that you'll generate the same seed is high. However, if you have similar seed values and a not too good initialization, it is a common practice to use a fast, tiny, but maybe not too good random number generator to create good quality seed values, or the whole initial state itself. But there is no need to do it manually, because NumPy's legacy random implementation follows a specific case of a scientifically sound approach [1] that ensures good different initial states even on similar (e.g. adjacent) seed values. I.e. you can seed your simulations with 0 to 1000, and the random numbers you get with NumPy in the different iterations will look completely different. You can also use this seed value to identify your calculation when you save it, or when you create statistics.
I am not sure about the implementation of random number generator in torch though. It seems it takes a 64-bit integer. If it suits your need, you can generate random numbers with NumPy's engine on this range and use at as a seed value. If you make 2 simulations, the probability that the 2 seed values are the same is 1/2^{64} ~ 5 * 10^{-20}.
With the example below, it is ensured that the state of NumPy's random generator is different in each iteration of the for loop, and the random state of torch is most probably different in each iteration
.
import numpy as np
import torch
max_sim = 3 # how many simulations you need
for numpy_seed in range(max_sim):
np.random.seed(numpy_seed)
torch_seed = np.random.randint(low=-2**63,
high=2**63,
dtype=np.int64)
print(torch_seed)
torch.manual_seed(torch_seed)
# do the rest of the simulation
# output:
# 900450186894289455
# -1530673954295414549
# -1180685649882019313
[1]: Matsumoto, Makoto and Wada, Isaku and Kuramoto, Ai and Ashihara, Hyo,
Title: Common Defects in Initialization of Pseudorandom Number Generators; around equation 30

I cannot reproduce your result as well like #iacob, and I believe the script to set the seed has no problem.

Related

How to keep the sequence of random numbers (normal distributions) the same? I tried random.seed(), but it didn't work

I'm trying to write some replicable Monte Carlo simulation, and need to fix the seed for the random number generator (so that when other people run it, they get exactly the same result).
I tried the following codes
import numpy as np
import random
random.seed(1)
N=10
mu=[0]
sig=[[1]]
a=np.random.multivariate_normal(mu, sig, N)
print(a)
But each time I run the code, it prints a different sequence. How could this be fixed? Thanks!
random and np.random aren't the same. If you use np.random then use np.random.seed.

Python Numpy: Random number in a loop

I have such code and use Jupyter-Notebook
for j in range(timesteps):
a_int = np.random.randint(largest_number/2) # int version
and i get random numbers, but when i try to move part of code to the functions, i start to receive same number in each iteration
def create_train_data():
np.random.seed(seed=int(time.time()))
a_int = np.random.randint(largest_number/2) # int version
return a
for j in range(timesteps):
c = create_train_data()
Why it's happend and how to fix it? i think maybe it because of processes in Jupyter-Notebook
The offending line of code is
np.random.seed(seed=int(time.time()))
Since you're executing in a loop that completes fairly quickly, calling int() on the time reduces your random seed to the same number for the entire loop. If you really want to manually set the seed, the following is a more robust approach.
def create_train_data():
a_int = np.random.randint(largest_number/2) # int version
return a
np.random.seed(seed=int(time.time()))
for j in range(timesteps):
c = create_train_data()
Note how the seed is being created once and then used for the entire loop, so that every time a random integer is called the seed changes without being reset.
Note that numpy already takes care of a pseudo-random seed. You're not gaining more random results by using it. A common reason for manually setting the seed is to ensure reproducibility. You set the seed at the start of your program (top of your notebook) to some fixed integer (I see 42 in a lot of tutorials), and then all the calculations follow from that seed. If somebody wants to verify your results, the stochasticity of the algorithms can't be a confounding factor.
The other answers are correct in saying that it is because of the seed. If you look at the Documentation From SciPy you will see that seeds are used to create a predictable random sequence. However, I think the following answer from another question regarding seeds gives a better overview of what it does and why/where to use it.
What does numpy.random.seed(0) do?
Hans Musgrave's answer is great if you are happy with pseudo-random numbers. Pseudo-random numbers are good for most applications but they are problematic if used for cryptography.
The standard approach for getting one truly random number is seeding the random number generator with the system time before pulling the number, like you tried. However, as Hans Musgrave pointed out, if you cast the time to int, you get the time in seconds which will most likely be the same throughout the loop. The correct solution to seed the RNG with a time is:
def create_train_data():
np.random.seed()
a_int = np.random.randint(largest_number/2) # int version
return a
This works because Numpy already uses the computer clock or another source of randomness for the seed if you pass no arguments (or None) to np.random.seed:
Parameters: seed : {None, int, array_like}, optional Random seed used
to initialize the pseudo-random number generator. Can be any integer
between 0 and 2**32 - 1 inclusive, an array (or other sequence) of
such integers, or None (the default). If seed is None, then
RandomState will try to read data from /dev/urandom (or the Windows
analogue) if available or seed from the clock otherwise.
It all depends on your application though. Do note the warning in the docs:
Warning The pseudo-random generators of this module should not be used
for security purposes. For security or cryptographic uses, see the
secrets module.

How does one choose the seed using noise.py module

How does one choose the seed for the noise module in python?
I have this bit of code:
from noise import snoise2
terrainTiles[varX][varY].set_elevation(snoise2(x=varX/20,y=varY/20,octaves=1))
And it does create proper noise; however, i am unable to change the seed. I have been searching for hours and have yet to find a solution. Thanks!
Simplier example of the function:
from noise inport snoise2
print(snoise2(10,10))
SOLUTION
I found a solution separate of noise.py. I used this script I found on github: https://gist.github.com/eevee/26f547457522755cb1fb8739d0ea89a1
This also does not have a seed function, BUT, it has an unbias function so that far out coordinates still have proper noise. I used a 3 dimensional noise function where the 3rd dimension value is essentially the seed. Code shown here:
#generate world seed
worldSeed = random.randint(0, 100000000)
#generate noise objects. I hate this but im ghettoing it so that the 3rd dimension value is essentially the seed, I hate this but it works
elevationNoise = noise.PerlinNoiseFactory(dimension=3, octaves=1, unbias=True)
and it being applied to a value:
terrainTiles[varX][varY].set_elevation(elevationNoise(varX/20,varY/20,worldSeed)*1.15)
there is no seed parameter so to say, however there is a base parameter which you can modify which specifies an offset for the noise coordinates. For example:
import random
from noise import snoise2
seed = random.random()
print snoise2(10, 10, base=seed)
where base requires a float
so for your first example you should just be able to add base=seed to your snoise2(..):
terrainTiles[varX][varY].set_elevation(snoise2(x=varX/20,y=varY/20,octaves=1, base=seed))

random: what is the default seed?

For Python 3, I can find many different places on the internet stating that the default seed for the random module is based on system time.
Is this also the case for Python 2.7? I imagine it is, because if I start two different Python processes, and in both I do import random; random.random() then the two different processes return different results.
If it does use system time, what is the actual seed used? (E.g. "number of seconds since midnight" or "number of microseconds since UNIX epoch", or ...)
If not, what is used to seed the PRNG?
This is the source code about how to generate default seed for a Random object.
try:
# Seed with enough bytes to span the 19937 bit
# state space for the Mersenne Twister
a = long(_hexlify(_urandom(2500)), 16)
except NotImplementedError:
import time
a = long(time.time() * 256) # use fractional seconds
urandom equals to os.urandom. And for more information about urandom, please check this page.

Sympy reconfigures the randomness seed

The use of Python symbolic computation module "Sympy" in a simulation is very difficult, I need to have reliable fixed inputs, for that I use the seed() in the random module.
However every time I call a simple sympy function, it seems to overwrites the seed with a new value, thus getting new output every time. I have searched a little bit and found this. But neither of them has a solution.
Consider this code:
from sympy import *
import random
random.seed(1)
for _ in range(2):
x = symbols('x')
equ = (x** random.randint(1,5)) ** Rational(random.randint(1,5)/2)
print(equ)
This outputs
(x**2)**(5/2)
x**4
on the first run, and
(x**2)**(5/2)
(x**5)**(3/2)
On the second run, and every-time I run the script it returns new output. I need a way to fix this to enforce the use of seed().
Does this help? From the docs on random:
"You can instantiate your own instances of Random to get generators that don’t share state"
Usage:
import random
# Create a new pseudo random number generator
prng = random.Random()
prng.seed(1)
This number generator will be unaffected by sympy

Categories