The use of Python symbolic computation module "Sympy" in a simulation is very difficult, I need to have reliable fixed inputs, for that I use the seed() in the random module.
However every time I call a simple sympy function, it seems to overwrites the seed with a new value, thus getting new output every time. I have searched a little bit and found this. But neither of them has a solution.
Consider this code:
from sympy import *
import random
random.seed(1)
for _ in range(2):
x = symbols('x')
equ = (x** random.randint(1,5)) ** Rational(random.randint(1,5)/2)
print(equ)
This outputs
(x**2)**(5/2)
x**4
on the first run, and
(x**2)**(5/2)
(x**5)**(3/2)
On the second run, and every-time I run the script it returns new output. I need a way to fix this to enforce the use of seed().
Does this help? From the docs on random:
"You can instantiate your own instances of Random to get generators that don’t share state"
Usage:
import random
# Create a new pseudo random number generator
prng = random.Random()
prng.seed(1)
This number generator will be unaffected by sympy
Related
I'm trying to write some replicable Monte Carlo simulation, and need to fix the seed for the random number generator (so that when other people run it, they get exactly the same result).
I tried the following codes
import numpy as np
import random
random.seed(1)
N=10
mu=[0]
sig=[[1]]
a=np.random.multivariate_normal(mu, sig, N)
print(a)
But each time I run the code, it prints a different sequence. How could this be fixed? Thanks!
random and np.random aren't the same. If you use np.random then use np.random.seed.
I have a large Python code that I've been maintaining/updating/expanding since ~2014. Recently I came across numpy's Random Number Generator Policy (2018-05) and now I'm a bit confused.
I'm not sure what changed, and if I should upgrade my code accordingly to use the new Random Generator. For example, the Random sampling docs say:
# Do this
from numpy.random import default_rng
rng = default_rng()
vals = rng.standard_normal(10)
more_vals = rng.standard_normal(10)
# instead of this
from numpy import random
vals = random.standard_normal(10)
more_vals = random.standard_normal(10)
All my code depends on the (old?) syntax shown in the second block (i.e., I don't use default_rng but simple calls to np.random.seed(), np.random.uniform(), np.random.normal(), etc), and I don't know why I should use the first block instead of the second block.
Could someone shed some light over this please?
1.In python2(old code) default_rng is not available.
2.In python3(new code) both first and second blocks you mentioned will runs without an error and executed.
3.In future they may drop the random.standard_normal from coming versions of python,that's why they mentioned to use of default_rng instead of random.standard_normal
I have such code and use Jupyter-Notebook
for j in range(timesteps):
a_int = np.random.randint(largest_number/2) # int version
and i get random numbers, but when i try to move part of code to the functions, i start to receive same number in each iteration
def create_train_data():
np.random.seed(seed=int(time.time()))
a_int = np.random.randint(largest_number/2) # int version
return a
for j in range(timesteps):
c = create_train_data()
Why it's happend and how to fix it? i think maybe it because of processes in Jupyter-Notebook
The offending line of code is
np.random.seed(seed=int(time.time()))
Since you're executing in a loop that completes fairly quickly, calling int() on the time reduces your random seed to the same number for the entire loop. If you really want to manually set the seed, the following is a more robust approach.
def create_train_data():
a_int = np.random.randint(largest_number/2) # int version
return a
np.random.seed(seed=int(time.time()))
for j in range(timesteps):
c = create_train_data()
Note how the seed is being created once and then used for the entire loop, so that every time a random integer is called the seed changes without being reset.
Note that numpy already takes care of a pseudo-random seed. You're not gaining more random results by using it. A common reason for manually setting the seed is to ensure reproducibility. You set the seed at the start of your program (top of your notebook) to some fixed integer (I see 42 in a lot of tutorials), and then all the calculations follow from that seed. If somebody wants to verify your results, the stochasticity of the algorithms can't be a confounding factor.
The other answers are correct in saying that it is because of the seed. If you look at the Documentation From SciPy you will see that seeds are used to create a predictable random sequence. However, I think the following answer from another question regarding seeds gives a better overview of what it does and why/where to use it.
What does numpy.random.seed(0) do?
Hans Musgrave's answer is great if you are happy with pseudo-random numbers. Pseudo-random numbers are good for most applications but they are problematic if used for cryptography.
The standard approach for getting one truly random number is seeding the random number generator with the system time before pulling the number, like you tried. However, as Hans Musgrave pointed out, if you cast the time to int, you get the time in seconds which will most likely be the same throughout the loop. The correct solution to seed the RNG with a time is:
def create_train_data():
np.random.seed()
a_int = np.random.randint(largest_number/2) # int version
return a
This works because Numpy already uses the computer clock or another source of randomness for the seed if you pass no arguments (or None) to np.random.seed:
Parameters: seed : {None, int, array_like}, optional Random seed used
to initialize the pseudo-random number generator. Can be any integer
between 0 and 2**32 - 1 inclusive, an array (or other sequence) of
such integers, or None (the default). If seed is None, then
RandomState will try to read data from /dev/urandom (or the Windows
analogue) if available or seed from the clock otherwise.
It all depends on your application though. Do note the warning in the docs:
Warning The pseudo-random generators of this module should not be used
for security purposes. For security or cryptographic uses, see the
secrets module.
My question is the exact opposite of this one.
This is an excerpt from my test file
f1 = open('seed1234','r')
f2 = open('seed7883','r')
s1 = eval(f1.read())
s2 = eval(f2.read())
f1.close()
f2.close()
####
test_sampler1.random_inst.setstate(s1)
out1 = test_sampler1.run()
self.assertEqual(out1,self.out1_regress) # this is fine and passes
test_sampler2.random_inst.setstate(s2)
out2 = test_sampler2.run()
self.assertEqual(out2,self.out2_regress) # this FAILS
Some info -
test_sampler1 and test_sampler2 are 2 object from a class that performs some stochastic sampling. The class has an attribute random_inst which is an object of type random.Random(). The file seed1234 contains a TestSampler's random_inst's state as returned by random.getstate() when it was given a seed of 1234 and you can guess what seed7883 is. What I did was I created a TestSampler in the terminal, gave it a random seed of 1234, acquired the state with rand_inst.getstate() and save it to a file. I then recreate the regression test and I always get the same output.
HOWEVER
The same procedure as above doesn't work for test_sampler2 - whatever I do not get the same random sequence of numbers. I am using python's random module and I am not importing it anywhere else, but I do use numpy in some places (but not numpy.random).
The only difference between test_sampler1 and test_sampler2 is that they are created from 2 different files. I know this is a big deal and it is totally dependent on the code I wrote but I also can't simply paste ~800 lines of code here, I am merely looking for some general idea of what I might be messing up...
What might be scrambling the state of test_sampler2's random number generator?
Solution
There were 2 separate issues with my code:
1
My script is a command line script and after I refactored it to use python's optparse library I found out that I was setting the seed for my sampler using something like seed = sys.argv[1] which meant that I was setting the seed to be a str, not an int - seed can take any hashable object and I found it the hard way. This explains why I would get 2 different sequences if I used the same seed - one if I run my script from the command line with sth like python sample 1234 #seed is 1234 and from my unit_tests.py file when I would create an object instance like test_sampler1 = TestSampler(seed=1234).
2
I have a function for discrete distribution sampling which I borrowed from here (look at the accepted answer). The code there was missing something fundamental: it was still non-deterministic in the sense that if you give it the same values and probabilities array, but transformed by a permutation (say values ['a','b'] and probs [0.1,0.9] and values ['b','a'] and probabilities [0.9,0.1]) and the seed is set and you will get the same random sample, say 0.3, by the PRNG, but since the intervals for your probabilities are different, in one case you'll get a b and in one an a. To fix it, I just zipped the values and probabilities together, sorted by probability and tadaa - I now always get the same probability intervals.
After fixing both issues the code worked as expected i.e. out2 started behaving deterministically.
The only thing (apart from an internal Python bug) that can change the state of a random.Random instance is calling methods on that instance. So the problem lies in something you haven't shown us. Here's a little test program:
from random import Random
r1 = Random()
r2 = Random()
for _ in range(100):
r1.random()
for _ in range(200):
r2.random()
r1state = r1.getstate()
r2state = r2.getstate()
with open("r1state", "w") as f:
print >> f, r1state
with open("r2state", "w") as f:
print >> f, r2state
for _ in range(100):
with open("r1state") as f:
r1.setstate(eval(f.read()))
with open("r2state") as f:
r2.setstate(eval(f.read()))
assert r1state == r1.getstate()
assert r2state == r2.getstate()
I haven't run that all day, but I bet I could and never see a failing assert ;-)
BTW, it's certainly more common to use pickle for this kind of thing, but it's not going to solve your real problem. The problem is not in getting or setting the state. The problem is that something you haven't yet found is calling methods on your random.Random instance(s).
While it's a major pain in the butt to do so, you could try adding print statements to random.py to find out what's doing it. There are cleverer ways to do that, but better to keep it dirt simple so that you don't end up actually debugging the debugging code.
I got an issue which is, in my code,anyone can help will be great.
this is the example code.
from random import *
from numpy import *
r=array([uniform(-R,R),uniform(-R,R),uniform(-R,R)])
def Ft(r):
for i in range(3):
do something here, call r
return something
however I found that in python shell, every time I run function Ft, it gives me different
result.....seems like within the function, in each iterate of the for loop,call r once, it gives random numbers once... but not fix the initial random number when I call the function....how can I fix it?
how about use b=copy(r) then call b in the Ft function?
Thanks
Do you mean that you want the calls to randon.uniform() to return the same sequence of values each time you run the function?
If so, you need to call random.seed() to set the start of the sequence to a fixed value. If you don't, the current system time is used to initialise the random number generator, which is intended to cause it to generate a different sequence every time.
Something like this should work
random.seed(42) # Set the random number generator to a fixed sequence.
r = array([uniform(-R,R), uniform(-R,R), uniform(-R,R)])
I think you mean 'list' instead of 'array', you're trying to use functions when you really don't need to. If I understand you correctly, you want to edit a list of random floats:
import random
r=[random.uniform(-R,R) for x in range(3)]
def ft(r):
for i in range(len(r)):
r[i]=???