Is there any way to find out what seed Python used to seed its random number generator?
I know I can specify my own seed, but I'm quite happy with Python managing it. But, I do want to know what seed it used, so that if I like the results I'm getting in a particular run, I could reproduce that run later. If I had the seed that was used then I could.
If the answer is I can't, then what's the best way to generate a seed myself? I want them to always be different from run to run---I just want to know what was used.
UPDATE: yes, I mean random.random()! mistake... [title updated]
It is not possible to get the automatic seed back out from the generator. I normally generate seeds like this:
seed = random.randrange(sys.maxsize)
rng = random.Random(seed)
print("Seed was:", seed)
This way it is time-based, so each time you run the script (manually) it will be different, but if you are using multiple generators they won't have the same seed simply because they were created almost simultaneously.
The state of the random number generator isn't always simply a seed. For example, a secure PRNG typically has an entropy buffer, which is a larger block of data.
You can, however, save and restore the entire state of the randon number generator, so you can reproduce its results later on:
import random
old_state = random.getstate()
print random.random()
random.setstate(old_state)
print random.random()
# You can also restore the state into your own instance of the PRNG, to avoid
# thread-safety issues from using the default, global instance.
prng = random.Random()
prng.setstate(old_state)
print prng.random()
The results of getstate can, of course, be pickled if you want to save it persistently.
http://docs.python.org/library/random.html#random.getstate
You can subclass the random.Random, rewrite the seed() method the same way python does (v3.5 in this example) but storing seed value in a variable before calling super():
import random
class Random(random.Random):
def seed(self, a=None, version=2):
from os import urandom as _urandom
from hashlib import sha512 as _sha512
if a is None:
try:
# Seed with enough bytes to span the 19937 bit
# state space for the Mersenne Twister
a = int.from_bytes(_urandom(2500), 'big')
except NotImplementedError:
import time
a = int(time.time() * 256) # use fractional seconds
if version == 2:
if isinstance(a, (str, bytes, bytearray)):
if isinstance(a, str):
a = a.encode()
a += _sha512(a).digest()
a = int.from_bytes(a, 'big')
self._current_seed = a
super().seed(a)
def get_seed(self):
return self._current_seed
If you test it, a first random value generated with a new seed and a second value generated using the same seed (with the get_seed() method we created) will be equal:
>>> rnd1 = Random()
>>> seed = rnd1.get_seed()
>>> v1 = rnd1.randint(1, 0x260)
>>> rnd2 = Random(seed)
>>> v2 = rnd2.randint(1, 0x260)
>>> v1 == v2
True
If you store/copy the huge seed value and try using it in another session the value generated will be exactly the same.
Since no one mentioned that usually the best random sample you could get in any programming language is generated through the operating system I have to provide the following code:
random_data = os.urandom(8)
seed = int.from_bytes(random_data, byteorder="big")
this is cryptographically secure.
Source: https://www.quora.com/What-is-the-best-way-to-generate-random-seeds-in-python
with a value 8 it seems to produce around the same number of digits as sys.maxsize for me.
>>> int.from_bytes(os.urandom(8), byteorder="big")
17520563261454622261
>>> sys.maxsize
9223372036854775807
>>>
If you "set" the seed using random.seed(None), the randomizer is automatically seeded as a function the system time. However, you can't access this value, as you observed. What I do when I want to randomize but still know the seed is this:
tim = datetime.datetime.now()
randseed = tim.hour*10000+tim.minute*100+tim.second
random.seed(randseed)
note: the reason I prefer this to using time.time() as proposed by #Abdallah is because this way the randseed is human-readable and immediately understandable, which often has big benefits. Date components and even microsegments could also be added as needed.
I wanted to do the same thing but I could not get the seed. So, I thought since the seed is generated from time. I created my seed using the system time and used it as a seed so now I know which seed was used.
SEED = int(time.time())
random.seed(SEED)
The seed is an internal variable in the random package which is used to create the next random number. When a new number is requested, the seed is updated, too.
I would simple use 0 as a seed if you want to be sure to have the same random numbers every time, or make i configurable.
CorelDraw once had a random pattern generator, which was initialized with a seed. Patterns varied drastically for different seeds, so the seed was important configuration information of the pattern. It should be part of the config options for your runs.
EDIT: As noted by ephemient, the internal state of a random number generator may be more complex than the seed, depending on its implementation.
Related
I'm looking for some info around generating as random of number as possible when the random module is embedded within a function like so:
import random as rd
def coinFlip()
flip = rd.random()
if flip > .5:
return "Heads"
else:
return "Tails"
main()
for i in range(1000000):
print(coinFlip())
Edit: Ideally the above script would always yield different results therefore limiting my ability to use random.seed()
Does the random module embedded within a function initialize with a new seed each time the function is called? (Instead of using the previous generated random number as the seed.)
If so...
Is the default initialization on system time exact enough to pull a truly random number considering that the system times in the for loop here would be so close together or maybe even the same (depending on the precision of the system time.)
Is there a way to initialize a random module outside of the function and have the function pull the next random number (so to avoid multiple initializations.)
Any other more pythonic ways to accomplish this?
Thank you very much!
use random.seed() if you want to initialize the pseudo-random number generator
you can have a look here
If you don’t initialize the pseudo-random number generator using a
random.seed (), internally random generator call the seed function and
use current system current time value as the seed value. That’s why
whenever we execute random.random() we always get a different value
if you want to always have a diff number than you should not bother with initializing the random module since internally, the random module it is using by default the current system time(which is always diff).
just use :
from random import random
def coinFlip()
if random() > .5:
return "Heads"
else:
return "Tails"
to make more clear, the random module it is not initializing each time it is used, only at import time, so every time you call random.random() you have the next number which is guaranteed to be different
For starters:
This module implements pseudo-random number generators for various distributions.
[..]
The functions supplied by this module are actually bound methods of a hidden instance of the random.Random class. You can instantiate your own instances of Random to get generators that don’t share state.
https://docs.python.org/3/library/random.html
The random module is a Pseudo-Random Number Generator. All PRNGs are entirely deterministic and have state. Meaning, if the PRNG is in the same state, the next "random" number will always be the same. As the above paragraph explains, your rd.random() call is really a call to an implicitly instantiated Random object.
So:
Does the random module embedded within a function initialize with a new seed each time the function is called?
No.
Is there a way to initialize a random module outside of the function and have the function pull the next random number (so to avoid multiple initializations.)
You don't need to avoid multiple initialisation, as it's not happening. You can instantiate your own Random object if you want to control the state exactly.
class random.Random([seed])
Class that implements the default pseudo-random number generator used by the random module.
random.seed(a=None, version=2)
Initialize the random number generator. If a is omitted or None, the current system time is used. [..]
So, the implicitly instantiated Random object uses the system time as initial seed (read further though), and from there will keep state. So each time you start your Python instance, it will be seeded differently, but will be seeded only once.
I have read that the random module in Python uses the previously generated value as the seed except for the first time where it uses the system time.
(https://stackoverflow.com/a/22639752/11455105, https://pynative.com/python-random-seed/)
If this is true, why don't I get the same value when I explicitly set the previously generated value as the new seed like this:
random.seed(random.randint(1, 100))
The same doesn't work for the random.random() method either.
>>> import random
>>> random.seed(20)
>>> random.randint(1,100)
93
>>> random.randint(1,100)
88
>>> random.seed(20)
>>> random.randint(1,100)
93
>>> random.randint(1,100)
88
>>> random.seed(20)
>>> random.seed(random.randint(1,100))
>>> random.randint(1,100)
64
Why didn't the last randint() call not give 88?
Thanks in Advance!
Because what you read was false, or you misunderstood what it said. CPython uses the Mersenne Twister generator under the covers, which has a state consuming 19937 bits. What you pass to .seed() is not the new state, but merely a pile of bits which is expanded to a full 19937-bit state via an undocumented (implementation-dependent) algorithm.
Note: if you want to save and restore states, that's what the .getstate() and .setstate() methods are for.
Python random module does not use the previously generated value as the seed. However, Python uses the Mersenne Twister generator to create pseudo-randomness. This algorithm is deterministic: that implies that it next state (the next generated number) depends on the previous one. This is different from the seed which is a value used to configure the initial state of the generator.
On the random module python page (Link Here) there is this warning:
Warning: The pseudo-random generators of this module should not be used for security purposes. Use os.urandom() or SystemRandom if you
require a cryptographically secure pseudo-random number generator.
So whats the difference between os.urandom() and random?
Is one closer to a true random than the other?
Would the secure random be overkill in non-cryptographic instances?
Are there any other random modules in python?
You can read up on the distinction of cryptographically secure RNG in this fantastic answer over at Crypto.SE.
The main distinction between random and the system RNG like urandom is one of use cases. random implements deterministic PRNGs. There are scenarios where you want exactly those. For instance when you have an algorithm with a random element which you want to test, and you need those tests to be repeatable. In that case you want a deterministic PRNG which you can seed.
urandom on the other hand cannot be seeded and draws its source of entropy from many unpredictable sources, making it more random.
True random is something else yet and you'd need a physical source of randomness like something that measures atomic decay; that is truly random in the physical sense, but usually overkill for most applications.
So whats the difference between os.urandom() and random?
Random itself is predicable. That means that given the same seed the sequence of numbers generated by random is the same. Take a look at this question for a better explanation. This question also illustrates than random isn't really random.
This is generally the case for most programming languages - the generation of random numbers is not truly random. You can use these numbers when
cryptographic security is not a concern or if you want the same pattern of numbers to be generated.
Is one closer to a true random than the other?
Not sure how to answer this question because truly random numbers cannot be generated. Take a look at this article or this question for more information.
Since random generates a repeatable pattern I would say that os.urandom() is certainly more "random"
Would the secure random be overkill in non-cryptographic instances?
I wrote the following functions and there doesn't appear to be a huge time difference. However, if you don't need cryptographically secure numbers
it doesn't really make sense to use os.urandom(). Again it comes down to the use case, do you want a repeatable pattern, how "random" do you want your numbers, etc?
import time
import os
import random
def generate_random_numbers(x):
start = time.time()
random_numbers = []
for _ in range(x):
random_numbers.append(random.randrange(1,10,1))
end = time.time()
print(end - start)
def generate_secure_randoms(x):
start = time.time()
random_numbers = []
for _ in range(x):
random_numbers.append(os.urandom(1))
end = time.time()
print(end - start)
generate_random_numbers(10000)
generate_secure_randoms(10000)
Results:
0.016040563583374023
0.013456106185913086
Are there any other random modules in python?
Python 3.6 introduces the new secrets module
random implements a pseudo random number generator. Knowing the algorithm and the parameters we can predict the generated sequence. At the end of the text is a possible implementation of a linear pseudo random generator in Python, that shows the generator can be a simple linear function.
os.urandom uses system entropy sources to have better random generation. Entropy sources are something that we cannot predict, like asynchronous events. For instance the frequency that we hit the keyboard keys cannot be predicted.
Interrupts from other devices can also be unpredictable.
In the random module there is a class: SystemRandom which uses os.urandom() to generate random numbers.
Actually, it cannot be proven if a given sequence is Random or NOT. Andrey Kolmogorov work this out extensively around 1960s.
One can think that a sequence is random when the rules to obtain the sequence, in any given language, are larger than the sequence itself. Take for instance the following sequence, which seems random:
264338327950288419716939937510
However we can represent it also as:
pi digits 21 to 50
Since we found a way to represent the sequence smaller than the sequence itself, the sequence is not random. We could even think of a more compact language to represent it, say:
pi[21,50]
or yet another.
But the smaller rules, in the most compact language (or the smaller algorithm, if you will), to generate the sequence may never be found, even if it exists.
This finding depends only on human intelligence which is not absolute.
There might be a definitive way to prove if a sequence is random, but we will only know it when someone finds it. Or maybe there is no way to prove if randomness even exists.
An implementation of a LCG (Linear congruent generator) in Python can be:
from datetime import datetime
class LCG:
defaultSeed = 0
defaultMultiplier = 1664525
defaultIncrement = 1013904223
defaultModulus = 0x100000000
def __init__(self, seed, a, c, m):
self._x0 = seed #seed
self._a = a #multiplier
self._c = c #increment
self._m = m #modulus
#classmethod
def lcg(cls, seed = None):
if seed is None: seed = cls.defaultSeed
return LCG(int(seed), cls.defaultMultiplier,
cls.defaultIncrement, cls.defaultModulus)
#pre: bound > 0
#returns: pseudo random integer in [0, bound[
def randint(self, bound):
self._x0 = (self._a * self._x0 + self._c) % self._m
return int(abs(self._x0 % bound))
#generate a sequence of 20 digits
rnd = LCG.lcg(datetime.now().timestamp()) #diff seed every time
for i in range(20):
print(rnd.randint(10), end='')
print()
How does Python seed its Mersenne twister pseudorandom number generator used in the built-in random library if no explicit seed value is provided? Is it based on the clock somehow? If so, is the seed found when the random module is imported or when it is first called?
Python's documentation does not seem to have the answer.
In modern versions of python (c.f. http://svn.python.org/projects/python/branches/release32-maint/Lib/random.py) Random.seed tries to use 32 bytes read from /dev/urandom. If that doesn't work, it uses the current time: (a is an optional value which can be used to explicitly seed the PRNG.)
if a is None:
try:
a = int.from_bytes(_urandom(32), 'big')
except NotImplementedError:
import time
a = int(time.time() * 256) # use fractional seconds
The seed is based on the clock or (if available) an operating system source. The random module creates (and hence seeds) a shared Random instance when it is imported, not when first used.
References
Python docs for random.seed:
random.seed(a=None, version=2)
Initialize the random number generator.
If a is omitted or None, the current system time is used. If randomness sources are provided by the operating system, they are used
instead of the system time (see the os.urandom() function for details
on availability).
Source of random.py (heavily snipped):
from os import urandom as _urandom
class Random(_random.Random):
def __init__(self, x=None):
self.seed(x)
def seed(self, a=None, version=2):
if a is None:
try:
a = int.from_bytes(_urandom(32), 'big')
except NotImplementedError:
import time
a = int(time.time() * 256) # use fractional seconds
# Create one instance, seeded from current time, and export its methods
# as module-level functions. The functions share state across all uses
#(both in the user's code and in the Python libraries), but that's fine
# for most programs and is easier for the casual user than making them
# instantiate their own Random() instance.
_inst = Random()
The last line is at the top level, so it is executed when the module is loaded.
From this answer, I found the source of random.py. In the Random class, the seed is set when the object is constructed. The module instantiates a Random object and uses it for all of the module methods. So if the random number is produced with random.random() or another module method, then the seed was set at the time of the import. If the random number is produced by another instance of Random, then the seed was set at the time of the construction of that instance.
From the source:
# Create one instance, seeded from current time, and export its methods
# as module-level functions. The functions share state across all uses
#(both in the user's code and in the Python libraries), but that's fine
# for most programs and is easier for the casual user than making them
# instantiate their own Random() instance.
The other answers are correct, but to summarize something from comments above which might be missed by someone else looking for the answer I tracked down today:
The typical reference implementations of Mersenne Twister take a seed and then internally (usually in the constructor) call this.init_genrand(seed)
If you do that and use a simple number you will get different results than what Python uses -- and probably wonder why like I did.
In order to get the same results in another language (node.js in my case) that you would in python you need an implementation which supports the init_by_array method and then initialize it with init_by_array([seed]).
This example is if you're just using a simple 32 bit int val -- if your seed is something else then python passes it in a different way (e.g. larger than 32 bit numbers are split up and sent in 32 bits per array element, etc) but that should at least help someone get going in the right direction.
The node.js implementation I ended up using was https://gist.github.com/banksean/300494 and it worked beautifully. I could not find one in npm which had the support I needed -- might have to add one.
This is not a coding question, but am hoping that someone has come across this in the forums here. I am using Python to run some simulations. I need to run many replications using different random number seeds. I have two questions:
Are negative numbers okay as seeds?
Should I keep some distance in the seeds?
Currently I am using random.org to create 50 numbers between -100000 and +100000, which I use as seeds. Is this okay?
Thanks.
Quoting random.seed([x]):
Optional argument x can be any hashable object.
Both positive and negative numbers are hashable, and many other objects besides.
>>> hash(42)
42
>>> hash(-42)
-42
>>> hash("hello")
-1267296259
>>> hash(("hello", "world"))
759311865
Is it important that your simulations are repeatable? The canonical way to seed a RNG is by using the current system time, and indeed this is random's default behaviour:
random.seed([x])
Initialize the basic random number generator. Optional argument x can be
any hashable object. If x is omitted
or None, current system time is used;
current system time is also used to
initialize the generator when the
module is first imported.
I would only deviate from this behaviour if repeatability is important. If it is important, then your random.org seeds are a reasonable solution.
Should I keep some distance in the seeds?
No. For a good quality RNG, the choice of seed will not affect the quality of the output. A set of seeds [1,2,3,4,5,6,7,8,9,10] should result in the same quality of randomness as any random selection of 10 ints. But even if a selection of random uniformly-distributed seeds were desirable, maintaining some distance would break that distribution.