Probability of gettting 0.0 from random()? - python

I'm trying to make a simple program to demonstrate something, though I'm a bit preplexed on the math of it.
from random import random
a = random()
I read up on the random function and its distribution is [0.0;1.0). It uses Mersenne Twister to generate pseudo randoms and it's a 56bit precision floating number.
I'm assuming that means that the probability of it generating exactly 0.0 is 1/2^56?
What would a have to be lower than, in order for the probability to be 1/2^28?.. I tried understanding the 56-bit float conversion but I can't seem to figure it out. What would the actual float value have to be?
a = ?
if random() < a:
print("Success")

With a continuous uniform distribution over [0, 1), the portion of samples less than x is x. For example, ½ the samples are less than ½. So the x such that the probability that a sample is less than x is 1/228 is 1/228.
With a quantized distribution (only multiples of a certain quantum are in the distribution) over [0, 1), the same is true if x is a number in the distribution. If it is between two numbers in the distribution, the probability a sample is less than x is the number just greater than x. However, in the situation you describe, it seems like 1/228 is in the distribution, and so it is the answer.

It depends on how it was generated. Almost all libraries use the equidistant method to produce values on [0,1). Briefly:
Generate uniform integer (say 64-bits returned per call)
Throw away the number of excess bits to match the floating-point precision (24 for single, 53 for doubles)
Convert integer to float (no rounding occurs since value "fits") and scale to range (2^-24/2^-53)
So (taking doubles) the method produces 2^53 unique FP values and each occurs with probability of 2^-53. The number of bits of the underlying integer generator doesn't effect this.

Related

Random numbers simulating stock market returns

Let me say right from the beginning that I know just a little bit about statistics, but not enough to figure out this problem.
I'm trying to create a list of n random floating point numbers to simulate annual stock market returns. The numbers should range from -30.0 to +30.0 with an average of 7.0. The numbers should be distributed mostly around the average, but they should be well distributed. Basically, it should be a flattened bell curve, so there should be a good chance of having some negative numbers as well as some numbers closer to the upper limit.
I know numpy has functions to create random numbers that are distributed in different ways, but not sure how to specify these parameters.
You need to sample from a nomal distribution that has your chosen mean (mu) and a large standard deviation (sigma) to flatten it.
Here's some code to get you started
import numpy as np
mu, sigma, n = 7.0, 3.0, 1000 # mean and standard deviation
s = np.random.normal(mu, sigma, n)
Note that the larger sigma is, the further your samples will be from the mean and could exceed your +-30 limits. So choose a suitable sigma (like three sigmas from your mean to the extreme), or you'll have to clip your numbers to the limit.
Also note that stock market returns aren't necessarily normally distributed.

Why does numpy.random.normal gives some negative value in the ndarray?

I know that the normal distribution is always greater than 0 for any chosen value of the mean and the standard deviation.
>> np.random.normal(scale=0.3, size=x.shape)
[ 0.15038925 -0.34161875 -0.07159422 0.41803414 0.39900799 0.10714512
0.5770597 -0.16351734 0.00962916 0.03901677]
Here the mean is 0.0 and the standard deviation is 0.3. But some values in the ndarray are negative. Am I wrong in my interpretation that normal distribution curve is always positive?
Edit:
But using normpdf function in matlab always give an array of positive values which I guess is the probability density function (y axis). Whereas numpy.random.normal gives both positive and negative values (x axis). Now this is confusing.
Values generated from a Normal distribution does take negative value.
For example, for a mean 0 normal distribution. We need some positive values and negative values for the average value to be zero. Also, for the normal distribution with mean 0, it is equally likely to be positive or negative.
It actually take any real number with positive probability. You might be confused with the probability density function is always positive.
referencing to np.random.normal in "https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html", the output is the sample (x), not the distribution (y). Therefore, the output can be negative.
Therefore, np.random.normal is used to do the sampling by following the normal distribution, not to randomly generate a probability value by following the normal distribution.
Try to not expect probability mean as 0, as it makes no sense, you expecting your random event never to occur.
Try to use something like np.random.normal(0.5, 0.3, 1000) to express your normal probability distribution.
Also, take a closer look at the math of Normal Distribution to be able to construct your probability density functions easily.

Random Number Generator Explanation

from random import *
def main():
t = 0
for i in range(1000): # thousand
t += random()
print(t/1000)
main()
I was looking at the source code for a sample program my professor gave me and I came across this RNG. can anyone explain how this RNG works?
If you plotted the points, you would see that this actually produces a Gaussian ("normal") distribution about the mean of the random function.
Generate random numbers following a normal distribution in C/C++ talks about random number generation; it's a pretty common technique to do this if all you have is a uniform number generator like in standard C.
What I've given you here is a histogram of 100,000 values drawn from your function (of course, returned not printed, if you aren't familiar with python). The y axis is the frequency that the value appears, the x axis is the bin of the value. As you can see, the average value is 1/2, and by 3 standard deviations (99.7 percent of the data) we have almost no values in the range. That should be intuitive; we "usually" get 1/2, and very rarely get .99999
Have a look at the documentation. Its quite well written:
https://docs.python.org/2/library/random.html
The idea is that that program generates a random number 1000 times which is sufficiently enough to get mean as 0.5
The program is using the Central Limit Theorem - sums of independent and identically distributed random variables X with finite variance asymptotically converge to a normal (a.k.a. Gaussian) distribution whose mean is the sum of the means, and variance is the sum of the variances. Scaling this by N, the number of X's summed, gives the sample mean (a.k.a. average). If the expected value of X is μ and the variance of X is σ2, the expected value of the sample mean is also μ and it has variance σ2 / N.
Since a Uniform(0,1) has mean 0.5 and variance 1/12, your algorithm will generate results that are pretty close to normally distributed with a mean of 0.5 and a variance of 1/12000. Consequently 99.7% of the outcomes should fall within +/-3 standard deviations of the mean, i.e., in the range 0.5+/-0.0274.
This is a ridiculously inefficient way to generate normals. Better alternatives include the Box-Muller method, Polar method, or ziggurat method.
The thing making this random is the random() function being called. random() will generate 1 (for most practical purposes) random float between 0 and 1.
>>>random()
0.1759916412898097
>>>random()
0.5489228122596088
etc.
The rest of it is just adding each random to a total and then dividing by the number of randoms, essentially finding the average of all 1000 randoms, which as Cyber pointed out is actually not a random number at all.

Random integers from an exponential distribution between min and max

I would like to generate random integers on an interval min to max. For a uniform distribution in numpy:
numpy.random.randint(min,max,n)
does exactly what I want.
However, I would now like to give the distribution of random numbers an exponential bias. There are a number of suggestions for this e.g. Pseudorandom Number Generator - Exponential Distribution as well as the numpy function numpy.random.RandomState.exponential, but these do not address how to constrain the distribution to integers between min and max. I'm not sure how to do this, whilst still ensuring a random distribution.
The exponential distribution is a continuous distribution. What you probably want is its discrete equivalent, the geometric distribution. Numpy's implementation generates strictly positive integers, i.e, 1,2,3,..., so you'll want add min-1 to shift it, and then truncate by rejecting/throwing away results > max. That, in turn, means generating them one-by-one add adding the non-rejected values to a list until you get the desired number. (You could also determine analytically what proportion you expect to be rejected, and scale your n accordingly, but you'll still likely end up a few short or with a few too many.)
It's possible to do this without rejection, but you'd have to create your own inversion, determine the probability of exceeding max, and generate uniforms to between 0 and that probability to feed to your inversion algorithm. Rejection is simpler even though it's less efficient.
May be you can try summing up all the bias. Then the probability of generating an integer j= bias of j / total bias. You can use monte carlo simulation to implement this.

Drawing floating numbers with [0, 1] from uniform distribution by using numpy

I'm currently trying to draw floating numbers from a uniform distribution.
The Numpy provides numpy.random.uniform.
import numpy as np
sample = np.random.uniform (0, 1, size = (N,) + (2,) + (2,) * K)
However, this module generates values over the half-open interval [0, 1).
How can I draw floating numbers with [0, 1] from a uniform distribution?
Thanks.
It doesn't matter if you're drawing the uniformly distributed numbers from (0,1) or [0,1] or [0,1) or (0,1]. Because the probability of getting 0 or 1 is zero.
random_integers generates integers on a closed interval. So, if you can recast the actual problem of yours into using integers, you're all set. Otherwise, you may consider if granularity of 1./MAX_INT is sufficient to your problem.
From the standard Python random.uniform documentation :
The end-point value b may or may not be included in the range depending on floating-point rounding in the equation a + (b-a) * random().
So basically, the inclusion of the end point is strictly based on the floating-point rounding scheme used. Therefore, to include 1.0, you need to define the precision required by your operation and round the random number accordingly. If you do not have a defined precision for your problem, you can use numpy.nextafter. Its usage was covered by a previous answer.
If your software depends on the difference between [0,1) and [0,1] then you should probably roll your own random number generator, possibly the one mentioned here in order to ensure that it meets these stringent requirements.

Categories