Generating a 10000 bit random sequence - python

Is there a more efficient way to generate a 10 kBit (10,000 bits) random binary sequence in Python than appending 0s and 1s in a loop?

If you want a random binary sequence then it's probably quickest just to generate a random integer in the appropriate range:
import random
s = random.randint(0, 2**10000 - 1)
After this it really depends on what you want to do with your binary sequence. You can query individual bits using bitwise operations:
s & (1 << x) # is bit x set?
or you could use a library like bitarray or bitstring if you want to make checking, setting slicing etc. easier:
from bitstring import BitArray
b = BitArray(uint=s, length=10000)
p, = b.find('0b000000')
if b[99]:
b[100] = False
...

The numpy package has a subpackage 'random' which can produce arrays of random numbers.
http://docs.scipy.org/doc/numpy/reference/routines.random.html
If you want an array of 'n' random bits, you can use
arr = numpy.random.randint(2, size=(n,))
... but depending on what you are doing with them, it may be more efficient to use e.g.
arr = numpy.random.randint(0x10000, size=(n,))
to get an array of 'n' numbers, each with 16 random bits; then
rstring = arr.astype(numpy.uint16).tostring()
turns that into a string of 2*n chars, containing the same random bits.

Here is a one liner:
import random
[random.randrange(2) for _ in range(10000)]

Related

numpy array of arbitrary precision (10 bit int)

I need to simulate a piece of hardware that generates binary files where each word is 10 bits. How can I achieve this with a numpy array?
Something like:
outarray = np.zeros(512, dtype=np.int10)
Thanks!
Numpy doesn't have an uint10 type. But you can use uint16, and a bitmask to check for overflow. And use binary_rep to get the 10 bit binary representations:
import numpy as np
MAX_WORD = 2**10
unused_bits = ~np.array([MAX_WORD-1], dtype="uint16") # Binary mask of the 6 unused_bits
words = np.random.randint(MAX_WORD, size=10, dtype="uint16") # Create 10 bit words
assert not np.any(words & unused_bits) # Check for overflow
for word in words:
print(word, np.binary_repr(word, width=10)) # Get 10 bit binary representation
binary_repr = "".join(np.binary_repr(word, width=10) for word in words)
print(binary_repr) # Full binary representation
Another option you could consider, if you're mainly interested in understanding the accuracy of arithmetic operations on 10-bit numbers, is to use the spfpm package. This will simulate the effect of fixed-point arithemetic operations, including multiplication, division, square-roots, trigonometric functions, etc., but doesn't currently support matrix operations.

How to speedup binary transformation from integer values

I wrote the following method (in python 2.7) that generates a set of integers and transform them into binary representation. It takes self-explanatory two parameters: total_num_nodes and dim. It returns numpy matrix-like containing the binary representation of all these integers:
def generate(total_num_nodes, dim):
# Generate random nodes from the range (0, dim-1)
nodes_matrix = [random.randint(0, 2 ** dim - 1) for _ in range(total_num_nodes)]
# Removes duplicates
nodes_matrix = list(set(nodes_matrix))
# Transforms each node from decimal to string representation
nodes_matrix = [('{0:0' + str(dim) + 'b}').format(x) for x in nodes_matrix]
# Transforms each bit into an integer.
nodes_matrix = np.asarray([list(map(int, list(x))) for x in nodes_matrix], dtype=np.uint8)
return nodes_matrix
The problem is that when I pass very large values, say total_num_nodes= 10,000,000 and dim=128, the generation time takes really long time. A friend of mine hinted me that the following line is actually a bottleneck and it is likely to be responsible for the majority of computation time:
# Transforms each node from decimal to string representation
nodes_matrix = [('{0:0' + str(dim) + 'b}').format(x) for x in nodes_matrix]
I cannot think of other faster method that can replce this line so that I get to speedup the generation time when it is running on a single processor. Any suggestion from you is really really appreciated.
Thank you
Do it all in numpy and it will be faster.
The following generates total_num_nodes rows of dim columns of np.uint8 data, then keeps the unique rows by providing a view of the data suitable for np.unique, then translating back to a 2D array:
import numpy as np
def generate(total_num_nodes, dim):
a = np.random.choice(np.array([0,1],dtype=np.uint8),size=(total_num_nodes,dim))
dtype = a.dtype.descr * dim
temp = a.view(dtype)
uniq = np.unique(temp)
return uniq.view(a.dtype).reshape(-1,dim)

Generate a random number of len(10) containing digits only 0-1s

The question is self-explanatory.
I've tried this.
import random
number = "".join([str(random.randint(0,1)) for i in xrange(0,10)])
print number
Is there any in-built functionality for the same?
Either use:
''.join(random.choice('01') for _ in xrange(10))
Which avoids the int->str, or otherwise use randrange (to exclude a full bit set) with a range that's 2**10, then format as a binary string.
format(random.randrange(2**10), '010b')
Also, to avoid overflow, you can use getrandbits and specify 10 as the amount, eg:
format(random.getrandbits(10), '010b')
Choose a random int in the range 0 ti 1023 inclusive and format it in base 2 with a minimum width of 10 with leading 0s filled in.
format(random.randint(0,1023), '010b')
I’m going to throw in another solution that creates an actual (decimal) number containing only ones and zeros:
>>> import random, functools
>>> functools.reduce(lambda x, i: 10 * x + random.randrange(2), range(10), 0)
1011001010
OR, to skip the decimal-binary transition :
from random import randint
def repeat(n):
if(n<=0):
return ''
n -= 1
return str(randint(0,1))+repeat(n)
and at the end just call repeat(10) or whatever number of bits you want.

Get the the number of zeros and ones of a binary number in Python

I am trying to solve a binary puzzle, my strategy is to transform a grid in zeros and ones, and what I want to make sure is that every row has the same amount of 0 and 1.
Is there a any way to count how many 1s and 0s a number has without iterating through the number?
What I am currently doing is:
def binary(num, length=4):
return format(num, '#0{}b'.format(length + 2)).replace('0b', '')
n = binary(112, 8)
// '01110000'
and then
n.count('0')
n.count('1')
Is there any more efficient computational (or maths way) of doing that?
What you're looking for is the Hamming weight of a number. In a lower-level language, you'd probably use a nifty SIMD within a register trick or a library function to compute this. In Python, the shortest and most efficient way is to just turn it into a binary string and count the '1's:
def ones(num):
# Note that bin is a built-in
return bin(num).count('1')
You can get the number of zeros by subtracting ones(num) from the total number of digits.
def zeros(num, length):
return length - ones(num)
Demonstration:
>>> bin(17)
'0b10001'
>>> # leading 0b doesn't affect the number of 1s
>>> ones(17)
2
>>> zeros(17, length=6)
4
If the length is moderate (say less than 20), you can use a list as a lookup table.
It's only worth generating the list if you're doing a lot of lookups, but it seems you might in this case.
eg. For a 16 bit table of the 0 count, use this
zeros = [format(n, '016b').count('0') for n in range(1<<16)]
ones = [format(n, '016b').count('1') for n in range(1<<16)]
20 bits still takes under a second to generate on this computer
Edit: this seems slightly faster:
zeros = [20 - bin(n).count('1') for n in range(1<<20)]
ones = [bin(n).count('1') for n in range(1<<20)]

Python: Number ranges that are extremely large?

val = long(raw_input("Please enter the maximum value of the range:")) + 1
start_time = time.time()
numbers = range(0, val)
shuffle(numbers)
I cannot find a simple way to make this work with extremely large inputs - can anyone help?
I saw a question like this - but I could not implement the range function they described in a way that works with shuffle. Thanks.
To get a random permutation of the range [0, n) in a memory efficient manner; you could use numpy.random.permutation():
import numpy as np
numbers = np.random.permutation(n)
If you need only small fraction of values from the range e.g., to get k random values from [0, n) range:
import random
from functools import partial
def sample(n, k):
# assume n is much larger than k
randbelow = partial(random.randrange, n)
# from random.py
result = [None] * k
selected = set()
selected_add = selected.add
for i in range(k):
j = randbelow()
while j in selected:
j = randbelow()
selected_add(j)
result[i] = j
return result
print(sample(10**100, 10))
If you don't need the full list of numbers (and if you are getting billions, its hard to imagine why you would need them all), you might be better off taking a random.sample of your number range, rather than shuffling them all. In Python 3, random.sample can work on a range object too, so your memory use can be quite modest.
For example, here's code that will sample ten thousand random numbers from a range up to whatever maximum value you specify. It should require only a relatively small amount of memory beyond the 10000 result values, even if your maximum is 100 billion (or whatever enormous number you want):
import random
def get10kRandomNumbers(maximum):
pop = range(1, maximum+1) # this is memory efficient in Python 3
sample = random.sample(pop, 10000)
return sample
Alas, this doesn't work as nicely in Python 2, since xrange objects don't allow maximum values greater than the system's integer type can hold.
An important point to note is that it will be impossible for a computer to have the list of numbers in memory if it is larger than a few billion elements: its memory footprint becomes larger than the typical RAM size (as it takes about 4 GB for 1 billion 32-bit numbers).
In the question, val is a long integer, which seems to indicate that you are indeed using more than a billion integer, so this cannot be done conveniently in memory (i.e., shuffling will be slow, as the operating system will swap).
That said, if the number of elements is small enough (let's say smaller than 0.5 billion), then a list of elements can fit in memory thanks to the compact representation offered by the array module, and be shuffled. This can be done with the standard module array:
import array, random
numbers = array.array('I', xrange(10**8)) # or 'L', if the number of bytes per item (numbers.itemsize) is too small with 'I'
random.shuffle(numbers)

Categories