I have a problem to perform a bitwise '&' between two large binary sequences of the same length and I need to find the indexes where the 1's appear.
I used numpy to do it and here is my code:
>>> c = numpy.array([[0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1],[0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1]]) #initialize 2d array
>>> c = c.all(axis=0)
>>> d = numpy.where(c)[False] #returns indices
I checked the timings for it.
>>> print("Time taken to perform 'numpy.all' : ",timeit.timeit(lambda :c.all(axis=0),number=10000))
>>> Time taken to perform 'numpy.all' : 0.01454929300234653
This operation was slower than what I expected.
Then, to compare, I performed a basic bitwise '&' operation:
>>> print("Time taken to perform bitwise & :",timeit.timeit('a = 0b0000000001111111111100000001111111111; b = 0b0000000001111111111100000001111111111; c = a&b',number=10000))
>>> Time taken to perform bitwise & : 0.0004252859980624635
This is much quicker than numpy
I'm using numpy because it allows to find the indexes where it has 1's, but the numpy.all operator is much slower.
My original data will be array list just like in first case. Will there be any repurcusion if I convert this list into a binary number and then perform the computation like in the second case?
I don't think you can beat the speed of a&b (the actual computation is just a bunch of elementary cpu ops, I'm pretty sure the result of your timeit is >99% overhead). For example:
>>> from timeit import timeit
>>> import numpy as np
>>> import random
>>>
>>> k = 2**17-2
>>> a = random.randint(0, 2**k-1) + 2**k
>>> b = random.randint(0, 2**k-1) + 2**k
>>> timeit('a & b', globals=globals())
2.520026927930303
That's >100k bits and takes just ~2.5 us.
In any case the cost of & will be dwarfed by the cost of generating the list or array of indices.
numpy comes with significant overhead itself, so for a simple operation like yours one needs to check whether it is worth it.
So let's try a pure python solution first:
>>> c = a & b
>>> timeit("[x for x, y in enumerate(bin(c), -2) if y=='1']", globals=globals(), number=1000)
7.905808186973445
That's ~8 ms and as anticipated several orders of magnitude more than the & operation.
How about numpy?
Let's move the list comprehension first:
>>> timeit("np.where(np.fromstring(bin(c), np.uint8)[2:] - ord('0'))[0]", globals=globals(), number=1000)
1.0363857130287215
So in this case we get a ~8-fold speedup. This shrinks to ~4-fold if we require the result to be a list:
>>> timeit("np.where(np.fromstring(bin(c), np.uint8)[2:] - ord('0'))[0].tolist()", globals=globals(), number=1000)
1.9008758360287175
We can also let numpy do the binary conversion, which gives another small speedup:
>>> timeit("np.where(np.unpackbits(np.frombuffer(c.to_bytes(k//8+1, 'big'), np.uint8))[1:])[0]", globals=globals(), number=1000)
0.869781385990791
In summary:
numpy is not always faster, better leave the & to pure Python
locating nonzero bits seems fast enough in numpy to offset the cost of conversion between list and array
Please note that all this comes with the caveat that my pure Python code is not necessarily optimal. For example using a lookup table we can get a bit faster:
>>> lookup = [(np.where(i)[0]-1).tolist() for i in np.ndindex(*8*[2])]
>>> timeit("[(x<<3) + z for x, y in enumerate(c.to_bytes(k//8+1, 'big')) for z in lookup[y]]", globals=globals(), number=1000)
4.687953414046206
>>> c = numpy.random.randint(2, size=(2, 40)) #initialize
>>> c
array([[1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1,
0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0],
[1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1,
0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1]])
Accessing this gives you two slow-downs:
You have to access the two rows, whereas your bit-wise test has the constants readily available in registers.
You are performing a series of 40 and operations, which may include casting from a full integer to a Boolean.
You severely handicapped the all test; the result is not a surprise (any more).
The factor you observe is a direct consequence of the fact that c=numpy.array([[0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1],[0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1]]) is an array on int and an int is coded in 32bits
therefore when you go c.all() you are doing an operation on 37*32 = 1184 bits
However a = 0b0000000001111111111100000001111111111 is composed of 37 bits so when you do a&b the operation is on 37 bits.
therefore you are doing something 32 times more costly with the numpy array.
Let's test that
import timeit
import numpy as np
print("Time taken to perform bitwise & :",timeit.timeit('a=0b0000000001111111111100000001111111111; b = 0b0000000001111111111100000001111111111; c = a&b',number=320000))
a = 0b0000000001111111111100000001111111111
b = 0b0000000001111111111100000001111111111
c=np.array([a,b])
print("Time taken to perform 'numpy.all' : ",timeit.timeit(lambda :c.all(axis=0),number=10000))
the & operation I do 320000 times and the all() operation I do 10000 times.
Time taken to perform bitwise & : 0.01527938833025152
Time taken to perform 'numpy.all' : 0.01583387375572265
It's the same thing !
Now back to your initial problem you want to know the indices where bits are 1 in a large binary number.
Maybe you could try things provided by the bitarray module
a = bitarray.bitarray('0000000001111111111100000001111111111')
b = bitarray.bitarray('0000000001111111111100000001111111111')
i=0
data = list()
for c in a&b:
if(c):
data.append(i)
i=i+1
print (data)
outputs
[9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]
Related
So I'm doing the foo.bar challenge, and I've got code in python that outputs the required answers. I know for a fact that for at least the first two test cases my output matches their output but it still fails all of them. I assumed it could be because its running in python 2.7.13 so I found an online sandbox that runs that version of python but my code still outputs the required output there too. I've tried using the print function to output the results, I've tried formatting the results as lists and arrays but none of this worked. The question is below:
Doomsday Fuel
Making fuel for the LAMBCHOP's reactor core is a tricky process
because of the exotic matter involved. It starts as raw ore, then
during processing, begins randomly changing between forms, eventually
reaching a stable form. There may be multiple stable forms that a
sample could ultimately reach, not all of which are useful as fuel.
Commander Lambda has tasked you to help the scientists increase fuel
creation efficiency by predicting the end state of a given ore sample.
You have carefully studied the different structures that the ore can
take and which transitions it undergoes. It appears that, while
random, the probability of each structure transforming is fixed. That
is, each time the ore is in 1 state, it has the same probabilities of
entering the next state (which might be the same state). You have
recorded the observed transitions in a matrix. The others in the lab
have hypothesized more exotic forms that the ore can become, but you
haven't seen all of them.
Write a function solution(m) that takes an array of array of
nonnegative ints representing how many times that state has gone to
the next state and return an array of ints for each terminal state
giving the exact probabilities of each terminal state, represented as
the numerator for each state, then the denominator for all of them at
the end and in simplest form. The matrix is at most 10 by 10. It is
guaranteed that no matter which state the ore is in, there is a path
from that state to a terminal state. That is, the processing will
always eventually end in a stable state. The ore starts in state 0.
The denominator will fit within a signed 32-bit integer during the
calculation, as long as the fraction is simplified regularly.
For example, consider the matrix m: [ [0,1,0,0,0,1], # s0, the
initial state, goes to s1 and s5 with equal probability
[4,0,0,3,2,0], # s1 can become s0, s3, or s4, but with different
probabilities [0,0,0,0,0,0], # s2 is terminal, and unreachable
(never observed in practice) [0,0,0,0,0,0], # s3 is terminal
[0,0,0,0,0,0], # s4 is terminal [0,0,0,0,0,0], # s5 is terminal ]
So, we can consider different paths to terminal states, such as: s0 ->
s1 -> s3 s0 -> s1 -> s0 -> s1 -> s0 -> s1 -> s4 s0 -> s1 -> s0 -> s5
Tracing the probabilities of each, we find that s2 has probability 0
s3 has probability 3/14 s4 has probability 1/7 s5 has probability 9/14
So, putting that together, and making a common denominator, gives an
answer in the form of [s2.numerator, s3.numerator, s4.numerator,
s5.numerator, denominator] which is [0, 3, 2, 9, 14].
Languages
To provide a Java solution, edit Solution.java To provide a Python
solution, edit solution.py
Test cases
========== Your code should pass the following test cases. Note that it may also be run against hidden test cases not shown here.
-- Java cases -- Input: Solution.solution({{0, 2, 1, 0, 0}, {0, 0, 0, 3, 4}, {0, 0, 0, 0, 0}, {0, 0, 0, 0,0}, {0, 0, 0, 0, 0}}) Output:
[7, 6, 8, 21]
Input: Solution.solution({{0, 1, 0, 0, 0, 1}, {4, 0, 0, 3, 2, 0}, {0,
0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0,
0, 0}}) Output:
[0, 3, 2, 9, 14]
-- Python cases -- Input: solution.solution([[0, 2, 1, 0, 0], [0, 0, 0, 3, 4], [0, 0, 0, 0, 0], [0, 0, 0, 0,0], [0, 0, 0, 0, 0]]) Output:
[7, 6, 8, 21]
Input: solution.solution([[0, 1, 0, 0, 0, 1], [4, 0, 0, 3, 2, 0], [0,
0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0,
0, 0]]) Output:
[0, 3, 2, 9, 14]
my code is below:
import numpy as np
from fractions import Fraction
from math import gcd
def solution(M):
height = (len(M))
length = (len(M[0]))
M = np.array(M)
AB = []
#Find B
for i in range(0, height):
#if B = 1
if (sum(M[:,0])) == 0:
sumB = 1
if(M[i,0]) != 0:
B1 = Fraction((M[i,0]), (sum(M[i])))
B2 = Fraction((M[0,i]), (sum(M[0])))
B = B1 * B2
#Find sum(B) to infinity
sumB = (1/(1-B))
#Find A
boolean2 = 0
count = 0
index = []
for i in range (0, height):
if sum(M[i]) == 0:
if boolean2 == 0:
terminalstart = i
boolean = 0
boolean2 = 1
for j in range(0, height):
#if there is no A
if j==height-1 and boolean == 0:
index.append(i-terminalstart)
count +=1
if (M[j,i]) != 0:
boolean = 1
A1 = Fraction((M[j,i]), (sum(M[j])))
A = A1
if j!=0:
A2 = Fraction((M[0,j]), (sum(M[0])))
A = A1 * A2
#Find AB
AB.append(A*sumB)
#Find common denominators
x = []
y = []
for i in range (0,len(AB)):
x.append(AB[i].denominator)
lcm = 1
#change numerators to fit
for i in x:
lcm = lcm*i//gcd(lcm, i)
for i in range (0, len(AB)):
z = (lcm) / x[i]
#
z = float(z)
#
y.append(int((AB[i].numerator)*z))
#insert 0s
for i in range (0, count):
y.insert(index[i], 0)
#insert denominator
y.append(lcm)
return y
So the code and the questions are basically irrelevant, the main point is, my output (y) is exactly the same as the output in the examples, but when it runs in foo.bar it fails. To test it I used a code that simply returned the desired output in foo.bar and it worked for the test case that had this output:
def solution(M):
y = [0, 3, 2, 9, 14]
return y
So I know that since my code gets to the exact same array and data type for y in the python IDE it should work in google foo.bar, but for some reason its not. Any help would be greatly appreciated
edit:
I found a code online that works:
import numpy as np
# Returns indexes of active & terminal states
def detect_states(matrix):
active, terminal = [], []
for rowN, row in enumerate(matrix):
(active if sum(row) else terminal).append(rowN)
return(active,terminal)
# Convert elements of array in simplest form
def simplest_form(B):
B = B.round().astype(int).A1 # np.matrix --> np.array
gcd = np.gcd.reduce(B)
B = np.append(B, B.sum()) # append the common denom
return (B / gcd).astype(int)
# Finds solution by calculating Absorbing probabilities
def solution(m):
active, terminal = detect_states(m)
if 0 in terminal: # special case when s0 is terminal
return [1] + [0]*len(terminal[1:]) + [1]
m = np.matrix(m, dtype=float)[active, :] # list --> np.matrix (active states only)
comm_denom = np.prod(m.sum(1)) # product of sum of all active rows (used later)
P = m / m.sum(1) # divide by sum of row to convert to probability matrix
Q, R = P[:, active], P[:, terminal] # separate Q & R
I = np.identity(len(Q))
N = (I - Q) ** (-1) # calc fundamental matrix
B = N[0] * R * comm_denom / np.linalg.det(N) # get absorbing probs & get them close to some integer
return simplest_form(B)
When I compared the final answer from this working code to mine by adding the lines:
print(simplest_form(B))
print(type(simplest_form(B))
this is what I got
[ 0 3 2 9 14]
<class 'numpy.ndarray'>
array([ 0, 3, 2, 9, 14])
When I added the lines
y = np.asarray(y)
print(y)
print(type(y))
to my code this is what I got:
[ 0 3 2 9 14]
<class 'numpy.ndarray'>
array([ 0, 3, 2, 9, 14])
when they were both running the same test input. These are the exact same but for some reason mine doesn't work on foo.bar but his does. Am I missing something?
It turns out the
math.gcd(x, y)
function is not allowed in python 2. I just rewrote it as this:
def grcd(x, y):
if x >= y:
big = x
small = y
else:
big = y
small = x
bool1 = 1
for i in range(1, big+1):
while bool1 == 1:
if big % small == 0:
greatest = small
bool1 = 0
small-= 1
return greatest
I'm experimenting with a genetic search algorithm and after building the initial population at random, and then selecting the top two fittest entries, I need to 'mate' them (with some random mutation) to create 64
'children'. The crossover part, explained here:
https://towardsdatascience.com/introduction-to-genetic-algorithms-including-example-code-e396e98d8bf3
seems easy to follow, but I can't seem to figure out how to implement it in Python. How can I implement this crossover of two integers?
def crossover(a, b, index):
return b[:index] + a[index:], a[:index] + b[index:]
Should be quite a bit faster than James' solution, since this one lets Python do all the work!
Here is a function called crossover that takes two parents and a crossover point. The parents should be lists of integers of the same length. The crossover point is the point before which genes get exchanged, as defined in the article that you linked to.
It returns the two offspring of the parents.
def crossover(a, b, crossover_point):
a1 = a[:]
b1 = b[:]
for i in range(crossover_point):
a1[i], b1[i] = b1[i], a1[i]
return [a1, b1]
And here is some code that demonstrates its usage. It creates a population consisting of two lists of length 10, one with only zeros, and the other with only ones. It crosses them over at point 4, and adds the children to the population.
def test_crossover():
a = [0]*10
b = [1]*10
population = [a,b]
population += crossover(a,b,4)
return population
print (test_crossover())
The output of the above is:
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
]
Currently I have the following version but it's my main bottleneck and it's quite slow.
def intToBinary(Input):
bStrInput = format(Input, "016b")
bStrInput = list(bStrInput)
bInput = list(map(int, bStrInput))
return bInput
any ideas how to speed up this code?
I'm using this in a Tensorflow project, for hot encoding conversion of integers. The function takes in 2-byte integer (in the range [0, 65536)) and outputs a list of integers with values 0 and 1:
>>> intToBinary(50411)
[1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1]
The result is passed to a tensor with tInput = torch.tensor(bInput, dtype=torch.uint8).
Your version, improved slightly
Your version can avoid a few opcodes by not using intermediary variables and a conversion to list:
def intToBinary(Input):
return list(map(int, format(Input, "016b")))
Pure Python still, bitshifting option
However, you can make it faster by not converting to string then to integers again. If all you need is bits, then use bit manipulation:
def int_to_binary(v):
return [(v >> i) & 1 for i in range(15, -1, -1)]
This shifts the bits of the input integer to the right by 15, 14, etc. steps, then masks out that shifted integer with 1 to get the bit value for the right-most bit each time.
Speed comparison, using 1000 random integers to reduce variance to acceptable levels:
>>> import sys, platform, psutil
>>> sys.version_info
sys.version_info(major=3, minor=7, micro=0, releaselevel='final', serial=0)
>>> platform.platform(), psutil.cpu_freq().current / 1000, psutil.cpu_count(), psutil.virtual_memory().total // (1024 ** 3)
('Darwin-17.7.0-x86_64-i386-64bit', 2.9, 8, 16)
>>> from timeit import Timer
>>> from random import randrange
>>> testvalues = [randrange(2**16) for _ in range(1000)]
>>> count, total = Timer("for i in tv: t(i)", "from __main__ import intToBinary as t, testvalues as tv").autorange()
>>> (total / count) * (10 ** 3) # milliseconds
3.2812212200224167
>>> count, total = Timer("for i in tv: t(i)", "from __main__ import int_to_binary as t, testvalues as tv").autorange()
>>> (total / count) * (10 ** 3) # milliseconds
2.2861225200176705
So int_to_binary() is about 1.5 times as fast, about 2.3 milliseconds to produce 1000 results, versus just over 3.3 for the optimised string manipulation version.
The base loop and function call takes 7.4 microseconds on my machine:
>>> count, total = Timer("for i in tv: pass", "from __main__ import testvalues as tv; t = lambda i: None").autorange()
>>> (total / count) * (10 ** 3)
0.007374252940062434
so the base per-call timings are about 3.27 microseconds vs 2.28 microseconds for the bit-manipulation version.
What can Numpy do
If you are using Tensorflow, you'll also have numpy operations available, which can convert uint8 to binary using the numpy.unpackbits() function; uint16 needs to be 'viewed' as uint8 first:
import numpy as np
def int_to_bits_np(v):
return np.unpackbits(np.array([v], dtype=np.uint16).view(np.uint8)).tolist()
This converts to numpy array, back to a list of Python integers again, so is not that efficient on just one value:
>>> count, total = Timer("for i in tv: t(i)", "from __main__ import int_to_bits_np as t, testvalues as tv").autorange()
>>> (total / count) * (10 ** 3)
2.654717969999183
Faster than your version, slower than bitshifting.
Numpy vectorised option
You probably want to not convert back to a list, since the numpy array already has the right dtype for your tensor here. You would also use this on a large number of values; such as the whole 1000 integers in the input:
def int_to_bits_array(varray):
"""Convert an array of uint16 values to binary"""
return np.unpackbits(varray.reshape(varray.shape[0], 1).view(np.uint8), axis=1)
which is way, way, way faster:
>>> testvalues_array = np.array(testvalues, dtype=np.uint16)
>>> int_to_bits_array(testvalues_array)
array([[1, 1, 0, ..., 1, 1, 0],
[0, 1, 1, ..., 1, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
...,
[1, 1, 1, ..., 0, 1, 0],
[0, 0, 0, ..., 1, 1, 0],
[0, 0, 0, ..., 0, 0, 0]], dtype=uint8)
>>> count, total = Timer("t(tva)", "from __main__ import int_to_bits_array as t, testvalues_array as tva").autorange()
>>> (total / count) * (10 ** 3) # milliseconds
0.007919690339913358
>>> (total / count) * (10 ** 6) # microseconds
7.919690339913359
Yes, that's 1000 values converted to binary in one step, processing all values in 8 microseconds. This scales up linearly to larger numbers; 1 million random values are converted in under 8 milliseconds:
>>> million_testvalues_array = np.random.randint(2 ** 16, size=10 ** 6, dtype=np.uint16)
>>> count, total = Timer("t(tva)", "from __main__ import int_to_bits_array as t, million_testvalues_array as tva").autorange()
>>> (total / count) * (10 ** 3) # milliseconds
7.9162722200271665
I have a binary file with size of 10 MB, what I want to do with this file is to read bit by bit. In Python- Numpy, as far as I know we cannot read data bit by bit but byte. So, in order to read the data bit by bit, first I read the file using np.fromfile function then later unpack the byte into 8 bits using np.unpackbits function. Here is the script how I did it:
fbyte = np.fromfile(binar_file, dtype='uint8')
fbit = np.unpackbits(fbyte)
What I have in fbit is a long binary file but with reversing order in every 8 bits (MSB - LSB) e.g 10010011 ..., what I actually expected is in order LSB - MSB like this 11001001. By using for loop to flip the order of binary file every 8 bits will solve the problem, but it will take some time which I would like to avoid since I want to read thousand of files. So my question is, is there any way to unpack the bytes into bit but directly in order of LSB - MSB. Just as comparison, in Matlab this process is easy to do since there is Matlab function fread where I can specify bit configuration, e.g 'ubit1' for reading bit by bit and the result is as I expected --> LSB - MSB. Any help/hints would be appreciated. Thanks.
You could simply reshape to 2D keeping 8 columns and then flip those, like so -
np.unpackbits(fbyte).reshape(-1,8)[:,::-1]
Sample run -
In [1176]: fbyte
Out[1176]: array([253, 35, 198, 182, 62], dtype=uint8)
In [1177]: np.unpackbits(fbyte).reshape(-1,8)[:,::-1]
Out[1177]:
array([[1, 0, 1, 1, 1, 1, 1, 1],
[1, 1, 0, 0, 0, 1, 0, 0],
[0, 1, 1, 0, 0, 0, 1, 1],
[0, 1, 1, 0, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 0, 0]], dtype=uint8)
Timings on one million elements array -
In [1173]: fbyte = np.random.randint(0,255,(1000000)).astype(np.uint8)
In [1174]: %timeit np.unpackbits(fbyte).reshape(-1,8)[:,::-1]
1000 loops, best of 3: 541 µs per loop
Seems crazy fast to me!
In NumPy 1.17 and newer, unpackbits accepts a bitorder parameter that will accomplish this -- just pass bitorder="little" to the np.unpackbits call.
I'm looking to quickly (hopefully without a for loop) generate a Numpy array of the form:
array([a,a,a,a,0,0,0,0,0,b,b,b,0,0,0, c,c,0,0....])
Where a, b, c and other values are repeated at different points for different ranges. I'm really thinking of something like this:
import numpy as np
a = np.zeros(100)
a[0:3,9:11,15:16] = np.array([a,b,c])
Which obviously doesn't work. Any suggestions?
Edit (jterrace answered the original question):
The data is coming in the form of an N*M Numpy array. Each row is mostly zeros, occasionally interspersed by sequences of non-zero numbers. I want to replace all elements of each such sequence with the last value of the sequence. I'll take any fast method to do this! Using where and diff a few times, we can get the start and stop indices of each run.
raw_data = array([.....][....])
starts = array([0,0,0,1,1,1,1...][3, 9, 32, 7, 22, 45, 57,....])
stops = array([0,0,0,1,1,1,1...][5, 12, 50, 10, 30, 51, 65,....])
last_values = raw_data[stops]
length_to_repeat = stops[1]-starts[1]
Note that starts[0] and stops[0] are the same information (which row the run is occurring on). At this point, since the only route I know of is what jterrace suggest, we'll need to go through some contortions to get similar start/stop positions for the zeros, then interleave the zero start/stop with the values start/stops, and interleave the number 0 with the last_values array. Then we loop over each row, doing something like:
for i in range(N)
values_in_this_row = where(starts[0]==i)[0]
output[i] = numpy.repeat(last_values[values_in_this_row], length_to_repeat[values_in_this_row])
Does that make sense, or should I explain some more?
If you have the values and repeat counts fully specified, you can do it this way:
>>> import numpy
>>> values = numpy.array([1,0,2,0,3,0])
>>> counts = numpy.array([4,5,3,3,2,2])
>>> numpy.repeat(values, counts)
array([1, 1, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 3, 3, 0, 0])
you can use numpy.r_:
>>> np.r_[[a]*4,[b]*3,[c]*2]
array([1, 1, 1, 1, 2, 2, 2, 3, 3])