I am aware that the problem of partitioning an integer is old and there are many questions and answers about it here on SO, but after searching extensively I haven't found exactly what I am looking for. To be fair, my solution is not too too bad, but I'd like to know if there is a faster/better way of doing the following:
I need to partition an integer into a fixed-length partition that may include the value 0, and where each "position" in the partition is subject to a max possible value. For example:
>>>list(partition(number = 5, max_vals = (1,0,3,4)))
[(1, 0, 3, 1),
(1, 0, 2, 2),
(1, 0, 0, 4),
(1, 0, 1, 3),
(0, 0, 1, 4),
(0, 0, 2, 3),
(0, 0, 3, 2)]
My solution is the following:
from collections import Counter
from itertools import combinations
def partition(number:int, max_vals:tuple):
S = set(combinations((k for i,val in enumerate(max_vals) for k in [i]*val), number))
for s in S:
c = Counter(s)
yield tuple([c[n] for n in range(len(max_vals))])
Essentially I first create "tokens" for each slot, then I combine the right number of them and finally I count how many per slot there are.
I don't particularly like having to instantiate a Counter for each partition, but the thing I dislike the most is that combinations generates many more tuples than what is needed and then I discard all of the duplicates with set(), which seems quite inefficient. Is there a better way?
Even though there must be better algorithms, a relatively simpler and faster solution, using itertools.product will be:
>>> from itertools import product
>>> def partition_2(number:int, max_vals:tuple):
return (comb for comb in
product(*(range(min(number, i) + 1) for i in max_vals))
if sum(comb)==number)
>>> list(partition_2(number = 5, max_vals = (1,0,3,4)))
[(0, 0, 1, 4),
(0, 0, 2, 3),
(0, 0, 3, 2),
(1, 0, 0, 4),
(1, 0, 1, 3),
(1, 0, 2, 2),
(1, 0, 3, 1)]
Performance:
>>> %timeit list(partition(number = 15, max_vals = (1,0,3,4)*3))
155 ms ± 681 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit list(partition_2(number = 15, max_vals = (1,0,3,4)*3))
14.7 ms ± 763 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
################################################################################
>>> %timeit list(partition(number = 5, max_vals = (10,20,30,10,10)))
1.17 s ± 26.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit list(partition_2(number = 5, max_vals = (10,20,30,10,10)))
1.21 ms ± 28.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#################################################################################
>>> %timeit list(partition_2(number = 35, max_vals = (8,9,10,11,12)))
23.2 ms ± 697 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit list(partition(number = 35, max_vals = (8,9,10,11,12)))
# Will update when/if it finishes :)
A recursive function is usually an elegant way of approaching this kind of problem:
def partition(N,slots):
if len(slots)==1:
if slots[0]>=N: yield [N]
return
for s in range(min(N,slots[0])+1):
yield from ([s]+p for p in partition(N-s,slots[1:]))
for part in partition(5,[1,0,3,4]): print(part)
[0, 0, 1, 4]
[0, 0, 2, 3]
[0, 0, 3, 2]
[1, 0, 0, 4]
[1, 0, 1, 3]
[1, 0, 2, 2]
[1, 0, 3, 1]
This can be further optimized by checking the remaining space at each recursion level and short circuit traversal when the remaining slots are insufficient for the number to spread:
def partition(N,slots,space=None):
if space is None: space = sum(slots)
if N>space: return
if len(slots)==1:
if slots[0]>=N: yield [N]
return
for s in range(min(N,slots[0])+1):
yield from ([s]+p for p in partition(N-s,slots[1:],space-slots[0]))
This optimization improves performance in scenarios where the number of solutions is less than the full product of all slots. It is slower than iterations in cases where most of the slot combinations work.
from timeit import timeit
t = timeit(lambda:list(partition(45,(8,9,10,11,12))),number=1)
print(t) # 0.000679596
t = timeit(lambda:list(partition_2(45,(8,9,10,11,12))),number=1)
print(t) # 0.027492302 (Sayandip's)
t = timeit(lambda:list(partition(15,(1,0,3,4)*3)),number=1)
print(t) # 0.024383259
t = timeit(lambda:list(partition_2(15,(1,0,3,4)*3)),number=1)
print(t) # 0.018362536
To get systematically better performance from the recursive approach, we would need to limit the depth of recursion. This can be done by approaching the problem differently. If we split the slots in two groups and determine the distribution between two combined slots (left and right) we can then apply the partition on each side and combine the results. This will only recurse to a depth of Log2N and will combine large chunks together instead of only adding values one at a time:
from itertools import product
def partition(N,slots,space=None):
if space is not None and N>space: return
if len(slots)==1:
if slots[0]>=N: yield [N]
return
if len(slots)==2:
for left in range(max(0,N-slots[1]),min(N,slots[0])+1):
yield [left,N-left]
return
leftSlots = slots[:len(slots)//2]
rightSlots = slots[len(slots)//2:]
leftSpace,rightSpace = sum(leftSlots),sum(rightSlots)
for leftN,rightN in partition(N,[leftSpace,rightSpace],leftSpace+rightSpace):
partLeft = partition(leftN, leftSlots, leftSpace)
partRight = partition(rightN, rightSlots, rightSpace)
for leftSide,rightSide in product(partLeft,partRight):
yield leftSide+rightSide
The performance improvement is then systematic, in all scenarios:
t = timeit(lambda:list(partition(45,(8,9,10,11,12))),number=1)
print(t) # 0.00017742
t = timeit(lambda:list(partition_2(45,(8,9,10,11,12))),number=1)
print(t) # 0.02895038
t = timeit(lambda:list(partition(15,(1,0,3,4)*3)),number=1)
print(t) # 0.00338676
t = timeit(lambda:list(partition_2(15,(1,0,3,4)*3)),number=1)
print(t) # 0.02025453
Related
There is a random 1D array m_0
np.array([0, 1, 2])
I need to generate two 1D arrays:
np.array([0, 1, 2, 0, 1, 2, 0, 1, 2])
np.array([0, 0, 0, 1, 1, 1, 2, 2, 2])
Is there faster way to do it than this one:
import numpy as np
import time
N = 3
m_0 = np.arange(N)
t = time.time()
m_1 = np.tile(m_0, N)
m_2 = np.repeat(m_0, N)
t = time.time() - t
Size of m_0 is 10**3
You could use itertools.product to form the Cartesian product of m_0 with itself, then take the result apart again to get your two arrays.
import numpy as np
from itertools import product
N = 3
m_0 = np.arange(N)
m_2, m_1 = map(np.array, zip(*product(m_0, m_0)))
# m_1 is now array([0, 1, 2, 0, 1, 2, 0, 1, 2])
# m_2 is now array([0, 0, 0, 1, 1, 1, 2, 2, 2])
However, for large N this is probably quite a bit less performant than your solution, as it probably can't use many of NumPy's SIMD optimizations.
For alternatives and comparisons, you'll probably want to look at the answers to Cartesian product of x and y array points into single array of 2D points.
I guess you could try reshape:
>>> np.reshape([m_0]*3, (-1,), order='C')
array([0, 1, 2, 0, 1, 2, 0, 1, 2])
>>> np.reshape([m_0]*3, (-1,), order='F')
array([0, 0, 0, 1, 1, 1, 2, 2, 2])
Should be tiny bit faster for larger arrays.
>>> m_0 = np.random.randint(0, 10**3, size=(10**3,))
>>> %timeit np.tile([m_0]*10**3, N)
5.85 ms ± 138 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit np.reshape([m_0]*10**3, (-1,), order='C')
1.94 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
You can slightly improve speed if you reuse your first variable to create the second.
N=1000
%timeit t = np.arange(N); a = np.tile(t, N); b = np.repeat(t, N)
%timeit t = np.arange(N); a = np.tile(t, N); b = np.reshape(a.reshape((N,N)),-1,'F')
7.55 ms ± 46.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
5.54 ms ± 23.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
If you insist on speeding it up further, you can specify the dtype of your array.
%timeit t = np.arange(N,dtype=np.uint16); a = np.tile(t, N); b = np.repeat(t, N)
%timeit t = np.arange(N,dtype=np.uint16); a = np.tile(t, N); b = np.reshape(a.reshape((N,N)),-1,'F')
6.03 ms ± 587 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.2 ms ± 37.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Be sure to keep the data type limit in mind.
I'm looking for a numpy equivalent of my suboptimal Python code. The calculation I want to do can be summarized by:
The average of the peak of each section for each row.
Here the code with a sample array and list of indices. Sections can be of different sizes.
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
indices = [2]
result = np.empty((1, x.shape[0]))
for row in x:
splited = np.array_split(row, indexes)
peak = [np.amax(a) for a in splited]
result[0, i] = np.average(peak)
Which gives: result = array([[3., 7.]])
What is the optimized numpy way to suppress both loop?
You could just take off the for loop and use axis instead:
result2 = np.mean([np.max(arr, 1) for arr in np.array_split(x_large, indices, 1)], axis=0)
Output:
array([3., 7.])
Benchmark:
x_large = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]] * 1000)
%%timeit
result = []
for row in x_large:
splited = np.array_split(row, indices)
peak = [np.amax(a) for a in splited]
result.append(np.average(peak))
# 29.9 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit np.mean([np.max(arr, 1) for arr in np.array_split(x_large, indices, 1)], axis=0)
# 37.4 µs ± 499 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Validation:
np.array_equal(result, result2)
# True
Given a = [1, 2, 3, 4, 5]
After encoding, a' = [1, 1, 1, 1, 1], each element represents the difference compare to its previous element.
I know this can be done with
for i in range(len(a) - 1, 0, -1):
a[i] = a[i] - a[i - 1]
Is there a faster way? I am working with 2 billion numbers here, the process is taking about 30 minutes.
One way using itertools.starmap, islice and operator.sub:
from operator import sub
from itertools import starmap, islice
l = list(range(1, 10000000))
[l[0], *starmap(sub, zip(islice(l, 1, None), l))]
Output:
[1, 1, 1, ..., 1]
Benchmark:
l = list(range(1, 100000000))
# OP's method
%timeit [l[i] - l[i - 1] for i in range(len(l) - 1, 0, -1)]
# 14.2 s ± 373 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# numpy approach by #ynotzort
%timeit np.diff(l)
# 8.52 s ± 301 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# zip approach by #Nick
%timeit [nxt - cur for cur, nxt in zip(l, l[1:])]
# 7.96 s ± 243 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# itertool and operator approach by #Chris
%timeit [l[0], *starmap(sub, zip(islice(l, 1, None), l))]
# 6.4 s ± 255 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
You could use zip to put together the list with an offset version and subtract those values
a = [1, 2, 3, 4, 5]
a[1:] = [nxt - cur for cur, nxt in zip(a, a[1:])]
print(a)
Output:
[1, 1, 1, 1, 1]
Out of interest, I ran this, the original code and #ynotzort answer through timeit and this was much faster than the numpy code for short lists; remaining faster up to about 10M values; both were about 30% faster than the original code. As the list size increased beyond 10M, the numpy code has more of a speed up and eventually is faster from about 20M values onward.
Update
Also tested the starmap code, and that is about 40% faster than the numpy code at 20M values...
Update 2
#Chris has some more comprehensive performance data in their answer. This answer can be sped up further (about 10%) by using itertools.islice to generate the offset list:
a = [a[0], *[nxt - cur for cur, nxt in zip(a, islice(a, 1, None))]]
You could use numpy.diff, For example:
import numpy as np
a = [1, 2, 3, 4, 5]
npa = np.array(a)
a_diff = np.diff(npa)
I need to find the index of the minimum per row in a 2-dim array which at the same time satifies additional constraint on the column values. Having two arrays a and b
a = np.array([[1,0,1],[0,0,1],[0,0,0],[1,1,1]])
b = np.array([[1,-1,2],[4,-1,1],[1,-1,2],[1,2,-1]])
the objective is to find the indicies for which holds that a == 1, b is positive and b is the minimumim value of the row. Fulfilling the first two conditions is easy
idx = np.where(np.logical_and(a == 1, b > 0))
which yields the indices:
(array([0, 0, 1, 3, 3]), array([0, 2, 2, 0, 1]))
Now I need to filter the duplicate row entries (stick with minimum value only) but I cannot think of an elegant way to achieve that. In the above example the result should be
(array([0,1,3]), array([0,2,0]))
edit:
It should also work for a containing other values than just 0 and 1.
Updated to trying to understand the problem better, try:
c = b*(b*a > 0)
np.where(c==np.min(c[np.nonzero(c)]))
Output:
(array([0, 1, 3], dtype=int64), array([0, 2, 0], dtype=int64))
Timings:
Method 1
a = np.array([[1,0,1],[0,0,1],[0,0,0],[1,1,1]])
b = np.array([[1,-1,2],[4,-1,1],[1,-1,2],[1,2,-1]])
b[b<0] = 100000
cond = [[True if i == b.argmin(axis=1)[k] else False for i in range(b.shape[1])] for k in range(b.shape[0])]
idx = np.where(np.logical_and(np.logical_and(a == 1, b > 0),cond))
idx
Method 2
c = b*(b*a > 0)
idx1 = np.where(c==np.min(c[np.nonzero(c)]))
idx1
Method 1 Timing:
28.3 µs ± 418 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Method 2 Timing:
12.2 µs ± 144 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
I found a solution based on list comprehension. It is necessary to change the negative values of b to some high value though.
a = np.array([[1,0,1],[0,0,1],[0,0,0],[1,1,1]])
b = np.array([[1,-1,2],[4,-1,1],[1,-1,2],[1,2,-1]])
b[b<0] = 100000
cond = [[True if i == b.argmin(axis=1)[k] else False for i in range(b.shape[1])] for k in range(b.shape[0])]
idx = np.where(np.logical_and(np.logical_and(a == 1, b > 0),cond))
print(idx)
(array([0, 1, 3]), array([0, 2, 0]))
Please let me hear what you think.
edit: I just noticed that this solution is horribly slow.
if i got this list
a = [1,0,0,1,0,0,0,1]
and I want it turned into
a = [1,0,0,2,0,0,0,3]
Setup for solution #1 and #2
from itertools import count
to_add = count()
a = [1,0,0,1,0,0,0,1]
Solution #1
>>> [x + next(to_add) if x else x for x in a]
[1, 0, 0, 2, 0, 0, 0, 3]
Solution #2, hacky but fun
>>> [x and x + next(to_add) for x in a]
[1, 0, 0, 2, 0, 0, 0, 3]
Setup for solution #3 and #4
import numpy as np
a = np.array([1,0,0,1,0,0,0,1])
Solution #3
>>> np.where(a == 0, 0, a.cumsum())
array([1, 0, 0, 2, 0, 0, 0, 3])
Solution #4 (my favorite one yet)
>>> a*a.cumsum()
array([1, 0, 0, 2, 0, 0, 0, 3])
All the cumsum solutions assume that the non-zero elements of a are all ones.
Timings:
# setup
>>> a = [1, 0, 0, 1, 0, 0, 0, 1]*1000
>>> arr = np.array(a)
>>> to_add1, to_add2 = count(), count()
# IPython timings # i5-6200U CPU # 2.30GHz (though only relative times are of interest)
>>> %timeit [x + next(to_add1) if x else x for x in a] # solution 1
669 µs ± 3.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit [x and x + next(to_add2) for x in a] # solution 2
673 µs ± 15.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit np.where(arr == 0, 0, arr.cumsum()) # solution 3
34.7 µs ± 94.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit arr = np.array(a); np.where(arr == 0, 0, arr.cumsum()) # solution 3 with array creation
474 µs ± 14.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit arr*arr.cumsum() # solution 4
23.6 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit arr = np.array(a); arr*arr.cumsum() # solution 4 with array creation
465 µs ± 6.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Here is how I would do it:
def increase(l):
count = 0
for num in l:
if num == 1:
yield num + count
count += 1
else:
yield num
c = list(increase(a))
c
[1, 0, 0, 2, 0, 0, 0, 3]
So, you want to increase each 1 except for the first one, right?
How about:
a = [1,0,0,1,0,0,0,1]
current_number = 0
for i, num in enumerate(a):
if num == 1:
a[i] = current_number + 1
current_number += 1
print(a)
>>> [1, 0, 0, 2, 0, 0, 0, 3]
Or, if you prefer:
current_number = 1
for i, num in enumerate(a):
if num == 1:
a[i] = current_number
current_number += 1
Use a list comprehension for this:
print([a[i]+a[:i].count(1) if a[i]==1 else a[i] for i in range(len(a))])
Output:
[1, 0, 0, 2, 0, 0, 0, 3]
Loop version:
for i in range(len(a)):
if a[i]==1:
a[i]=a[i]+a[:i].count(1)
Using numpy cumsum or cumulative sum to replace 1's to sum of 1's
In [4]: import numpy as np
In [5]: [i if i == 0 else j for i, j in zip(a, np.cumsum(a))]
Out[5]: [1, 0, 0, 2, 0, 0, 0, 3]
Other option: a one liner list comprehension, no dependencies.
[ 0 if e == 0 else sum(a[:i+1]) for i, e in enumerate(a) ]
#=> [1, 0, 0, 2, 0, 0, 0, 3]