I'm trying to compute a certain subset of the full outer product of n vectors. The computation of the full outer product is described in this question.
Formally: Let v1,v2,...,vk be vectors of some length n, and K be a positive constant. I want a list containing all the products v1[i1]v2[i2]...vk[ik] for which i1*i2*...*ik <= K (indices start at one). Note: For example, if K = n ** k, the list would contain every combination.
My current approach is to create a hierarchical list of the indices fulfilling the condition above and then calculating the products recursively, which has the advantage of reusing some factors.
This implementation is a lot slower than the computation of the full outer product using NumPy (for same n and k). I want to achieve a better performance than the computation of the full product. I'm interested in larger values for k, and small K (this problem comes from function approximation with sparse bases, i.e. hyperbolic cross).
Does anyone know a more performant way to get this list? Maybe by using more NumPy or another algorithm? I will try a C implementation next.
Here is my current implementation:
import numpy as np
def get_cross_indices(n, k, K):
"""
Assume k > 0.
Returns a hierarchical list containg elements of type
(i1, list) with
- i1 being a index (zero based!)
- list being again a list (possibly empty) with all indices i2, such
that (i1+1) * (i2+1) * ... * (ik+1) <= K (going down the hierarchy)
"""
if k == 1:
num = min(n, K)
return (num, [(x, []) for x in range(num)])
else:
indices = []
nums = 0
for i in xrange(min(n, K)):
(num, tail) = get_cross_indices(n,
k - 1, K // (i + 1))
indices.append((i, tail))
nums += num
return (nums, indices)
def calc_cross_outer_product(vectors, result, factor, indices, pos):
"""
Fills the result list recursively with all products
vectors[0][i1] * ... * vectors[k-1][ik]
such that i1,...,ik is a feasible index sequence
from `indices` (they are in there hierarchically,
also see `get_cross_indices`).
"""
for (x, list) in indices:
if not list:
result[pos] = factor * vectors[0][x]
pos += 1
else:
pos = calc_cross_outer_product(vectors[1:], result,
factor * vectors[0][x], list, pos)
return pos
k = 3 # number of vectors
n = 4 # vector length
K = 3
# using random values here just for demonstration purposes
vectors = np.random.rand(k, n)
# get all indices which meet the condition
(count, indices) = get_cross_indices(n, k, K)
result = np.ones(count)
calc_cross_outer_product(vectors, result, 1, indices, 0)
## Equivalent version ##
alt_result = np.ones(count)
# create full outer products
outer_product = reduce(np.multiply, np.ix_(*vectors))
pos = 0
for inds in np.ndindex((n,)*k):
# current index set is feasible?
if np.product(np.array(inds) + 1) <= K:
# compute [ vectors[0][inds[0]],...,vectors[k-1][inds[k-1]] ]
values = map(lambda x: vectors[x[0]][x[1]],
np.dstack((np.arange(k), inds))[0])
alt_result[pos] = np.product(values)
pos += 1
To get a visual idea of the indices I'm interested in, here is a picture for k=3, K=n:
(taken from this website)
Related
A sparse matrix is a matrix whose most members have zero value. Therefore, in order to save memory and storage
The matrices, it is convenient to represent them using a dictionary in the following configuration: for each cell in the matrix that is not zero, a tuple key will be stored in the dictionary
which represents the coordinates of the cell, and the value represents the value of the cell in the matrix (some number of type int or float) as usual in mathematics,
The indices of the matrix start from one.
• The cell coordinate (j, i) contains natural numbers so that the coordinate represents the cell in the i-th row
and in the jth column.
• The order in the dictionary is not important.
Realize the function sparse_mult(n, mat2, mat1) which receives 2 dictionaries, mat1 and mat2, representing square sparse matrices
of size n×n, and returns a dictionary representing the mat2×mat1 matrix multiplication matrix.
pay attention:
There is no need to check the correctness of the matrices.
It can be assumed that n is a natural number < 1.
The repeated dictionary must represent a sparse matrix as defined above.
for i in range(1, n + 1):
temp = 0
for j in range(1, n + 1):
if (mat1.get((i, j), 0) != 0)|(mat2.get((j, i), 0) != 0):
temp += mat1.get((i, j), 0) * mat2.get((j, i), 0)
if temp !=0:
resultrow[(i, i)]=temp
That's my code, I know I got it wrong but i just don't have a clue
It is inefficient to iterate over all indices in the 2-dimensional index set when multiplying two sparse matrices. Instead, you can iterate over all pairs of keys where 1 pair is drawn from each sparse matrix. Given such a pair (i,j) and (k,l), it contributes a product of 2 numbers if and only if j == k. In this case the corresponding product goes towards entry (i,l) in the overall product. A final dictionary comprehension can get rid of any zero entries. This last step might be inadequate if the numbers are floats and some entries are non-zero only due to round-off error. In that case a threshold approach which removes entries close to zero and not merely equal to zero.
def sparse_multiply(a,b):
c = {}
for i,j in a.keys():
for k,l in b.keys():
if j == k:
p = a[(i,j)]*b[(k,l)]
if (i,l) in c:
c[(i,l)] += p
else:
c[(i,l)] = p
return {k:v for k,v in c.items() if v != 0}
Note that n plays no role here. The complexity is mk where m is the number of non-zero entries in the first matrix and k the number of such entries in the second. For matrices which are very sparse this will be substantially faster than the n^3 of using straight-forward matrix multiplication. There will be some threshold where mk will actually be larger than n^3, but at that stage the matrices are no longer sparse.
so i eventually got it, if anyone care:
initialize the result dictionary
result = {}
# iterate over the rows and columns of the result matrix
for i in range(1, n + 1):
for j in range(1, n + 1):
# initialize the sum to 0
sum = 0
# iterate over the columns of the first matrix and the rows of the second matrix
for k in range(1, n + 1):
# check if the cell (i, k) in the first matrix and the cell (k, j) in the second matrix are non-zero
if (i, k) in mat1 and (k, j) in mat2:
sum += mat1[(i, k)] * mat2[(k, j)]
# add the result to the dictionary if it is non-zero
if sum != 0:
result[(i, j)] = sum
# return the result dictionary
return result
I am looking at GeeksForGeeks problem Kth smallest element:
Given an array arr[] and an integer K where K is smaller than size of array, the task is to find the Kth smallest element in the given array. It is given that all array elements are distinct.
Expected Time Complexity: O(n)
Expected Auxiliary Space: O(log(n))
Constraints:
1 <= N <= 105
1 <= arr[i] <= 105
1 <= K <= N
My Code:
class Solution:
def kthSmallest(self,arr, l, r, k):
'''
arr : given array
l : starting index of the array i.e 0
r : ending index of the array i.e size-1
k : find kth smallest element and return using this function
'''
arr2=arr[:k]
arr2.insert(0,None)
for i in range(k//2,0,-1):
arr2=self.heapify(arr2,i,k-1)
for i in arr[k:]:
if i <arr2[1]:
arr2[1]=i
arr2=self.heapify(arr2,1,k-1)
return arr2[1]
def heapify(self,arr, i, r):
if 2 * i <= r + 1 and arr[2 * i] > arr[i]:
arr[2 * i], arr[i] = arr[i], arr[i * 2]
arr = self.heapify(arr, 2 * i, r)
if 2 * i + 1 <= r + 1 and arr[2 * i + 1] > arr[i]:
arr[2 * i + 1], arr[i] = arr[i], arr[i * 2 + 1]
arr = self.heapify(arr, 2 * i + 1, r)
return arr
I made a sub array of first K elements in the array, and max heapified it.
Then for the rest of the elements in the array, if the element is smaller than the first element of the heap, I replaced the top element and then max heapified the top element. I am getting time limit exceeded error. Any idea?
The problem is that your heapify function is not efficient. In the worst case it makes two recursive calls at the same recursion depth. This may even happen at several recursion depths, so that the number of times heapify is called recursively could become quite large. The goal is to have this only call heapify once (at the most) per recursion level.
It should first find the greatest child, and only then determine whether heapify should be called again, and make that single call if needed.
Some other remarks:
Instead of making heapify recursive, use an iterative solution. This will also save some execution time.
It is strange to pass k-1 to heapify as last argument, when the last element sits at index k, and so you get the weird comparison <= r + 1 in that function. It is more intuitive to pass k as argument, and work with <= r inside the function.
As arr is mutated by heapify it is not needed to return it. This is just overhead that is useless.
2 * i is calculated several times. It is better to calcuate this only once.
arr[k:] makes a copy of that part of the list. This is not really needed. You could just iterate over the range and take the corresponding value from the array in the loop.
It is not clear why the main function needs to get l and r as arguments, since in comments it is explained that l will be 0 and r the index of the last element. But in my opinion, since you get them, you should use them. So you should not assume l is 0,... etc.
I would use a more descriptive name for arr2. Why not name it heap?
Here is the improvement of your code:
class Solution:
def kthSmallest(self,arr, l, r, k):
'''
arr : given array
l : starting index of the array i.e 0
r : ending index of the array i.e size-1
k : find kth smallest element and return using this function
'''
heap = [arr[i] for i in range(l, r + 1)]
heap.insert(0, None)
for i in range(k//2, 0, -1):
self.heapify(heap, i, k)
for i in range(l + k, r + 1):
val = arr[i]
if val < heap[1]:
heap[1] = val
self.heapify(heap, 1, k)
return heap[1]
def heapify(self, arr, i, r):
child = 2 * i
while child <= r:
if child + 1 <= r and arr[child + 1] > arr[child]:
child += 1
if arr[child] <= arr[i]:
break
arr[child], arr[i] = arr[i], arr[child]
i = child
child = 2 * i
Finally, there is a heapq module you can use, which simplifies your code:
from heapq import heapify, heapreplace
class Solution:
def kthSmallest(self, arr, l, r, k):
'''
arr : given array
l : starting index of the array i.e 0
r : ending index of the array i.e size-1
k : find kth smallest element and return using this function
'''
heap = [-arr[i] for i in range(l, l + k)]
heapify(heap)
for i in range(l + k, r + 1):
val = -arr[i]
if val > heap[0]:
heapreplace(heap, val)
return -heap[0]
The unary minus that occurs here and there is to make the native minheap work as a maxheap.
I am trying to extract all subsets from a list of elements which add up to a certain value.
Example -
List = [1,3,4,5,6]
Sum - 9
Output Expected = [[3,6],[5,4]]
Have tried different approaches and getting the expected output but on a huge list of elements it is taking a significant amount of time.
Can this be optimized using Dynamic Programming or any other technique.
Approach-1
def subset(array, num):
result = []
def find(arr, num, path=()):
if not arr:
return
if arr[0] == num:
result.append(path + (arr[0],))
else:
find(arr[1:], num - arr[0], path + (arr[0],))
find(arr[1:], num, path)
find(array, num)
return result
numbers = [2, 2, 1, 12, 15, 2, 3]
x = 7
subset(numbers,x)
Approach-2
def isSubsetSum(arr, subset, N, subsetSize, subsetSum, index , sum):
global flag
if (subsetSum == sum):
flag = 1
for i in range(0, subsetSize):
print(subset[i], end = " ")
print("")
else:
for i in range(index, N):
subset[subsetSize] = arr[i]
isSubsetSum(arr, subset, N, subsetSize + 1,
subsetSum + arr[i], i + 1, sum)
If you want to output all subsets you can't do better than a sluggish O(2^n) complexity, because in the worst case that will be the size of your output and time complexity is lower-bounded by output size (this is a known NP-Complete problem). But, if rather than returning a list of all subsets, you just want to return a boolean value indicating whether achieving the target sum is possible, or just one subset summing to target (if it exists), you can use dynamic programming for a pseudo-polynomial O(nK) time solution, where n is the number of elements and K is the target integer.
The DP approach involves filling in an (n+1) x (K+1) table, with the sub-problems corresponding to the entries of the table being:
DP[i][k] = subset(A[i:], k) for 0 <= i <= n, 0 <= k <= K
That is, subset(A[i:], k) asks, 'Can I sum to (little) k using the suffix of A starting at index i?' Once you fill in the whole table, the answer to the overall problem, subset(A[0:], K) will be at DP[0][K]
The base cases are for i=n: they indicate that you can't sum to anything except for 0 if you're working with the empty suffix of your array
subset(A[n:], k>0) = False, subset(A[n:], k=0) = True
The recursive cases to fill in the table are:
subset(A[i:], k) = subset(A[i+1:, k) OR (A[i] <= k AND subset(A[i+i:], k-A[i]))
This simply relates the idea that you can use the current array suffix to sum to k either by skipping over the first element of that suffix and using the answer you already had in the previous row (when that first element wasn't in your array suffix), or by using A[i] in your sum and checking if you could make the reduced sum k-A[i] in the previous row. Of course, you can only use the new element if it doesn't itself exceed your target sum.
ex: subset(A[i:] = [3,4,1,6], k = 8)
would check: could I already sum to 8 with the previous suffix (A[i+1:] = [4,1,6])? No. Or, could I use the 3 which is now available to me to sum to 8? That is, could I sum to k = 8 - 3 = 5 with [4,1,6]? Yes. Because at least one of the conditions was true, I set DP[i][8] = True
Because all the base cases are for i=n, and the recurrence relation for subset(A[i:], k) relies on the answers to the smaller sub-problems subset(A[i+i:],...), you start at the bottom of the table, where i = n, fill out every k value from 0 to K for each row, and work your way up to row i = 0, ensuring you have the answers to the smaller sub-problems when you need them.
def subsetSum(A: list[int], K: int) -> bool:
N = len(A)
DP = [[None] * (K+1) for x in range(N+1)]
DP[N] = [True if x == 0 else False for x in range(K+1)]
for i in range(N-1, -1, -1):
Ai = A[i]
DP[i] = [DP[i+1][k] or (Ai <=k and DP[i+1][k-Ai]) for k in range(0, K+1)]
# print result
print(f"A = {A}, K = {K}")
print('Ai,k:', *range(0,K+1), sep='\t')
for (i, row) in enumerate(DP): print(A[i] if i < N else None, *row, sep='\t')
print(f"DP[0][K] = {DP[0][K]}")
return DP[0][K]
subsetSum([1,4,3,5,6], 9)
If you want to return an actual possible subset alongside the bool indicating whether or not it's possible to make one, then for every True flag in your DP you should also store the k index for the previous row that got you there (it will either be the current k index or k-A[i], depending on which table lookup returned True, which will indicate whether or not A[i] was used). Then you walk backwards from DP[0][K] after the table is filled to get a subset. This makes the code messier but it's definitely do-able. You can't get all subsets this way though (at least not without increasing your time complexity again) because the DP table compresses information.
Here is the optimized solution to the problem with a complexity of O(n^2).
def get_subsets(data: list, target: int):
# initialize final result which is a list of all subsets summing up to target
subsets = []
# records the difference between the target value and a group of numbers
differences = {}
for number in data:
prospects = []
# iterate through every record in differences
for diff in differences:
# the number complements a record in differences, i.e. a desired subset is found
if number - diff == 0:
new_subset = [number] + differences[diff]
new_subset.sort()
if new_subset not in subsets:
subsets.append(new_subset)
# the number fell short to reach the target; add to prospect instead
elif number - diff < 0:
prospects.append((number, diff))
# update the differences record
for prospect in prospects:
new_diff = target - sum(differences[prospect[1]]) - prospect[0]
differences[new_diff] = differences[prospect[1]] + [prospect[0]]
differences[target - number] = [number]
return subsets
First of all, sorry about the naive question. But I couldn't find help elsewhere
I'm trying to create an Optimal Search Tree using Dynamic Programing in Python that receives two lists (a set of keys and a set of frequencies) and returns two answers:
1 - The smallest path cost.
2 - The generated tree for that smallest cost.
I basically need to create a tree organized by the most accessed items on top (most accessed item it's the root), and return the smallest path cost from that tree, by using the Dynamic Programming solution.
I've the following implemented code using Python:
def optimalSearchTree(keys, freq, n):
#Create an auxiliary 2D matrix to store results of subproblems
cost = [[0 for x in xrange(n)] for y in xrange(n)]
#For a single key, cost is equal to frequency of the key
#for i in xrange (0,n):
# cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in xrange (2,n):
for i in xrange(0,n-L+1):
j = i+L-1
cost[i][j] = sys.maxint
for r in xrange (i,j):
if (r > i):
c = cost[i][r-1] + sum(freq, i, j)
elif (r < j):
c = cost[r+1][j] + sum(freq, i, j)
elif (c < cost[i][j]):
cost[i][j] = c
return cost[0][n-1]
def sum(freq, i, j):
s = 0
k = i
for k in xrange (k,j):
s += freq[k]
return s
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
print(optimalSearchTree(keys, freq, n))
I'm trying to output the answer 1. The smallest cost for that tree should be 142 (the value stored on the Matrix Position [0][n-1], according to the Dynamic Programming solution). But unfortunately it's returning 0. I couldn't find any issues in that code. What's going wrong?
You have several very questionable statements in your code, definitely inspired by C/Java programming practices. For instance,
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
I think you think you calculate the number of items in the list. However, n is not 3:
sys.getsizeof(keys)/sys.getsizeof(keys[0])
3.142857142857143
What you need is this:
n = len(keys)
One more find: elif (r < j) is always True, because r is in the range between i (inclusive) and j (exclusive). The elif (c < cost[i][j]) condition is never checked. The matrix c is never updated in the loop - that's why you always end up with a 0.
Another suggestion: do not overwrite the built-in function sum(). Your namesake function calculates the sum of all items in a slice of a list:
sum(freq[i:j])
import sys
def optimalSearchTree(keys, freq):
#Create an auxiliary 2D matrix to store results of subproblems
n = len(keys)
cost = [[0 for x in range(n)] for y in range(n)]
storeRoot = [[0 for i in range(n)] for i in range(n)]
#For a single key, cost is equal to frequency of the key
for i in range (0,n):
cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in range (2,n+1):
for i in range(0,n-L+1):
j = i + L - 1
cost[i][j] = sys.maxsize
for r in range (i,j+1):
c = (cost[i][r-1] if r > i else 0)
c += (cost[r+1][j] if r < j else 0)
c += sum(freq[i:j+1])
if (c < cost[i][j]):
cost[i][j] = c
storeRoot[i][j] = r
return cost[0][n-1], storeRoot
if __name__ == "__main__" :
keys = [10,12,20]
freq = [34,8,50]
print(optimalSearchTree(keys, freq))
I have random 2d arrays which I make using
import numpy as np
from itertools import combinations
n = 50
A = np.random.randint(2, size=(n,n))
I would like to determine if the matrix has two pairs of pairs of rows which sum to the same row vector. I am looking for a fast method to do this. My current method just tries all possibilities.
for pair in combinations(combinations(range(n), 2), 2):
if (np.array_equal(A[pair[0][0]] + A[pair[0][1]], A[pair[1][0]] + A[pair[1][1]] )):
print "Pair found", pair
A method that worked for n = 100 would be really great.
Here is a pure numpy solution; no extensive timings, but I have to push n up to 500 before I can see my cursor blink once before it completes. it is memory intensive though, and will fail due to memory requirements for much larger n. Either way, I get the intuition that the odds of finding such a vector decrease geometrically for larger n anyway.
import numpy as np
n = 100
A = np.random.randint(2, size=(n,n)).astype(np.int8)
def base3(a):
"""
pack the last axis of an array in base 3
40 base 3 numbers per uint64
"""
S = np.array_split(a, a.shape[-1]//40+1, axis=-1)
R = np.zeros(shape=a.shape[:-1]+(len(S),), dtype = np.uint64)
for i in xrange(len(S)):
s = S[i]
r = R[...,i]
for j in xrange(s.shape[-1]):
r *= 3
r += s[...,j]
return R
def unique_count(a):
"""returns counts of unique elements"""
unique, inverse = np.unique(a, return_inverse=True)
count = np.zeros(len(unique), np.int)
np.add.at(count, inverse, 1)
return unique, count
def voidview(arr):
"""view the last axis of an array as a void object. can be used as a faster form of lexsort"""
return np.ascontiguousarray(arr).view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[-1]))).reshape(arr.shape[:-1])
def has_pairs_of_pairs(A):
#optional; convert rows to base 3
A = base3(A)
#precompute sums over a lower triangular set of all combinations
rowsums = sum(A[I] for I in np.tril_indices(n,-1))
#count the number of times each row occurs by sorting
#note that this is not quite O(n log n), since the cost of handling each row is also a function of n
unique, count = unique_count(voidview(rowsums))
#print if any pairs of pairs exist;
#computing their indices is left as an excercise for the reader
return np.any(count>1)
from time import clock
t = clock()
for i in xrange(100):
print has_pairs_of_pairs(A)
print clock()-t
Edit: included base-3 packing; now n=2000 is feasible, taking about 2gb of mem, and a few seconds of processing
Edit: added some timings; n=100 takes only 5ms per call on my i7m.
Based on the code in your question, and on the assumption that you're actually looking for pairs of pairs of rows that sum to equal the same row vector, you could do something like this:
def findMatchSets(A):
B = A.transpose()
pairs = tuple(combinations(range(len(A[0])), 2))
matchSets = [[i for i in pairs if B[0][i[0]] + B[0][i[1]] == z] for z in range(3)]
for c in range(1, len(A[0])):
matchSets = [[i for i in block if B[c][i[0]] + B[c][i[1]] == z] for z in range(3) for block in matchSets]
matchSets = [block for block in matchSets if len(block) > 1]
if not matchSets:
return []
return matchSets
This basically stratifies the matrix into equivalence sets that sum to the same value after one column has been taken into account, then two columns, then three, and so on, until it either reaches the last column or there is no equivalence set left with more than one member (i.e. there is no such pair of pairs). This will work fine for 100x100 arrays, largely because the chances of two pairs of rows summing to the same row vector are infinitesimally small when n is large (n*(n+1)/2 combinations compared to 3^n possible vector sums).
UPDATE
Updated code to allow searching for pairs of n-size subsets of all rows, as requested. Default is n=2 as per the original question:
def findMatchSets(A, n=2):
B = A.transpose()
pairs = tuple(combinations(range(len(A[0])), n))
matchSets = [[i for i in pairs if sum([B[0][i[j]] for j in range(n)]) == z] for z in range(n + 1)]
for c in range(1, len(A[0])):
matchSets = [[i for i in block if sum([B[c][i[j]] for j in range(n)]) == z] for z in range(n + 1) for block in matchSets]
matchSets = [block for block in matchSets if len(block) > 1]
if not matchSets:
return []
return matchSets
Here is a 'lazy' approach, that scales up to n=10000, using 'only' 4gb of memory, and completing in 10s per call or so. Worst case complexity is O(n^3), but for random data, expected performance is O(n^2). At first sight, it seems like youd need O(n^3) ops; each row combination needs to be produced and inspected at least once. But we need not look at the entire row. Rather, we can perform an early exit strategy on the comparison of rowpairs, once it is clear they are of no use to us; and for random data, we may draw this conclusion typically long before we have considered all columns in a row.
import numpy as np
n = 10
#also works for non-square A
A = np.random.randint(2, size=(n*2,n)).astype(np.int8)
#force the inclusion of some hits, to keep our algorithm on its toes
##A[0] = A[1]
def base_pack_lazy(a, base, dtype=np.uint64):
"""
pack the last axis of an array as minimal base representation
lazily yields packed columns of the original matrix
"""
a = np.ascontiguousarray( np.rollaxis(a, -1))
init = np.zeros(a.shape[1:], dtype)
packing = int(np.dtype(dtype).itemsize * 8 / (float(base) / 2))
for columns in np.array_split(a, (len(a)-1)//packing+1):
yield reduce(
lambda acc,inc: acc*base+inc,
columns,
init)
def unique_count(a):
"""returns counts of unique elements"""
unique, inverse = np.unique(a, return_inverse=True)
count = np.zeros(len(unique), np.int)
np.add.at(count, inverse, 1) #note; this scatter operation requires numpy 1.8; use a sparse matrix otherwise!
return unique, count, inverse
def has_identical_row_sums_lazy(A, combinations_index):
"""
compute the existence of combinations of rows summing to the same vector,
given an nxm matrix A and an index matrix specifying all combinations
naively, we need to compute the sum of each row combination at least once, giving n^3 computations
however, this isnt strictly required; we can lazily consider the columns, giving an early exit opportunity
all nicely vectorized of course
"""
multiplicity, combinations = combinations_index.shape
#list of indices into combinations_index, denoting possibly interacting combinations
active_combinations = np.arange(combinations, dtype=np.uint32)
for packed_column in base_pack_lazy(A, base=multiplicity+1): #loop over packed cols
#compute rowsums only for a fixed number of columns at a time.
#this is O(n^2) rather than O(n^3), and after considering the first column,
#we can typically already exclude almost all rowpairs
partial_rowsums = sum(packed_column[I[active_combinations]] for I in combinations_index)
#find duplicates in this column
unique, count, inverse = unique_count(partial_rowsums)
#prune those pairs which we can exclude as having different sums, based on columns inspected thus far
active_combinations = active_combinations[count[inverse] > 1]
#early exit; no pairs
if len(active_combinations)==0:
return False
return True
def has_identical_triple_row_sums(A):
n = len(A)
idx = np.array( [(i,j,k)
for i in xrange(n)
for j in xrange(n)
for k in xrange(n)
if i<j and j<k], dtype=np.uint16)
idx = np.ascontiguousarray( idx.T)
return has_identical_row_sums_lazy(A, idx)
def has_identical_double_row_sums(A):
n = len(A)
idx = np.array(np.tril_indices(n,-1), dtype=np.int32)
return has_identical_row_sums_lazy(A, idx)
from time import clock
t = clock()
for i in xrange(10):
print has_identical_double_row_sums(A)
print has_identical_triple_row_sums(A)
print clock()-t
Extended to include the calculation over sums of triplets of rows, as you asked above. For n=100, this still takes only about 0.2s
Edit: some cleanup; edit2: some more cleanup
Your current code does not test for pairs of rows that sum to the same value.
Assuming that's actually what you want, its best to stick to pure numpy. This generates the indices of all rows that have equal sum.
import numpy as np
n = 100
A = np.random.randint(2, size=(n,n))
rowsum = A.sum(axis=1)
unique, inverse = np.unique(rowsum, return_inverse = True)
count = np.zeros_like(unique)
np.add.at(count, inverse, 1)
for p in unique[count>1]:
print p, np.nonzero(rowsum==p)[0]
If all you need to do is determine whether such a pair exists you can do:
exists_unique = np.unique(A.sum(axis=1)).size != A.shape[0]