Angle between planes algorithm is too slow

Angle between planes algorithm is too slow - python

I have written working code calculating the angle between the adjacent planes.
Here's what I already tried to optimise:
1) I got rid of couple of np built-in functions, e.g. np.cross() and np.linalg.norm(), that gave me a couple of seconds.
2) It was for z in range(1, n), I changed 1 to k in order to not take into account already calculated triangles.
I also tried to make faster input, but to no avail.
Please, can someone tell me how to make it significantly faster?
I'm not well-acquainted with graphs, and I have a bad feeling about this...
(Migrated to Code Review)

You determine the adjacency of the triangles by matching all triangles to each other. If you create a dictionary of edges, you can find adjacent triangles more efficiently.
Use the two nodes of an edge as key. In order to make the key unique, make the node with the lowest index the first one. You can create the dict when you read the indices:
edge = {}
for i in range(n):
a, b, c = [int(j) for j in raw_input().split()]
ind.append((a, b, c))
k = (min(a, b), max(a, b))
edge[k] = edge.get(k, []) + [i]
k = (min(b, c), max(b, c))
edge[k] = edge.get(k, []) + [i]
k = (min(c, a), max(c, a))
edge[k] = edge.get(k, []) + [i]
Use the dict like so:
def calculate_angle():
for e in edge:
if len(e) == 2:
i1, i2 = e
n1 = norm[i1]
n2 = norm[i2]
a = abs(math.acos(max(-1, min(1, dot(n1, n2)))))
angles_list.append(a)
return max(angles_list)
The drawback here is that the angles appear in an arbitrary order in the list, but that's what happens in your original code, too.
You can speed up the program by precalculating the normal as unit vector for each tria only once and store it in the list norm. That's what I have done above. The angle calculation is then only the arc cosine of the dot product.
And do you only need the maximum value? Then don't create a list, but keep a running maximum which you update if the current angle is greater than the current maximum.

Related

Analyzing the complexity matrix path-finding

Recently in my homework, I was assinged to solve the following problem:
Given a matrix of order nxn of zeros and ones, find the number of paths from [0,0] to [n-1,n-1] that go only through zeros (they are not necessarily disjoint) where you could only walk down or to the right, never up or left. Return a matrix of the same order where the [i,j] entry is the number of paths in the original matrix that go through [i,j], the solution has to be recursive.
My solution in python:
def find_zero_paths(M):
n,m = len(M),len(M[0])
dict = {}
for i in range(n):
for j in range(m):
M_top,M_bot = blocks(M,i,j)
X,Y = find_num_paths(M_top),find_num_paths(M_bot)
dict[(i,j)] = X*Y
L = [[dict[(i,j)] for j in range(m)] for i in range(n)]
return L[0][0],L
def blocks(M,k,l):
n,m = len(M),len(M[0])
assert k<n and l<m
M_top = [[M[i][j] for i in range(k+1)] for j in range(l+1)]
M_bot = [[M[i][j] for i in range(k,n)] for j in range(l,m)]
return [M_top,M_bot]
def find_num_paths(M):
dict = {(1, 1): 1}
X = find_num_mem(M, dict)
return X
def find_num_mem(M,dict):
n, m = len(M), len(M[0])
if M[n-1][m-1] != 0:
return 0
elif (n,m) in dict:
return dict[(n,m)]
elif n == 1 and m > 1:
new_M = [M[0][:m-1]]
X = find_num_mem(new_M,dict)
dict[(n,m-1)] = X
return X
elif m == 1 and n>1:
new_M = M[:n-1]
X = find_num_mem(new_M, dict)
dict[(n-1,m)] = X
return X
new_M1 = M[:n-1]
new_M2 = [M[i][:m-1] for i in range(n)]
X,Y = find_num_mem(new_M1, dict),find_num_mem(new_M2, dict)
dict[(n-1,m)],dict[(n,m-1)] = X,Y
return X+Y
My code is based on the idea that the number of paths that go through [i,j] in the original matrix is equal to the product of the number of paths from [0,0] to [i,j] and the number of paths from [i,j] to [n-1,n-1]. Another idea is that the number of paths from [0,0] to [i,j] is the sum of the number of paths from [0,0] to [i-1,j] and from [0,0] to [i,j-1]. Hence I decided to use a dictionary whose keys are matricies of the form [[M[i][j] for j in range(k)] for i in range(l)] or [[M[i][j] for j in range(k+1,n)] for i in range(l+1,n)] for some 0<=k,l<=n-1 where M is the original matrix and whose values are the number of paths from the top of the matrix to the bottom. After analizing the complexity of my code I arrived at the conclusion that it is O(n^6).
Now, my instructor said this code is exponential (for find_zero_paths), however, I disagree.
The recursion tree (for find_num_paths) size is bounded by the number of submatrices of the form above which is O(n^2). Also, each time we add a new matrix to the dictionary we do it in polynomial time (only slicing lists), SO... the total complexity is polynomial (poly*poly = poly). Also, the function 'blocks' runs in polynomial time, and hence 'find_zero_paths' runs in polynomial time (2 lists of polynomial-size times a function which runs in polynomial time) so all in all the code runs in polynomial time.
My question: Is the code polynomial and my O(n^6) bound is wrong or is it exponential and I am missing something?

Unfortunately, your instructor is right.
There is a lot to unpack here:
Before we start, as quick note. Please don't use dict as a variable name. It hurts ^^. Dict is a reserved keyword for a dictionary constructor in python. It is a bad practice to overwrite it with your variable.
First, your approach of counting M_top * M_bottom is good, if you were to compute only one cell in the matrix. In the way you go about it, you are unnecessarily computing some blocks over and over again - that is why I pondered about the recursion, I would use dynamic programming for this one. Once from the start to end, once from end to start, then I would go and compute the products and be done with it. No need for O(n^6) of separate computations. Sine you have to use recursion, I would recommend caching the partial results and reusing them wherever possible.
Second, the root of the issue and the cause of your invisible-ish exponent. It is hidden in the find_num_mem function. Say you compute the last element in the matrix - the result[N][N] field and let us consider the simplest case, where the matrix is full of zeroes so every possible path exists.
In the first step, your recursion creates branches [N][N-1] and [N-1][N].
In the second step, [N-1][N-1], [N][N-2], [N-2][N], [N-1][N-1]
In the third step, you once again create two branches from every previous step - a beautiful example of an exponential explosion.
Now how to go about it: You will quickly notice that some of the branches are being duplicated over and over. Cache the results.

Generating n binary vectors where each vector has a Hamming distance of d from every other vector

I'm trying to generate n binary vectors of some arbitrary length l, where each vector i has a Hamming distance of d (where d is even) from every other vector j. I'm not sure if there are any theoretical relationships between n, l, and d, but I'm wondering if there are any implementations for this task. My current implementation is shown below. Sometimes I am successful, other times the code hangs, which indicates either a) it's not possible to find n such vectors given l and d, or b) the search takes a long time especially for large values of l.
My questions are:
Are there any efficient implementations of this task?
What kind of theoretical relationships exist between n, l, and d?
import numpy as np
def get_bin(n):
return ''.join([str(np.random.randint(0, 2)) for _ in range(n)])
def hamming(s1, s2):
return sum(c1 != c2 for c1, c2 in zip(s1, s2))
def generate_codebook(n, num_codes, d):
codebooks = []
seen = []
while len(codebooks) < num_codes:
code = get_bin(n)
if code in seen:
continue
else:
if len(codebooks) == 0:
codebooks.append(code)
print len(codebooks), code
else:
if all(map(lambda x: int(hamming(code, x)) == d, codebooks)):
codebooks.append(code)
print len(codebooks), code
seen.append(code)
codebook_vectorized = map(lambda x: map(lambda b: int(b), x), codebooks)
return np.array(codebook_vectorized)
Example:
codebook = generate_codebook(4,3,2)
codebook
1 1111
2 1001
3 0101

Let's build a graph G where every L-bit binary vector v is a vertex. And there is an edge (vi, vj) only when a Hamming distance between vi and vj is equal to d. Now we need to find a clique of size n is this graph.
Clique is a subset of vertices of an undirected graph such that every
two distinct vertices in the clique are adjacent.
The task of finding a clique of given size in an arbitrary graph is NP-complete. You can read about this problem and some algorithms in this wikipedia article.
There are many special cases of this problem. For example, for perfect graphs there is a polynomial algorithm. Don't know if it is possible to show that our graph is one of these special cases.

Not a real solution, but more of a partial discussion about the relationship between l, d and n and the process of generating vectors. In any case, you may consider posting the question (or a similar one, in more formal terms) to Mathematics Stack Exchange. I have been reasoning as I was writing, but I hope I didn't make a mistake.
Let's say we have l = 6. Since the Hamming distance depends only on position-wise differences, you can decide to start by putting one first arbitrary vector in your set (if there are solutions, some may not include it, but at least one should). So let's begin with an initial v1 = 000000. Now, if d = 1 then obviously n can only be 1 or 2 (with 111111). If d = 1, you will find that n can also only be 1 or 2; for example, you could add 000001, but any other possible vector will have a distance of 2 or more with at least one the vectors you have.
Let's say d = 4. You need to change 4 positions and keep the other 2, so you have 4-combinations from a 6-element set, which is 15 choices, 001111, 010111, etc. - you can see now that the binomial coefficient C(n, d) plus 1 is an upper bound for n. Let's pick v2 = 001111, and say that the kept positions are T = [1, 2] and the changed ones are S = [3, 4, 5, 6]. Now to go on, we could consider making changes to v2; however, in order to keep the right distances we must follow these rules:
We must make 4 changes to v2.
If we change a position in a position in S, we must make another change in a position in T (and viceversa). Otherwise, the distance to v1 would not be kept.
Logically, if d were odd you would be done now (only sets of two elements could be formed), but fortunately you already said that your distance numbers are even. So we divide our number by two, which is 2, and need to pick 2 elements from S, C(4, 2) = 6, and 2 elements from T, C(2, 2) = 1, giving us 6 * 1 = 6 options - you should note now that C(d, d/2) * C(l - d, d/2) + 2 is a new, lower upper bound for n, if d is even. Let's pick v3 = 111100. v3 has now four kinds of positions: positions that have changed with respect to both v1 and v2, P1 = [1, 2], positions that have not changed with respect to either v1 or v2, P2 = [] (none in this case), positions that have changed with respect to v1 but not with respect to v2, P3 = [3, 4], and positions that have changed with respect to v2 but not with respect to v1, P4 = [5, 6]. Same deal, we need 4 changes, but now each change we make to a P1 position must imply a change in a P2 position, and each change we make to a P3 position must imply a change in a P4 position. The only remaining option is v4 = 110011, and that would be it, the maximum n would be 4.
So, thinking about the problem from a combinatoric point of view, after each change you will have an exponentially increasing number of "types of positions" (2 after the first change, 4 after the second, 8, 16...) defined in terms of whether they are equal or not in each of the previously added vectors, and these can be arranged in couples through a "symmetry" or "complement" relationship. On each step, you can (I think, and this is the part of this reasoning that I am less sure about) greedily choose a set of changes from these couples and compute the sizes of the "types of positions" for the next step. If this is all correct, you should be able to write an algorithm based on this to generate and/or count the possible sets of vectors for some particular l and d and n if given.

Efficient Particle-Pair Interactions Calculation

I have an N-body simulation that generates a list of particle positions, for multiple timesteps in the simulation. For a given frame, I want to generate a list of the pairs of particles' indices (i, j) such that dist(p[i], p[j]) < masking_radius. Essentially I'm creating a list of "interaction" pairs, where the pairs are within a certain distance of each other. My current implementation looks something like this:
interaction_pairs = []
# going through each unique pair (order doesn't matter)
for i in range(num_particles):
for j in range(i + 1, num_particles):
if dist(p[i], p[j]) < masking_radius:
interaction_pairs.append((i,j))
Because of the large number of particles, this process takes a long time (>1 hr per test), and it is severely limiting to what I need to do with the data. I was wondering if there was any more efficient way to structure the data such that calculating these pairs would be more efficient instead of comparing every possible combination of particles. I was looking into KDTrees, but I couldn't figure out a way to utilize them to compute this more efficiently. Any help is appreciated, thank you!

Since you are using python, sklearn has multiple implementations for nearest neighbours finding:
http://scikit-learn.org/stable/modules/neighbors.html
There is KDTree and Balltree provided.
As for KDTree the main point is to push all the particles you have into KDTree, and then for each particle ask query: "give me all particles in range X". KDtree usually do this faster than bruteforce search.
You can read more for example here: https://www.cs.cmu.edu/~ckingsf/bioinfo-lectures/kdtrees.pdf
If you are using 2D or 3D space, then other option is to just cut the space into big grid (which cell size of masking radius) and assign each particle into one grid cell. Then you can find possible candidates for interaction just by checking neighboring cells (but you also have to do a distance check, but for much fewer particle pairs).

Here's a fairly simple technique using plain Python that can reduce the number of comparisons required.
We first sort the points along either the X, Y, or Z axis (selected by axis in the code below). Let's say we choose the X axis. Then we loop over point pairs like your code does, but when we find a pair whose distance is greater than the masking_radius we test whether the difference in their X coordinates is also greater than the masking_radius. If it is, then we can bail out of the inner j loop because all points with a greater j have a greater X coordinate.
My dist2 function calculates the squared distance. This is faster than calculating the actual distance because computing the square root is relatively slow.
I've also included code that behaves similar to your code, i.e., it tests every pair of points, for speed comparison purposes; it also serves to check that the fast code is correct. ;)
from random import seed, uniform
from operator import itemgetter
seed(42)
# Make some fake data
def make_point(hi=10.0):
return [uniform(-hi, hi) for _ in range(3)]
psize = 1000
points = [make_point() for _ in range(psize)]
masking_radius = 4.0
masking_radius2 = masking_radius ** 2
def dist2(p, q):
return (p[0] - q[0])**2 + (p[1] - q[1])**2 + (p[2] - q[2])**2
pair_count = 0
test_count = 0
do_fast = 1
if do_fast:
# Sort the points on one axis
axis = 0
points.sort(key=itemgetter(axis))
# Fast
for i, p in enumerate(points):
left, right = i - 1, i + 1
for j in range(i + 1, psize):
test_count += 1
q = points[j]
if dist2(p, q) < masking_radius2:
#interaction_pairs.append((i, j))
pair_count += 1
elif q[axis] - p[axis] >= masking_radius:
break
if i % 100 == 0:
print('\r {:3} '.format(i), flush=True, end='')
total_pairs = psize * (psize - 1) // 2
print('\r {} / {} tests'.format(test_count, total_pairs))
else:
# Slow
for i, p in enumerate(points):
for j in range(i+1, psize):
q = points[j]
if dist2(p, q) < masking_radius2:
#interaction_pairs.append((i, j))
pair_count += 1
if i % 100 == 0:
print('\r {:3} '.format(i), flush=True, end='')
print('\n', pair_count, 'pairs')
output with do_fast = 1
181937 / 499500 tests
13295 pairs
output with do_fast = 0
13295 pairs
Of course, if most of the point pairs are within masking_radius of each other, there won't be much benefit in using this technique. And sorting the points adds a little bit of time, but Python's TimSort is rather efficient, especially if the data is already partially sorted, so if the masking_radius is sufficiently small you should see a noticeable improvement in the speed.

Number of Vertices Within a Distance from Start Vertex

I was working on a question on a judge that asked about finding the number of vertices that are within a certain distance from it. This has to be done for all vertices on the graph. The full question specifications can be seen here. I have some Python code to solve the program, but it is too slow.
import sys, collections
raw_input = sys.stdin.readline
n, m, k = map(int, raw_input().split())
dict1 = collections.defaultdict(set)
ans = {i:[set([i])for j in xrange(k)]for i in xrange(1, n+1)}
for i in xrange(m):
x, y = map(int, raw_input().split())
dict1[x].add(y)
dict1[y].add(x)
for point in dict1:
ans[point][0].update(dict1[point])
for i in xrange(1, k):
for point in dict1:
for neighbour in dict1[point]:
ans[point][i].update(ans[neighbour][i-1])
for i in xrange(1, n+1):
print len(ans[i][-1])
What my code does is it initially creates a set of points that are direct neighbours of each vertex (distance of 0 to 1). After that, it creates a new set of neighbours for each vertex from all the previously found neighbours of neighbours (distance of 2). Then it keeps doing this, creating a new set of neighbours and incrementing the distance until the final distance is reached. Is there a better way to solve this problem?

There is a plenty of good and fast solutions.
One of them (Not the fastest, but fast enough) is to use BFS algorithm up to distance K. Just run bfs, which not adds neighbours to queue when distance exceeds K, for all vertexes. K is parameter from exercise specification.

I would use the adjacency matrix multiplication. Adjacency matrix is a boolean square matrix n * n where n is a number of vertices. The value of adjacency_matrix[i][j] equals 1 if the edge from i to j exists and 0 otherwise. If we multiply the adjacency matrix by itself we get the paths of length 2. If we do that again we get the paths of length 3 and so on and so on. In Your case the K <= 5 so there won't be too much of that multiplication. You can use numpy for that and it will be very fast. So in pseudocode, the solution to Your problem would look like this:
adjacency_matrix = build_adjacency_matrix_from_input()
initial_adjacency_matrix = adjacency_matrix
result_matrix = adjacency_matrix
for i = 2 to K:
adjacency_matrix = adjacency_matrix * initial_adjacency_matrix
result_matrix = result_matrix + adjacency_matrix
for each row of result_matrix print how many values higher then 0 are in it

You want paths of length <=K. In this case, BFS can be used to find paths of certain length easily. Or if your graphs uses adjacency matrix representation, then matrix multiplication can also be used for these purposes.
If using BFS:
This is equivalent to performing level-by-level traversal starting from a given source vertex. Here is the pseudo code that can compute all the vertices that are at distance K from a given source vertex:
Start: Let s be your source vertex and K represent max path length required
Create two Queues Q1 and Q2 and insert source vertex s into Q1
Let queueTobeEmptied = Q1 // represents the queue that is to be emptied
Let queueTobeFilled = Q2 // represents the queue that is used for inserting new elements discovered
Let Result be a vector of vertices: initialize it to be empty
Note: source vertex s is at level 0, push it to Result vector if that is also required
for(current_level=1; current_level<=K; current_level++) {
while(queueTobeEmptied is not empty) {
remove the vertex from queueTobeEmptied and call it as u
for_each adjacent vertex 'v' of u {
if v is not already visited {
mark v as visited
insert v into the queueTobeFilled
push v to Result
}
}
}
swap the queues now for next iteration of for loop: swap(queueTobeEmptied, queueTobeFilled)
}
Empty Q1 and Q2
End: Result is the vector that contains all the vertices of length <= K

Generate "random" matrix of certain rank over a fixed set of elements

I'd like to generate matrices of size mxn and rank r, with elements coming from a specified finite set, e.g. {0,1} or {1,2,3,4,5}. I want them to be "random" in some very loose sense of that word, i.e. I want to get a variety of possible outputs from the algorithm with distribution vaguely similar to the distribution of all matrices over that set of elements with the specified rank.
In fact, I don't actually care that it has rank r, just that it's close to a matrix of rank r (measured by the Frobenius norm).
When the set at hand is the reals, I've been doing the following, which is perfectly adequate for my needs: generate matrices U of size mxr and V of nxr, with elements independently sampled from e.g. Normal(0, 2). Then U V' is an mxn matrix of rank r (well, <= r, but I think it's r with high probability).
If I just do that and then round to binary / 1-5, though, the rank increases.
It's also possible to get a lower-rank approximation to a matrix by doing an SVD and taking the first r singular values. Those values, though, won't lie in the desired set, and rounding them will again increase the rank.
This question is related, but accepted answer isn't "random," and the other answer suggests SVD, which doesn't work here as noted.
One possibility I've thought of is to make r linearly independent row or column vectors from the set and then get the rest of the matrix by linear combinations of those. I'm not really clear, though, either on how to get "random" linearly independent vectors, or how to combine them in a quasirandom way after that.
(Not that it's super-relevant, but I'm doing this in numpy.)
Update: I've tried the approach suggested by EMS in the comments, with this simple implementation:
real = np.dot(np.random.normal(0, 1, (10, 3)), np.random.normal(0, 1, (3, 10)))
bin = (real > .5).astype(int)
rank = np.linalg.matrix_rank(bin)
niter = 0
while rank > des_rank:
cand_changes = np.zeros((21, 5))
for n in range(20):
i, j = random.randrange(5), random.randrange(5)
v = 1 - bin[i,j]
x = bin.copy()
x[i, j] = v
x_rank = np.linalg.matrix_rank(x)
cand_changes[n,:] = (i, j, v, x_rank, max((rank + 1e-4) - x_rank, 0))
cand_changes[-1,:] = (0, 0, bin[0,0], rank, 1e-4)
cdf = np.cumsum(cand_changes[:,-1])
cdf /= cdf[-1]
i, j, v, rank, score = cand_changes[np.searchsorted(cdf, random.random()), :]
bin[i, j] = v
niter += 1
if niter % 1000 == 0:
print(niter, rank)
It works quickly for small matrices but falls apart for e.g. 10x10 -- it seems to get stuck at rank 6 or 7, at least for hundreds of thousands of iterations.
It seems like this might work better with a better (ie less-flat) objective function, but I don't know what that would be.
I've also tried a simple rejection method for building up the matrix:
def fill_matrix(m, n, r, vals):
assert m >= r and n >= r
trans = False
if m > n: # more columns than rows I think is better
m, n = n, m
trans = True
get_vec = lambda: np.array([random.choice(vals) for i in range(n)])
vecs = []
n_rejects = 0
# fill in r linearly independent rows
while len(vecs) < r:
v = get_vec()
if np.linalg.matrix_rank(np.vstack(vecs + [v])) > len(vecs):
vecs.append(v)
else:
n_rejects += 1
print("have {} independent ({} rejects)".format(r, n_rejects))
# fill in the rest of the dependent rows
while len(vecs) < m:
v = get_vec()
if np.linalg.matrix_rank(np.vstack(vecs + [v])) > len(vecs):
n_rejects += 1
if n_rejects % 1000 == 0:
print(n_rejects)
else:
vecs.append(v)
print("done ({} total rejects)".format(n_rejects))
m = np.vstack(vecs)
return m.T if trans else m
This works okay for e.g. 10x10 binary matrices with any rank, but not for 0-4 matrices or much larger binaries with lower rank. (For example, getting a 20x20 binary matrix of rank 15 took me 42,000 rejections; with 20x20 of rank 10, it took 1.2 million.)
This is clearly because the space spanned by the first r rows is too small a portion of the space I'm sampling from, e.g. {0,1}^10, in these cases.
We want the intersection of the span of the first r rows with the set of valid values.
So we could try sampling from the span and looking for valid values, but since the span involves real-valued coefficients that's never going to find us valid vectors (even if we normalize so that e.g. the first component is in the valid set).
Maybe this can be formulated as an integer programming problem, or something?

My friend, Daniel Johnson who commented above, came up with an idea but I see he never posted it. It's not very fleshed-out, but you might be able to adapt it.
If A is m-by-r and B is r-by-n and both have rank r then AB has rank r. Now, we just have to pick A and B such that AB has values only in the given set. The simplest case is S = {0,1,2,...,j}.
One choice would be to make A binary with appropriate row/col sums
that guaranteed the correct rank and B with column sums adding to no
more than j (so that each term in the product is in S) and row sums
picked to cause rank r (or at least encourage it as rejection can be
used).
I just think that we can come up with two independent sampling
schemes on A and B that are less complicated and quicker than trying
to attack the whole matrix at once. Unfortunately, all my matrix
sampling code is on the other computer. I know it generalized easily
to allowing entries in a bigger set than {0,1} (i.e. S), but I can't
remember how the computation scaled with m*n.

I am not sure how useful this solution will be, but you can construct a matrix that will allow you to search for the solution on another matrix with only 0 and 1 as entries. If you search randomly on the binary matrix, it is equivalent to randomly modifying the elements of the final matrix, but it is possible to come up with some rules to do better than a random search.
If you want to generate an m-by-n matrix over the element set E with elements ei, 0<=i<k, you start off with the m-by-k*m matrix, A:
Clearly, this matrix has rank m. Now, you can construct another matrix, B, that has 1s at certain locations to pick the elements from the set E. The structure of this matrix is:
Each Bi is a k-by-n matrix. So, the size of AB is m-by-n and rank(AB) is min(m, rank(B)). If we want the output matrix to have only elements from our set, E, then each column of Bi has to have exactly one element set to 1, and the rest set to 0.
If you want to search for a certain rank on B randomly, you need to start off with a valid B with max rank, and rotate a random column j of a random Bi by a random amount. This is equivalent to changing column i row j of A*B to a random element from our set, so it is not a very useful method.
However, you can do certain tricks with the matrices. For example, if k is 2, and there are no overlaps on first rows of B0 and B1, you can generate a linearly dependent row by adding the first rows of these two sub-matrices. The second row will also be linearly dependent on rows of these two matrices. I am not sure if this will easily generalize to k larger than 2, but I am sure there will be other tricks you can employ.
For example, one simple method to generate at most rank k (when m is k+1) is to get a random valid B0, keep rotating all rows of this matrix up to get B1 to Bm-2, set first row of Bm-1 to all 1, and the remaining rows to all 0. The rank cannot be less than k (assuming n > k), because B_0 columns have exactly 1 nonzero element. The remaining rows of the matrices are all linear combinations (in fact exact copies for almost all submatrices) of these rows. The first row of the last submatrix is the sum of all rows of the first submatrix, and the remaining rows of it are all zeros. For larger values of m, you can use permutations of rows of B0 instead of simple rotation.
Once you generate one matrix that satisfies the rank constraint, you may get away with randomly shuffling the rows and columns of it to generate others.

How about like this?
rank = 30
n1 = 100; n2 = 100
from sklearn.decomposition import NMF
model = NMF(n_components=rank, init='random', random_state=0)
U = model.fit_transform(np.random.randint(1, 5, size=(n1, n2)))
V = model.components_
M = np.around(U) # np.around(V)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.