2D Bit matrix with every possible combination - python

I need to create a Python generator which yields every possible combination of a 2D bit matrix.
The length of each dimension is variable.
So for a 2x2 matrix:
1.
00
00
2.
10
00
3.
11
00
....
x.
00
01
Higher lengths of the dimentions (up to 200*1000) need to work too.
In the end, I will not need all of the combinations. I need only the ones where the sum in each line is 1. But I need all combinations where this is the case. I would filter them before yielding. Printing is not required.
I want to use this as filter masks to test all possible variations of a data set.
Generating variations like this must be a common problem. Maybe there is even a good library for Python?

Going through all possible values of a bit vector of a given size is exactly what a counter does. It's not evident from your question what order you want, but it looks much like a Gray counter. Example:
from sys import stdout
w,h=2,2
for val in range(2**(w+h)):
gray=val^(val>>1)
for y in range(h):
for x in range(w):
stdout.write('1' if gray & (1<<(w*y+x)) else '0')
stdout.write('\n')
stdout.write('\n')
Note that the dimensions of the vector don't matter to the counter, only the size. Also, while this gives every static pattern, it does not cover all possible transitions.

This can be done using permutation from itertools in the following way.
import itertools
dim=2
dimension = dim*dim
data = [0 for i in range(0,dimension)] + [1 for i in range(0,dimension)]
count = 1
for matrix in set(itertools.permutations(data,dimension)):
print('\n',count,'.')
for i in range(0,dimension,dim):
print(' '.join(map(str,matrix[i:i+dim])))
count+=1
P.S: This will be good for 2X2 matrix but a little bit time consuming and memory consuming for higher order. I would be glad if some one provides the less expensive algorithms for this.

You can generate every possibility of length 2 by using every number from 0 to 4(2 to the power of 2).
0 -> 00
1 -> 01
2 -> 10
3 -> 11
For the displaying part of a number as binary, bin function can be used.
Since you have 2x2 matrix, you need 2 numbers(i and j), each for a row. Then you can just convert these numbers to binary and print them.
for i in range(4):
for j in range(4):
row1 = bin(i)[2:].zfill(2)
row2 = bin(j)[2:].zfill(2)
print row1, "\n" , row2, "\n"
EDIT:
I have found zfill function which fills a string with zeros to make it fixed length.
>>> '1'.zfill(5)
'00001'
Another generic solution might be:
import re
dim1 = 2
dim2 = 2
n = dim1 * dim2
i = 0
limit = 2**n
while i < limit:
print '\n'.join(re.findall('.'*dim2, bin(i)[2:].zfill(n))), '\n'
i += 1

you could do something like this for 3x3 binary matrix:
for i in range(pow(2,9)):
p = '{0:09b}'.format(i)
print(p)
x = []
x.append([p[0],p[1],p[2]])
x.append([p[3],p[4],p[5]])
x.append([p[6],p[7],p[8]])
for i in range(3):
x[i] = map(int, x[i])

Related

Efficiently adding two different sized one dimensional arrays

I want to add two numpy arrays of different sizes starting at a specific index. As I need to do this couple of thousand times with large arrays, this needs to be efficient, and I am not sure how to do this efficiently without iterating through each cell.
a = [5,10,15]
b = [0,0,10,10,10,0,0]
res = add_arrays(b,a,2)
print(res) => [0,0,15,20,25,0,0]
naive approach:
# b is the bigger array
def add_arrays(b, a, i):
for j in range(len(a)):
b[i+j] = a[j]
You might assign smaller one into zeros array then add, I would do it following way
import numpy as np
a = np.array([5,10,15])
b = np.array([0,0,10,10,10,0,0])
z = np.zeros(b.shape,dtype=int)
z[2:2+len(a)] = a # 2 is offset
res = z+b
print(res)
output
[ 0 0 15 20 25 0 0]
Disclaimer: I assume that offset + len(a) is always less or equal len(b).
Nothing wrong with your approach. You cannot get better asymptotic time or space complexity. If you want to reduce code lines (which is not an end in itself), you could use slice assignment and some other utils:
def add_arrays(b, a, i):
b[i:i+len(a)] = map(sum, zip(b[i:i+len(a)], a))
But the functional overhead should makes this less performant, if anything.
Some docs:
map
sum
zip
It should be faster than Daweo answer, 1.5-5x times (depending on the size ratio between a and b).
result = b.copy()
result[offset: offset+len(a)] += a

How to only iterate over one argument of an matrix array if both have the same variable in python?

I am trying to eliminate some non zero entries in a matrix where the 2 adjacent diagonals to the main diagonal are nonzero.
h = np.zeros((n**2,n**2))
for i in np.arange(0, n**2):
for j in np.arange(0,n**2):
if(i==j):
for i in np.arange(0,n**2,n):
h[i,j-1] = 0
print(h)
I want it to only eliminate the lower triangle non-zero entries, but it's erasing some entries in the upper triangle. I know this is because on the last if statement with the for loop, it is iterating for both arguments of the array, when I only want it to iterate for the first argument i, but since I set i=j, it runs for both.
The matrix I want to obtain is the following:
Desired matrix
PS: sorry for the extremely bad question format, this is my first question.
hamiltonian = np.zeros((n**2,n**2)) # store the Hamiltonian
for i in np.arange(0, n**2):
for j in np.arange(0,n**2):
if abs(i-j) == 1:
hamiltonian[i,j] = 1
Is this what you are looking for?:
hamiltonian[0,1] = 1
hamiltonian[n**2-1,n**2-2] = 1
for i in np.arange(1, n**2-1):
hamiltonian[i,i+1] = 1
hamiltonian[i,i-1] = 1

amplitude spectrum in Python

I have a given array with a length of over 1'000'000 and values between 0 and 255 (included) as integers. Now I would like to plot on the x-axis the integers from 0 to 255 and on the y-axis the quantity of the corresponding x value in the given array (called Arr in my current code).
I thought about this code:
list = []
for i in range(0, 256):
icounter = 0
for x in range(len(Arr)):
if Arr[x] == i:
icounter += 1
list.append(icounter)
But is there any way I can do this a little bit faster (it takes me several minutes at the moment)? I thought about an import ..., but wasn't able to find a good package for this.
Use numpy.bincount for this task (look for more details here)
import numpy as np
list = np.bincount(Arr)
While I completely agree with the previous answers that you should use a standard histogram algorithm, it's quite easy to greatly speed up your own implementation. Its problem is that you pass through the entire input for each bin, over and over again. It would be much faster to only process the input once, and then write only to the relevant bin:
def hist(arr):
nbins = 256
result = [0] * nbins # or np.zeroes(nbins)
for y in arr:
if y>=0 and y<nbins:
result[y] += 1
return result

How to generate a matrix with random entries and with constraints on row and columns?

How to generate a matrix that its entries are random real numbers between zero and one inclusive with the additional constraint : The sum of each row must be less than or equal to one and the sum of each column must be less than or equal to one.
Examples:
matrix = [0.3, 0.4, 0.2;
0.7, 0.0, 0.3;
0.0, 0.5, 0.1]
If you want a matrix that is uniformly distributed and fulfills those constraints, you probably need a rejection method. In Matlab it would be:
n = 3;
done = false;
while ~done
matrix = rand(n);
done = all(sum(matrix,1)<=1) & all(sum(matrix,2)<=1);
end
Note that this will be slow for large n.
If you're looking for a Python way, this is simply a transcription of Luis Mendo's rejection method. For simplicity, I'll be using NumPy:
import numpy as np
n = 3
done = False
while not done:
matrix = np.random.rand(n,n)
done = np.all(np.logical_and(matrix.sum(axis=0) <= 1, matrix.sum(axis=1) <= 1))
If you don't have NumPy, then you can generate your 2D matrix as a list of lists instead:
import random
n = 3
done = False
while not done:
# Create matrix as a list of lists
matrix = [[random.random() for _ in range(n)] for _ in range(n)]
# Compute the row sums and check for each to be <= 1
row_sums = [sum(matrix[i]) <= 1 for i in range(n)]
# Compute the column sums and check for each to be <= 1
col_sums = [sum([matrix[j][i] for j in range(n)]) <= 1 for i in range(n)]
# Only quit of all row and column sums are less than 1
done = all(row_sums) and all(col_sums)
The rejection method will surely give you a uniform solution, but it might take a long time to generate a good matrix, especially if your matrix is large. So another, but more tedious approach is to generate each element such that the sum can only be 1 in each direction. For this you always generate a new element between 0 and the remainder until 1:
n = 3
matrix = zeros(n+1); %dummy line in first row/column
for k1=2:n+1
for k2=2:n+1
matrix(k1,k2)=rand()*(1-max(sum(matrix(k1,1:k2-1)),sum(matrix(1:k1-1,k2))));
end
end
matrix = matrix(2:end,2:end)
It's a bit tricky because for each element you check the row-sum and column-sum until that point, and use the larger of the two for generating a new element (in order to stay below a sum of 1 in both directions). For practical reasons I padded the matrix with a zero line and column at the beginning to avoid indexing problems with k1-1 and k2-1.
Note that as #LuisMendo pointed out, this will have a different distribution as the rejection method. But if your constraints do not consider the distribution, this could do as well (and this will give you a matrix from a single run).

Iterating over a large list

I have a list in the range [465868129, 988379794] both inclusive. When I use the following code I get a Memory Error. What can I do?
r = [465868129, 988379794]
list = [x for x in xrange(r[0], r[1]+1)]
You could iterate over the xrange directly instead of creating a list.
for x in xrange(r[0], r[1] + 1):
...
But iterating over such a large range is a very, very slow way to find squares. The fact that you run out of memory should alert you that a different approach is needed.
A much better way would be to take the square roots of each end point and then iterate between the square roots. Each integer between the square roots, when squared, would give you one of the numbers you're searching for.
In fact, if you're clever enough, you can generate all the squares with a single list comprehension and avoid an explicit for loop entirely.
Unless you have a very good reason to store the list items in a list, iterate over the generator instead, that way Python won't need to allocate a lot of memory (causing your Memory Error) to create that list:
init, end = (465868129, 988379794)
items = xrange(init, end + 1)
for item in items:
#Do something with item
To count squares on an arbitrary range consider the following formula:
import math
number_of_squares = int(math.sqrt(end) - math.sqrt(init)) +
op(is_perfect_square(init), is_perfect_square(end))
The is_perfect_square(n) is another problem on its own, so check this post if interested.
The op is used to adjust the number of squares when the init and end of the intervals init (or/and/neither) end are perfect squares. So we need a function with the following characteristics:
Both numbers are perfect squares: Eg: 25,64 => 8 - 5 = 3 (and there are 4 squares on that range). (it should sum 1 more)
End is a perfect square: Eg: 26,64 => 8 - 5 = 3 (There are 3 squares on that range). (it is correct => it should sum 0)
Init is a perfect square: Eg: 25,65 => 8 - 5 = 3 (There are 4 squares on that range). (it should sum 1 more)
None of the numbers are primes: Eg: 26, 65 => 8 - 5 = 3 (There are 3 squares on that range) (it is correct => it should sum 0)
So we need an operator with the following characteristics, based on the past examples:
1 op 1 = 1 (Both numbers are perfect squares)
0 op 1 = 0 (End is a perfect square)
1 op 0 = 1 (Init is a perfect square)
0 op 0 = 0 (None of the numbers are perfect squares)
Note that the max function almost fulfils our needs, but it fails on the second case max(0,1) = 1 and it should be 0.
So, looks like the result only depends on the first operator: if it's one, the result is 1, on the other hand if it's 0, it returns 0.
So, it's easy to write the function with that in mind:
import math
number_of_squares = int(math.sqrt(end) - math.sqrt(init)) +
int(is_perfect_square(init))
Thanks to #kojiro, we have this approach (having a similar idea), which is easier to read:
from math import sqrt, floor, ceil
number_of_squares = 1 + floor(sqrt(end)) - ceil(sqrt(init))

Categories