You are given two integer numbers n and r, such that 1 <= r < n,
a two-dimensional array W of size n x n.
Each element of this array is either 0 or 1.
Your goal is to compute density map D for array W, using radius of r.
The output density map is also two-dimensional array,
where each value represent number of 1's in matrix W within the specified radius.
Given the following input array W of size 5 and radius 1 (n = 5, r = 1)
1 0 0 0 1
1 1 1 0 0
1 0 0 0 0
0 0 0 1 1
0 1 0 0 0
Output (using Python):
3 4 2 2 1
4 5 2 2 1
3 4 3 3 2
2 2 2 2 2
1 1 2 2 2
Logic: Input first row, first column value is 1. r value is 1. So we should check 1 right element, 1 left element, 1 top element, top left, top right, bottom , bottom left and bottom right and sum all elements.
Should not use any 3rd party library.
I did it using for loop and inner for loop and check for each element. Any better work around ?
Optimization: For each 1 in W, update count for locations, in whose neighborhood it belongs
Although for W of size nxn, the following algorithm would still take O(n^2) steps, however if W is sparse i.e. number of 1s (say k) << nxn then instead of rxrxnxn steps for approach stated in question, following would take nxn + rxrxk steps, which is much lower if k << nxn
Given r assigned and W stored as
[[1, 0, 0, 0, 1],
[1, 1, 1, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 0, 0]]
then following
output = [[ 0 for i in range(5) ] for j in range(5) ]
for i in range(len(W)):
for j in range(len(W[0])):
if W[i][j] == 1:
for off_i in range(-r,r+1):
for off_j in range(-r,r+1):
if (0 <= i+off_i < len(W)) and (0 <= j+off_j < len(W[0])):
output[i+off_i][j+off_j] += 1
stores required values in output
for r = 1, output is as required
[[3, 4, 2, 2, 1],
[4, 5, 2, 2, 1],
[3, 4, 3, 3, 2],
[2, 2, 2, 2, 2],
[1, 1, 2, 2, 2]]
Related
I'm trying to make a script which generates a contracted basis set that I need in my scientific calculations. The output file contains basically 2D grid of numbers. In the first column I have exponents and the second column and columns after that contain contraction coefficients. In the example below I have four functions (that correspond to 4 exponents) and in this contraction scheme these four functions are contracted into two functions in such a way that in the first function I have 3 exponents and hence 3 nonzero coefficients. The second function consist only of 1 function and hence 1 nonzero coefficient. So, 4 functions contracted into two as 3 + 1 it would look like this
exponent1 coefficient1 0
exponent2 coefficient2 0
exponent3 coefficient3 0
exponent4 0 coefficient4
so that the non-zero numbers in the column are contracted together. In my script I of course try to make a general scheme so that if I have n exponents I can make a contraction scheme where I can have any kind of scheme between 1 and n functions.
The script is as follows now
#!/usr/bin/python3
def contract(e, i, c, n):
"""
e = list of exponents == number of functions.
Functions e are contracted to i functions whose coefficients
are determined by c and n is a list which tells how the contraction
is done (e.g. n = [3, 1] --> functions contracted into two as 3 + 1.)
"""
num = len(e)
grid = [[0 for i in range(i + 1)] for x in range(num)]
for num1, row1 in enumerate(grid):
row1[0] = e[num1] #add exponents
for g in grid:
print(g)
i = 2
e = [0, 1, 2, 3]
c = [4, 5, 6, 7]
n = [3, 1]
contract(e, i, c, n)
This works so far and the output this produces now is:
[0, 0, 0]
[1, 0, 0]
[2, 0, 0]
[3, 0, 0]
The output should be:
[0, 4, 0]
[1, 5, 0]
[2, 6, 0]
[3, 0, 7]
The problem is that I don't know how I could do the rest of the code. So how could I get the coefficients to correct places? I will then somehow print the numbers in the list so the final output file should in this case look like:
0 4 0
1 5 0
2 6 0
3 0 7
Does anyone have an idea how I could do this?
Let's say I have a machine that sends out 4 bits every second and I want to see the amount of times a certain bit signature is sent over time.
I am given an input list of lists that contain a message in bits that change over time.
For my output I would like a list of dictionaries, per bit pair, containing the unique bit pair as the key and the times it appears as the value.
Edit New Example:
For example this following data set would be a representation of that data. With the horizontal axis being bit position and the vertical axis being samples over time. So for the following example I have 4 total bits and 6 total samples.
a = [
[0, 0, 1, 1],
[0, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[0, 0, 0, 0],
[1, 0, 1, 0]])
For this data set I am trying to get a count of how many times a certain bit string occurs this length should be able to vary but for this example let's say I am doing 2 bits at a time.
So the first sample [0,0,1,1] would be split into this
[00,01,11] and the second would be [01,11,11] and the third would be [11,11,11] and so on. Producing a list like so:
y = [
[00,01,11],
[01,11,11],
[11,11,11],
[11,11,11],
[00,00,00],
[10,01,10]]
From this I want to be able to count each unique signature and to produce a dictionary with keys corresponding to the signature and values to the counts.
The dictionary would like this
z = [
{'00':2, '01':1, '11':2, '10':1},
{'00':1, '01':2, '11':3},
{'00':1, '11':4], '10':1}]
Finding the counts is easy if a have a list of parsed items. However getting from the raw data to that parsed list is where I am currently having some trouble. I have an implementation but it's essentially 3 for loops and it runs really slow over large dataset. Surely there is a better and more pythonic way to get about this?
I am using numpy for some additional calculation later on in my program so I would not be against using it here.
UPDATE:
I have been looking around at other things and came to this. Not sure if this is the best solution either.
import numpy as np
a = np.array([
[0, 0, 1, 1],
[0, 1, 1, 1],
[1, 1, 1, 1]])
my_list = a.astype(str).tolist()
# How many elements each
# list should have
n = 2
# using list comprehension
final = [([''.join(c[i:(i) + n]) for i in range((len(c) + n) // n)]) for c in my_list]
final = [['00', '01', '11'], ['01', '11', '11'], ['11', '11', '11']]
UPDATE 2:
I have ran the following implementations and tested there speeds and here is what I have came up with.
Running the data on the small example of 4 bits and 4 samples with a width of 2.
x = [
[0,0,1,1],
[0,1,1,1],
[1,1,1,1]]
My implementation took 0.0003 seconds
Kasrâmvd's implementation took 0.0002 seconds
Chris' implementation took 0.0002 seconds
Paul's implementation took 0.0243 seconds
However when running against an actual dataset of 64 bits and 23,497 samples with a width of 2. I got these results:
My implementation took 1.5302 seconds
Kasrâmvd's implementation took 0.3913 seconds
Chris' Implementation took 2.0802 seconds
Paul's implementation took 0.0204 seconds
Here is an approach using convolution. As fast convolution depends on FFT and therefore needs to do computations with floats, we have 52 bits mantissa and 53 is the maximum pattern length we can handle.
import itertools as it
import numpy as np
import scipy.signal as ss
MAX_BITS = np.finfo(float).nmant + 1
def sliding_window(data, width, return_keys=True, return_dict=True, prune_empty=True):
n, m = data.shape
if width > MAX_BITS:
raise ValueError(f"max window width is {MAX_BITS}")
patterns = ss.convolve(data, 1<<np.arange(width)[None], 'valid', 'auto').astype(int)
patterns += np.arange(m-width+1)*(1<<width)
cnts = np.bincount(patterns.ravel(), None, (m-width+1)*(1<<width)).reshape(m-width+1,-1)
if return_keys or return_dict:
keys = np.array([*map("".join, it.product(*width*("01",)))], 'S')
if return_dict:
dt = np.dtype([('key', f'S{width}'), ('value', int)])
aux = np.empty(cnts.shape, dt)
aux['value'] = cnts
aux['key'] = keys
if prune_empty:
i,j = np.where(cnts)
return [*map(dict, np.split(aux[i,j],
i.searchsorted(np.arange(1,m-width+1))))]
return [*map(dict, aux.tolist())]
return keys, cnts
return cnts
example = np.random.randint(0, 2, (10,10))
print(example)
print(sliding_window(example,3))
Sample run:
[[0 1 1 1 0 1 1 1 1 1]
[0 0 1 0 1 0 0 1 0 1]
[0 0 1 0 1 1 1 0 1 1]
[1 1 1 1 1 0 0 0 1 0]
[0 0 0 0 1 1 1 0 0 0]
[1 1 0 0 0 1 0 0 1 1]
[0 1 1 1 0 1 1 1 1 1]
[0 1 0 0 0 1 1 0 0 1]
[1 0 1 1 0 1 1 0 1 0]
[0 0 1 1 0 1 0 1 0 0]]
[{b'000': 1, b'001': 3, b'010': 1, b'011': 2, b'101': 1, b'110': 1, b'111': 1}, {b'000': 1, b'010': 2, b'011': 2, b'100': 2, b'111': 3}, {b'000': 2, b'001': 1, b'101': 2, b'110': 4, b'111': 1}, {b'001': 2, b'010': 1, b'011': 2, b'101': 4, b'110': 1}, {b'010': 2, b'011': 4, b'100': 2, b'111': 2}, {b'000': 1, b'001': 1, b'100': 1, b'101': 1, b'110': 4, b'111': 2}, {b'001': 2, b'010': 2, b'100': 2, b'101': 2, b'111': 2}, {b'000': 1, b'001': 1, b'010': 2, b'011': 2, b'100': 1, b'101': 1, b'111': 2}]
If you wanna have a geometrical or algebraic analysis/solution you can do the following:
In [108]: x = np.array([[0,0,1,1],
...: [0,1,1,1],
...: [1,1,1,1]])
...:
In [109]:
In [109]: pairs = np.dstack((x[:, :-1], x[:, 1:]))
In [110]: x, y, z = pairs.shape
In [111]: uniques
Out[111]:
array([[0, 0],
[0, 1],
[1, 1]])
In [112]: uniques = np.unique(pairs.reshape(x*y, z), axis=0)
# None: 3d broadcasting is not recommended in any situation, please read doc for more details,
In [113]: R = (uniques[:,None][:,None,:] == pairs).all(3).sum(-1)
In [114]: R
Out[114]:
array([[1, 0, 0],
[1, 1, 0],
[1, 2, 3]])
The columns of matrix R stand for the count of each unique pair in uniques object in each row of your original array.
You can then get a Python object like what you want as following:
In [116]: [{tuple(i): j for i,j in zip(uniques, i) if j} for i in R.T]
Out[116]: [{(0, 0): 1, (0, 1): 1, (1, 1): 1}, {(0, 1): 1, (1, 1): 2}, {(1, 1): 3}]
This solution doesn't pair the bits, but gives them as tuples (although that should be simple enough to do).
EDIT: formed strings of bits as needed.
from collections import Counter
x = [[0,0,1,1],
[0,1,1,1],
[1,1,1,1]]
y = [[''.join(map(str, ref[j:j+2])) for j in range(len(x[0])-1)] \
for ref in x]
for bit in y:
d = Counter(bit)
print(d)
Prints
Counter({'00': 1, '01': 1, '11': 1})
Counter({'11': 2, '01': 1})
Counter({'11': 3})
EDIT: To increase the window from 2 to 3, you might add this to your code:
window = 3
offset = window - 1
y = [[''.join(map(str, ref[j:j+window])) for j in range(len(x[0])-offset)] \
for ref in x]
I can't wrap my head around csr_matrix examples in scipy documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html
Can someone explain how this example work?
>>> row = np.array([0, 0, 1, 2, 2, 2])
>>> col = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray()
array([[1, 0, 2],
[0, 0, 3],
[4, 5, 6]])
I believe this is following this format.
csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
where data, row_ind and col_ind satisfy the relationship a[row_ind[k], col_ind[k]] = data[k].
What is a here?
row = np.array([0, 0, 1, 2, 2, 2])
col = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
from the above arrays;
for k in 0~5
a[row_ind[k], col_ind[k]] = data[k]
a
row[0],col[0] = [0,0] = 1 (from data[0])
row[1],col[1] = [0,2] = 2 (from data[1])
row[2],col[2] = [1,2] = 3 (from data[2])
row[3],col[3] = [2,0] = 4 (from data[3])
row[4],col[4] = [2,1] = 5 (from data[4])
row[5],col[5] = [2,2] = 6 (from data[5])
so let's arrange matrix 'a' in shape(3X3)
a
0 1 2
0 [1, 0, 2]
1 [0, 0, 3]
2 [4, 5, 6]
This is a sparse matrix. So, it stores the explicit indices and values at those indices. So for example, since row=0 and col=0 corresponds to 1 (the first entries of all three arrays in your example). Hence, the [0,0] entry of the matrix is 1. And so on.
Represent the "data" in a 4 X 4 Matrix:
data = np.array([10,0,5,99,25,9,3,90,12,87,20,38,1,8])
indices = np.array([0,1,2,3,0,2,3,0,1,2,3,1,2,3])
indptr = np.array([0,4,7,11,14])
'indptr'- Index pointers is linked list of pointers to 'indices' (Column
index Pointers)...
indptr[i:i+1] represents i to i+1 index of pointer
14 reprents len of Data len(data)...
indptr = np.array([0,4,7,11,len(data)]) other way of represenint 'indptr'
0,4 --> 0:4 represents pointers to indices 0,1,2,3
4,7 --> 4:7 represents the pointers of indices 0,2,3
7,11 --> 7:11 represents the pointers of 0,1,2,3
11,14 --> 11:14 represents pointers 1,2,3
# Representing the data in a 4,4 matrix
a = csr_matrix((data,indices,indptr),shape=(4,4),dtype=np.int)
a.todense()
matrix([[10, 0, 5, 99],
[25, 0, 9, 3],
[90, 12, 87, 20],
[ 0, 38, 1, 8]])
Another Stackoverflow explanation
As far as I understand, in row and col arrays we have indices which corrensponds to non-zero values in matrix. a[0, 0] = 1, a[0, 2] = 2, a[1, 2] = 3 and so on. As we have no indices for a[0, 1], a[1, 0], a[1, 1] so appropriate values in matrix are equal to 0.
Also, maybe this little intro will be helpful for you:
https://www.youtube.com/watch?v=Lhef_jxzqCg
#Rohit Pandey stated correctly, I just want to add an example on that.
When most of the elements of a matrix have 0 values, then we call this a sparse matrix. The process includes removing zero elements from the matrix and thus saving memory space and computing time. We only store non-zero items with their respected row and column index. i.e.
0 3 0 4
0 5 7 0
0 0 0 0
0 2 6 0
We calculate the sparse matrix by putting non-zero items row index first, then column index, and finally non-zero values like the following:
Row
0
0
1
1
3
3
Column
1
3
1
2
1
2
Value
3
4
5
7
2
6
By reversing the process we get the simple matrix form from the sparse form.
I have matrix similar to this:
1 0 0
1 0 0
0 2 0
0 2 0
0 0 3
0 0 3
(Non-zero numbers denote parts that I'm interested in. Actual number inside matrix could be random.)
And I need to produce vector like this:
[ 1 1 2 2 3 3 ].T
I can do this with loop:
result = np.zeros([rows])
for y in range(rows):
x = y // (rows // cols) # pick index of corresponded column
result[y] = mat[y][x]
But I can't figure out how to do this in vector form.
This might be what you want.
import numpy as np
m = np.array([
[1, 0, 0],
[1, 0, 0],
[0, 2, 0],
[0, 2, 0],
[0, 0, 3],
[0, 0, 3]
])
rows, cols = m.shape
# axis1 indices
y = np.arange(rows)
# axis2 indices
x = y // (rows // cols)
result = m[y,x]
print(result)
Result:
[1 1 2 2 3 3]
For example the binary table for 3 bit:
0 0 0
0 0 1
0 1 0
1 1 1
1 0 0
1 0 1
And I want to store this into an n*n*2 array so it would be:
0 0 0
0 0 1
0 1 0
1 1 1
1 0 0
1 0 1
For generating the combinations automatically, you can use itertools.product standard library, which generates all possible combinations of the different sequences which are supplied, i. e. the cartesian product across the input iterables. The repeat argument comes in handy as all of our sequences here are identical ranges.
from itertools import product
x = [i for i in product(range(2), repeat=3)]
Now if we want an array instead a list of tuples from that, we can just pass this to numpy.array.
import numpy as np
x = np.array(x)
# [[0 0 0]
# [0 0 1]
# [0 1 0]
# [0 1 1]
# [1 0 0]
# [1 0 1]
# [1 1 0]
# [1 1 1]]
If you want all elements in a single list, so you could index them with a single index, you could chain the iterable:
from itertools import chain, product
x = list(chain.from_iterable(product(range(2), repeat=3)))
result: [0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1]
Most people would expect 2^n x n as in
np.c_[tuple(i.ravel() for i in np.mgrid[:2,:2,:2])]
# array([[0, 0, 0],
# [0, 0, 1],
# [0, 1, 0],
# [0, 1, 1],
# [1, 0, 0],
# [1, 0, 1],
# [1, 1, 0],
# [1, 1, 1]])
Explanation: np.mgrid as used here creates the coordinates of the corners of a unit cube which happen to be all combinations of 0 and 1. The individual coordinates are then ravelled and joined as columns by np.c_
Here's a recursive, native python (no libraries) version of it:
def allBinaryPossiblities(maxLength, s=""):
if len(s) == maxLength:
return s
else:
temp = allBinaryPossiblities(maxLength, s + "0") + "\n"
temp += allBinaryPossiblities(maxLength, s + "1")
return temp
print (allBinaryPossiblities(3))
It prints all possible:
000
001
010
011
100
101
110
111