Related
I have an array A. I want to identify all locations with element 1 and convert it to a list as shown in the expected output. But I am getting an error.
import numpy as np
A=np.array([0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
B=np.where(A==1)
B=B.tolist()
print(B)
The error is
in <module>
B=B.tolist()
AttributeError: 'tuple' object has no attribute 'tolist'
The expected output is
[1, 2, 5, 7, 10, 11]
np.where used with only the condition returns a tuple of arrays containing indices; one array for each dimension of the array. According to the docs, this is much like np.nonzero, which is the recommended approach over np.where. So, since your array is one dimensional, np.where will return a tuple with one element, inside of which is the array containing the indices in your expected output. You can resolve your problem by accessing into the tuple like np.where(A == 1)[0].tolist().
However, I recommend using np.flatnonzero instead, which avoids the hassle entirely:
import numpy as np
A = np.array([0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
B = np.flatnonzero(A).tolist()
B:
[1, 2, 5, 7, 10, 11]
PS: when all other elements are 0, you don't have to explicitly compare to 1 ;).
import numpy as np
A = np.array([0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
indices = np.where(A == 1)[0]
B = indices.tolist()
print(B)
You should access the first element of this tuple with B[0] :
import numpy as np
A=np.array([0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
B=np.where(A==1)
B = B[0].tolist()
print(B) # [1, 2, 5, 7, 10, 11]
I'd like to take the difference of non-adjacent values within 2D numpy array along axis=-1 (per row). An array can consist of a large number of rows.
Each row is a selection of values along a timeline from 1 to N.
For N=12, the array could look like below 3x12 shape:
timeline = np.array([[ 0, 0, 0, 4, 0, 6, 0, 0, 9, 0, 11, 0],
[ 1, 0, 3, 4, 0, 0, 0, 0, 9, 0, 0, 12],
[ 0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0]])
The desired result should look like: (size of array is intact and position is important)
diff = np.array([[ 0, 0, 0, 4, 0, 2, 0, 0, 3, 0, 2, 0],
[ 1, 0, 2, 1, 0, 0, 0, 0, 5, 0, 0, 3],
[ 0, 0, 0, 4, 0, 0, 0, 0, 5, 0, 0, 0]])
I am aware of the solution in 1D, Diff on non-adjacent values
imask = np.flatnonzero(timeline)
diff = np.zeros_like(timeline)
diff[imask] = np.diff(timeline[imask], prepend=0)
within which the last line can be replaced with
diff[imask[0]] = timeline[imask[0]]
diff[imask[1:]] = timeline[imask[1:]] - timeline[imask[:-1]]
and the first line can be replaced with
imask = np.where(timeline != 0)[0]
Attempting to generalise the 1D solution I can see imask = np.flatnonzero(timeline) is undesirable as rows becomes inter-dependent. Thus I am trying by using the alternative np.nonzero.
imask = np.nonzero(timeline)
diff = np.zeros_like(timeline)
diff[imask] = np.diff(timeline[imask], prepend=0)
However, this solution results in a connection between row's end values (inter-dependent).
array([[ 0, 0, 0, 4, 0, 2, 0, 0, 3, 0, 2, 0],
[-10, 0, 2, 1, 0, 0, 0, 0, 5, 0, 0, 3],
[ 0, 0, 0, -8, 0, 0, 0, 0, 5, 0, 0, 0]])
How can I make the "prepend" to start each row with a zero?
Wow. I did it... (It is interesting problem for me too..)
I made non_adjacent_diff function to be applied to every row, and apply it to every row using np.apply_along_axis.
Try this code.
timeline = np.array([[ 0, 0, 0, 4, 0, 6, 0, 0, 9, 0, 11, 0],
[ 1, 0, 3, 4, 0, 0, 0, 0, 9, 0, 0, 12],
[ 0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0]])
def non_adjacent_diff(row):
not_zero_index = np.where(row != 0)
diff = row[not_zero_index][1:] - row[not_zero_index][:-1]
np.put(row, not_zero_index[0][1:], diff)
return row
np.apply_along_axis(non_adjacent_diff, 1, timeline)
Say i have a 2 dimensional boolean array. I would like to get a list of slices/or alike, where each slice represents the least (in size) subregions of the array contanining True values while its border contains all False.
I could loop for each row and column and store the indices when such condition is met, but i wonder if you know another way or a library that does this efficiently? You can assume that the boundary of the original boolean array is always False/0.
Example 1
Example 2
Edit ! Added new examples with the correct solutions. Sorry for the confusion.
That's connected component analysis, which has been asked and answered before. Adapting the accepted answer from there to your needs, a possible solution is quite short:
import numpy as np
from scipy.ndimage.measurements import label
def analysis(array):
labeled, _ = label(array, np.ones((3, 3), dtype=np.int))
for i in np.arange(1, np.max(labeled)+1):
pixels = np.array(np.where(labeled == i))
x1 = np.min(pixels[1, :])
x2 = np.max(pixels[1, :])
y1 = np.min(pixels[0, :])
y2 = np.max(pixels[0, :])
print(str(i) + ' | slice: array[' + str(y1) + ':' + str(y2) + ', ' + str(x1) + ':' + str(x2) + ']')
example1 = np.array([
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]).astype(bool)
example2 = np.array([
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]).astype(bool)
for a in [example1, example2]:
print(a, '\n')
analysis(a)
print('\n')
That's the output (without the examples):
[[...]]
1 | slice: array[1:2, 3:5]
2 | slice: array[4:6, 6:8]
3 | slice: array[8:8, 2:2]
[[...]]
1 | slice: array[1:3, 5:8]
2 | slice: array[2:2, 3:3]
3 | slice: array[4:5, 1:1]
4 | slice: array[5:8, 3:6]
5 | slice: array[6:6, 8:8]
6 | slice: array[8:8, 8:8]
Hope that helps!
------------------
System information
------------------
Python: 3.8.1
SciPy: 1.4.1
------------------
You can approach the problem from a graph perspective, with the coordinates of the ones being the graph elements, 8-way connected - then you just have to find the connected components in the graph. If the data is sparse, this should be quite faster than looping through possible square sizes. This an example of how it could work:
from itertools import combinations
def find_squares(a):
# Find ones
ones = [(i, j) for i, row in enumerate(a) for j, val in enumerate(row) if val]
# Make graph of connected ones
graph = {a: [] for a in ones}
for a, b in combinations(ones, 2):
if abs(a[0] - b[0]) <= 1 and abs(a[1] - b[1]) <= 1:
graph[a].append(b)
graph[b].append(a)
# Find connected components in graph
components = []
for a, a_neigh in graph.items():
if any(a in c for c in components):
continue
component = {a, *a_neigh}
pending = [*a_neigh]
visited = {a}
while pending:
b = pending.pop()
if b in visited:
continue
visited.add(b)
component.add(b)
b_neigh = graph[b]
component.update(b_neigh)
pending.extend(c for c in b_neigh if c not in visited)
components.append(component)
# Find bounds for each component
bounds = [((min(a[0] for a in c), min(a[1] for a in c)),
(max(a[0] for a in c), max(a[1] for a in c)))
for c in components]
return bounds
a = [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
square_bounds = find_squares(a)
print(*square_bounds, sep='\n')
# ((1, 5), (3, 8))
# ((2, 3), (2, 3))
# ((4, 1), (5, 1))
# ((5, 3), (8, 6))
# ((6, 8), (6, 8))
# ((8, 8), (8, 8))
I've got an np.array 219 by 219 with mostly 0s and 2% of nonzeros and I know want to create new arrays where each of the nonzero values has 90% of chance of becoming a zero.
I now know how to change the n-th non zero value to 0 but how to work with probabilities?
Probably this can be modified:
index=0
for x in range(0, 219):
for y in range(0, 219):
if (index+1) % 10 == 0:
B[x][y] = 0
index+=1
print(B)
You could use np.random.random to create an array of random numbers to compare with 0.9, and then use np.where to select either the original value or 0. Since each draw is independent, it doesn't matter if we replace a 0 with a 0, so we don't need to treat zero and nonzero values differently. For example:
In [184]: A = np.random.randint(0, 2, (8,8))
In [185]: A
Out[185]:
array([[1, 1, 1, 0, 0, 0, 0, 1],
[1, 1, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0, 1],
[0, 1, 0, 1, 1, 1, 1, 0],
[1, 1, 0, 1, 1, 0, 0, 0],
[1, 0, 0, 1, 0, 0, 1, 0],
[1, 1, 0, 0, 0, 1, 0, 1]])
In [186]: np.where(np.random.random(A.shape) < 0.9, 0, A)
Out[186]:
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0]])
# first method
prob=0.3
print(np.random.choice([2,5], (5,), p=[prob,1-prob]))
# second method (i prefer)
import random
import numpy as np
def randomZerosOnes(a,b, N, prob):
if prob > 1-prob:
n1=int((1-prob)*N)
n0=N-n1
else:
n0=int(prob*N)
n1=N-n0
zo=np.concatenate(([a for _ in range(n0)] ,[b for _ in range(n1)] ), axis=0 )
random.shuffle(zo)
return zo
zo=randomZerosOnes(2,5, N=5, prob=0.3)
print(zo)
I need to have something like this:
arr = array([[1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0]])
Where each row contains 36 elements, every 6 element in a row represents a hidden row, and that hidden row needs exactly one 1, and 0 everywhere else. In other words, every entry mod 6 needs exactly one 1. This is my requirement for arr.
I have a table that's going to be used to compute a "fitness" value for each row. That is, I have a
table = np.array([10, 5, 4, 6, 5, 1, 6, 4, 9, 7, 3, 2, 1, 8, 3,
6, 4, 6, 5, 3, 7, 2, 1, 4, 3, 2, 5, 6, 8, 7, 7, 6, 4, 1, 3, 2])
table = table.T
and I'm going to multiply each row of arr with table. The result of that multiplication, a 1x1 matrix, will be stored as the "fitness" value of that corresponding row. UNLESS the row does not fit the requirement described above, which should return 0.
an example of what should be returned is
result = array([5,12,13,14,20,34])
I need a way to do this but I'm too new to numpy to know how to.
(I'm Assuming you want what you've asked for in the first half).
I believe better or more elegant solutions exist, but this is what I think can do the job.
np.all(arr[:,6] == 1) and np.all(arr[:, :6] == 0) and np.all(arr[:, 7:])
Alternatively, you can construct the array (with 0's and 1's) and then just compare with it, say using not_equal.
I'm also not 100% sure of your question, but I'll try to answer with the best of my knowledge.
Since you're saying your matrix has "hidden rows", to check whether it is well formed, the easiest way seems to be to just reshape it:
# First check, returns true if all elements are either 0 or 1
np.in1d(arr, [0,1]).all()
# Second check, provided the above was True, returns True if
# each "hidden row" has exactly one 1 and other 0.
(arr.reshape(6,6,6).sum(axis=2) == 1).all()
Both checks return "True" for your arr.
Now, my understanding is that for each "large" row of 36 elements, you want a scalar product with your "table" vector, unless that "large" row has an ill-formed "hidden small" row. In this case, I'd do something like:
# The following computes the result, not checking for integrity
results = arr.dot(table)
# Now remove the results that are not well formed.
# First, compute "large" rows where at least one "small" subrow
# fails the condition.
mask = (arr.reshape(6,6,6).sum(axis=2) != 1).any(axis=1)
# And set the corresponding answer to 0
results[mask] = 0
However, running this code against your data returns as answer
array([38, 31, 24, 24, 32, 20])
which is not what you mention; did I misunderstand your requirement, or was the example based on different data?