Related
I am trying to create block matrices with numpy with this shape
Where each entry is a 4x4 matrix.
As an example lets fill all the entries with 4x4 zero matrices.
N = 9
sizeOfBlock = 4
A = np.zeros((N, N), dtype =object)
for i in np.arange(N):
for j in np.arange(N):
A[i,j] = np.zeros((sizeOfBlock,sizeOfBlock))
This will create the matrix with the correct shape.
Now I would like to convert it from a 9x9 matrix of object to a 36x36 matrix will all the entries.
Any way to do this?
Best Regards
Your use of object dtype array has few, if any, advantages over a list of lists.
block can make an array from nested lists of arrays:
In [294]: A=np.arange(4).reshape(2,2); Z=np.zeros((2,2),int)
In [295]: np.block([[A,Z],[Z,A]])
Out[295]:
array([[0, 1, 0, 0],
[2, 3, 0, 0],
[0, 0, 0, 1],
[0, 0, 2, 3]])
An object dtype array:
In [296]: O = np.empty((2,2),object)
In [298]: O[0,0]=O[1,1]=A
In [300]: O[0,1]=O[1,0]=Z
In [301]: O
Out[301]:
array([[array([[0, 1],
[2, 3]]), array([[0, 0],
[0, 0]])],
[array([[0, 0],
[0, 0]]), array([[0, 1],
[2, 3]])]], dtype=object)
block doesn't do anything useful with that:
In [302]: np.block(O)
Out[302]:
array([[array([[0, 1],
[2, 3]]), array([[0, 0],
[0, 0]])],
[array([[0, 0],
[0, 0]]), array([[0, 1],
[2, 3]])]], dtype=object)
However if we turn into a list of lists as in [295]
In [303]: np.block(O.tolist())
Out[303]:
array([[0, 1, 0, 0],
[2, 3, 0, 0],
[0, 0, 0, 1],
[0, 0, 2, 3]])
The current code of block is obscured with the dispatching layer, but when I looked at it in the past, it was using a recursive hstack and vstack
In [307]: np.vstack((np.hstack(O[0]),np.hstack(O[1])))
Out[307]:
array([[0, 1, 0, 0],
[2, 3, 0, 0],
[0, 0, 0, 1],
[0, 0, 2, 3]])
concatenate and family can handle 1d object dtype arrays, treating them as a sequence, or list, of arrays. They can't work directly with 2d arrays.
Comments suggest working with a 4d array:
In [308]: arr = np.array([[A,Z],[Z,A]])
In [309]: arr
Out[309]:
array([[[[0, 1],
[2, 3]],
[[0, 0],
[0, 0]]],
[[[0, 0],
[0, 0]],
[[0, 1],
[2, 3]]]])
simple reshape doesn't not preserve the block layout:
In [310]: arr.reshape(4,4)
Out[310]:
array([[0, 1, 2, 3],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 1, 2, 3]])
but an axis swap will:
In [311]: arr.transpose(0,2,1,3).reshape(4,4)
Out[311]:
array([[0, 1, 0, 0],
[2, 3, 0, 0],
[0, 0, 0, 1],
[0, 0, 2, 3]])
For some layouts, Kronecker product does the job
In [315]: np.kron(np.eye(2).astype(int),A)
Out[315]:
array([[0, 1, 0, 0],
[2, 3, 0, 0],
[0, 0, 0, 1],
[0, 0, 2, 3]])
I have a Numpy Array that with integer values 1 or 0 (can be cast as booleans if necessary). The array is square and symmetric (see note below) and I want a list of the indices where a 1 appears:
Note that array[i][j] == array[j][i] and array[i][i] == 0 by design. Also I cannot have any duplicates.
import numpy as np
array = np.array([
[0, 0, 1, 0, 1, 0, 1],
[0, 0, 1, 1, 0, 1, 0],
[1, 1, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 1, 1, 0],
[1, 0, 0, 1, 0, 0, 1],
[0, 1, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0]
])
I would like a result that is like this (order of each sub-list is not important, nor is the order of each element within the sub-list):
[
[0, 2],
[0, 4],
[0, 6],
[1, 2],
[1, 3],
[1, 5],
[2, 6],
[3, 4],
[3, 5],
[4, 6]
]
Another point to make is that I would prefer not to loop over all indices twice using the condition j<i because the size of my array can be large but I am aware that this is a possibility - I have written an example of this using two for loops:
result = []
for i in range(array.shape[0]):
for j in range(i):
if array[i][j]:
result.append([i, j])
print(pd.DataFrame(result).sort_values(1).values)
# using dataframes and arrays for formatting but looking for
# 'result' which is a list
# Returns (same as above but columns are the opposite way round):
[[2 0]
[4 0]
[6 0]
[2 1]
[3 1]
[5 1]
[6 2]
[4 3]
[5 3]
[6 4]]
idx = np.argwhere(array)
idx = idx[idx[:,0]<idx[:,1]]
Another way:
idx = np.argwhere(np.triu(array))
output:
[[0 2]
[0 4]
[0 6]
[1 2]
[1 3]
[1 5]
[2 6]
[3 4]
[3 5]
[4 6]]
Comparison:
##bousof solution
def method1(array):
return np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]>=0))).transpose()[:,::-1]
#Also mentioned by #hpaulj
def method2(array):
return np.argwhere(np.triu(array))
def method3(array):
idx = np.argwhere(array)
return idx[idx[:,0]<idx[:,1]]
#The original method in question by OP(d-man)
def method4(array):
result = []
for i in range(array.shape[0]):
for j in range(i):
if array[i][j]:
result.append([i, j])
return result
#suggestd by #bousof in comments
def method5(array):
return np.vstack(np.where(np.triu(array))).transpose()
inputs = [np.random.randint(0,2,(n,n)) for n in [10,100,1000,10000]]
Seems like method1, method2 and method5 are slightly faster for large arrays while method3 is faster for smaller cases:
In [249]: arr = np.array([
...: [0, 0, 1, 0, 1, 0, 1],
...: [0, 0, 1, 1, 0, 1, 0],
...: [1, 1, 0, 0, 0, 0, 1],
...: [0, 1, 0, 0, 1, 1, 0],
...: [1, 0, 0, 1, 0, 0, 1],
...: [0, 1, 0, 1, 0, 0, 0],
...: [1, 0, 1, 0, 1, 0, 0]
...: ])
The most common way of getting indices on non-zeros (True) is with np.nonzero (aka np.where):
In [250]: idx = np.nonzero(arr)
In [251]: idx
Out[251]:
(array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6]),
array([2, 4, 6, 2, 3, 5, 0, 1, 6, 1, 4, 5, 0, 3, 6, 1, 3, 0, 2, 4]))
This is a tuple - 2 arrays for a 2d array. It can be used directly to index the array (or anything like it): arr[idx] will give all 1s.
Apply np.transpose to that and get an array of 'pairs':
In [252]: np.argwhere(arr)
Out[252]:
array([[0, 2],
[0, 4],
[0, 6],
[1, 2],
[1, 3],
[1, 5],
[2, 0],
[2, 1],
[2, 6],
[3, 1],
[3, 4],
[3, 5],
[4, 0],
[4, 3],
[4, 6],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
Using such an array to index arr is harder - requiring a loop and conversion to tuple.
To weed out the symmetric duplicates we could make a tri-lower array:
In [253]: np.tril(arr)
Out[253]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0]])
In [254]: np.argwhere(np.tril(arr))
Out[254]:
array([[2, 0],
[2, 1],
[3, 1],
[4, 0],
[4, 3],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
You can use numpy.where:
>>> np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]<=0))).transpose()
array([[2, 0],
[2, 1],
[3, 1],
[4, 0],
[4, 3],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]<=0 is true only on the lower part of the matrix. If the order is important, you can get the same order as in the question using:
>>> np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]>=0))).transpose()[:,::-1]
array([[2, 0],
[4, 0],
[6, 0],
[2, 1],
[3, 1],
[5, 1],
[6, 2],
[4, 3],
[5, 3],
[6, 4]])
Matrix A:
A = np.array([[3, 0, 0, 8, 3],
[9, 3, 2, 2, 6],
[5, 5, 4, 2, 8],
[3, 8, 7, 1, 2],
[3, 9, 1, 5, 5]])
Matrix B: values in each row means the index of each row in matrix A.
B = np.array([[1, 2],
[3, 4],
[1, 3],
[0, 1],
[2, 3]])
We will set values in A whose index are in B to 1, others to 0.
Then the result will be:
A = np.array([[0, 1, 1, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 1, 0]])
I don't want to use for loop, how can I do it with numpy?
We can index using arrays. For axis0, we just make a range for 0-len(B) to cover each row. Then for axis1, we transpose B to represent all the column indices we want to access.
>>> C = np.zeros_like(A)
>>> C[np.arange(len(B)), B.T] = 1
>>> C
array([[0, 1, 1, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 1, 0]])
>>> B = np.array([[1, 2],
... [3, 4],
... [1, 3],
... [0, 1],
... [2, 3]])
Convenient but a bit wasteful
>>> np.identity(5,int)[B].sum(1)
array([[0, 1, 1, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 1, 0]])
More economical but also more typing
>>> out = np.zeros((5,5),int)
>>> out[np.c_[:5],B] = 1
>>> out
array([[0, 1, 1, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 1, 0]])
I have a list in the form:
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
However the last two sub-elements will always be zero at the start so it could be like:
lst = [[1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [1, 1], [3, 1], [1, 3], [2, 1], [2, 0]]
If that is easier.
What I want is to remove and count the duplicates of this list and set the 3rd sub-element to the count so if we take the above I want:
lst = [[1, 0, 1, 0], [1, 1, 2, 0], [2, 0, 2, 0], [2, 1, 3, 0], [3, 1, 1, 0], [1, 3, 1, 0]]
I have found explanations of how to remove duplicates at:
Removing Duplicates from Nested List Based on First 2 Elements
and
Removing duplicates from list of lists in Python
but I don't know how to count the duplicates. The order of the elements in the overall list doesn't matter but the order of the elements in the sub-lists must be preserved as [1,3] and [3,1] aren't the same thing.
If this turns out to be a dead end I could do something like hash the first two elements for counting but only if I could get them back after counting.
Any help is appreciated.
Sorry for dyslexia!
For example:
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
from collections import Counter
c = Counter(tuple(i) for i in lst)
print [list(item[0][0:2] + (item[1], 0)) for item in c.items()]
# [[1, 0, 1, 0], [1, 1, 2, 0], [3, 1, 1, 0], [2, 1, 3, 0], [1, 3, 1, 0], [2, 0, 2, 0]]
To elaborate on the great hint provided by njzk2:
Turn your list of lists into a list of tuples
Create a Counter from it
Get a dict from the Counter
Set the 3rd element of the sublists to the frequency from the Counter
from collections import Counter
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
list_of_tuples = [tuple(elem) for elem in lst]
dct = dict(Counter(list_of_tuples))
lst = [list(e) for e in dct]
for elem in lst:
elem[2] = dct[tuple(elem)]
Edit: removed duplicates with the line before the for loop. Didn't see that requirement before.
You can do this to keep count of the duplicates:
lst = [[1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [1, 1], [3, 1], [1, 3], [2, 1], [2, 0]]
for x in lst:
count = 1
tmpLst = list(lst)
tmpLst.remove(x)
for y in tmpLst:
if x[0] == y[0] and x[1] == y[1]:
count = count + 1
x.append(count)
#x.append(0) #if you want to add that 4th element
print lst
Result:
[[1, 0, 1], [1, 1, 2], [2, 0, 2], [2, 1, 3], [2, 1, 3], [1, 1, 2], [3, 1, 1], [1, 3, 1], [2, 1, 3], [2, 0, 2]]
Then you can take lst and remove duplicates as mentioned in the link you posted.
A different (maybe functional) approach.
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0],\
[2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0],\
[2, 1, 0, 0], [2, 0, 0, 0]]
def rec_counter(lst):
# Inner method that is called at the end. Receives a
# list, the current element to be compared and an accumulator
# that will contain the result.
def counter(lst, elem, acc):
new_lst = [x for x in lst if x != elem]
elem[2] = lst.count(elem)
acc.append(elem)
if len(new_lst) == 0:
return acc
else:
return counter(new_lst, new_lst[0], acc)
# This part starts the recursion of the inner method. If the list
# is empty, nothing to do. Otherwise, count starting with the first
# element of the list and an empty accumulator.
if len(lst) == 0:
return []
else:
return counter(lst, lst[0], [])
print rec_counter(lst)
# [[1, 0, 1, 0], [1, 1, 2, 0], [2, 0, 2, 0], \
# [2, 1, 3, 0], [3, 1, 1, 0], [1, 3, 1, 0]]
I create skeleton images from binary images, like that
I detect end points of that skeletons with mahotas python library but it returns me the full image array with 1 value for end point and zero the others.
I prefer to detect end points' coordinates.
How can i get them?
My Code that compute endpoints:
branch1=np.array([[2, 1, 2], [1, 1, 1], [2, 2, 2]])
branch2=np.array([[1, 2, 1], [2, 1, 2], [1, 2, 1]])
branch3=np.array([[1, 2, 1], [2, 1, 2], [1, 2, 2]])
branch4=np.array([[2, 1, 2], [1, 1, 2], [2, 1, 2]])
branch5=np.array([[1, 2, 2], [2, 1, 2], [1, 2, 1]])
branch6=np.array([[2, 2, 2], [1, 1, 1], [2, 1, 2]])
branch7=np.array([[2, 2, 1], [2, 1, 2], [1, 2, 1]])
branch8=np.array([[2, 1, 2], [2, 1, 1], [2, 1, 2]])
branch9=np.array([[1, 2, 1], [2, 1, 2], [2, 2, 1]])
endpoint1=np.array([[0, 0, 0], [0, 1, 0], [2, 1, 2]])
endpoint2=np.array([[0, 0, 0], [0, 1, 2], [0, 2, 1]])
endpoint3=np.array([[0, 0, 2], [0, 1, 2], [0, 2, 1]])
endpoint4=np.array([[0, 2, 1], [0, 1, 2], [0, 0, 0]])
endpoint5=np.array([[2, 1, 2], [0, 1, 0], [0, 0, 0]])
endpoint6=np.array([[1, 2, 0], [2, 1, 0], [0, 0, 0]])
endpoint7=np.array([[2, 0, 0], [1, 1, 0], [2, 0, 0]])
endpoint8=np.array([[0, 0, 0], [2, 1, 0], [1, 2, 0]])
jpg = 'skel.jpg'
skel = cv2.imread(jpg, 0)
sk = pymorph.binary(skel)
complete_path = 'skel.jpg'
print pymorph.gray(sk).dtype
br1=mah.morph.hitmiss(sk,branch1)
br2=mah.morph.hitmiss(sk,branch2)
br3=mah.morph.hitmiss(sk,branch3)
br4=mah.morph.hitmiss(sk,branch4)
br5=mah.morph.hitmiss(sk,branch5)
br6=mah.morph.hitmiss(sk,branch6)
br7=mah.morph.hitmiss(sk,branch7)
br8=mah.morph.hitmiss(sk,branch8)
br9=mah.morph.hitmiss(sk,branch9)
ep1=mah.morph.hitmiss(sk,endpoint1)
ep2=mah.morph.hitmiss(sk,endpoint2)
ep3=mah.morph.hitmiss(sk,endpoint3)
ep4=mah.morph.hitmiss(sk,endpoint4)
ep5=mah.morph.hitmiss(sk,endpoint5)
ep6=mah.morph.hitmiss(sk,endpoint6)
ep7=mah.morph.hitmiss(sk,endpoint7)
ep8=mah.morph.hitmiss(sk,endpoint8)
br=br1+br2+br3+br4+br5+br6+br7+br8+br9
ep=ep1+ep2+ep3+ep4+ep5+ep6+ep7+ep8
br and ep are the array with all branches and endpoint from i want get coordinates.
So ep should just be a binary numpy array that is 1 only at your endpoint coordinates? In which case you can use either numpy.where or numpy.nonzero to get the indices of the nonzero values in ep:
pseudo_ep = (np.random.rand(512,512) > 0.9)
rows,cols = np.where(pseudo_ep)
These should correspond to the y,x coordinates of your endpoints