Progressively sort an array, like excel - python

I'd like to progressively sort an array, like I can in excel. For example:
randomMatrix = np.asarray(
[[0, 1, 0, 1, 0, 0, 2, 0, 1, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 2],
[1, 1, 0, 0, 2, 0, 0, 1, 0, 0]])
I'd like to have a: "Sort by column 1. Then, sort by column2. Then sort by column3, etc. etc." like we can in excel to produce the following:
sortedMatrix = np.asarray(
[[0, 1, 2, 0, 1, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 1, 2, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 1, 2]])
How can I accomplish this? This answer recommends using lexsort, but when I do I get:
randomMatrix[np.lexsort(randomMatrix.T[::-1])]
array([[0, 1, 0, 1, 0, 0, 2, 0, 1, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 2],
[1, 1, 0, 0, 2, 0, 0, 1, 0, 0]])

You are sorting by rows which is different from the answer, which is sorting by column, a little adaptation of the answer should work for you:
randomMatrix[:, np.lexsort(randomMatrix)] # no need to transpose here but the sorting
# index has to be applied to the second axis
# array([[0, 1, 2, 0, 1, 0, 0, 1, 0, 0],
# [0, 0, 0, 1, 1, 2, 0, 0, 1, 0],
# [0, 0, 0, 0, 0, 0, 1, 1, 1, 2]])
Also from the documentation:
If a 2D array is provided for the keys argument, it's rows are
interpreted as the sorting keys and sorting is according to the last
row, second last row etc.
So here the last row will be the primary sorting key, the second row will be the secondary sorting key and the first row will be the last sorting key. And when actually doing sorting process, with a stable sorting algorithm, the sorting process will be executed on the first row firstly, then the second row and the primary sorting key will be sorted at the final stage. Combined together, np.lexsort returns an integer indices which gives the sorting order. Applying this sorting order to all the rows of your matrix gives the desired output.

Related

How to find the indices of a certain value that exists in the same location in two matrices?

In what I am working on, I have two numpy matrices, both the same size, filled with 0's and 1's for simplicity (but let's say it could be filled with any numbers). What I would like to know is a way to extract, from these two matrices, the position of the 1's that exist in the same position in both matrices.
For example, if I have the following two matrices and value
a = np.array([[0, 0, 0, 1, 0, 1],
[1, 1, 0, 1, 1, 1],
[1, 0, 1, 1, 0, 1],
[1, 0 ,1, 1, 1, 0],
[0, 0, 1, 0, 0, 0]])
b = np.array([[0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 1],
[1, 1, 1, 1, 1, 0]])
value = 1
then I would like a way to somehow get the information of all the locations where the value "1" exists in both matrices, i.e.:
result = [(0,5),(1,1),(2,3),(4,2)]
I guess the result could be thought of as an intersection, but in my case the position is important which is the reason I don't think np.intersect1d() would be much help. In the actual matrices I am working with, they are on the order of about 10,000 by 10,000, so this list would probably be a lot longer.
Thanks in advance for any help!
You could use numpy.argwhere:
import numpy as np
a = np.array([[0, 0, 0, 1, 0, 1],
[1, 1, 0, 1, 1, 1],
[1, 0, 1, 1, 0, 1],
[1, 0, 1, 1, 1, 0],
[0, 0, 1, 0, 0, 0]])
b = np.array([[0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 1],
[1, 1, 1, 1, 1, 0]])
result = np.argwhere(a & b)
print(result)
Output
[[0 5]
[1 1]
[2 3]
[2 5]
[4 2]]

pattern restriction in substring- Python

I found a question in glassdoor. I do not have additional clarification
Input : an int array [1,0,0,1,1,0,0,1,0,1,0,0,0,1]
you have to come up with a program that will give all possible subsets of the array based on the pattern.
Pattern restrictions were the string array should start with 1 and end with 1. So there will be many sub arrays like from index 0 to 3 and 0 to 4 and index 7 to 9
To solve this I was thinking of using 2 for loops and if both cases the values are equal to 1 then print them.
v=[1,0,0,1,1,0,0,1,0,1,0,0,0,1]
resultList=[]
for i in range(0,len(v)-1):
for j in range(i+1, len(v)):
if v[i]==1 and v[j]==1:
r=v[i:j]
resultList.append(r)
print(resultList)
Output:[[1, 0, 0], [1, 0, 0, 1], [1, 0, 0, 1, 1, 0, 0], [1, 0, 0, 1, 1, 0, 0, 1, 0], [1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0], [1], [1, 1, 0, 0],
I only see 1 correct value so far in output [1, 0, 0, 1]. Should I have used set instead of list? I tried that but that approach did not work either. Can someone kindly give some directions on how to solve this problem?
Thanks for your time.
You can use itertools.combinations to pick 2 indices where the values are non-zeroes in the list:
from itertools import combinations
a = [1,0,0,1,1,0,0,1,0,1,0,0,0,1]
[a[i: j + 1] for i, j in combinations((i for i, n in enumerate(a) if n), 2)]
This returns:
[[1, 0, 0, 1], [1, 0, 0, 1, 1], [1, 0, 0, 1, 1, 0, 0, 1], [1, 0, 0, 1, 1, 0, 0, 1, 0, 1], [1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1], [1, 1], [1, 1, 0, 0, 1], [1, 1, 0, 0, 1, 0, 1], [1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1], [1, 0, 0, 1], [1, 0, 0, 1, 0, 1], [1, 0, 0, 1, 0, 1, 0, 0, 0, 1], [1, 0, 1], [1, 0, 1, 0, 0, 0, 1], [1, 0, 0, 0, 1]]
The probelm is in v[i:j]. Change v[i:j] to v[i:j+1]

finding continuous signal in noisy binary time series

Suppose I have a time series such as:
[1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 , 1, 1, 1, 1]
and I know there is some noise in the signal. I want to remove the noise as best I can and still output a binary signal. The above example would turn into something like:
[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 1, 1, 1, 1]
I have implemented a naive rule-based approach where I iterate through the values and have some minimum amount of 1s or 0s I need to "swap" the signal.
It seems like there must be a better way to do it. A lot of the results from googling around give non-binary output. Is there some scipy function I could leverage for this?
There are two similar functions that can help you: scipy.signal.argrelmin and scipy.signal.argrelmax. There are search for local min/max in discrete arrays. You should pass your array and neighbours search radius as order. Your problem can be solved by their combination:
>>> a = np.asarray([1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 , 1, 1, 1, 1], int)
>>> signal.argrelmin(a, order=3)
(array([4], dtype=int32),)
>>> signal.argrelmax(a, order=3)
(array([15], dtype=int32),)
Then you can just replace these elements.

How to shift the columns of a 2D array multiple times, while still considering its original position?

Alright, so consider that I have a matrix m, as follows:
m = [[0, 1, 0, 0, 0, 1],
[4, 0, 0, 3, 2, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
My goal is to check each row of the matrix and see if the sum of that row is zero. If the sum is not zero, I want to shift the column that corresponds to that row to the end of the matrix. If the sum of the row is zero, nothing happens. So in the given matrix above the following should occur:
The program discovers that the 0th row has a sum that does not equal zero
The 0th column of the matrix is shifted to the end of the matrix, as follows:
m = [[1, 0, 0, 0, 1, 0],
[0, 0, 3, 2, 0, 4],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
The program checks the next row and does the same, shifting the column to the end of the matrix
m = [[0, 0, 0, 1, 0, 1],
[0, 3, 2, 0, 4, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
Each of the other rows are checked, but since all of their sums are zero no shift is made, and the final result is the matrix above.
The issue arises after shifting the columns of the matrix for the first time, once all of the values are shifted it becomes tricky to tell which column corresponds to the correct row.
I can't use numpy to solve this problem as I can only use the original Python 2 libraries.
Use a simple loop and when the sum is not equal to zero loop over rows again and append the popped first item to each row.
>>> from pprint import pprint
>>> m = [[0, 1, 0, 0, 0, 1],
[4, 0, 0, 3, 2, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
>>> for row in m:
# If all numbers are >= 0 then we can short-circuit this using `if any(row):`.
if sum(row) != 0:
for row in m:
row.append(row.pop(0))
...
>>> pprint(m)
[[0, 0, 0, 1, 0, 1],
[0, 3, 2, 0, 4, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
list.pop is O(N) operation, if you need something fast then use collections.deque.
deque can rotate elements.
from collections import deque
def rotate(matrix):
matrix_d = [deque(row) for row in matrix]
for row in matrix:
if sum(row) != 0:
for row_d in matrix_d:
row_d.rotate(-1)
return [list(row) for row in matrix_d]

Python Image.fromarray() doesn't accept my ndarray input which is built from a list

I'm trying to visualize a list of 2048280 integers which are either 1's or 0's. There is a function that outputs this list from a (width=1515 height=1352) image file. The function
test_results = [(numpy.argmax(SomeFunctionReturningAnArrayForEachGivenPixel))
for y in xrange(1352) for x in range(1532)]
returns an array of size 2058280 (=1515x1352) = as expected. For each y, 1532 values of 1/0 are returned and stored in the array.
Now, when this "test_results" array is returned, I want to save it as an image. So I np.reshape() the array to size (1352,1515,1). All is fine. Logically, I should save this list as a grayscale image. I changed the ndarray data type to 'unit8' and multiplied the pixel values by 127 or 255.
But no matter what I do, the Image.fromarray() function keeps saying that either 'it cannot handle this data type' or 'too many dimensions' or simply gives an error. When I debug it into the Image functions, it looks like the Image library cannot retrieve the array's 'stride'!
All the examples on the net simply reshape the list into an array and save them as an image! Is there anything wrong with my list?
I have already tried various modes ('RGB' , 'L' , '1'). I also changed the data type of my array into uint8, int8, np.uint8(), uint32..
result=self.evaluate(test_data,box) #returns the array
re_array= np.asarray(result,dtype='uint8')
res2 = np.reshape(reray,(1352,1515,1))
res3 =(res2*255)
i = Image.fromarray(res3,'1') ## Raises the exception
i.save('me.png')
For a grayscale image, don't add the trivial third dimension to your array. Leave it as a two-dimensional array: res2 = np.reshape(reray, (1352, 1515)) (assuming reray is the one-dimensional array).
Here's a simple example that worked for me. data is a two-dimensional array with type np.uint8 containing 0s and 1s:
In [29]: data
Out[29]:
array([[0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1],
[0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0],
[1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1],
[1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0],
[0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0],
[1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0],
[1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0]], dtype=uint8)
Create an image from 255*data with mode 'L', and save it as a PNG file:
In [30]: img = Image.fromarray(255*data, mode='L')
In [31]: img.save('foo.png')
When I tried to create the image using mode='1', I wasn't able to get a correct PNG file. Pillow has some known problems with moving between numpy arrays and images with bit depth 1.
Another option is to use numpngw. (I'm the author numpngw.) It allows you to save the data to a PNG file with bit depth 1:
In [40]: import numpngw
In [41]: numpngw.write_png('foo.png', data, bitdepth=1)

Categories