Problems sorting an 2D array - python

I have a following 2D numpy array:
array([[1 0]
[2 0]
[4 0]
[1 1]
[2 1]
[3 1]
[4 2])
I want to sort the ID of first column with its value in second value, suck that I get back:
array([[1 0]
[1 1]
[2 0]
[2 1]
[3 1]
[4 0]
[4 2]])
I am getting O(n^2) complexity and want to improve it further.

A better way to sort a list of lists:
import numpy as np
a = np.array([[1, 0], [2 ,0], [4 ,0], [1 ,1], [2 ,1], [3 ,1], [4 ,2]])
s_a = np.asarray(sorted(a, key=lambda x: x[0]))
print(s_a)
Output:
[[1 0]
[1 1]
[2 0]
[2 1]
[3 1]
[4 0]
[4 2]]

Try the below code, Hope this will help:
a = np.array([[1, 0],
[2 ,0],
[4 ,0],
[1 ,1],
[2 ,1],
[3 ,1],
[4 ,2]])
np.sort(a.view('i8,i8'), order=['f0'], axis=0).view(np.int)
Ouput will be :
array([[(1, 0)],
[(1, 1)],
[(2, 0)],
[(2, 1)],
[(3, 1)],
[(4, 0)],
[(4, 2)]])

Related

Removing empty rows in tf sparse tensor

This a follow up question to Tensorflow sparse tensor row-wise mask. It seems tf doesn't provide a convenient way to remove empty rows in the sparse tensor. e.g. from
SparseTensor(indices=tf.Tensor(
[[0 0]
[2 0]
[2 1]
[3 0]
[4 0]], shape=(5, 2), dtype=int64), values=tf.Tensor([b'a', b'b', b'c', b'd', b'e'], shape=(5,), dtype=string), dense_shape=tf.Tensor([5 2], shape=(2,), dtype=int64))
to
SparseTensor(indices=tf.Tensor(
[[0 0]
[1 0]
[1 1]
[2 0]
[3 0]], shape=(5, 2), dtype=int64), values=tf.Tensor([b'a', b'b', b'c', b'd', b'e'], shape=(5,), dtype=string), dense_shape=tf.Tensor([4 2], shape=(2,), dtype=int64))
How to deal with the case as above without converting the sparse to dense?
Thx, J

How to save iterations in a loop as follows?

I use two for loop to go through two arrays. I have set the conditions to print me only those that meet my conditions.
the individual positions in are in the array comb_x and comb_y. this results in conditions in the for loop
This is fine and works as it should.
This my code
import os
import numpy as np
from itertools import combinations
a=np.array([0, 0, 0, 1, 1, 1, 2, 2, 2])
b=np.array([0, 1, 2, 0, 1, 2, 0, 1, 2])
z1=np.array([1, 1])
z2=np.array([1, 1])
comb_x=np.array([0, 0, 1, 1])
comb_y=np.array([0, 1, 0, 1])
for (j), (k) in zip(a,b):
z1[:]=0
z1[:j]=1
x12=z1
z2[:]=0
z2[:k]=1
y12=z2
#print(x12,y12)
for (h),(n),(r) in zip(comb_x,comb_y,np.arange(0,4)):
if x12[h]==1 and y12[n]==1:
print('pravda',x12,y12)
My output sorts those numbers by values
my output:
[1 0] [1 0]
[1 0] [1 1]
[1 0] [1 1]
[1 1] [1 0]
[1 1] [1 0]
[1 1] [1 1]
[1 1] [1 1]
[1 1] [1 1]
[1 1] [1 1]
I need them to appear as well as follow in each iteration
required output
[1 0] [1 0]
[1 0] [1 1]
[1 1] [1 0]
[1 1] [1 1]
[1 0] [1 1]
[1 1] [1 1]
[1 1] [1 0]
[1 1] [1 1]
[1 1] [1 1]
To make it easier to see, I'll show below what the individual iterations and their results should look like
first iteration:
[1 0] [1 0]
[1 0] [1 1]
[1 1] [1 0]
[1 1] [1 1]
second iteration:
[1 0] [1 1]
[1 1] [1 1]
third iteration:
[1 1] [1 0]
[1 1] [1 1]
fourth iteration:
[1 1] [1 1]
But as you can see, it does not save them by iteration, but sorts them in ascending order by value
Could it be arranged according to iterations so that it is still a numpy array?

Unique symmetrical elements Numpy Array

I have a Numpy array as this:
[1 4]
[2 3]
[3 0]
[4 1]
[5 6]
[6 5]
[7 6]]
This is output of NearestNeighbors algorithm of scikit-learn. I want to remove duplicated values. To have something like this:
[[0 3]
[1 4]
[2 3]
[6 5]
[7 6]]
I searched a lot, but not found any solution.
One way with sorting and np.unique -
np.unique(np.sort(a, axis=1), axis=0)

add field to memmaped numpy record array

With normal memmapped numpy arrays, you can "add" a new column by opening the memmap file with an additional column in the shape.
k = np.memmap('input', dtype='int32', shape=(10, 2), mode='r+', order='F')
k[:] = 1
l = np.memmap('input', dtype='int32', shape=(10, 3), mode='r+', order='F')
print(k)
print(l)
[[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]]
[[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]]
Is it possible to make a similar move with record arrays? Seems possible with rows, but can't find a way to do so with a new field, if the dtype has heterogeneous types.

Numpy swap the values of nested columns

Given the following data structure is there some way I can swap the first and last column such that each row is [3, 2, 1] (I don't wish to sort them) without looping through each row?
[[[1 2 3]
[1 2 3]
[1 2 3]
...
[1 2 3]
[1 2 3]
[1 2 3]]
[[1 2 3]
[1 2 3]
[1 2 3]
...
[1 2 3]
[1 2 3]
[1 2 3]]]

Categories