I'm trying to sort the rows of one array by the values of another. For example:
import numpy as np
arr1 = np.random.normal(1, 1, 80)
arr2 = np.random.normal(1,1, (80,100))
I want to sort arr1 in descending order, and to have the current relationship between arr1 and arr2 to be maintained (ie, after sorting both, the rows of arr1[0] and arr2[0, :] are the same).
Use argsort as follows:
arr1inds = arr1.argsort()
sorted_arr1 = arr1[arr1inds[::-1]]
sorted_arr2 = arr2[arr1inds[::-1]]
This example sorts in descending order.
Use the zip function: zip( *sorted( zip(arr1, arr2) ) ) This will do what you need.
Now the explanation:
zip(arr1, arr2) will combine the two lists, so you've got [(0, [...list 0...]), (1, [...list 1...]), ...]
Next we run sorted(...), which by default sorts based on the first field in the tuple.
Then we run zip(...) again, which takes the tuples from sorted, and creates two lists, from the first element in the tuple (from arr1) and the second element (from arr2).
Related
I want to select multiple groups of three or more elements from a list, according to some indices.
I thought of using itemgetter, but it does not work for multiple sets, for example
labels=['C1','C2','C3','C4','C5','C6','C7','C8','C10','C13','C14','C15']
indexlist = list(itertools.combinations(range(1, 10), 3))
ixs= [4,5]
a=[indexlist[ix] for ix in ixs]
from operator import itemgetter
print(*itemgetter(*a[0])(labels))
where
a=[(1, 2, 7), (1, 2, 8)]
works well, whereas
labels=['C1','C2','C3','C4','C5','C6','C7','C8','C10','C13','C14','C15']
indexlist = list(itertools.combinations(range(1, 10), 3))
ixs= [4,5]
a=[indexlist[ix] for ix in ixs]
from operator import itemgetter
print(*itemgetter(*a)(labels))
gives the error
list indices must be integers or slices, not list
Is there a way to pass multiple sets of indices to itemgetter, or is there some other convenient alternative?
You are trying to parse several indexes, as stated. To make it easier you can use numpy.
labels = np.array(['C1','C2','C3','C4','C5','C6','C7','C8','C10','C13','C14','C15'])
print([list(labels[index_tuples])] for index_tuples in a)
What this is doing is getting multiple sets of indexes using your tuples and printing them as a list.
Source: Access multiple elements of list knowing their index
I have a list made up of arrays. All have shape (2,).
Minimum example: mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]
I would like to get a unique list, e.g.
[np.array([1,2]),np.array([3,4])]
or perhaps even better, a dict with counts, e.g. {np.array([1,2]) : 2, np.array([3,4]) : 1}
So far I tried list(set(mylist)), but the error is TypeError: unhashable type: 'numpy.ndarray'
As the error indicates, NumPy arrays aren't hashable. You can turn them to tuples, which are hashable and build a collections.Counter from the result:
from collections import Counter
Counter(map(tuple,mylist))
# Counter({(1, 2): 2, (3, 4): 1})
If you wanted a list of unique tuples, you could construct a set:
set(map(tuple,mylist))
# {(1, 2), (3, 4)}
In general, the best option is to use np.unique method with custom parameters
u, idx, counts = np.unique(X, axis=0, return_index=True, return_counts=True)
Then, according to documentation:
u is an array of unique arrays
idx is the indices of the X that give the unique values
counts is the number of times each unique item appears in X
If you need a dictionary, you can't store hashable values in its keys, so you might like to store them as tuples like in #yatu's answer or like this:
dict(zip([tuple(n) for n in u], counts))
Pure numpy approach:
numpy.unique(mylist, axis=0)
which produces a 2d array with your unique arrays in rows:
numpy.array([
[1 2],
[3 4]])
Works if all your arrays have same length (like in your example).
This solution can be useful depending on what you do earlier in your code: perhaps you would not need to get into plain Python at all, but stick to numpy instead, which should be faster.
Use the following:
import numpy as np
mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]
np.unique(mylist, axis=0)
This gives out list of uniques arrays.
array([[1, 2],
[3, 4]])
Source: https://numpy.org/devdocs/user/absolute_beginners.html#how-to-get-unique-items-and-counts
My array looks like this:
a = ([1,2],[2,3],[4,5],[3,8])
I did the following to delete odd indexes :
a = [v for i, v in enumerate(a) if i % 2 == 0]
but it dives me now two different arrays instead of one two dimensional:
a= [array([1, 2]), array([4, 5])]
How can I keep the same format as the beginning? thank you!
That is as simple as
a[::2]
which yields the lines with even index.
Use numpy array indexing, not comprehensions:
c = a[list(range(0,len(a),2)),:]
If you define c as the output of a list comprehension, it will return a list of one-dimensional numpy arrays. Instead, using the proper indexing maintains the result a numpy array.
Note than instead of "deleting" the odd indices, what we do is specify what to keep: take all lines with an even index (the list(range(0,len(a),2)) part) and for each line take all elements (the : part)
In normal situations a list with integers can be used as indices for an array. Let's say
arr = np.arange(10)*2
l = [1,2,5]
arr[l] # this gives np.array([2,4,10])
Instead of one list of indices, I have several, with different lenghts, an I want to get arr[l] for each sublist in my list of indices. How can I achieve this without an sequential approach (using a for), or better, using less time than using a for using numpy?
For example:
lists = [[1,2,5], [5,6], [2,8,4]]
arr = np.arange(10)*2
result = np.array([[2,4,10], [10, 12], [4,16,8]]) #this is after the procedure I want to get
It depends on the size of your lists whether this makes sense. One option is to concatenate them all, do the slicing and then redistribute into lists.
lists = [[1,2,5], [5,6], [2,8,4]]
arr = np.arange(10)*2
extracted = arr[np.concatenate(lists)]
indices = [0] + list(np.cumsum(map(len, lists)))
result = [extracted[indices[i]:indices[i + 1]] for i in range(len(lists))]
Or, taking into account #unutbu's comment:
result = np.split(extracted, indices[1:-1])
Say that I have 4 numpy arrays
[1,2,3]
[2,3,1]
[3,2,1]
[1,3,2]
In this case, I've determined [1,2,3] is the "minimum array" for my purposes, as it is one of two arrays with lowest value at index 0, and of those two arrays it has the the lowest index 1. If there were more arrays with similar values, I would need to compare the next index values, and so on.
How can I extract the array [1,2,3] in that same order from the pile?
How can I extend that to x arrays of size n?
Thanks
Using the python non-numpy .sort() or sorted() on a list of lists (not numpy arrays) automatically does this e.g.
a = [[1,2,3],[2,3,1],[3,2,1],[1,3,2]]
a.sort()
gives
[[1,2,3],[1,3,2],[2,3,1],[3,2,1]]
The numpy sort seems to only sort the subarrays recursively so it seems the best way would be to convert it to a python list first. Assuming you have an array of arrays you want to pick the minimum of you could get the minimum as
sorted(a.tolist())[0]
As someone pointed out you could also do min(a.tolist()) which uses the same type of comparisons as sort, and would be faster for large arrays (linear vs n log n asymptotic run time).
Here's an idea using numpy:
import numpy
a = numpy.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
col = 0
while a.shape[0] > 1:
b = numpy.argmin(a[:,col:], axis=1)
a = a[b == numpy.min(b)]
col += 1
print a
This checks column by column until only one row is left.
numpy's lexsort is close to what you want. It sorts on the last key first, but that's easy to get around:
>>> a = np.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
>>> order = np.lexsort(a[:, ::-1].T)
>>> order
array([0, 3, 1, 2])
>>> a[order]
array([[1, 2, 3],
[1, 3, 2],
[2, 3, 1],
[3, 2, 1]])