Get all the rows with same values in python? - python

So, suppose I have this 2D array in python
a = [[1,2]
[2,3]
[3,2]
[1,3]]
How do get all array entries with the same row value and store them in a new matrix.
For example, I will have
b = [1,2]
[1,3]
after the query.
My approach is b = [a[i] for i in a if a[i][0] == 1][0]]
but it didn't seem to work?
I am new to Python and the whole index slicing thing is kind confusing. Thanks!

Since you tagged numpy, you can perform this task with NumPy arrays. First define your array:
a = np.array([[1, 2],
[2, 3],
[3, 2],
[1, 3]])
For all unique values in the first column, you can use a dictionary comprehension. This is useful to avoid duplicating operations.
d = {i: a[a[:, 0] == i] for i in np.unique(a[:, 0])}
{1: array([[1, 2],
[1, 3]]),
2: array([[2, 3]]),
3: array([[3, 2]])}
Then access your array where first column is equal to 1 via d[1].
For a single query, you can simply use a[a[:, 0] == 1].

The for i in a syntax gives you the actual items in the list..so for example:
list_of_strs = ['first', 'second', 'third']
first_letters = [s[0] for s in list_of_strs]
# first_letters == ['f', 's', 't']
What you are actually doing with b = [a[i] for i in a if a[i][0]==1] is trying to index an element of a with each of the elements of a. But since each element of a is itself a list, this won't work (you can't index lists with other lists)
Something like this should work:
b = [row for row in a if row[0] == 1]
Bonus points if you write it as a function so that you can pick which thing you want to filter on.
If you're working with arrays a lot, you might also check out the numpy library. With numpy, you can do stuff like this.
import numpy as np
a = np.array([[1,2], [2,3], [3,2], [1,3]])
b = a[a[:,0] == 1]
The last line is basically indexing the original array a with a boolean array defined inside the first set of square brackets. It's very flexible, so you could also modify this to filter on the second element, filter on other conditions (like > some_number), etc. etc.

Related

Set rows in Python 2D array to another row without Numpy?

I want to "set" the values of a row of a Python nested list to another row without using NumPy.
I have a sample list:
lst = [[0, 0, 1],
[0, 2, 3],
[5, 2, 3]]
I want to make row 1 to row 2, row 2 to row 3, and row 3 to row 1. My desired output is:
lst = [[0, 2, 3],
[5, 2, 3],
[0, 0, 1]]
How can I do this without using Numpy?
I tried to do something like arr[[0, 1]] = arr[[1, 0]] but it gives the error 'NoneType' object is not subscriptable.
One very straightforward way:
arr = [arr[-1], *arr[:-1]]
Or another way to achieve the same:
arr = [arr[-1]] + arr[:-1]
arr[-1] is the last element of the array. And arr[:-1] is everything up to the last element of the array.
The first solution builds a new list and adds the last element first and then all the other elements. The second one constructs a list with only the last element and then extends it with the list containing the rest.
Note: naming your list an array doesn't make it one. Although you can access a list of lists like arr[i1][i2], it's still just a list of lists. Look at the array documentation for Python's actual array.
The solution user #MadPhysicist provided comes down to the second solution provided here, since [arr[-1]] == arr[-1:]
Since python does not actually support multidimensional lists, your task becomes simpler by virtue of the fact that you are dealing with a list containing lists of rows.
To roll the list, just reassemble the outer container:
result = lst[-1:] + lst[:-1]
Numpy has a special interpretation for lists of integers, like [0, 1], tuples, like :, -1, single integers, and slices. Python lists only understand single integers and slices as indices, and do not accept tuples as multidimensional indices, because, again, lists are fundamentally one-dimensional.
use this generalisation
arr = [arr[-1]] + arr[:-1]
which according to your example means
arr[0],arr[1],arr[2] = arr[1],arr[2],arr[0]
or
arr = [arr[2]]+arr[:2]
or
arr = [arr[2]]+arr[:-1]
You can use this
>>> lst = [[0, 0, 1],
[0, 2, 3],
[5, 2, 3]]
>>> lst = [*lst[1:], *lst[:1]]
>>>lst
[[0, 2, 3],
[5, 2, 3],
[0, 0, 1]]

group list elements based on another list

I have two lists: inp and base.
I want to add each item in inp to a list in out based on the position in base.
The following code works fine:
from pprint import pprint as print
num = 3
out = [[] for i in range(num)]
inp = [[1,1],[2,1],[3,2],[7,11],[9,99],[0,-1]]
base = [0,1,0,2,0,1]
for i, num in enumerate(base):
out[num].append(inp[i])
print(out,width=40)
[[[1, 1], [3, 2], [9, 99]],
[[2, 1], [0, -1]],
[[7, 11]]]
I would like to do this using the NumPy module (np.array and np.append or etc.).
Can anyone help me?
Assuming baseand inp as NumPy arrays, we could do something like this -
# Get sorted indices for base
sidx = base.argsort()
# Get where the sorted version of base changes groups
split_idx = np.flatnonzero(np.diff(base[sidx])>0)+1
# OR np.unique(base[sidx],return_index=True)[1][1:]
# Finally sort inp based on the sorted indices and split based on split_idx
out = np.split(inp[sidx], split_idx)
To make it work for lists, we need few tweaks, mainly the indexing part, for which we can use np.take to replace the indexing into arrays as listed in the earlier approach. So, the modified version would be -
sidx = np.argsort(base)
split_idx = np.flatnonzero(np.diff(np.take(base,sidx))>0)+1
out = np.split(np.take(inp,sidx,axis=0), split_idx)

Basic NumPy array replacement

I have a rather basic question about the NumPy module in Python 2, particularly the version on trinket.io. I do not see how to replace values in a multidimensional array several layers in, regardless of the method. Here is an example:
a = numpy.array([1,2,3])
a[0] = 0
print a
a = numpy.array([[1,2,3],[1,2,3]])
a[0][0] = a[1][0] = 0
print a
Result:
array([0, 2, 3], '<class 'int'>')
array([[1, 2, 3], [1, 2, 3]], '<class 'int'>')
I need the ability to change individual values, my specific code being:
a = numpy.empty(shape = (8,8,2),dtype = str)
for row in range(a.shape[0]):
for column in range(a.shape[1]):
a[row][column][1] = 'a'
Thank you for your time and any help provided.
To change individual values you can simply do something like:
a[1,2] = 'b'
If you want to change all the array, you can do:
a[:,:] = 'c'
Use commas (array[a,b]) instead of (array[a][b])
With numpy version 1.11.0, I get
[[0 2 3]
[0 2 3]]
When I run your code. I guess your numpy version is newer and better.
As user3408085 said, the correct thing is to go a[0,0] = 0 to change one element or a[:,0]=0 if your actually want to zero the entire first column.
The reason a[0][0]=0 does not modify a (at least in your version of numpy) is that a[0] is a new array. If break down your command a[0][0]=0 into 2 lines:
b=a[0]
b[0]=0
Then the fact that this modifies a is counterintuitive.

Python Set the matrix value of multiple row and each rows with multiple different columns without for loop

How to set the same value to the matrix of multiple rows and each row with different column numbers without for loop?
For example for matrix a:
a=matrix([[1,2,3],
[8,2,9],
[1,8,7]])
row = [1,2,3]
col = [[1,2]
[1,3]
[2,3]]
I want to set a[1,1],a[1,2],a[2,1],a[2,3],a[3,2],a[3,3] to the same value.
I know use for loop:
for i in xrange(len(row)):
a[row[i],col[i]] = setvalue
But is there anyway to do this without for loop?
Using numpy, you can avoid loops:
import numpy as np
from numpy.matlib import repmat
a = np.array([[1,2,3],
[8,2,9],
[1,8,7]])
row = np.array([[1],
[2],
[3]])
col = np.array([[1,2],
[1,3],
[2,3]])
row = repmat(row,1,col.shape[1])
setvalue = 0
a[row.ravel(),col.ravel()] = setvalue
However, it's important to note that in python indexing starts at 0, so you should actually do
a[row-1,col-1] = setvalue
Or even better, use the correct (zero-based) indices to initialise your row and col arrays.
Case 1: Use list comprehension
You can do like this:
value = 2
col_length = 3
line_length = 3
a = [[value for x in range(col_length)] for x in range(line_length)]
If you print a,
[[2, 2, 2], [2, 2, 2], [2, 2, 2]]
EDIT: Case 2 : Use map()
I am not very used to this one. But you can find more informations about it here in terms of performance. General idea: it seems faster when used with one function and no lambda expression.
You'll have to use a for loop.
Usually you want to avoid for loops (by using comprehesions) when following the functional paradigm, by building new instances instead of mutating the old one. As your goal is to mutate the old one, somewhere you will need a loop. The best you can do is to wrap it up in a function:
def set_items_to(mx, indices, value=0):
for row,cols in indices:
for col in cols:
mx[row, col] = value
a = matrix([[1,2,3],[4,5,6],[7,8,9]])
set_items_to(a, [
[0, [0,1]],
[1, [0,2]],
[2, [1,2]]
], setvalue)
EDIT
In case it is a programming challenge, there are ways to accomplish that without explicit for loops by using one of the built in aggregator functions. But this approach doesn't make the code clearer nor shorter. Just for completeness, it would look something like this:
def set_items_to(mx, indices, value=0):
sum(map(lambda item: [0,
sum(map(lambda col: [0,
mx.__setitem__((item[0], col), value)
][0], item[1]))
][0], indices))

Acquiring the Minimum array out of Multiple Arrays by order in Python

Say that I have 4 numpy arrays
[1,2,3]
[2,3,1]
[3,2,1]
[1,3,2]
In this case, I've determined [1,2,3] is the "minimum array" for my purposes, as it is one of two arrays with lowest value at index 0, and of those two arrays it has the the lowest index 1. If there were more arrays with similar values, I would need to compare the next index values, and so on.
How can I extract the array [1,2,3] in that same order from the pile?
How can I extend that to x arrays of size n?
Thanks
Using the python non-numpy .sort() or sorted() on a list of lists (not numpy arrays) automatically does this e.g.
a = [[1,2,3],[2,3,1],[3,2,1],[1,3,2]]
a.sort()
gives
[[1,2,3],[1,3,2],[2,3,1],[3,2,1]]
The numpy sort seems to only sort the subarrays recursively so it seems the best way would be to convert it to a python list first. Assuming you have an array of arrays you want to pick the minimum of you could get the minimum as
sorted(a.tolist())[0]
As someone pointed out you could also do min(a.tolist()) which uses the same type of comparisons as sort, and would be faster for large arrays (linear vs n log n asymptotic run time).
Here's an idea using numpy:
import numpy
a = numpy.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
col = 0
while a.shape[0] > 1:
b = numpy.argmin(a[:,col:], axis=1)
a = a[b == numpy.min(b)]
col += 1
print a
This checks column by column until only one row is left.
numpy's lexsort is close to what you want. It sorts on the last key first, but that's easy to get around:
>>> a = np.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
>>> order = np.lexsort(a[:, ::-1].T)
>>> order
array([0, 3, 1, 2])
>>> a[order]
array([[1, 2, 3],
[1, 3, 2],
[2, 3, 1],
[3, 2, 1]])

Categories