group list elements based on another list - python

I have two lists: inp and base.
I want to add each item in inp to a list in out based on the position in base.
The following code works fine:
from pprint import pprint as print
num = 3
out = [[] for i in range(num)]
inp = [[1,1],[2,1],[3,2],[7,11],[9,99],[0,-1]]
base = [0,1,0,2,0,1]
for i, num in enumerate(base):
out[num].append(inp[i])
print(out,width=40)
[[[1, 1], [3, 2], [9, 99]],
[[2, 1], [0, -1]],
[[7, 11]]]
I would like to do this using the NumPy module (np.array and np.append or etc.).
Can anyone help me?

Assuming baseand inp as NumPy arrays, we could do something like this -
# Get sorted indices for base
sidx = base.argsort()
# Get where the sorted version of base changes groups
split_idx = np.flatnonzero(np.diff(base[sidx])>0)+1
# OR np.unique(base[sidx],return_index=True)[1][1:]
# Finally sort inp based on the sorted indices and split based on split_idx
out = np.split(inp[sidx], split_idx)
To make it work for lists, we need few tweaks, mainly the indexing part, for which we can use np.take to replace the indexing into arrays as listed in the earlier approach. So, the modified version would be -
sidx = np.argsort(base)
split_idx = np.flatnonzero(np.diff(np.take(base,sidx))>0)+1
out = np.split(np.take(inp,sidx,axis=0), split_idx)

Related

How to split lists that have multiple items within a dimension?

How would I split this? I'm been trying to find documentation on splits, especially with Pandas but unfortunately am unable to find something suitable for this.
Essentially I have two lists:
final = [([1,2],[0]),([0,1],[2]),([0,2],[1])]
How would I split this back so it could show:
first_tuple = [[1,2],[0,1],[0,2]]
second_tuple = [[0],[2],[1]]
You may use zip:
final = [([1,2],[0]),([0,1],[2]),([0,2],[1])]
first_tuple, second_tuple = zip(*final)
print(first_tuple)
print(second_tuple)
Which yields
([1, 2], [0, 1], [0, 2])
([0], [2], [1])
Adding to the answer above, another option is to simply loop and subscript the values using list comprehension.
final = [([1,2],[0]),([0,1],[2]),([0,2],[1])]
first = [item[0] for item in final]
second = [item[1] for item in final]
print(first)
print(second)

Iterate Python List of Lists and Remove Final Index of Each Sublist, No Imports

There are a few similar questions to this one but not exactly the same:
I want to dynamically decrease a given input array or list of lists. For example:
matrix = [[0,1,2], [3,4,5],[6,7,8]]
Starting at 0 I need to iterate through and remove the final index - the iterative. So the output I would like to store in a new list is:
#output
[0,1,2], ,[3,4], [6]]
[0,1,2], ,[3,4], [6]] ==> which then flattens to [0,1,2,3,4,6]
Here's what I'm currently going after:
def get_list(matrix, stop_index):
temp = []
for i in range(0, stop_index):
for m in matrix:
temp.append(matrix[0:stop_index])
outside_list.append(temp)
return outside_list
I believe I am seeing well my over reliance on packages and libraries, so I am really trying to do this without outside packages or imports
Thank you for any help! I don't forget to green check mark.
Using list comprehension
l = [[0,1,2], [3,4,5],[6,7,8]]
ll = [ x[:len(l)-l.index(x)] for x in l]
# [[0, 1, 2], [3, 4], [6]]
print([x for y in ll for x in y ])
# [0, 1, 2, 3, 4, 6]
Simpler syntax:
matrix = [[0,1,2], [3,4,5],[6,7,8]]
outside_list = list()
for i in range(len(matrix)):
# matrix[i] is used to access very sublist in the matrix,
#[:3-i] is to slice every sublist from the beginning to (3 - current position)
outside_list.append(matrix[i][:3-i])
print(outside_list)
Some useful refernces
List slicing https://stackoverflow.com/a/509295/8692977
List comprehension: https://stackoverflow.com/a/34835952/8692977

Get all the rows with same values in python?

So, suppose I have this 2D array in python
a = [[1,2]
[2,3]
[3,2]
[1,3]]
How do get all array entries with the same row value and store them in a new matrix.
For example, I will have
b = [1,2]
[1,3]
after the query.
My approach is b = [a[i] for i in a if a[i][0] == 1][0]]
but it didn't seem to work?
I am new to Python and the whole index slicing thing is kind confusing. Thanks!
Since you tagged numpy, you can perform this task with NumPy arrays. First define your array:
a = np.array([[1, 2],
[2, 3],
[3, 2],
[1, 3]])
For all unique values in the first column, you can use a dictionary comprehension. This is useful to avoid duplicating operations.
d = {i: a[a[:, 0] == i] for i in np.unique(a[:, 0])}
{1: array([[1, 2],
[1, 3]]),
2: array([[2, 3]]),
3: array([[3, 2]])}
Then access your array where first column is equal to 1 via d[1].
For a single query, you can simply use a[a[:, 0] == 1].
The for i in a syntax gives you the actual items in the list..so for example:
list_of_strs = ['first', 'second', 'third']
first_letters = [s[0] for s in list_of_strs]
# first_letters == ['f', 's', 't']
What you are actually doing with b = [a[i] for i in a if a[i][0]==1] is trying to index an element of a with each of the elements of a. But since each element of a is itself a list, this won't work (you can't index lists with other lists)
Something like this should work:
b = [row for row in a if row[0] == 1]
Bonus points if you write it as a function so that you can pick which thing you want to filter on.
If you're working with arrays a lot, you might also check out the numpy library. With numpy, you can do stuff like this.
import numpy as np
a = np.array([[1,2], [2,3], [3,2], [1,3]])
b = a[a[:,0] == 1]
The last line is basically indexing the original array a with a boolean array defined inside the first set of square brackets. It's very flexible, so you could also modify this to filter on the second element, filter on other conditions (like > some_number), etc. etc.

generating conditional data with Hypothesis Python

I want to generate a list of lists of integers of size 2 with the following conditions.
the first element should be smaller than the second and
all the data should be unique.
I could generate each tuple with a custom function but don't know how to use that to satisfy the second condition.
from hypothesis import strategies as st
#st.composite
def generate_data(draw):
min_val, max_val = draw(st.lists(st.integers(1, 1e2), min_size=2, max_size=2))
st.assume(min_val < max_val)
return [min_val, max_val]
I could generate the data by iterating over generate_date a few times in this (inefficient ?) way:
>>> [generate_data().example() for _ in range(3)]
[[5, 31], [1, 12], [33, 87]]
But how can I check that the data is unique?
E.g, the following values are invalid:
[[1, 2], [1, 5], ...] # (1 is repeated)
[[1, 2], [1, 2], ...] # (repeated data)
but the following is valid:
[[1, 2], [3, 4], ...]
I think the following strategy satisfies your requirements:
import hypothesis.strategies as st
#st.composite
def unique_pair_lists(draw):
data = draw(st.lists(st.integers(), unique=True)
if len(data) % 2 != 0:
data.pop()
result = [data[i:i+2] for i in range(0, len(data), 2)]
for pair in result:
pair.sort()
return result
The idea here is that we generate something that gives the right elements, and then we transform it into something of the right shape. Rather than trying to generate pairs of lists of integers, we just generate a list of unique integers and then group them into pairs (we drop the last element if there's an odd number of integers). We then sort each pair to ensure it's in the right order.
David's solution permits an integer to appear in two sub-lists - for totally unique integers I'd use the following:
#st.composite
def list_of_pairs_of_unique_elements(draw):
seen = set()
new_int = st.integers(1, 1e2)\
.filter(lambda n: n not in seen)\ # Check that it's unique
.map(lambda n: seen.add(n) or n) # Add to filter before next draw
return draw(st.lists(st.tuples(new_int, new_int).map(sorted))
The .filter(...) method is probably what you're looking for.
.example() is only for interactive use - you'll get a warning (or error) if you use it in #given().
If you might end up filtering out most elements in the range (eg outer list of length > 30, meaning 60/100 possible unique elements), you might get better performance by creating a list of possible elements and popping out of it rather than rejecting seen elements.

Python: Complex for-loops

I am working through some code trying to understand some Python mechanics, which I just do not get. I guess it is pretty simple and I also now, what it does, but i do not know how it works. I understand the normal use of for-loops but this here... I do not know.
Remark: I know some Python, but I am not an expert.
np.array([[[S[i,j]] for i in range(order+1)] for j in range(order+1)])
The second piece of code, I have problems with is this one:
for i in range(len(u)):
for j in range(len(v)):
tmp+=[rm[i,j][k]*someFuction(name,u[i],v[j])[k] for k in range(len(rm[i,j])) if rm[i,j][k]]
How does the innermost for-loop work? And also what does the if do here?
Thank you for your help.
EDIT: Sorry that the code is so unreadable, I just try to understand it myself. S, rm are numpy matrices, someFunction returns an array with scalar entries, andtmp is just a help variable
There are quite a few different concepts inside your code. Let's start with the most basic ones. Python lists and numpy arrays have different methodologies for indexation. Also you can build a numpy array by providing it a list:
S_list = [[1,2,3], [4,5,6], [7,8,9]]
S_array = np.array(S_list)
print(S_list)
print(S_array)
print(S_list[0][2]) # indexing element 2 from list 0
print(S_array[0,2]) # indexing element at position 0,2 of 2-dimensional array
This results in:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[1 2 3]
[4 5 6]
[7 8 9]]
3
3
So for your first line of code:
np.array([[[S[i,j]] for i in range(order+1)] for j in range(order+1)])
You are building a numpy array by providing it a list. This list is being built with the concept of list comprehension. So the code inside the np.array(...) method:
[[[S[i,j]] for i in range(order+1)] for j in range(order+1)]
... is equivalent to:
order = 2
full_list = []
for j in range(order+1):
local_list = []
for i in range(order+1):
local_list.append(S_array[i, j])
full_list.append(local_list)
print(full_list)
This results in:
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
As for your second snippet its important to notice that although typically numpy arrays have very specific and constant (for all the array) cell types you can actually give the data type object to a numpy array. So creating a 2-dimensional array of lists is possible. It is also possible to create a 3-dimensional array. Both are compatible with the indexation rm[i,j][k]. You can check this in the following example:
rm = np.array(["A", 3, [1,2,3]], dtype="object")
print(rm, rm[2][0]) # Acessing element 0 of list at position 2 of the array
rm2 = np.zeros((3, 3, 3))
print(rm2[0, 1][2]) # This is also valid
The following code:
[rm[i,j][k]*someFuction(name,u[i],v[j])[k] for k in range(len(rm[i,j])) if rm[i,j][k]]
... could be written as such:
some_list = []
for k in range(len(rm[i,j])):
if rm[i, j][k]: # Expecting a boolean value (or comparable)
a_list = rm[i,j][k]*someFuction(name,u[i],v[j])
some_list.append(a_list[k])
The final detail is the tmp+=some_list. When you sum two list they'll be concatenated as can been seen in this simple example:
tmp = []
tmp += [1, 2, 3]
print(tmp)
tmp += [4, 5, 6]
print(tmp)
Which results in this:
[1, 2, 3]
[1, 2, 3, 4, 5, 6]
Also notice that multiplying a list by a number will effectively be the same as summing the list several times. So 2*[1,2] will result in [1,2,1,2].
Its a list comprehension, albeit a pretty unreadable one. That was someome doing something very 'pythonic' in spite of readablity. Just look up list comprehensions and try to rewrite it yourself as a traditional for loop. list comprehensions are very useful, not sure I would have gone that route here.
The syntax for a list comprehension is
[var for var in iterable if optional condition]
So this bottom line can be rewritten like so:
for k in range(len(rm[i,j]):
if rm[i,j][k]:
tmp+= rm[i,j][k]*someFunction(name,u[i],v[j])[k]

Categories