Delete rows in matrix containing certain elements (python) - python

The following problem I have, might be very trivial for a more advanced python programmer, but I -- as a python beginner -- can't figure out the problem.
I just want to delete a row from a 2D-list, if it matches a certain condition --- in my case, if the row contains a certain character. I wanted to do it in a more functional, python way, rather than looping over all list items. Therefore, my attempt was
alist = [[1,2],[3,4]]
map(lambda ele : (if 2 in ele: tmp3.remove(ele)), alist)
which should just delete the first row, because it contains a "2". But I just get an error "invalid syntax" and I don't know why!
(I also came across some solution which uses dataframes from the pandas package, but as I'm learning python, I want to avoid pandas at this stage ;) )
Thanks in advance!

You can't use an if statement in a lambda. You could use the more clearer list comprehension:
alist = [row for row in alist if 2 not in row]
This also has the advantage of iterating through the list once, as opposed to using map and list.remove, although you get a new list.

If you are trying to remove elements from a list, you need filter instead of map which is often used for transformation and doesn't change the length of the list:
alist = [[1,2],[3,4]]
filter(lambda ele : 2 not in ele, alist)
# [[3, 4]]

Related

Getting the indexes of an element in a subset

I have an list and a subset of it, and want to find the index of each element in the subset. I have currently tried this code:
def convert_toindex(listof_elements, listof_indices):
for i in range(len(listof_elements)):
listof_elements[:] = [listof_indices.index(x) for x in listof_elements]
return listof_elements
list1 = ['lol', 'please', 'help']
list2 = ['help', 'lol', 'please', 'extra']
What I want to happen when I do convert_toindex(list1, list2) is the output to be [2, 0, 1]
However, when I do this I get a ValueError: '0' is not in list.
0, however, appears nowhere in either list so I am not sure why this is happening.
Secondly, if I have a list of lists, and I want to do this process all the nested lists inside the big list, would I do something like this?
for smalllist in biglist:
smalllist[:] = [dict_of_indices[x] for x in smalllist]
Where dict_of_indices is the dictionary of indices created following the top answer.
The problem is that, instead of doing this one times, you're doing it over and over, N times:
for i in range(len(listof_elements)):
listof_elements[:] = [listof_indices.index(x) for x in listof_elements]
The first time through, you replace every value in listof_elements with its index in listof_indices. So far, so good. In fact, you should be done there.
But then you do it a second time. You look up each of those indices, as if they were values, in listof_indices. And some of them aren't there. So you get an error.
You can solve this just by removing the outer loop. You're already done after the first time.
You may be confused because this problem seems to inherently require two loops—but you already do have two loops. The first is the obvious one in the list comprehension, and the second one is the one hidden inside listof_indices.index.
While we're at it: while this problem does require two loops, it doesn't require them to be nested.
Instead of looping over listof_indices to find each x, you can loop over it in advance to build a dictionary:
dict_of_indices = {value: index for index, value in enumerate(listof_indices)}
And then just do a direct lookup in that dictionary:
listof_elements[:] = [dict_of_indices[x] for x in listof_elements]
Besides being a whole lot faster (O(N+M) time rather than O(N*M)), I think this might also easier be to understand, and to debug. The first line may be a bit tricky, but you can easily print out the dict and verify that it's correct. And then the second line is about as trivial as you can get.

How to create new array deducting segments of existing array

I am trying to create new array out of an existing array in Python.
I read some of already existing and similar questions but I still can not solve the problem.
For example:
I have array A = [4,6,9,15] and I want to create B =[(6-4),(9-6),(15-9)].
I tried to do it in for loop like this:
deltaB=[]
for i in range(0,len(A)):
deltaB[i]=A[i]-A[i-1]
deltaB.append(deltaB[i])
But that does not work... probably because I am writing code completely wrong since I'm new in Python and programming in general.
Can you help and write me code for this?
Many thanks upfront
List comprehension
Probably the best way to do this is using list comprehension:
[xj-xi for xi,xj in zip(A,A[1:])]
which generates:
>>> [xj-xi for xi,xj in zip(A,A[1:])]
[2, 3, 6]
Here we first zip(..) A (the list) and A[1:] the slice of the list that omits the first element together into tuples. For each such tuple (xi,xj) we add xj-xi to the list.
The error
The error occurs because in the for loop, you start from 0 and stop before len(A), it should be starting from 1 and stop before len(A). Furthermore you cannot first assign to an index that does not exist, you need to directly append it:
deltaB=[]
for i in range(1,len(A)):
deltaB.append(A[i]-A[i-1])

Python: How to insert into a nestled list via iteration at a variable index position?

I've been banging my head over this one for a while, so hopefully you can help me! So here is what I have:
grouped_list = [[["0","1","1","1"]["1","0","1","1"]][["1","1","0","1","1","1"]][["1","1","1","0","1"]]]
index_list = [[2,3][][4]]
and I want to insert a "-" into the sublists of grouped_list at the corresponding index positions indicated in the index_list. The result would look like:
[[["0","1","-","-","1","1"]["1","0","-","-","1","1"]][["1","1","0","1","1","1"]][["1","1","1","0","-","1"]]]
And since I'm new to python, here is my laughable attempt at this:
for groups in grouped_list:
for columns in groups:
[[columns[i:i] = ["-"] for i in index] for index in index_list]
I get a syntax error, pointing at the = in the list comprehension, but I didn't think it would really work to start. I would prefer not to do this manually, because I'm dealing with rather large datasets, so some sort of iteration would be nice! Do I need to use numpy or pandas for something like this? Could this be solved with clever use of zipping? Any help is greatly appreciated!
I am sadly unable to make this a one liner:
def func(x, il):
for i in il:
x.insert(i,'-')
return x
s = [[func(l, il) for l in ll] for (ll, il) in zip(grouped_list, index_list)]
I think what you want is
for k, groups in enumerate(grouped_list):
for columns in groups:
for i in sorted(index_list[k], reverse=True):
columns.insert(i, "-")
Here, I iterate over the grouped lists and save the index k to determine which indices to use from index_list. I modify the lists in-place using list.insert, which inserts elements in place. Note that this only works when the indices are used from the largest to the smallest, since otherwise the positions shift. This is why I use sorted in the loop.

Collapse list of lists to eliminate redundancy

I have a couple of long lists of lists of related objects that I'd like to group to reduce redundancy. Pseudocode:
>>>list_of_lists = [[1,2,3],[3,4],[5,6,7],[1,8,9,10]...]
>>>remove_redundancy(list_of_lists)
[[1,2,3,4,8,9,10],[5,6,7]...]
So lists that contain the same elements would be collapsed into single lists. Collapsing them is easy, once I find lists to combine I can make the lists into sets and take their union, but I'm not sure how to compare the lists. Do I need to do a series of for loops?
My first thought was that I should loop through and check whether each item in a sublist is in any of the other lists, if yes, merge the lists and then start over, but that seems terribly inefficient. I did some searching and found this: Python - dividing a list-of-lists to groups but my data isn't structured. Also, my actual data is a series of strings and thus not sortable in any meaningful sense.
I can write some gnarly looping code to make this work, but I was wondering if there are any built-in functions that would make this sort of comparison easier. Maybe something in list comprehensions?
I think this is a reasonably efficient way of doing it, if I understand your question correctly. The result here will be a list of sets.
Maybe the missing bit of knowledge was d & g (also written d.intersection(g)) for finding the set intersection, along with the fact that an empty set is "falsey" in Python
data = [[1,2,3],[3,4],[5,6,7],[1,8,9,10]]
result = []
for d in data:
d = set(d)
matched = [d]
unmatched = []
# first divide into matching and non-matching groups
for g in result:
if d & g:
matched.append(g)
else:
unmatched.append(g)
# then combine all matching groups into one group
# while leaving unmatched groups intact
result = unmatched + [set().union(*matched)]
print(result)
# [set([5, 6, 7]), set([1, 2, 3, 4, 8, 9, 10])]
We start with no groups at all (result = []). Then we take the first list from the data. We then check which of the existing groups intersect this list and which don't. Then we merge all of these matching groups along with the list (achieved by starting with matched = [d]). We don't touch the non-matching groups (though maybe some of these will end up being merged in a later iteration). If you add a line print(result) in each loop you should be able to see how it's built up.
The union of all the sets in matched is computed by set().union(*matched). For reference:
Pythonic Way to Create Union of All Values Contained in Multiple Lists
What does the Star operator mean?
I assume that you want to merge lists that contain any common element.
Here is a function that looks efficiently (to the best of my knowledge) if any two lists contain at least one common element (according to the == operator)
import functools #python 2.5+
def seematch(X,Y):
return functools.reduce(lambda x,y : x|y,functools.reduce(lambda x,y : x+y, [[k==l for k in X ] for l in Y]))
it would be even faster if you would use a reduce that can be interrupted when finding "true" as described here:
Stopping a Reduce() operation mid way. Functional way of doing partial running sum
I was trying to find an elegant way to iterate fast after having that in place, but I think a good way would be simply looping once and creating an other container that will contain the "merged" lists. You loop once on the lists contained on the original list and for every new list created on the proxy list.
Having said that - it seems there might be a much better option - see if you can do away with that redundancy by some sort of book-keeping on the previous steps.
I know this is an incomplete answer - hope that helped anyway!

(Unintentionally) skipping items when iterating over a list

I have a list and I want to remove from it the items that don't appear in another list. I've tried the following:
for w in common:
for i in range(1,n):
if not w in words[i]:
common.remove(w)
However, this fails to remove some of the items. Adding print statements for w in common:
for i in range(1,n):
print w
if not w in words[i]:
print w
common.remove(w)results in some w never being printed. Any ideas as to what's happening? I assume the answer's simple and I just don't have adequate Python knowledge, but I'm completely out of ideas.
I think you can simplify your statement with something like this:
filtered = filter(lambda x: x in words, common)
That's checking each element in common for it's presence in words and removing based on it. You may need to try x not in words depending on what you're desired result is, but I think that should come close.
I wanted to add one other approach, that might also come close, though I would need to see examples of your initial lists to test it fully.
filtered = [x for x in common if x in words]
-- EDITED -- I had the syntax in the list comprehension backwards, but caught it after I saw the comment. Thanks!
You can't delete items from the list you're iterating over. Try iterating over a copy of the list instead.
for w in common[:]:
for i in range(1,n):
if not w in words[i]:
common.remove(w)
From the Python docs:
It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.
You are modifying the list while trying to iterate through it.
You could modify the first line of the code to iterate through a copy of the list (using common[:]).
If you delete (say) item 5, then the old item 6 will now be item 5. So if you think to move to item 6 you will skip it.
Is it possible to iterate backwards over that list? Then index-changes happen in parts you already processed.

Categories