Remove all integer from list in python - python
How do I remove all integers in my list except the last integer?
From
mylist = [('a',1,'b',2,'c',3), ('d',1,'e',2),('f',1,'g',2,'h',3,'i',4)]
To
[('a','b','c',3), ('d','e',2),('f','g','h','i',4)]
I tried doing below but nothing happens.
no_integers = [x for x in mylist if not isinstance(x, int)]
One way using filter with packing:
[(*filter(lambda x: isinstance(x, str), i), j) for *i, j in mylist]
Output:
[('a', 'b', 'c', 3), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
Explanation:
for *i, j in mylist: packs mylist's element (i.e. ('a',1,'b',2,'c',3), ...) into everything until last (*i) and the last (j).
So it will yield (('a',1,'b',2,'c'), 3) and so on.
filter(lambda x: isinstance(x, str), i): from i:('a',1,'b',2,'c'), filters out only str objects.
So ('a',1,'b',2,'c') becomes ('a','b','c').
(*filter, j): unpacks the result of 2 into a tuple whose last element is j.
So it becomes ('a', 'b', 'c', 3).
clear(), pop(), and remove() are methods of list. You can also remove elements from a list with del statements.,In Python, use list methods clear(), pop(), and remove() to remove items (elements) from a list. It is also possible to delete items using del statement by specifying a position or range with an index or slice.,It is also possible to delete all items by specifying the entire range.,See the following example.
l=list(range(10))
print(l)
#[0,1,2,3,4,5,6,7,8,9]
l.clear()
print(l)
#[]
Using only list comprehensions
there will always be the last integer on each tuple. There's no scenario where the end will be a string.
Keeping your above comment in mind,
Your approach is almost right, except for the fact that your list is actually a list of tuples. This means you need a nested loop to iterate through the items inside the sublist.
Knowing that the last element is an integer (last integer) that has to be kept, you can simply just do the above iteration on n-1 items of each sublist, and then append the last item regardless of the condition.
Check comments for an explanation of each component.
[tuple([item for item in sublist if not isinstance(item, int)]+[sublist[-1]]) for sublist in l]
#|___________________________________________________________| |____________|
# | |
# Same as your method, iterating on n-1 items for each tuple |
# append last item
#|____________________________________________________________________________________________|
# |
# Iterate over the list and then iterate over each sublist (tuples) with condition
[('a', 'b', 'c', 3), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
So, simply use your method for iterating inside the sublist, while having a separate for loop to iterate through the sublists.
You can use listcomp and unpacking like this.
[
(*[c for c in subl if not isinstance(c, int)], last)
for *subl, last in mylist
]
Out: [('a', 'b', 'c', 3), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
PS: Updated after this answer. Thanks Chris.
Assumptions
If we can assume the int we want to keep will always be the last element in each tuple, this is very easy. We'll just filter everything except the last element and reconstruct a tuple using the result.
>>> [tuple(list(filter(lambda x: not isinstance(x, int), l[:-1])) + [l[-1]]) for l in mylist]
[('a', 'b', 'c', 3), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
>>>
Less assumption
But what if, as originally specified in your question, we just want to get rid of all ints... except the last one and we don't know what order or positions things will occur in?
First a function that might provide handy:
>>> def partition(pred, lst):
... t, f = [], []
... for x in lst:
... (t if pred(x) else f).append(x)
... return (t, f)
...
>>>
We can easily partition your data into ints and not ints with this:
>>> partition(lambda x: isinstance(x, int), mylist[0])
([1, 2, 3], ['a', 'b', 'c'])
>>> [partition(lambda x: isinstance(x, int), lst) for lst in mylist]
[([1, 2, 3], ['a', 'b', 'c']), ([1, 2], ['d', 'e']), ([1, 2, 3, 4], ['f', 'g', 'h', 'i'])]
>>>
Then we'd just need to discard all but the last of the ints, but how would we know where to put them back together? Well, if we were to enumerate them first...
>>> [partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist]
[([(1, 1), (3, 2), (5, 3)], [(0, 'a'), (2, 'b'), (4, 'c')]), ([(1, 1), (3, 2)], [(0, 'd'), (2, 'e')]), ([(1, 1), (3, 2), (5, 3), (7, 4)], [(0, 'f'), (2, 'g'), (4, 'h'), (6, 'i')])]
>>>
Now we just discard all but the last int, and add it back into the list of not ints to create a single list.
>>> [(i[-1], ni) for i, ni in [partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist]]
[((5, 3), [(0, 'a'), (2, 'b'), (4, 'c')]), ((3, 2), [(0, 'd'), (2, 'e')]), ((7, 4), [(0, 'f'), (2, 'g'), (4, 'h'), (6, 'i')])]
>>> [ni + [i[-1]] for i, ni in [partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist]]
[[(0, 'a'), (2, 'b'), (4, 'c'), (5, 3)], [(0, 'd'), (2, 'e'), (3, 2)], [(0, 'f'), (2, 'g'), (4, 'h'), (6, 'i'), (7, 4)]]
>>>
If we sort by the indexes we added on with enumerate:
>>> [sorted(ni + [i[-1]], key=lambda x: x[0]) for i, ni in [partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist]]
[[(0, 'a'), (2, 'b'), (4, 'c'), (5, 3)], [(0, 'd'), (2, 'e'), (3, 2)], [(0, 'f'), (2, 'g'), (4, 'h'), (6, 'i'), (7, 4)]]
>>>
Now we need to discard the indexes and convert back to tuples.
>>> [tuple(map(lambda x: x[1], sorted(ni + [i[-1]], key=lambda x: x[0]))) for i, ni in [partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist]]
[('a', 'b', 'c', 3), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
>>>
Now, even if we change up the input data, we keep only the last int, and where it originally was:
>>> mylist = [('a',1,'b',2,'c',3,'d',6,'e','f'), ('d',1,'e',2),('f',1,'g',2,'h',3,'i',4)]
>>> [tuple(map(lambda x: x[1], sorted(ni + [i[-1]], key=lambda x: x[0]))) for i, ni in [partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist]]
[('a', 'b', 'c', 'd', 6, 'e', 'f'), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
>>>
The inner list comprehension in the above expression can be replaced by a generator expression and the solution still works.
>>> mylist = [('a',1,'b',2,'c',3,'d',6,'e','f'), ('d',1,'e',2),('f',1,'g',2,'h',3,'i',4)]
>>> [tuple(map(lambda x: x[1], sorted(ni + [i[-1]], key=lambda x: x[0]))) for i, ni in (partition(lambda x: isinstance(x[1], int), enumerate(lst)) for lst in mylist)]
[('a', 'b', 'c', 'd', 6, 'e', 'f'), ('d', 'e', 2), ('f', 'g', 'h', 'i', 4)]
>>>
Use loop and list comprehension
you can iterate over each tuple of list and then using list comprehension remove all the integers and add last element.
mylist = [('a',1,'b',2,'c',3), ('d',1,'e',2),('f',1,'g',2,'h',3,'i',4)]
newlist = []
#print(mylist)
for ele in mylist:
#print(ele)
list_1 = tuple(([ x for x in list(ele) if not str(x).isdigit() ] + [ele[-1]]))
newlist.append(list_1)
print("Final List :")
print(newlist)
after list comprehension add last element with index and convert that list to tuple and append it to your new list.
Related
How can I get list levels of nested lists and add them as tuples?
Given a nested list, such as my_list = ['A', 'B', ['C', ['D', 'E'], 'F'], 'G'] Is there a way I can read the list level of each element, and return it such that: new_list = [('A', 0), ('B', 0), [('C', 1), [('D', 2), ('E', 2)], ('F', 1)], ('G', 0)] Any help or suggestions are greatly appreciated.
You can use recursion: my_list = ['A', 'B', ['C', ['D', 'E'], 'F'], 'G'] def to_level(d, l = 0): return [(a, l) if not isinstance(a, list) else to_level(a, l+1) for a in d] print(to_level(my_list)) Output: [('A', 0), ('B', 0), [('C', 1), [('D', 2), ('E', 2)], ('F', 1)], ('G', 0)]
This is nasty and doesn't work if there are two identical letters in the list, but here it is, the idea might be useful for someone: import re def foo(my_list): s = str(my_list) letters = re.findall("'.'", s) for letter, tup in zip(letters, [f'({res}, {min(s.count("[", 0, s.index(res))-1, s.count("]", s.index(res))-1)})' for res in letters]): s = s.replace(letter, tup) return eval(s) It stringifies the list, searches for the letters, then checks the number of brackets to its left and right so this way it can determine its level. Then the letters are replaced with the surrounding brackets accordingly and converted back to a list using eval.
You can also use a generator: from collections.abc import Sequence from typing import Any, Tuple def get_level(value: Any, level: int = 0) -> Tuple[Any, int]: for item in value: if isinstance(item, Sequence): yield from get_level(item, level + 1) continue yield item, level demo_list = ['A', 'B', ['C', ['D', 'E'], 'F'], 'G'] print(list(get_level(demo_list))) demo_tuple = ('A', 'B') print(list(get_level(demo_tuple))) demo_single = 'A' print(list(get_level(demo_single))) Output: [('A', 0), ('B', 0), ('C', 1), ('D', 2), ('E', 2), ('F', 1), ('G', 0)] [('A', 0), ('B', 0)] [('A', 0)]
How to do a full outer join / merge of iterators by key?
I have multiple sorted iterators that yield keyed data, representable by lists: a = iter([(1, 'a'), (2, 't'), (4, 'c')]) b = iter([(1, 'a'), (3, 'g'), (4, 'g')]) I want to merge them, using the key and keeping track of which iterator had a value for a key. This should be equivalent to a full outer join in SQL: >>> list(full_outer_join(a, b, key=lambda x: x[0])) [(1, 'a', 'a'), (2, 't', None), (3, None, 'g'), (4, 'c', 'g')] I tried using heapq.merge and itertools.groupby, but with merge I already lose information about the iterators: >>> list(heapq.merge(a, b, key=lambda x: x[0])) [(1, 'a'), (1, 'a'), (2, 't'), (3, 'g'), (4, 'c'), (4, 'g')] So what I could use is a tag generator def tagged(it, tag): for item in it: yield (tag, *x) and merge the tagged iterators, group by the key and create a dict using the tag: merged = merge(tagged(a, 'a'), tagged(b, 'b'), key=lambda x: x[1]) grouped = groupby(merged, key=lambda x: x[1]) [(key, {g[0]: g[2] for g in group}) for key, group in grouped] Which gives me this usable output: [(1, {'a': 'a', 'b': 'a'}), (2, {'a': 't'}), (3, {'b': 'g'}), (4, {'a': 'c', 'b': 'g'})] However, I think creating dicts for every group is quite costly performance wise, so maybe there is a more elegant way? Edit: To clarify, the dataset is too big to fit into memory, so I definitely need to use generators/iterators. Edit 2: To further clarify, a and b should only be iterated over once, because they represent huge files that are slow to read.
You can alter your groupby solution by using reduce and a generator in a function: from itertools import groupby from functools import reduce def group_data(a, b): sorted_data = sorted(a+b, key=lambda x:x[0]) data = [reduce(lambda x, y:(*x, y[-1]), list(b)) for _, b in groupby(sorted_data, key=lambda x:x[0])] current = iter(range(len(list(filter(lambda x:len(x) == 2, data))))) yield from [i if len(i) == 3 else (*i, None) if next(current)%2 == 0 else (i[0], None, i[-1]) for i in data] print(list(group_data([(1, 'a'), (2, 't'), (4, 'c')], [(1, 'a'), (3, 'g'), (4, 'g')]))) Output: [(1, 'a', 'a'), (2, 't', None), (3, None, 'g'), (4, 'c', 'g')]
Here is one solution via dictionaries. I provide it here as it's not clear to me that dictionaries are inefficient in this case. I believe dict_of_lists can be replaced by an iterator, but I use it in the below solution for demonstration purposes. a = [(1, 'a'), (2, 't'), (4, 'c')] b = [(1, 'a'), (3, 'g'), (4, 'g')] dict_of_lists = {'a': a, 'b': b} def gen_results(dict_of_lists): keys = {num for k, v in dict_of_lists.items() \ for num, val in v} for key in keys: d = {k: val for k, v in dict_of_lists.items() \ for num, val in v if num == key} yield (key, d) Result list(gen_results(dict_of_lists)) [(1, {'a': 'a', 'b': 'a'}), (2, {'a': 't'}), (3, {'b': 'g'}), (4, {'a': 'c', 'b': 'g'})]
Check list of tuples where first element of tuple is specified by defined string
This question is similar to Check that list of tuples has tuple with 1st element as defined string but no one has properly answered the "wildcard" question. Say I have [('A', 2), ('A', 1), ('B', 0.2)] And I want to identify the tuples where the FIRST element is A. How do I return just the following? [('A', 2), ('A', 1)]
Using a list comprehension: >>> l = [('A', 2), ('A', 1), ('B', 0.2)] >>> print([el for el in l if el[0] == 'A']) [('A', 2), ('A', 1)]
You could use Python's filter function for this as follows: l = [('A', 2), ('A', 1), ('B', 0.2)] print filter(lambda x: x[0] == 'A', l) Giving: [('A', 2), ('A', 1)]
Simple enough list comprehension: >>> L = [('A', 2), ('A', 1), ('B', 0.2)] >>> [(x,y) for (x,y) in L if x == 'A'] [('A', 2), ('A', 1)]
Generator to merge sorted dictionary-like iterables
This is a variation on Generator to yield gap tuples from zipped iterables . I wish to design a generator function that: Accepts an arbitrary number of iterables Each input iterable yields zero or more (k, v), k not necessarily unique Input keys are assumed to be sorted in ascending order Output should yield (k, (v1, v2, ...)) Output keys are unique, and appear in the same order as the input The number of output tuples is equal to the number of unique keys in the input The output values correspond to all input tuples matching the output key Since the inputs and outputs are potentially large, they should be treated as iterables and not loaded as an in-memory dict or list. As an example, i1 = ((2, 'a'), (3, 'b'), (5, 'c')) i2 = ((1, 'd'), (2, 'e'), (3, 'f')) i3 = ((1, 'g'), (3, 'h'), (5, 'i'), (5, 'j')) result = sorted_merge(i1, i2, i3) print [result] This would output: [(1, ('d', 'g')), (2, ('a', 'e')), (3, ('b', 'f', 'h')), (5, ('c', 'i', 'j'))] If I'm not mistaken, there's nothing built into the Python standard library to do this out of the box.
While there isn't a single standard library function to do what you want, there are enough building blocks to get you most of the way: from heapq import merge from itertools import groupby from operator import itemgetter def sorted_merge(*iterables): for key, group in groupby(merge(*iterables), itemgetter(0)): yield key, [pair[1] for pair in group] Example: >>> i1 = ((2, 'a'), (3, 'b'), (5, 'c')) >>> i2 = ((1, 'd'), (2, 'e'), (3, 'f')) >>> i3 = ((1, 'g'), (3, 'h'), (5, 'i'), (5, 'j')) >>> result = sorted_merge(i1, i2, i3) >>> list(result) [(1, ['d', 'g']), (2, ['a', 'e']), (3, ['b', 'f', 'h']), (5, ['c', 'i', 'j'])] Note that in the version of sorted_merge above, we're yielding int, list pairs for the sake of producing readable output. There's nothing to stop you changing the relevant line to yield key, (pair[1] for pair in group) if you want to yield int, <generator> pairs instead.
Something a little different: from collections import defaultdict def sorted_merged(*items): result = defaultdict(list) for t in items: for k, v in t: result[k].append(v) return sorted(list(result.items())) i1 = ((2, 'a'), (3, 'b'), (5, 'c')) i2 = ((1, 'd'), (2, 'e'), (3, 'f')) i3 = ((1, 'g'), (3, 'h'), (5, 'i'), (5, 'j')) result = sorted_merged(i1, i2, i3)
combine list elements
How can I merge/combine two or three elements of a list. For instance, if there are two elements, the list 'l' l = [(a,b,c,d,e),(1,2,3,4,5)] is merged into [(a,1),(b,2),(c,3),(d,4),(e,5)] however if there are three elements l = [(a,b,c,d,e),(1,2,3,4,5),(I,II,II,IV,V)] the list is converted into [(a,1,I),(b,2,II),(c,3,III),(d,4,Iv),(e,5,V)] Many thanks in advance.
Use zip: l = [('a', 'b', 'c', 'd', 'e'), (1, 2, 3, 4, 5)] print zip(*l) Result: [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)]