I have collection of lists like this:
example = [['a','b','c'],['d','e','f'],[ ],['z'],['g','h','i'],[ ],['z']]
I want to remove [] and ['z'] from the list.
The desired output is:
example = [['a','b','c'],['d','e','f'],['g','h','i']]
How can I do it? Can I remove both using an one liner?
I am familiar with .pop() and .remove() command but I have doubts if it will work for [ ] type of list.
You can use list comprehension for the filtering:
example = [['a','b','c'],['d','e','f'],[ ],['z'],['g','h','i'],[ ],['z']]
output = [sublst for sublst in example if sublst not in ([], ['z'])]
print(output) # [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
you can remove them like this:
example = list(filter(lambda val: val != [] and val!=['z'], example))
Related
I have a text file like this:
a w
b x
c,d y
e,f z
And I want to get the values of the first column into a list without duplicates. For now I get the values from the first column, which I am doing like this:
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.append(x.split(' ')[0])
f.close()
In the next step I want to separate the values by a comma delimiter the same way I did before, but then I get an output like this:
[['a'], ['b'], ['c', 'd'], ['e', 'f']]
How can I convert this into a one dimensional thing to be able to remove duplicates afterwards?
I am a beginner in python.
You can split it immediately after the first split and must use extend instead of append.
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.extend(x.split(' ')[0].split(','))
f.close()
print(firstCol)
Result
['a', 'b', 'c', 'd', 'e', 'f']
Or if you want to keep the firstCol
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.append(x.split(' ')[0])
f.close()
one_dimension = []
for col in firstCol:
one_dimension.extend(col.split(','))
print(firstCol)
print(one_dimension)
Result
['a', 'b', 'c,d', 'e,f']
['a', 'b', 'c', 'd', 'e', 'f']
you can use itertools.chain to flatten your list of lists and then you can use the built-in class set to remove the duplicates :
from itertools import chain
l = [['a'], ['b'], ['c', 'd'], ['e', 'f']]
set(chain.from_iterable(l))
# {'a', 'b', 'c', 'd', 'e', 'f'}
to flatten your list you can also use a list comprehension:
my_l = [e for i in l for e in i]
# ['a', 'b', 'c', 'd', 'e', 'f']
same with 2 simple for loops:
my_l = []
for i in l:
for e in i:
my_l.append(e)
Possible solution 1
If your are fine with your code, you can keep like that and remove duplicates from a list of lists executing the following:
import itertools
firstCol.sort()
firstCol = list(x for x,_ in itertools.groupby(firstCol))
Possible solution 2
If you want to convert the list of lists into one list of items:
firstCol = [x for y in firstCol for x in y]
If you want to also remove duplicates:
firstCol = list(set([x for y in firstCol for x in y]))
I have a list of values in which some values are words separated by commas, but are considered single strings as shown:
l = ["a",
"b,c",
"d,e,f"]
#end result should be
#new_list = ['a','b','c','d','e','f']
I want to split those strings and was wondering if there's a one liner or something short to do such a mutation. So far what, I was thinking of just iterating through l and .split(',')-ing all the elements then merging, but that seems like it would take a while to run.
import itertools
new_list = list(itertools.chain(*[x.split(',') for x in l]))
print(new_list)
>>> ['a', 'b', 'c', 'd', 'e', 'f']
Kind of unusual but you could join all your elements with , and then split them:
l = ["a",
"b,c",
"d,e,f"]
newList = ','.join(l).split(',')
print(newList)
Output:
['a', 'b', 'c', 'd', 'e', 'f']
Here's a one-liner using a (nested) list comprehension:
new_list = [item for csv in l for item in csv.split(',')]
See it run here.
Not exactly a one-liner, but 2 lines:
>>> l = ["a",
"b,c",
"d,e,f"]
>>> ll =[]
>>> [ll.extend(x.split(',')) for x in l]
[None, None, None]
>>> ll
['a', 'b', 'c', 'd', 'e', 'f']
The accumulator needs to be created separately since x.split(',') can not be unpacked inside a comprehension.
I am searching through a list like this:
my_list = [['a','b'],['b','c'],['a','x'],['f','r']]
and I want to see which elements come with 'a'. So first I have to find lists in which 'a' occurs. Then get access to the other element of the list. I do this by abs(pair.index('a')-1)
for pair in my_list:
if 'a' in pair:
print( pair[abs(pair.index('a')-1)] )
Is there any better pythonic way to do that?
Something like: pair.index(not 'a') maybe?
UPDATE:
Maybe it is good to point out that 'a' is not necessarily the first element.
in my case, ['a','a'] doesn't happen, but generally maybe it's good to choose a solution which handles this situation too
Are you looking for elements that accompany a? If so, a simple list comprehension will do:
In [110]: [x for x in my_list if 'a' in x]
Out[110]: [['a', 'b'], ['a', 'x']]
If you just want the elements and not the pairs, how about getting rid of a before printing:
In [112]: [(set(x) - {'a'}).pop() for x in my_list if 'a' in x]
Out[112]: ['b', 'x']
I use a set because a could either be the first or second element in the pair.
If I understand your question correctly, the following should work:
my_list = filter(
lambda e: 'a' not in e,
my_list
)
Note that in python 3, this returns a filter object instance. You may want to wrap the code in a list() command to get a list instance instead.
That technique works ok here, but it may be more efficient, and slightly more readable, to do it using sets. Here's one way to do that.
def paired_with(seq, ch):
chset = set(ch)
return [(set(pair) - chset).pop() for pair in seq if ch in pair]
my_list = [['a','b'], ['b','c'], ['x','a'], ['f','r']]
print(paired_with(my_list, 'a'))
output
['b', 'x']
If you want to do lots of tests on the same list, it would be more efficient to build a list of sets.
def paired_with(seq, ch):
chset = set(ch)
return [(pair - chset).pop() for pair in seq if ch in pair]
my_list = [['a','b'], ['b','c'], ['x','a'], ['f','r']]
my_sets = [set(u) for u in my_list]
print(my_sets)
print(paired_with(my_sets, 'a'))
output
[{'b', 'a'}, {'c', 'b'}, {'x', 'a'}, {'r', 'f'}]
['b', 'x']
This will fail if there's a pair like ['a', 'a'], but we can easily fix that:
def paired_with(seq, ch):
chset = set(ch)
return [(pair - chset or chset).pop() for pair in seq if ch in pair]
my_list = [['a','b'], ['b','c'], ['x','a'], ['f','r'], ['a', 'a']]
my_sets = [set(u) for u in my_list]
print(paired_with(my_sets, 'a'))
output
['b', 'x', 'a']
Suppose I have a list of items like this:
mylist=['a','b','c','d','e','f','g','h','i']
I want to pop two items from the left (i.e. a and b) and two items from the right (i.e. h,i). I want the most concise an clean way to do this. I could do it this way myself:
for x in range(2):
mylist.pop()
mylist.pop(0)
Any other alternatives?
From a performance point of view:
mylist = mylist[2:-2] and del mylist[:2];del mylist[-2:] are equivalent
they are around 3 times faster than the first solution for _ in range(2): mylist.pop(0); mylist.pop()
Code
iterations = 1000000
print timeit.timeit('''mylist=range(9)\nfor _ in range(2): mylist.pop(0); mylist.pop()''', number=iterations)/iterations
print timeit.timeit('''mylist=range(9)\nmylist = mylist[2:-2]''', number=iterations)/iterations
print timeit.timeit('''mylist=range(9)\ndel mylist[:2];del mylist[-2:]''', number=iterations)/iterations
output
1.07710313797e-06
3.44465017319e-07
3.49956989288e-07
You could slice out a new list, keeping the old list as is:
mylist=['a','b','c','d','e','f','g','h','i']
newlist = mylist[2:-2]
newlist now returns:
['c', 'd', 'e', 'f', 'g']
You can overwrite the reference to the old list too:
mylist = mylist[2:-2]
Both of the above approaches will use more memory than the below.
What you're attempting to do yourself is memory friendly, with the downside that it mutates your old list, but popleft is not available for lists in Python, it's a method of the collections.deque object.
This works well in Python 3:
for x in range(2):
mylist.pop(0)
mylist.pop()
In Python 2, use xrange and pop only:
for _ in xrange(2):
mylist.pop(0)
mylist.pop()
Fastest way to delete as Martijn suggests, (this only deletes the list's reference to the items, not necessarily the items themselves):
del mylist[:2]
del mylist[-2:]
If you don't want to retain the values, you could delete the indices:
del myList[-2:], myList[:2]
This does still require that all remaining items are moved up to spots in the list. Two .popleft() calls do require this too, but at least now the list object can handle the moves in one step.
No new list object is created.
Demo:
>>> myList = ['a','b','c','d','e','f','g','h','i']
>>> del myList[-2:], myList[:2]
>>> myList
['c', 'd', 'e', 'f', 'g']
However, from your use of popleft I strongly suspect you are, instead, working with a collections.dequeue() object instead. If so, *stick to using popleft(), as that is far more efficient than slicing or del on a list object.
To me, this is the prettiest way to do it using a list comprehension:
>> mylist=['a','b','c','d','e','f','g','h','i']
>> newlist1 = [mylist.pop(0) for idx in range(2)]
>> newlist2 = [mylist.pop() for idx in range(2)]
That will pull the first two elements from the beginning and the last two elements from the end of the list. The remaining items stay in the list.
First 2 elements: myList[:2]
Last 2 elements: mylist[-2:]
So myList[2:-2]
mylist = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o']
new_list = [mylist.pop(0) for _ in range(6) if len(mylist) > 0]
>>> new_list
['a', 'b', 'c', 'd', 'e', 'f']
new_list = [mylist.pop(0) for _ in range(6) if len(mylist) > 0]
>>> new_list
['g', 'h', 'i', 'j', 'k', 'l']
new_list = [mylist.pop(0) for _ in range(6) if len(mylist) > 0]
>>> new_list
['m', 'n', 'o']
Python3 has something cool, similar to rest in JS (but a pain if you need to pop out a lot of stuff)
mylist=['a','b','c','d','e','f','g','h','i']
_, _, *mylist, _, _ = mylist
mylist == ['c', 'd', 'e', 'f', 'g'] # true
I'm new to programming in general, so looking to really expand my skills here. I'm trying to write a script that will grab a list of strings from an object, then order them based on a template of my design. Any items not in the template will be added to the end.
Here's how I'm doing it now, but could someone suggest a better/more efficient way?
originalList = ['b', 'a', 'c', 'z', 'd']
listTemplate = ['a', 'b', 'c', 'd']
listFinal = []
for thing in listTemplate:
if thing in originalList:
listFinal.append(thing)
originalList.pop(originalList.index(thing))
for thing in originalList:
listFinal.append(thing)
originalList.pop(originalList.index(thing))
Try this:
originalList = ['b', 'a', 'c', 'z', 'd']
listTemplate = ['a', 'b', 'c', 'd']
order = { element:index for index, element in enumerate(listTemplate) }
sorted(originalList, key=lambda element: order.get(element, float('+inf')))
=> ['a', 'b', 'c', 'd', 'z']
This is how it works:
First, we build a dictionary indicating, for each element in listTemplate, its relative order with respect to the others. For example a is 0, b is 1 and so on
Then we sort originalList. If one of its elements is present in the order dictionary, then use its relative position for ordering. If it's not present, return a positive infinite value - this will guarantee that the elements not in listTemplate will end up at the end, with no further ordering among them.
The solution in the question, although correct, is not very pythonic. In particular, whenever you have to build a new list, try to use a list comprehension instead of explicit looping/appending. And it's not a good practice to "destroy" the input list (using pop() in this case).
You can create a dict using the listTemplate list, that way the expensive(O(N)) list.index operations can be reduced to O(1) lookups.
>>> lis1 = ['b', 'a', 'c', 'z', 'd']
>>> lis2 = ['a', 'b', 'c', 'd']
Use enumerate to create a dict with the items as keys(Considering that the items are hashable) and index as values.
>>> dic = { x:i for i,x in enumerate(lis2) }
Now dic looks like:
{'a': 0, 'c': 2, 'b': 1, 'd': 3}
Now for each item in lis1 we need to check it's index in dic, if the key is not found we return float('inf').
Function used as key:
def get_index(key):
return dic.get(key, float('inf'))
Now sort the list:
>>> lis1.sort(key=get_index)
>>> lis1
['a', 'b', 'c', 'd', 'z']
For the final step, you can just use:
listFinal += originalList
and it will add these items to the end.
There is no need to create a new dictionary at all:
>>> len_lis1=len(lis1)
>>> lis1.sort(key = lambda x: lis2.index(x) if x in lis2 else len_lis1)
>>> lis1
['a', 'b', 'c', 'd', 'z']
Here is a way that has better computational complexity:
# add all elements of originalList not found in listTemplate to the back of listTemplate
s = set(listTemplate)
listTemplate.extend(el for el in originalList if el not in s)
# now sort
rank = {el:index for index,el in enumerate(listTemplate)}
listFinal = sorted(originalList, key=rank.get)