Python - Remove nested lists in dictionary - python

I have a dict named results.
the dict is structured in this format:
{'a': [['1', '2', '4'],['1', '2', '2'],['1', '2', '2']], 'b': [['2', '2', '4'],['2', '2', '2'],['1', '2', '4']], 'c': [['1', '2', '4'],['1', '2', '2'],['1', '2', '2']]}
I wish to remove the duplicate nested lists for each key, therefore leaving the dict with:
{'a': [['1', '2', '4'],['1', '2', '2']], 'b': [['2', '2', '4'],['2', '2', '2'],['1', '2', '4']], 'c': [['1', '2', '4'],['1', '2', '2']]}
I have tried:
newdict = {}
for k, v in results.items():
for i in v:
if i not in i:
newdict[k] = i
any help? thanks in advance!

Your code is wrong beyond repair (sorry), mostly because of those 2 lines:
if i not in i: # makes no sense testing if something is inside itself
newdict[k] = i # overwrites the key with one list
You'd have to count each list, and only keep one occurrence of each.
If order doesn't matter you could do that with a nested dict/set/list comprehension.
results = {'a': [['1', '2', '4'],['1', '2', '2'],['1', '2', '2']], 'b': [['2', '2', '4'],['2', '2', '2'],['1', '2', '4']], 'c': [['1', '2', '4'],['1', '2', '2'],['1', '2', '2']]}
newdict = {k:[list(y) for y in {tuple(x) for x in v}] for k,v in results.items()}
print(newdict)
result:
{'a': [['1', '2', '2'], ['1', '2', '4']], 'b': [['2', '2', '4'], ['1', '2', '4'], ['2', '2', '2']], 'c': [['1', '2', '2'], ['1', '2', '4']]}
using a set allows to keep unicity, but you cannot put a list in a set, so the expression converts to tuple first (which is hashable), and converts back to list once the processing is done.

In case if order is important, you can use something like this:
results = {'a': [['1', '2', '4'],['1', '2', '2'],['1', '2', '2']],
'b': [['2', '2', '4'],['2', '2', '2'],['1', '2', '4']],
'c': [['1', '2', '4'],['1', '2', '2'],['1', '2', '2']]}
print({k: [y for x, y in enumerate(v) \
if y not in v[:x]] for k, v in results.items()})
Output:
{'a': [['1', '2', '4'], ['1', '2', '2']],
'b': [['2', '2', '4'], ['2', '2', '2'], ['1', '2', '4']],
'c': [['1', '2', '4'], ['1', '2', '2']]}
To skip first sub-list and require checking only in the remaining, you could do:
print({k: [y for x, y in enumerate(v) \
if y not in v[1:x]] for k, v in results.items()})

Related

How to retrieve the position of a list in a list of lists in python

A = [['0', '6', '4', '3'], ['0', '2', '8', '3'], ['0', '4', '1', '5'], ['0', '3', '2', '5']]
B = ['0', '4', '1', '5']
Say I want to find out at which line does B equal to the list. How do I write a solution for this?
The answer would be the third line.
I tried doing a for loop.
You can try list.index(element) to get the index of the element in the original list (A). In your terminology, to get the line just add one to the index.
line = A.index(B) + 1
you dont need use loops.you can get the element index by indexing.
A = [['0', '6', '4', '3'], ['0', '2', '8', '3'], ['0', '4', '1', '5'], ['0', '3', '2', '5']]
B = ['0', '4', '1', '5']
print(A.index(B))
>>> 2

Loop to gradually to remove elements to dictionary in Python

I have a dictionary that for example looks like this:
new_dict = {
0: ['1'],
1: ['1', '2'],
2: ['1', '2', '3'],
3: ['1', '2', '3', '4'],
4: ['1', '2', '3', '4', '5'],
5: ['1', '2', '3', '4', '5', '6']
}
the values are just gradually appended, meaning that the next value contains the elements of the previous value + its own value.
and my question is, how can I, starting from key=3 included, remove gradually all the starting values?
so for example, for key=3, after the code the new key=3 should look like this:
3: ['2', '3', '4'] #Should remove the value of key=0
then for key=4, the new key=4 should look like this:
4: ['3', '4', '5'] #Should remove the value of key=1
And so on til the end of new_dict.
Here is a relatively compact solution:
new_dict = {0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['1', '2', '3', '4'], 4: ['1', '2', '3', '4', '5'], 5: ['1', '2', '3', '4', '5', '6']}
up_dict={i:[item for item in new_dict[i] if item not in new_dict[i-3]] if i>2 else new_dict[i] for i in new_dict}
print(up_dict)
Output:
{0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['2', '3', '4'], 4: ['3', '4', '5'], 5: ['4', '5', '6']}
You can loop through the items and reassign the list to the last three elements:
In [1]: new_dict = {0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['1', '2', '3', '4'], 4: ['1', '2', '3', '4', '5'], 5: ['1', '2', '3', '4', '5', '6']}
In [2]: for k, v in new_dict.items():
...: if len(v) > 3:
...: new_dict[k] = v[-3:]
...:
In [3]: new_dict
Out[3]:
{0: ['1'],
1: ['1', '2'],
2: ['1', '2', '3'],
3: ['2', '3', '4'],
4: ['3', '4', '5'],
5: ['4', '5', '6']}
As a dict comprehension:
out_dict = {k: v[-3:] if len(v) > 3 else v for k, v in new_dict.items()}
You can use dictionary comprehension, and take the key value pair as it is if key<3 else take only the items that are not in the values for key at 0,1,2, and so on.
{k:new_dict[k] if idx<0 else [i
for i in new_dict[k] if i not in new_dict[idx]]
for idx,k in enumerate(new_dict, -3)}
OUTPUT:
{0: ['1'],
1: ['1', '2'],
2: ['1', '2', '3'],
3: ['2', '3', '4'],
4: ['3', '4', '5'],
5: ['4', '5', '6']}
This should do the job and be quite efficient as well, if that's important to you:
new_dict = {0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['1', '2', '3', '4'], 4: ['1', '2', '3', '4', '5'], 5: ['1', '2', '3', '4', '5', '6']}
print({key:(value,value[key-2:])[key//3] for key,value in new_dict.items()})
>>> {0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['2', '3', '4'], 4: ['3', '4', '5'], 5: ['4', '5', '6']}
Shorter comprehension-based solution:
new_dict = {0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['1', '2', '3', '4'], 4: ['1', '2', '3', '4', '5'], 5: ['1', '2', '3', '4', '5', '6']}
c = 0
r = {a:b if a < 3 else b[(c:=c+1):] for a, b in new_dict.items()}
Output:
{0: ['1'], 1: ['1', '2'], 2: ['1', '2', '3'], 3: ['2', '3', '4'], 4: ['3', '4', '5'], 5: ['4', '5', '6']}

How to find all possible combinations from nested list containing list and strings?

I am trying to get all possible pattern from list like:
input_x = ['1', ['2', '2x'], '3', '4', ['5', '5x']]
As we see, it has 2 nested list ['2', '2x'] and ['5', '5x'] here.
That means all possible pattern is 4 (2 case x 2 case), the expect output is:
output1 = ['1','2' , '3', '4', '5']
output2 = ['1','2x', '3', '4', '5']
output3 = ['1','2' , '3', '4', '5x']
output4 = ['1','2x', '3', '4', '5x']
I tried to search how to, but I can not find any examples (because of I have no idea about "keyword" to search)
I think python has inner libraries/methods to handle it.
One way to achieve this is via using itertools.product. But for using that, you need to firstly wrap the single elements within your list to another list.
For example, firstly we need to convert your list:
['1', ['2', '2x'], '3', '4', ['5', '5x']]
to:
[['1'], ['2', '2x'], ['3'], ['4'], ['5', '5x']]
This can be done via below list comprehension as:
formatted_list = [(l if isinstance(l, list) else [l]) for l in my_list]
# Here `formatted_list` is containing the elements in your desired format, i.e.:
# [['1'], ['2', '2x'], ['3'], ['4'], ['5', '5x']]
Now call itertools.product on the unpacked version of the above list:
>>> from itertools import product
# v `*` is used to unpack the `formatted_list` list
>>> list(product(*formatted_list))
[('1', '2', '3', '4', '5'), ('1', '2', '3', '4', '5x'), ('1', '2x', '3', '4', '5'), ('1', '2x', '3', '4', '5x')]
If you don't want to convert your list to all sub list then
You can try something like this :
input_x = ['1', ['2', '2x'], '3', '4', ['5', '5x'],['6','6x']]
import itertools
non_li=[]
li=[]
for i in input_x:
if isinstance(i,list):
li.append(i)
else:
non_li.append(i)
for i in itertools.product(*li):
sub=non_li[:]
sub.extend(i)
print(sorted(sub))
output:
['1', '2', '3', '4', '5', '6']
['1', '2', '3', '4', '5', '6x']
['1', '2', '3', '4', '5x', '6']
['1', '2', '3', '4', '5x', '6x']
['1', '2x', '3', '4', '5', '6']
['1', '2x', '3', '4', '5', '6x']
['1', '2x', '3', '4', '5x', '6']
['1', '2x', '3', '4', '5x', '6x']

Combine/group/merge partially the list of lists by equal first field, joining second field in string and writing in other fields same data

Example, I have list, I can sort it by first 2 fields, that's ok:
import operator
list = [
['1', '2', '3'],
['1', '5', '6'],
['2', '8', '9', '8', '17'],
['2', '3', '5', '3'],
['1', '14', '89', '34', '15'],
]
sorted_list = sorted(list, key=operator.itemgetter(0, 1))
getting:
['1', '14', '89', '34', '15']
['1', '2', '3']
['1', '5', '6']
['2', '3', '5', '3']
['2', '8', '9', '8', '17']
So, what I need - is to combine those lists by 1st field, in first step it would be '1' from [0][0], [0][1], [0][2], then I join second field with something like comma: "14:2:5" and I don't care, which part of those 3 list I append then, so, after '|' it can be any of:
['1', '14:2:5',| '89', '34', '15']
['1', '14:2:5',| '3']
['1', '14:2:5',| '6']
(in most cases data after '|' will match for first field)
In the end, I want something like:
['1', '14:2:5', '89', '34', '15']
['2', '3:8', '5', '3']
I'm currently into some sort of for loop and getting IndexError all the way :-(
I feel it should be much easier pythonic way.
Don't get yet how to find this algorithm and how it called. Something like list of lists reduce-map-shrink-normalization elements, appending another by element values?
Thanks a lot for helping, things in python still surprise me, how cool it could be done, in the end from all answers, for python 3:
# -*- coding: utf-8 -*-
import operator
import itertools
from natsort import humansorted
list_to_sort = [
['1', 'A', '3'],
['1', '5', '6'],
['1', '1', '10', '11', '12'],
['t', 'S', '7', '0asdf'],
['2', '8', '9', '8', '17'],
['2', '705', '5', '3'],
['2', 'checks', 'df', '1'],
['1', '14', '89', '34', '15'],
]
sorted_list = humansorted(list_to_sort, key=operator.itemgetter(0, 1))
grouped = [list(g) for k, g in itertools.groupby(sorted_list, key=lambda x: x[0])]
out = [[gg[0][0], ':'.join([g[1] for g in gg])] + gg[0][2:] for gg in grouped]
for elem in out:
print(elem)
Once you've sorted your list so that your first-field groups are contiguous, you can use itertools.groupby to do the heavy lifting:
>>> grouped = [list(g) for k,g in groupby(sorted_list, key=lambda x: x[0])]
>>> grouped
[[['1', '14', '89', '34', '15'], ['1', '2', '3'], ['1', '5', '6']], [['2', '3', '5', '3'], ['2', '8', '9', '8', '17']]]
>>> out = [[gg[0][0], ':'.join([g[1] for g in gg])] + gg[0][2:] for gg in grouped]
>>> out
[['1', '14:2:5', '89', '34', '15'], ['2', '3:8', '5', '3']]
You can use a dictionary to preserve the sublist with common first item then use a list comprehension with zip and join to create the desire result:
>>> d={}
>>>
>>> for i,*j in sorted_list:
... d.setdefault(i,[]).append(iter(j))
...
>>> [[i,':'.join(next(zip(*j)))]+next(j)[1:] for i,j in d.items()]
[['1', '14:2:5', '89', '34', '15'], ['2', '3:8', '5', '3']]
But note that this code have been written in python 3. if you are in python 2 you can use itertools.izip instead of zip and for creating the dictionary you can do :
>>> for i in sorted_list:
... d.setdefault(i[0],[]).append(iter(i[1:]))
You could do it with a pair of functions as shown below. The first one, named grouper, is a generator function which are often useful when there's a need to produce intermediate results from a process requires non-trivial amount of initialization and/or housekeeping to be done before multiple intermediate results are returned.
As #Ashwini Chaudhary pointed out in a comment, you were sorting the fields lexicographically, not numerically, in your code, so that issue was also corrected.
import operator
def grouper(a_list):
if a_list:
sorted_list = sorted(a_list,
key=lambda e, get_items=operator.itemgetter(0, 1):
map(int, get_items(e)))
g = [sorted_list[0]]
for x in sorted_list[1:]:
if x[0] == g[-1][0]:
g.append(x)
else:
yield g
g = [x]
yield g
def combiner(a_list):
return [[g[0][0], ':'.join(e[1] for e in g)] + g[0][2:]
for g in grouper(a_list)]
a_list = [
['1', '2', '3'],
['1', '1', '10', '11', '12'], # element added to test sorting
['1', '5', '6'],
['2', '8', '9', '8', '17'],
['2', '3', '5', '3'],
['1', '14', '89', '34', '15'],
]
print(combiner(a_list))
Output:
[['1', '1:2:5:14', '10', '11', '12'], ['2', '3:8', '5', '3']]

Multiple sorting in Python

I have an array with these datas:
[['1', '7', '14'], ['1', '1', '3'], ['1', '12', '3'], ['2', '3', '1'], ['1', '4', '9']]
I like to sort it (multiple):
>>> sorted(datas,key=lambda x:(x[0], x[1]))
[['1', '1', '3'], ['1', '12', '3'], ['1', '4', '9'], ['1', '7', '14'], ['2', '3', '1']]
but after sorted as it seems the 12 < 4. It should be:
[['1', '1', '3'], ['1', '4', '9'], ['1', '7', '14'], ['1', '12', '3'], ['2', '3', '1']]
Any idea? I need not natural sorting.
There is not wrong with sorted behaviour. Your data are lists of string, so it's doable.
>>> data = ['1', '12', '3', '2']
>>> sorted(data)
['1', '12', '2', '3']
If you want to sort as integer, it must be converted.
>>> sorted(data)
['1', '12', '2', '3']
>>> data = [['1', '7', '14'], ['1', '1', '3'], ['1', '12', '3'], ['2', '3', '1'], ['1', '4', '9']]
>>> sorted(data, key=lambda x: map(int, x))
[['1', '1', '3'], ['1', '4', '9'], ['1', '7', '14'], ['1', '12', '3'], ['2', '3', '1']]
Convert x[1] to int(x[1]):
sorted(d,key=lambda x:(int(x[0]), int(x[1])))
Output:
[['1', '1', '3'], ['1', '4', '9'], ['1', '7', '14'], ['1', '12', '3'], ['2', '3', '1']]
You are comparing strings, not ints. Therefor the order you get is the lexicographical order.
If you convert to int first
sorted(data, key=lambda x:(int(x[0]), int(x[1])))
you will get the desired result
[['1', '1', '3'], ['1', '4', '9'], ['1', '7', '14'], ['1', '12', '3'], ['2', '3', '1']]
Currently your sort is working on tuples string values. String values are determined similarly to any other iterable. When it compares two strings, it goes character by character from left-to-right or index 0 to index n-1 where n is the length of the iterable, until it finds one character that is larger than another. so when comparing '12' and '4', it notices that '4' is greater than '1' so it finishes right there. This system of ordering is known as lexicographical order.
To see the "value" of a character (In Python, a character is just a string of length 1), just use the ord function:
>>> ord('1')
49
>>> ord('4')
52
And to validate that the string '12' is indeed less than '4' because ord('1') < ord('4'):
>>> '12' < '4'
True
If you want to sort by the integer values of the strings, you have to convert the strings to ints by using the built-in int constructor.
sorted(datas,key=lambda x: (int(x[0]), int(x[1]))
Or if you want to cleanly handle iterables of all sizes, simply use a tuple-generator for the key:
sorted(datas,key=lambda x: tuple(int(e) for e in x))

Categories