Compare list of list with dictionary - python

My list of list is
candidates= [[714, 1023, 768, 1078], [803, 938, 868, 995]]
My dictionary is:
main_dict = {(1561, 48, 1623, 105): [[[1592, 58]],
[[1591, 59]],
[[1585, 59]],
[[1600, 58]]],
(714, 1023, 768, 1078): [[[1, 5]],
[[2, 6]],
[[3, 3]],
[[4, 3]]],
(803, 938, 868, 995): [[[14, 5]],
[[22, 64]],
[[34, 31]],
[[43, 32]]]
}
I would like to have 2 lists, candidate_values_exists_in_dict_key which contains the corresponding values of candidates that exist in main_dict, and the other list contains the values that are in main_dict but not in candidate_values_exists_in_dict_key.
Here is what I tried, very cluttering and slow. Can someone have a faster way? Furthermore, how can I have an else statement that has the list of keys v which do not exist in candidate_values_exists_in_dict_key but in main_dict?
It is guaranteed that the candidates values will always be in the main_dict keys, and in the same order of appearance as the candidates.
candidate_values_exists_in_dict_key = []
values_of_main_dict_not_in_candidates_values_list=[]
for x in candidates:
for k, v in main_dict.items():
if x == list(k):
candidate_values_exists_in_dict_key.append(v)

Just a normal list comprehension with dict lookup would do fine. There isn't any need for nested loops
candidate_values_exists_in_dict_key = [main_dict[tuple(c)] for c in candidates]
values_of_main_dict_not_in_candidates_values_list = [v for k,v in main_dict.items() if list(k) not in candidates]

Given that you want to output a list of values that are not in the candidates in addition to a list of those that are, you are going to have to iterate through the dictionary in one form or another in any case.
Therefore the code in the question is basically fine. But you could improve it slightly by using a set of the candidates (converted to the tuples used as dictionary keys) to test inclusion, rather than having nested loops. Because you know that these are used as dictionary keys, you also know that they can be used as elements of a set.
candidate_tuples = set(map(tuple, candidates))
candidate_values_exists_in_dict_key = []
values_of_main_dict_not_in_candidates_values_list = []
for k, v in main_dict.items():
if k in candidate_tuples:
candidate_values_exists_in_dict_key.append(v)
else:
values_of_main_dict_not_in_candidates_values_list.append(v)

Related

assign elements from a list of list to a dictionary

i have this code:
list1 = [['player1', 5,1,300,100, ..., n],['player2', 10,5,650,150,...n],['player3', 17,6,1100,1050...,n]]
dictionary = {
'playersname':[]
'totalwin':[]
'totalloss':[]
'moneywon':[]
'moneyloss':[]
}
for x in listplayers:
dictionary['name'].append(x[0])
dictionary['totalwins'].append(x[1])
dictionary['totalloss'].append(x[2])
dictionary['moneywon'].append(x[3])
dictionary['moneylost'].append(x[4])
my output:
dictionary = {
'name': [player1,player2,player3,...,n],
'totalwin':[5,10,17,...,n],
'totalloss':[1,5,6],
'moneywon':[300,650,1100],
'moneyloss':[100,150,1050],
}
it works just fine, but i have to write out every dictionary keys and append every items individually
(ex:dictionary['totalwins'].append(x[1]))
so if i had a dictionary with 30 keys and a list with 30 different players caracteristics(ex:win, lost, etc) i would have to write 30 lines.
Is there a way to write the same code in fewer lines (ex:loop through everything) instead of writing 30 lines like so:
1 for x in listplayers:
2 dictionary['name'].append(x[0])
3 dictionary['totalwins'].append(x[1])
... ...
30 dictionary['key30'].append(x[30])
If you make a list of keys, you can zip up the values, then zip that up with the key passing the whole thing to dict()
listplayers = [['player1',5,1,300,100], ['player2',10,5,650,150], ['player3',17,6,1100,1050]]
keys = ['playersname','totalwins','totalloss','moneywon','moneylost']
dictionary = dict(zip(keys, zip(*listplayers)))
dictionary
# {'playersname': ('player1', 'player2', 'player3'),
# 'totalwins': (5, 10, 17),
# 'totalloss': (1, 5, 6),
# 'moneywon': (300, 650, 1100),
# 'moneylost': (100, 150, 1050)}
Notice, this give you tuples, not lists. If that's a problem, you can wrap the zips in a dict comprehension or use map to convert them:
dictionary = {key: list(values) for key, values in zip(keys, zip(*listplayers))}
or
dictionary = dict(zip(keys, map(list,zip(*listplayers))))
You could do the following.
list1 = [['player1', 5,1,300,100],['player2', 10,5,650,150]]
dictionary = {f'key_{i}':[*x] for i,x in enumerate(zip(*list1))}
The resulting dictionary:
{'key_0': ['player1', 'player2'],
'key_1': [5, 10],
'key_2': [1, 5],
'key_3': [300, 650],
'key_4': [100, 150]}
Or, if you have some key names in mind:
list1 = [['player1', 5,1,300,100],['player2', 10,5,650,150]]
keys = ['playersname',
'totalwin',
'totalloss',
'moneywon',
'moneyloss']
{keys[i]:[*x] for i,x in enumerate(zip(*list1))}
The result:
{'playersname': ['player1', 'player2'],
'totalwin': [5, 10],
'totalloss': [1, 5],
'moneywon': [300, 650],
'moneyloss': [100, 150]}

Calculating the average vector for each unique element in a list

I have a list of the form:
mylist =[([256, 408, 147, 628], 'size'), ([628, 526, 236, 676], 'camera'),
([526, 876, 676, 541], 'camera'), ([567, 731, 724, 203], 'size'),.....]
That has a size of around 8000+.
It contains many duplicate entries, there are actually only 100 unique words in this list and so I would like to reduce this list down to a size of 100 (the number of unique words) by taking the average vector of every occurance of that word.
For example, my new list will have the form:
newlist = [([411.5,569.5,435.5,415.5],'size',.....] #I have taken the average values of 'size'
here and want to repeat this for each unique word
and will be of length 100.
How would I do this?
You can do this by collecting all the data for each 'key' into a dict, then work out the average for each element in each list assigned to that key. Something like:
from statistics import mean
data = [([1, 2, 3, 4], 'size'), ([10, 20, 30, 40], 'camera'),
([100, 200, 300, 400], 'camera'), ([10, 20, 30, 40], 'size')]
ddata = {}
for entry in data:
key = entry[-1]
if not key in ddata:
ddata[key] = []
ddata[key].append(entry[0])
#print(ddata)
out = []
for k, v in ddata.items():
out.append((list(map(mean, zip(*v))), k))
print(out)
# [([5.5, 11, 16.5, 22], 'size'), ([55, 110, 165, 220], 'camera')]
You can try this! Let me know if you like it :)
Note that the final output is my_new_list, soy check it out doing:
print(my_new_list) at the end.
mylist_names = set([item[1] for item in mylist])
my_new_list = []
for name in mylist_names:
name_list = [item[0] for item in mylist if item[1] == name]
name_list = np.mean(name_list, axis=0).tolist()
my_new_list.append((name_list, name))

Create new list based on ordered elements of two lists

I have two lists and want to create a new based on each element of the list.
list1 = [23, 57, 223, 246, 286, 429]
list2 = [17, 138, 425, 680, 535, 1063]
and I want new list such as:
list3 = [23, 17, 57, 138, 223, 425]
which is
list3 = [list1[0], list2[0], list1[1], list2[1], ...]
How should I proceed. I know append doesn't work, neither does zip.
You can use itertools.chain to flatten the output of zip:
from itertools import chain
list3 = [*chain(*zip(list1, list2))]
Or plainer with a nested comprehension:
list3 = [x for pair in zip(list1, list2) for x in pair]
If you only need some part of the input lists, just use an appropriate slice:
list3 = [*chain(*zip(list1[:3], list2))]
You can use double for loop. One to iterate over the zipped tuple and one to iterate over the elements of the tuple:
x=[k for j in zip(list1,list2) for k in j]
Trying for a one-liner is nice, but it does mean that whoever reads it (including you in a few months time) is going to have to sit there working through what it does.
It may be much more immediately readable to do it the slightly more verbose way:
out = []
for x in zip(list1, list2):
out.extend(x)
loop through the index of lists and add elements from both lists at once to result
list1 = [23, 57, 223, 246, 286, 429]
list2 = [17, 138, 425, 680, 535, 1063]
result = []
for i in range(0,len(list1)):
result.append(list1[i])
result.append(list2[i])
print(result)

Python intersection of arrays in dictionary

I have dictionary of arrays as like:
y_dict= {1: np.array([5, 124, 169, 111, 122, 184]),
2: np.array([1, 2, 3, 4, 5, 6, 111, 184]),
3: np.array([169, 5, 111, 152]),
4: np.array([0, 567, 5, 78, 90, 111]),
5: np.array([]),
6: np.array([])}
I need to find interception of arrays in my dictionary: y_dict.
As a first step I cleared dictionary from empty arrays, as like
dic = {i:j for i,j in y_dict.items() if np.array(j).size != 0}
So, dic has the following view:
dic = { 1: np.array([5, 124, 169, 111, 122, 184]),
2: np.array([1, 2, 3, 4, 5, 6, 111, 184]),
3: np.array([169, 5, 111, 152]),
4: np.array([0, 567, 5, 78, 90, 111])}
To find interception I tried to use tuple approach as like:
result_dic = list(set.intersection(*({tuple(p) for p in v} for v in dic.values())))
Actual result is empty list: [];
Expected result should be: [5, 111]
Could you please help me to find intersection of arrays in dictionary? Thanks
The code you posted is overcomplex and wrong because there's one extra inner iteration that needs to go. You want to do:
result_dic = list(set.intersection(*(set(v) for v in dic.values())))
or with map and without a for loop:
result_dic = list(set.intersection(*(map(set,dic.values()))))
result
[5, 111]
iterate on the values (ignore the keys)
convert each numpy array to a set (converting to tuple also works, but intersection would convert those to sets anyway)
pass the lot to intersection with argument unpacking
We can even get rid of step 1 by creating sets on every array and filtering out the empty ones using filter:
result_dic = list(set.intersection(*(filter(None,map(set,y_dict.values())))))
That's for the sake of a one-liner, but in real life, expressions may be decomposed so they're more readable & commentable. That decomposition may also help us to avoid the crash which occurs when passed no arguments (because there were no non-empty sets) which defeats the smart way to intersect sets (first described in Best way to find the intersection of multiple sets?).
Just create the list beforehand, and call intersection only if the list is not empty. If empty, just create an empty set instead:
non_empty_sets = [set(x) for x in y_dict.values() if x.size]
result_dic = list(set.intersection(*non_empty_sets)) if non_empty_sets else set()
You should be using numpy's intersection here, not directly in Python. And you'll need to add special handling for the empty intersection.
>>> intersection = None
>>> for a in y_dict.values():
... if a.size:
... if intersection is None:
... intersection = a
... continue
... intersection = np.intersect1d(intersection, a)
...
>>> if intersection is not None:
... print(intersection)
...
[ 5 111]
For the case where intersection is None, it means that all of the arrays in y_dict had size zero (no elements). In this case the intersection is not well-defined, you have to decide for yourself what the code should do here - probably raise an exception, but it depends on the use-case.

Python- printing n'th level sublist

I have a complicated list arrangement. There are many lists, and some of them have sub-lists. Now, some of the elements from the aforementioned lists are to be printed. What makes it more complicated is, the index of the value to be printed is in an excel file, as shown here:
[list_1,1,2] #Means - list[1][2] is to be printed (sub-lists are there)
[list_2,7] #Means - list_2[7] is to be printed (no sub-list)
................
[list_100,3,6] #Means list_100[3][6] is to be printed (sub list is there)
The number of the lists is so long, so that I was using a for loop and multiple if statements. For example (pseudo code):
for i in range(100): #because 100 lists are there in excel
if len(row_i) == 3:
print(list_name[excel_column_1_value][excel_column_2_value])
else:
print(list_name[excel_column_1_value])
Please note that, the excel sheet is only to get the list name and index, the lists are all saved in the main code.
Is there any way to avoid the if statements and automate that part as well ? Asking because, the if condition value is only based on the length given by the excel sheet. Thanks in advance.
Suppose you have data like this:
data = {
"list1": [[100, 101, 102], [110, 111, 112], [120, 121, 123]],
"list2": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
"list3": [[200, 201, 202], [210, 211, 212], [220, 221, 223]],
}
If it is homework, your teacher probably want you to solve it using recursion, but I recommend using an iterative version in Python unless you can assure you would not stack more than 999 calls:
fetch_element(data, listname, *indices):
value = data[listname]
for index in indices:
value = value[index]
return value
Then you have the list of elements you want:
desired = [
["list1", 0, 0],
["list2", 7],
["list3", 2, 2],
]
Now you can do:
>>> [fetch_element(data, *line) for line in desired]
[100, 7, 223]
Which is the same as:
>>> [data["list1"][0][0], data["list2"][7], data["list3"][2][2]]
[100, 7, 223]
Can you post a better example? how does your list look like and what's the desired output when printing?
You can open the file, read the indexes and lists names you want to print into a list and iterate that list to print what you want.
There are many ways to print a list a simple one, you can use:
mylist = ['hello', 'world', ':)']
print ', '.join(mylist)
mylist2 = [['hello', 'world'], ['Good', 'morning']]
for l in mylist2:
print(*l)

Categories