splitting nested dictionary based on a key values - python

Would anyone be able to give me a tip on how to split a nested dictionary into separate nested dictionaries based on a common key name called score and the value of score?
For example break up this nested dictionary nested_group_map_original :
nested_group_map_original = {
'group_l1n': {'score': 0.12949562072753906, 'VMA-1-1': '14', 'VMA-1-2': '13', 'VMA-1-3': '15', 'VMA-1-4': '11', 'VMA-1-5': '9', 'VMA-1-7': '7', 'VMA-1-10': '21', 'VMA-1-11': '16'},
'group_l1s': {'score': -0.40303707122802734, 'VMA-1-6': '8', 'VMA-1-8': '6', 'VMA-1-9': '10', 'VMA-1-12': '19', 'VMA-1-13': '20', 'VMA-1-14': '37', 'VMA-1-15': '38', 'VMA-1-16': '39'},
'group_l2n': {'score': 0.6091512680053768, 'VAV-2-1': '12032', 'VAV-2-2': '12033', 'VMA-2-3': '31', 'VMA-2-4': '29', 'VAV-2-5': '12028', 'VMA-2-6': '27', 'VMA-2-7': '30', 'VMA-2-12': '26'},
'group_l2s': {'score': 0.11078681945799929, 'VMA-2-8': '34', 'VAV-2-9': '12035', 'VMA-2-10': '36', 'VMA-2-11': '25', 'VMA-2-13': '23', 'VMA-2-14': '24'}
}
Make it look like this below but in a programmatic way for two separate nested dictionaries named nested_group_map_copy_lows and nested_group_map_copy_high:
nested_group_map_copy_lows = {
'group_l1s': {'score': -0.40303707122802734, 'VMA-1-6': '8', 'VMA-1-8': '6', 'VMA-1-9': '10', 'VMA-1-12': '19', 'VMA-1-13': '20', 'VMA-1-14': '37', 'VMA-1-15': '38', 'VMA-1-16': '39'},
'group_l2s': {'score': 0.11078681945799929, 'VMA-2-8': '34', 'VAV-2-9': '12035', 'VMA-2-10': '36', 'VMA-2-11': '25', 'VMA-2-13': '23', 'VMA-2-14': '24'}
}
nested_group_map_copy_highs = {
'group_l2n': {'score': 0.6091512680053768, 'VAV-2-1': '12032', 'VAV-2-2': '12033', 'VMA-2-3': '31', 'VMA-2-4': '29', 'VAV-2-5': '12028', 'VMA-2-6': '27', 'VMA-2-7': '30', 'VMA-2-12': '26'},
'group_l1n': {'score': 0.12949562072753906, 'VMA-1-1': '14', 'VMA-1-2': '13', 'VMA-1-3': '15', 'VMA-1-4': '11', 'VMA-1-5': '9', 'VMA-1-7': '7', 'VMA-1-10': '21', 'VMA-1-11': '16'},
}
Not really sure how to tackle this, I think I need to use enumerate to create entire new dictionaries but if I try to find highest scores in a separate list scores_
scores_ = []
for i in nested_group_map_original:
scores_.append(nested_group_map_original[i]["score"])
scores_sorted = sorted(scores_, key = float)
Then slice for highest and lowest values:
scores_sorted_highs = scores_sorted[2:]
scores_sorted_lows = scores_sorted[:2]
I am stuck here I dont think del is way to go, any tips greatly appreciated... I know in my code I am not even defining new dictionaries which I think I could do with Python enumerate but not sure how to implement that
for i in nested_group_map_original:
if nested_group_map_original[i]["score"] in scores_sorted_highs:
del nested_group_map_original[i]
This errors out:
RuntimeError: dictionary changed size during iteration

You can sort the keys of the original dictionary according to their corresponding scores, like so:
sorted_keys = sorted(nested_group_map_original, key = lambda x: nested_group_map_original[x]['score'])
You can then split into two different dictionaries according to how many values you want in each. Like in your example of two, you could do the following:
scores_sorted_lows = {k:nested_group_map_original[k] for k in sorted_keys[:2]}
scores_sorted_highs = {k:nested_group_map_original[k] for k in sorted_keys[2:]}

If the dict isn't super huge, then one easy way to do this would just be to construct the list by filtering the map twice using a comprehension:
nested_group_map_low = {k:v for k,v in nested_group_map_original.items() if is_low_score(v["score"])}
nested_group_map_high = {k:v for k,v in nested_group_map_original.items() if not is_low_score(v["score"])}

Related

how can I combine list elements inside list based on element value?

If I want to combine lists inside list based on element value how can I achieve that?
suppose if list
lis = [['steve','reporter','12','34','22','98'],['megan','arch','44','98','32','22'],['jack','doctor','80','32','65','20'],['steve','dancer','66','31','54','12']]
here list containing 'steve' appears twice so I want to combine them as below
new_lis = [['steve','reporter','12','34','22','98','dancer','66','31','54','12'],['megan','arch','44','98','32','22'],['jack','doctor','80','32','65','20']]
I tried below code to achieve this
new_dic = {}
for i in range(len(lis)):
name = lis[i][0]
if name in new_dic:
new_dic[name].append([lis[i][1],lis[i][2],lis[i][3],lis[i][4],lis[i][5]])
else:
new_dic[name] = [lis[i][1],lis[i][2],lis[i][3],lis[i][4],lis[i][5]]
print(new_dic)
I ended up creating a dictionary with multiple values of lists as below
{'steve': ['reporter', '12', '34', '22', '98', ['dancer', '66', '31', '54', '12']], 'megan': ['arch', '44', '98', '32', '22'], 'jack': ['doctor', '80', '32', '65', '20']}
but I wanted it as single list so I can convert into below format
new_lis = [['steve','reporter','12','34','22','98','dancer','66','31','54','12'],['megan','arch','44','98','32','22'],['jack','doctor','80','32','65','20']]
is there a way to tackle this in different way?
There is a differnet way to do it using groupby function from itertools. Also there are ways to convert your dict to a list also. It totally depends on what you want.
from itertools import groupby
lis = [['steve','reporter','12','34','22','98'],['megan','arch','44','98','32','22'],['jack','doctor','80','32','65','20'],['steve','dancer','66','31','54','12']]
lis.sort(key = lambda x: x[0])
output = []
for name , groups in groupby(lis, key = lambda x: x[0]):
temp_list = [name]
for group in groups:
temp_list.extend(group[1:])
output.append(temp_list)
print(output)
OUTPUT
[['jack', 'doctor', '80', '32', '65', '20'], ['megan', 'arch', '44', '98', '32', '22'], ['steve', 'reporter', '12', '34', '22', '98', 'dancer', '66', '31', '54', '12']]
Not sure whether this snippet answers your question or not. This is not a fastest approach in terms to time complexity. I will update this answer if I can solve in a better way.
lis = [['steve','reporter','12','34','22','98'],['megan','arch','44','98','32','22'],['jack','doctor','80','32','65','20'],['steve','dancer','66','31','54','12']]
new_lis = []
element_value = 'steve'
for inner_lis in lis:
if element_value in inner_lis:
if not new_lis:
new_lis+=inner_lis
else:
inner_lis.remove(element_value)
new_lis+=inner_lis
lis.remove(inner_lis)
print([new_lis] + lis)
Output
[['steve', 'reporter', '12', '34', '22', '98', 'dancer', '66', '31', '54', '12'], ['megan', 'arch', '44', '98', '32', '22'], ['jack', 'doctor', '80', '32', '65', '20']]

Search exact string in list

I am doing an exercise where I need to search the exact function name from the fun list and get the corresponding information from another list detail.
Here is the dynamic list detail:
csvCpReportContents =[
['[PLT] rand (DEBUG INFO NOT FOUND)', '11', '15'],
['rand', '10', '11', '12'],
['__random_r', '23', '45'],
['__random', '10', '11', '12'],
[],
['multiply_matrices()','23','45'] ]
Here is fun list contains function name to be searched:
fun = ['multiply_matrices()','__random_r','__random']
Expected Output for function fun[2]
['__random', '10', '11', '12']
Expected Output for function fun[1]
['__random_r', '23', '45'],
Here what I have tried for fun[2]:
for i in range(0, len(csvCpReportContents)):
row = csvCpReportContents[i]
if len(row)!=0:
search1 = re.search("\\b" + str(fun[2]).strip() + "\\b", str(row))
if search1:
print(csvCpReportContents[i])
Please suggest to me how to search for the exact word and fetch only that information.
for each fun function you can just iterate through the csv list checking if the first element starts with it
csvCpReportContents = [
['[PLT] rand (DEBUG INFO NOT FOUND)', '11', '15'],
['rand', '10', '11', '12'],
[],
['multiply_matrices()', '23', '45']]
fun=['multiply_matrices()','[PLT] rand','rand']
for f in fun:
for c in csvCpReportContents:
if len(c) and c[0].startswith(f):
print(f'fun function {f} is in csv row {c}')
OUTPUT
fun function multiply_matrices() is in csv row ['multiply_matrices()', '23', '45']
fun function [PLT] rand is in csv row ['[PLT] rand (DEBUG INFO NOT FOUND)', '11', '15']
fun function rand is in csv row ['rand', '10', '11', '12']
Updated code since you changed the test cases and requirement in the question. My first answer was based on your test cases that you wanted to match lines that started with item from fun. Now you seem to have changed that requirement to match an exact match and if not exact match match a starts with match. Below code updated to handle that scenario. However i would say next time be clear in your question and dont change the criteria after several people have answered
csvCpReportContents =[
['[PLT] rand (DEBUG INFO NOT FOUND)', '11', '15'],
['rand', '10', '11', '12'],
['__random_r', '23', '45'],
['__random', '10', '11', '12'],
[],
['multiply_matrices()','23','45'] ]
fun = ['multiply_matrices()','__random_r','__random','asd']
for f in fun:
result = []
for c in csvCpReportContents:
if len(c):
if f == c[0]:
result = c
elif not result and c[0].startswith(f):
result = c
if result:
print(f'fun function {f} is in csv row {result}')
else:
print(f'fun function {f} is not vound in csv')
OUTPUT
fun function multiply_matrices() is in csv row ['multiply_matrices()', '23', '45']
fun function __random_r is in csv row ['__random_r', '23', '45']
fun function __random is in csv row ['__random', '10', '11', '12']
fun function asd is not vound in csv
above input is nested list, so you have to consider 2D Indexing such as
l = [[1,2,3,4],[2,5,7,9]]
for finding 3 number element
you have to use the index of l[0][2]
With custom search_by_func_name function:
csvCpReportContents = [
['[PLT] rand (DEBUG INFO NOT FOUND)', '11', '15'],
['rand', '10', '11', '12'],
[],
['multiply_matrices()', '23', '45']]
fun = ['multiply_matrices()', '[PLT] rand', 'rand']
def search_by_func_name(name, content_list):
for lst in content_list:
if any(i.startswith(name) for i in lst):
return lst
print(search_by_func_name(fun[1], csvCpReportContents)) # ['[PLT] rand (DEBUG INFO NOT FOUND)', '11', '15']
print(search_by_func_name(fun[2], csvCpReportContents)) # ['rand', '10', '11', '12']
You can also use call_fun function as I did in the below code.
def call_fun(fun_name):
for ind,i in enumerate(csvCpReportContents):
if i:
if i[0].startswith(fun_name):
return csvCpReportContents[ind]
# call_fun(fun[2])
# ['rand', '10', '11', '12']

Searching relations between values in python dictionary

I'm trying to search relations between values contained in several keys from a dictionary, like this:
dictionary = {'103': ['26', '69', '91', '47', '19', '53'], '022': ['19', '92', '57', '48', '36', '46'], '507': ['47', '13', '91', '24', '74', '27'], '061': ['06', '27', '26', '71', '86', '46'], '875': ['25', '16', '28', '62', '80', '21']}
[value for key, value in dictionary.items() if value in key.lower()]
However, I get this error:
TypeError: 'in <string>' requires string as left operand, not list
And I don't get to know why! Can anyone help me?
You can't test for the containment of a list in a string. For strings, the left hand side of the in operator should also be a string.
If you need to find values that contain any string which is contained in the key, you can use any:
lst = [value for key, value in dictionary.items() if any(v in key for v in value)]
print(lst)
# [['06', '27', '26', '71', '86', '46']]
If you need to find the contained string itself, you can loop on the dictionary values and test each string on the key:
lst = [v for key, value in dictionary.items() for v in value if v in key]
print(lst)
# ['06']

How do I use other elements fom one list?

I have a list of lists:
data = [['2001', '20', '0', '0', '10', '0', '15', '0'],
['2004', '15', '0', '9.5', '13', '10', '18', '30']]
My work is to use items of sublists in this list of lists:
def FinalMark(studentNum):
if studentNum in data:
I don't know what to do next. Let's say if 2001 is the first item of a sublist, I want to know how to use others items of this sublist.
There are better ways to do it by storing the data as a dictionary. But with what you have, you can loop through data:
def FinalMark(studentNum):
for marks in data:
if marks[0] == studentNum:
return sum([float(i) for i in marks[1:]])
marks[1:] is a slice of marks that skips the first element (the student number).

using a for loop to compare lists

The problem at hand is I have a list of lists that I need to iterate through and compare one by one.
def stockcheck():
stock = open("Stock.csv", "r")
reader = csv.reader(stock)
stockList = []
for row in reader:
stockList.append(row)
The output from print(stockList) is:
[['Product', 'Current Stock', 'Reorder Level', 'Target Stock'], ['plain blankets', '5', '10', '50'], ['mugs', '15', '20', '120'], ['100m rope', '60', '15', '70'], ['burner', '90', '20', '100'], ['matches', '52', '10', '60'], ['bucket', '85', '15', '100'], ['spade', '60', '10', '65'], ['wood', '100', '10', '200'], ['sleeping bag', '50', '10', '60'], ['chair', '30', '10', '60']]
I've searched the basics for this but i've had no luck... I'm sure the solution is simple but it's escaping me! Essentially I need to check whether the current stock is less than the re-order level, and if it is save it to a CSV (that part I can do no problem).
for item in stockList:
if stockList[1][1] < stockList[1][2]:
print("do the add to CSV jiggle")
This is as much as I can do but it doesn't iterate through... Any ideas? Thanks in advance!
Iterate through the stockList using list comprehension, maybe and then print out the results
[sl for sl in stockList[1:] if sl[1] < sl[2]]
You will get the following results:
[['mugs', '15', '20', '120']]
In case you were wondering stockList[1:] is to ensure that you ignore the header.
However, you must note that the values are strings that are being compared. Hence, the values are compared char by char. If you want integer comparisons then you must convert the strings to integers, assuming you are absolutely sure that sl[1] and sl[2] will always be integers - just being presented as strings. Just try doing:
[sl for sl in stockList[1:] if int(sl[1]) < int(sl[2])]
The result changes:
[['plain blankets', '5', '10', '50'], ['mugs', '15', '20', '120']]
Use the [1:] to not get the header, and then make the comparation.
for item in stockList[1:]:
if item[1] < item[2]:
print item
print("do the add to CSV jiggle")

Categories