Finding largest areas in dictionary - python

I'm writing a function where I go through a dictionary. The dictionary contains artists as keys and their paintings as values. I need to find the painting in a dictionary that has the largest area and if there are two that have equal area they should be returned as a list of tuples.
Example Dictionary:
{
'A, Jr.':[("One",1400,10,20.5,"oil paint","Austria"),("Three",1400,100.0,100.0,"oil paint","France"),("Twenty",1410,50.0,200.0,"oil paint","France")],
'X':[("Eight",1460, 100.0, 20.0, "oil paint","France"),("Six",1465,10.0, 23.0, "oil paint", "France"),("Ten",1465,12.0,15.0,"oil paint","Austria"),("Thirty",1466,30.0,30.0,"watercolor","Germany")],
'M':[("One, Two", 1500, 10.0, 10.0, "panel","Germany")]
}
Basically the four digit number is the year that the painting or work of art was created and the next two numbers are the length and width. I need to return the values that have the largest area when multiplying the lengths and widths. So for the above dictionary the function find_largest should return
find_largest(dictionary2())
[('A, Jr.', 'Three'), ('A, Jr.', 'Twenty')]
Since 100 * 100 = 10,000 for the "Three" painting and 50 * 200 = 10,000 for the "Twenty" painting they are both returned as tuples within a list.
Does anyone have advice on how to do this? I have started code below but I don't think its the right approach for this.
def find_largest(dictionary):
matches = {}
for key, the_list in db.items():
for record in the_list:
value = record[4]
if dictionary in record:
if key in matches:
max(the_list)
max(lst, key=lambda tupl: tupl[2]*tupl[3])
matches[key].append(record)
else:
matches[key] = [record]
return matches
This is basically my code from an earlier function with a few significant changes. This basic framework has worked for a few of my goals. I added max(matches) but I realize this isn't doing much unless the function multiplies the lengths and widths and then looks for the max. If anyone has advice it would be helpful

It would probably be easier to just keep track of your current max instead
data = {
'A, Jr.':[("One",1400,10,20.5,"oil paint","Austria"),("Three",1400,100.0,100.0,"oil paint","France"),("Twenty",1410,50.0,200.0,"oil paint","France")],
'X':[("Eight",1460, 100.0, 20.0, "oil paint","France"),("Six",1465,10.0, 23.0, "oil paint", "France"),("Ten",1465,12.0,15.0,"oil paint","Austria"),("Thirty",1466,30.0,30.0,"watercolor","Germany")],
'M':[("One, Two", 1500, 10.0, 10.0, "panel","Germany")]
}
def find_largest(d):
matches = []
max_value = 0
for key in d:
for record in d[key]:
value = record[2] * record[3]
if value > max_value:
matches = [(key, record[0])]
max_value = value
elif value == max_value:
matches.append((key, record[0]))
return matches
# Output
>>> find_largest(data)
[('A, Jr.', 'Three'), ('A, Jr.', 'Twenty')]

Related

Dictionary to classify speed of objects in a list

I'm having difficulty in using a dictionary to classify speed of objects.
Input:
Object_dict={"Airbus 380":{"Country":"France,Germany,Spain,UK","Top
Speed(Mach)":0.89},"Concorde":{"Country":"France,UK","Top
Speed(Mach)":2.01}, "Boeing X-43":{"Country": "USA","Top
Speed(Mach)":9.6}}
Ouput:
Objects_by_Mach={"Subsonic":["Airbus 380"],"Transonic":[],"Supersonic":["Concorde"],"Hypersonic":["Boeing X-43"]}
This is my code:
Mach_scale = {"Subsonic": 0,
"Transonic": 1,
"Supersonic":5,
"Hypersonic":5 ,
}
#Subsonic object has speed of Mach<0
#Transsonic object has speed of Mach=1
#Supersonic object has speed of 1<Mach<5
#Hypersonic object has speed of Mach>5
def mach_speeds(dict1):
Objects_by_Mach={}
for object,data in dict1.items():
for value in data["Top Speed(Mach)"]:
Subsonic=[object for object in dict1 if value<=Mach_scale["Transonic"] and value>Mach_scale["Subsonic"] in dict1["Top Speed(Mach)"] in dict1.values()]
Transonic=[object for object in dict1 if value==Mach_scale["Transonic"] in Mach_scale["Top Speed(Mach)"] in dict1.values()]
Supersonic=[object for object in dict1 if value<=Mach_scale["Supersonic"] and value>Mach_scale["Transonic"] in dict1["Top Speed(Mach)"] in dict1.values()]
Hypersonic=[object for object in dict1 if value>Mach_scale["Hypersonic"] in dict1["Top Speed(Mach)"] in dict1.values()]
return Objects_by_Mach.update({"Subsonic":Subsonic,"Transonic":Transonic,"Supersonic":Supersonic,"Hypersonic":Hypersonic})
print(mach_speeds(Object_dict))
Thanks in advance again fellow SO'ers.
You can generalise and therefore shorten your code by specifying ranges for the Mach scale. The values used here may not be correct but can be easily adjusted to suit.
Mach_scale = {"Subsonic": (0.0, 0.8),
"Transonic": (0.8, 1.2),
"Supersonic": (1.2, 5.0),
"Hypersonic": (5.0, 10.0),
"High-hypersonic": (10.0, float('inf'))
}
Object_dict = {"Airbus 380": {"Country": "France,Germany,Spain,UK", "Top Speed(Mach)": 0.89},
"Concorde": {"Country": "France,UK", "Top Speed(Mach)": 2.01},
"Boeing X-43": {"Country": "USA", "Top Speed(Mach)": 9.6}}
result = dict()
def getmach(m):
for k, v in Mach_scale.items():
if m >= v[0] and m < v[1]:
return k
for k, v in Object_dict.items():
result.setdefault(getmach(v['Top Speed(Mach)']), []).append(k)
print(result)
Output:
{'Subsonic': ['Airbus 380'], 'Transonic': ['Concorde'], 'Hypersonic': ['Boeing X-43']}
First of all please note that this question is very specific and will most likely help only you, We love questions that are general and will help as many people as possible!
There are some things in your code that are considered bad practice and are problematic.
First looks like the indentations are not correct
Second, dont use the reserved word object - use a different variable name
Notice that in each iteration you are creating a new list, looks to me
that you want to update it - not creating a new one.
I would try something like the following code:
input = {
"Airbus 380":{"Country":"France,Germany,Spain,UK","Top Speed(Mach)":0.89},
"Concorde":{"Country":"France,UK","Top Speed(Mach)":2.01},
"Boeing X-43":{"Country": "USA","Top Speed(Mach)":9.6}
}
mach_scale = {"Subsonic": 0.8,
"Transonic": 1.2,
"Supersonic":5,
"Hypersonic":5
}
"""
subsonic speed - below 0.8 mach
transonic speed - between 0.8 - 1.2 mach
supersonic speed - between 1.2 - 5 mach
hypersonic speed - above 5 mach
"""
def mach_speeds(airplane_data):
subsonic, transonic, supersonic, hypersonic = [], [], [], []
for plane, data in airplane_data.items():
top_speed = data["Top Speed(Mach)"]
if top_speed <= mach_scale["Subsonic"]:
subsonic.append(plane)
elif top_speed <= mach_scale["Transonic"]:
transonic.append(plane)
elif top_speed <= mach_scale["Supersonic"]:
supersonic.append(plane)
else:
hypersonic.append(plane)
result = {}
result["Subsonic"] = subsonic
result["Transonic"] = transonic
result["Supersonic"] = supersonic
result["Hypersonic"] = hypersonic
return result
if __name__ == "__main__":
print(mach_speeds(input))
Output:
{'Subsonic': [], 'Transonic': ['Airbus 380'], 'Supersonic': ['Concorde'], 'Hypersonic': ['Boeing X-43']}

How to check a list for duplicates and add values if there are any?

I'm totally beginner with coding and just need help with some stuff.
My dream was to write a smart shopping list that automatically detects duplicates and increases the weight of duplicate products.
I get the shopping list from an external file which has the following form:
weight\n
ingredient\n
eg.
60
eggs
120
beef meat
25
pasta
120
eggs
etc...
After converting this files to dictionaries by this code:
final_list = []
def get_list(day_list):
for day in range(len(day_list)):
day += 1
day_to_open = f'Days/day{str(day)}.txt'
with open(day_to_open, 'r') as file:
day1 = file.readlines()
day1 = [item.rstrip() for item in day1]
x = 0
y = 1
list = []
for item in range(0, len(day1), 2):
dictio = {day1[y]: day1[x]}
x += 2
y += 2
list.append(dictio)
final_list.append(list)
list = []
for item in final_list:
list += item
return list
days = [1, 2, 3]
list = get_list(day_list=days)
Finally I get list of dictionaries like that:
[{'eggs': '60'}, {'beef meat': '120'}, {'pasta': '25'}, {'eggs': '120'}]
How can I iterate through the dictionary to check if any products are repeating, and if so leave one with the added weight?
For three weeks I have been trying to solve it, unfortunately to no avail.
Thank you very much for all your help!
#Edit
my goal is to make it look like this:
[{'eggs': 180}, {'beef meat': 120}, {'pasta': 25}]
#egg weight added (120 + 60)#
lis = [{'eggs': '60'}, {'beef meat': '120'}, {'pasta': '25'}, {'eggs': '120'}]
# make 1 dict from list of dicts and update max value
new = {}
for d in lis:
for k, v in d.items():
if (k not in new) or (int(v) > int(new[k])):
new[k] = v
# rebuild list of dicts
lis = [{k:v} for k, v in new.items()]
print(lis)
# [{'eggs': '120'}, {'beef meat': '120'}, {'pasta': '25'}]
As ShadowRanger has pointed out, it's not common practice to have a list of multiple dictionaries as you have done. Dictionaries are very useful if used correctly.
I'm not entirely sure the structure of the files you are reading, so I will just explain a way forward and leave it up to you to implement it. What I would suggest is that you first initiate a dictionary with all the necessary keys (ingredients in your case) with each of the values set to 0 (as an integer or float, rather than a string), so you would get a dictionary like this:
shopping_list = {'eggs': 0, 'beef meat': 0, 'pasta': 0}
Then, you will be able to access each of the values by calling the shopping_list dictionary and specifying the key of interest. For example, if you wanted to print the value of eggs, you would write:
print(shopping_list['eggs']) # this would return 0
You can then easily increase/decrease a value of interest; for example, to add 10 to pasta, you would write:
shopping_list['eggs'] += 10
Using this method, you can then iterate through each of your items, select the ingredient of interest and add the weight. So if you have duplicates, it will just add to the same ingredient. Again, I'm not sure the structure of the files you are reading, but it would be something along the lines of:
for ingredient, weight in file:
shopping_list[ingredient] += weight
Good luck for your dream - all the best!

Iterate over part of tuple key in Python dictionary

I am working on an optimization project where I have a series of dictionaries with tuples as keys and another dictionary (a decision variable with Gurobi) where the key is the first element of the tuples in the other dictionaries. I need to be able to do the following:
data1 = {(place, person): q}
data2 = {person: s}
x = {place: var}
qx = {k: x[k]*data1[k] for k in x}
total1 = {}
for key, value in qx.items():
person = key[1]
if person in total1:
total1[person] = total1[person] + value
else:
total1[person] = value
total2 = {k: total1[k]/data2[k] for k in total1}
(Please note that the data1, data2, and x dictionaries are very large, 10,000+ distinct place/person pairs).
This same process works when I use the raw data in place of the decision variable, which uses the same (place, person) key. Unfortunately, my variable within the Gurobi model itself must be a dictionary and it cannot contain the person key value.
Is there any way to iterate over just the first value in the tuple key?
EDIT:
Here are some sample values (sensitive data, so placeholder values):
data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
data2 = {a: 7.8, b: 8.5, c: 8.4}
x = {1: 0.002, 2: 0.013}
Values in data1 are all integers, data2 are hours, and x are small decimals.
Outputs in total2 should look similar to the following (assuming there are many other rows for each person):
total2 = {a: 0.85, b: 1.2, c: 1.01}
This code is essentially calculating a "productivity score" for each person. The decision variable, x, is looking only at each individual place for business purposes, so it cannot include the person identifiers. Also, the Gurobi package is very limiting about how things can be formatted, so I have not found a way to even use the tuple key for x.
Generally, the most efficient way to aggregate values into bins is to use a for loop and store the values in a dictionary, as you did with total1 in your example. In the code below, I have fixed your qx line so it runs, but I don't know if this matches your intention. I also used total1.setdefault to streamline the code a little:
a, b, c = 'a', 'b', 'c'
data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
data2 = {a: 7.8, b: 8.5, c: 8.4}
x = {1: 0.002, 2: 0.013}
qx = {place, person: x[place] * value for (place, person), value in data1.items()}
total1 = {}
for (place, person), value in qx.items():
total1.setdefault(person, 0.0)
total1[person] += value
total2 = {k: total1[k] / data2[k] for k in total1}
print(total2)
# {'a': 0.0071794871794871795, 'c': 0.013571428571428571, 'b': 0.19117647058823528}
But this doesn't produce the result you asked for. I can't tell at a glance how you get the result you showed, but this may help you move in the right direction.
It might also be easier to read if you moved the qx logic into the loop, like this:
total1 = {}
for (place, person), value in data1.items():
total1.setdefault(person, 0.0)
total1[person] += x[place] * value
total2 = {k: total1[k] / data2[k] for k in total1}
Or, if you want to do this often, it might be worth creating a cross-reference between persons and their matching places, as #martijn-pieters suggested (note, you still need a for loop to do the initial cross-referencing):
# create a list of valid places for each person
places_for_person = {}
for place, person in data1:
places_for_person.setdefault(person, [])
places_for_person[person].append(place)
# now do the calculation
total2 = {
person:
sum(
data1[place, person] * x[place]
for place in places_for_person[person]
) / data2[person]
for person in data2
}
For creating a new dictionary removing the tuple:
a, b, c = "a", "b", "c"
data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
total = list()
spot = 0
for a in data1:
total.append(list(a[1])) # Add new Lists to list "total" containing the Key values
total[spot].append(data1[a]) # Add Values to Keys judging from their spot in the list
spot += 1 # to keep the spot in correct place in lists
total = dict(total) # convert it to dictionary
print(total)
Output:
{'a': 28, 'c': 57, 'b': 125}

Merging duplicate lists and deleting a field in each list depending on the value in Python

I am still a beginner in Python. I have a tuple to be filtered, merged and sorted.
The tuple looks like this:
id, ts,val
tup = [(213,5,10.0),
(214,5,20.0),
(215,5,30.0),
(313,5,60.0),
(314,5,70.0),
(315,5,80.0),
(213,10,11.0),
(214,10,21.0),
(215,10,31.0),
(313,10,61.0),
(314,10,71.0),
(315,10,81.0),
(315,15,12.0),
(314,15,22.0),
(215,15,32.0),
(313,15,62.0),
(214,15,72.0),
(213,15,82.0] and so on
Description about the list: The first column(id)can have only these 6 values 213,214,215,313,314,315 but in any different order. The second column(ts) will have same values for every 6 rows. Third column(val) will have some random floating point values
Now my final result should be something like this:
result = [(5,10.0,20.0,30.0,60.0,70.0,80.0),
(10,11.0,21.0,31.0,61.0,71.0,81.0),
(15,82.0,72.0,32.0,62.0,22.0,12.0)]
That is the first column in each row is to be deleted. There should be only one unique row for each unique value in the second column. so the order of each result row should be:
(ts,val corresponding to id 213,val corresponding to 214, corresponding to id 215,val corresponding to 313,corresponding to id 314,val corresponding to 315)
Note : I am restricted to use only the standard python libraries. So panda, numpy cannot be used.
I tried a lot of possibilities but couldnt solve it. Please help me do this. Thanks in advance.
You can use itertools.groupby
from itertools import groupby
result=[]
for i,g in groupby(lst, lambda x:x[1]):
group= [i]+map(lambda x:x[-1],sorted(list(g),key=lambda x:x[0]))
result.append(tuple(group))
print result
Output:
[(5, 10.0, 20.0, 30.0, 60.0, 70.0, 80.0),
(10, 11.0, 21.0, 31.0, 61.0, 71.0, 81.0),
(15, 82.0, 72.0, 32.0, 62.0, 22.0, 12.0)]
With a slight change to your code you can fix it. If you change i[1] in ssd[cnt] to i[1] == ssd[cnt][0] your code may work. Also in else part you should add another list to ssd because you are creating another set of data. Also if the data should come according to their id's you should sort them by (ts,id). After applying the changes:
tup.sort( key = lambda x: (x[1],x[0]) )
ssd = [[]]
cnt = 0
ssd[0].append(tup[0][1])
for i in tup:
if i[1] == ssd[cnt][0]:
ssd[cnt].append(i[2])
else:
cnt = cnt + 1
ssd.append([])
ssd[cnt].append(i[1])
ssd[cnt].append(i[2])
Output
[[5, 10.0, 20.0, 30.0, 60.0, 70.0, 80.0],
[10, 11.0, 21.0, 31.0, 61.0, 71.0, 81.0],
[15, 82.0, 72.0, 32.0, 62.0, 22.0, 12.0]]
Here's a vanilla python solution, although I do think that using groupby is more pythonic. This does have the disadvantage that it has to build the dicts in memory, so it won't scale to a large tup list.
This does, however, obey the ordering requirement.
from collections import defaultdict
tup = ...
tup_dict = defaultdict(dict)
for id, ts, val in tup:
print id, ts, val
tup_dict[ts][id] = val
for tup_key in sorted(tup_dict):
id_dict = tup_dict[tup_key]
print tuple([tup_key] + [ id_dict[id_key] for id_key in sorted(id_dict)])
We want to iterate on a sorted instance of your tup, unpacking the items as we go, but first we need an auxiliary variable to store the keys and a variable to store our results
keys, res = [], []
for t0, t1, t2 in sorted(tup, key=lambda x:(x[1],x[0])):
the key argument is a lambda function that instructs thesorted` function to sort on the second and the first item of each element in the individual tuple --- so here we have the body of the loop
if t1 not in keys:
keys.append[t1]
res.append([t1])
that is, if the second integer in the tuple was not already processed, we have to memorize the fact that it's being processed and we want to add a new list in our result variable, that starts with the value of the second integer
To finish the operation on an individual tuple, we are sure that there is a list in res that starts with t1, indexing the aux variable we know the index of that list and so we can append the float to it...
i = keys.index(t1)
res[i].append(t2)
To have all of that in short
keys, res = [], []
for t0, t1, t2 in sorted(tup, key=lambda x:(x[1],x[0])):
if t1 not in keys:
keys.append[t1]
res.append([t1])
i = keys.index(t1)
res[i].append(t2)
Now, in res you have a list of lists, if you really need a list of tuples you can convert with a list comprehension
res = [tuple(elt) for elt in res]
adding to the answer of #Ahsanul Haque he also need it in order so instead of list(g) do sorted(g,key=lambda y:y[0]) you can also do the use tuple from the start
for i,g in groupby(tup,lambda x:x[1]):
gro = (i,) + tuple(map(lambda x:x[-1],sorted(g,key=lambda y:y[0])))
resul.append(gro)

Merging python lists based on a 'similar' float value

I have a list (containing tuples) and I want to merge the list based on if the first element is within a maximum distance of the other elements (if if delta value < 0.05). I have the following list as an example:
[(0.0, 0.9811758192941256), (1.00422, 0.9998252466431066), (0.0, 0.9024831978342827), (2.00425, 0.9951777494430947)]
This should yield something like:
[(0.0, 1.883659017),(1.00422, 0.9998252466431066),(2.00425,0.9951777494430947)]
I am thinking that I can use something similar as in this question (Merge nested list items based on a repeating value) altho a lot of other questions yield a similar answer. The only problem that I see there is that they use collections.defaultdict or itertools.groupby which require exact matching of the element. An important addition here is that I want the first element of a merged tuple to be the weighted mixture of elements, example as follows:
(1.001,80) and (0.99,20) are matched then the result should be (0.9988,100).
Is something similar possible but with the matching based on value difference and not exact match?
What I was trying myself (but don't really like the look of it) is:
Res = 0.05
combinations = itertools.combination(list,2)
for i in combinations:
if i[0][0] > i[1][0]-Res and i[0][0] < i[1][0]+Res:
newValue = ...
-- UPDATE --
Based on some comments and Dawgs answer I tried the following approach:
for fv, v in total:
k=round(fv, 2)
data[k]=data.get(k, 0)+v
using the following list (actual data example, instead of short example list):
total = [(0.0, 0.11630591852564721), (1.00335, 0.25158664272201053), (2.0067, 0.2707487305913156), (3.0100499999999997, 0.19327075057473678), (4.0134, 0.10295042331357719), (5.01675, 0.04364856520231155), (6.020099999999999, 0.015342958201863783), (0.0, 0.9811758192941256), (1.00422, 0.018649427348981), (0.0, 0.9024831978342827), (2.00425, 0.09269455160881204), (0.0, 0.6944298762418107), (0.99703, 0.2536959281304138), (1.99406, 0.045877927988415786)]
which then yields problems with values such as 2.0067 (rounded to 2.01) and 1.99406 (rounded to 1.99( where the total difference is 0.01264 (which is far below 0.05, a value that I had in mind as a 'limit' for now but that should set changeable). Rounding the values to 1 decimal place is also not an option since that would result in a window of ~0.09 with values such as 2.04999 and 1.95001 which both yield 2.0 in that case.
The exact output was:
{0.0: 2.694394811895866, 1.0: 0.5239319982014053, 4.01: 0.10295042331357719, 5.02: 0.04364856520231155, 2.0: 0.09269455160881204, 1.99: 0.045877927988415786, 3.01: 0.19327075057473678, 6.02: 0.015342958201863783, 2.01: 0.2707487305913156}
accum = list()
data = [(0.0, 0.9811758192941256), (1.00422, 0.9998252466431066), (0.0, 0.9024831978342827), (2.00425, 0.9951777494430947)]
EPSILON = 0.05
newdata = {d: True for d in data}
for k, v in data:
if not newdata[(k,v)]: continue
newdata[(k,v)] = False
# use each piece of data only once
keys,values = [k*v],[v]
for kk, vv in [d for d in data if newdata[d]]:
if abs(k-kk) < EPSILON:
keys.append(kk*vv)
values.append(vv)
newdata[(kk,vv)] = False
accum.append((sum(keys)/sum(values),sum(values)))
You can round the float values then use setdefault:
li=[(0.0, 0.9811758192941256), (1.00422, 0.9998252466431066), (0.0, 0.9024831978342827), (2.00425, 0.9951777494430947)]
data={}
for fv, v in li:
k=round(fv, 5)
data.setdefault(k, 0)
data[k]+=v
print data
# {0.0: 1.8836590171284082, 2.00425: 0.9951777494430947, 1.00422: 0.9998252466431066}
If you want some more complex comparison (other than fixed rounding) you can create a hashable object based on the epsilon value you want and use the same method from there.
As pointed out in the comments, this works too:
data={}
for fv, v in li:
k=round(fv, 5)
data[k]=data.get(k, 0)+v

Categories