Python Dictionary - find average of value from other values - python

I have the following list
count = 3.5, price = 2500
count = 3, price = 400
count = 2, price = 3000
count = 3.5, price = 750
count = 2, price = 500
I want to find the average price for all where the count is the same. For example:
count = 2, price = 3000
count = 2, price = 500
3000 + 500 = 3500
3500/2 = 1750
Avg for 'count 2' is 1750
Here's my code so far
avg_list = [value["average"] for value in dictionary_database_list]
counter_obj = collections.Counter(count_list)
print ("AVG:")
for i in counter_obj:
print (i, counter_obj[i])

I'll admit I'm not 100% clear on what you're looking for here, but I'll give it a shot:
A good strategy when you want to iterate over a list of "things" and accumulate some kind of information about "the same kind of thing" is to use a hash table. In Python, we usually use a dict for algorithms that require a hash table.
To collect enough information to get the average price for each item in your list, we need:
a) the total number of items with a specific "count"
b) the total price of items with a specific "count"
So let's build a data structure that maps a "count" to a dict containing "total items" and "total price" for the item with that "count".
Let's take our input in the format:
item_list = [
{'count': 3.5, 'price': 2500},
{'count': 3, 'price': 400},
{'count': 2, 'price': 3000},
{'count': 3.5, 'price': 750},
{'count': 2, 'price': 500},
]
Now let's map the info about "total items" and "total price" in a dict called items_by_count:
for item in item_list:
count, price = item['count'], item['price']
items_by_count[count]['total_items'] += 1
items_by_count[count]['total_price'] += price
But wait! items_by_count[count] will throw a KeyError if count isn't already in the dict. This is a good use case for defaultdict. Let's define the default value of a count we've never seen before as 0 total price, and 0 total items:
from collections import defaultdict
items_by_count = defaultdict(lambda: {
'total_items': 0,
'total_price': 0
})
Now our code won't throw an exception every time we see a new value for count.
Finally, we need to actually take the average. Let's get the information we need in another dict, mapping count to average price. This is a good use case for a dict comprehension:
{count: item['total_price'] / item['total_items']
for count, item in items_by_count.iteritems()}
This iterates over the items_by_count dict and creates the new dict that we want.
Putting it all together:
from collections import defaultdict
def get_average_price(item_list):
items_by_count = defaultdict(lambda: {
'total_items': 0,
'total_price': 0
})
for item in item_list:
count, price = item['count'], item['price']
items_by_count[count]['total_items'] += 1
items_by_count[count]['total_price'] += price
return {count: item['total_price'] / item['total_items']
for count, item in items_by_count.iteritems()}
If we pass in our example input dict, this function returns:
{3.5: 1625, 2: 1750, 3: 400}
Which is hopefully the output you want! Be cautious of gotchas like float division in your particular Python version.

You need to iterate over your items
See documentation
avg(dictionary.values()) is probably what you want

Related

How to check a list for duplicates and add values if there are any?

I'm totally beginner with coding and just need help with some stuff.
My dream was to write a smart shopping list that automatically detects duplicates and increases the weight of duplicate products.
I get the shopping list from an external file which has the following form:
weight\n
ingredient\n
eg.
60
eggs
120
beef meat
25
pasta
120
eggs
etc...
After converting this files to dictionaries by this code:
final_list = []
def get_list(day_list):
for day in range(len(day_list)):
day += 1
day_to_open = f'Days/day{str(day)}.txt'
with open(day_to_open, 'r') as file:
day1 = file.readlines()
day1 = [item.rstrip() for item in day1]
x = 0
y = 1
list = []
for item in range(0, len(day1), 2):
dictio = {day1[y]: day1[x]}
x += 2
y += 2
list.append(dictio)
final_list.append(list)
list = []
for item in final_list:
list += item
return list
days = [1, 2, 3]
list = get_list(day_list=days)
Finally I get list of dictionaries like that:
[{'eggs': '60'}, {'beef meat': '120'}, {'pasta': '25'}, {'eggs': '120'}]
How can I iterate through the dictionary to check if any products are repeating, and if so leave one with the added weight?
For three weeks I have been trying to solve it, unfortunately to no avail.
Thank you very much for all your help!
#Edit
my goal is to make it look like this:
[{'eggs': 180}, {'beef meat': 120}, {'pasta': 25}]
#egg weight added (120 + 60)#
lis = [{'eggs': '60'}, {'beef meat': '120'}, {'pasta': '25'}, {'eggs': '120'}]
# make 1 dict from list of dicts and update max value
new = {}
for d in lis:
for k, v in d.items():
if (k not in new) or (int(v) > int(new[k])):
new[k] = v
# rebuild list of dicts
lis = [{k:v} for k, v in new.items()]
print(lis)
# [{'eggs': '120'}, {'beef meat': '120'}, {'pasta': '25'}]
As ShadowRanger has pointed out, it's not common practice to have a list of multiple dictionaries as you have done. Dictionaries are very useful if used correctly.
I'm not entirely sure the structure of the files you are reading, so I will just explain a way forward and leave it up to you to implement it. What I would suggest is that you first initiate a dictionary with all the necessary keys (ingredients in your case) with each of the values set to 0 (as an integer or float, rather than a string), so you would get a dictionary like this:
shopping_list = {'eggs': 0, 'beef meat': 0, 'pasta': 0}
Then, you will be able to access each of the values by calling the shopping_list dictionary and specifying the key of interest. For example, if you wanted to print the value of eggs, you would write:
print(shopping_list['eggs']) # this would return 0
You can then easily increase/decrease a value of interest; for example, to add 10 to pasta, you would write:
shopping_list['eggs'] += 10
Using this method, you can then iterate through each of your items, select the ingredient of interest and add the weight. So if you have duplicates, it will just add to the same ingredient. Again, I'm not sure the structure of the files you are reading, but it would be something along the lines of:
for ingredient, weight in file:
shopping_list[ingredient] += weight
Good luck for your dream - all the best!

How To Append String To Dictionary in Python within Forloop

i need to append a value of string to a specific key in dictionary in python within forloop, and if the data in forloop is empty then give the value of empty string which i'm not able to get it right, here is some of the code,
top100 = {}
for product in product_list:
title = product.xpath('a[#class="someClass"]/text()') # LIST of 100
price = product.xpath('div[#class="someClass"]/text()') # LIST of 100
# the value in the title is list of 100 title
# more like ['title1', 'title2', ...] and so the price [100, 230, ...]
# how to append each pairs of title and price so i have list of dictionary
top100['title'].append(title)
top100['price'].append(price)
print( top100)
output:
KeyError: 'title'
but i need something more like:
top100 = [{'title': 'title1', 'price': 'price1'},
{'title': 'title2', 'price': 'price2'}
]
The top 100 variable should be a list, then append a dictionary
top100 = []
for product in product_list:
title = product.xpath('a[#class="someClass"]/text()') # LIST of 100
price = product.xpath('div[#class="someClass"]/text()') # LIST of 100
top100.append({'title':title,'price':price})
print( top100)
You need to make top100 a list with many nested dictionaries, with the following code:
top100 = []
for product in product_list:
title = product.xpath('a[#class="someClass"]/text()') # LIST of 100
price = product.xpath('div[#class="someClass"]/text()') # LIST of 100
top100.append({'title':title,'price':price})
Here for the another version of result
top100 = {'title':[],'price':[]}
for product in product_list:
title = product.xpath('a[#class="someClass"]/text()') # LIST of 100
price = product.xpath('div[#class="someClass"]/text()') # LIST of 100
top100['title'].append(title)
top100['price'].append(price)
print( top100)
This should output
{'title':[..., ..., ... ],'price':[..., ..., ...]}
thanks for the answer i was able to find my own solution,
top100 = []
for product in product_list:
titles = product.xpath('a[#class="someLink"]/text()')
prices = product.xpath('div[#class="somePrice"]/text()')
for title, price, in zip(titles, prices):
top100.append({'title':title, 'price':price})
output:
top100 = [{'title': 'title1', 'price': '100'},
{'title': 'title2', 'price': '200'}]

Calculate averages in Python nested list: if elements 1 and 2 match, average element 3

I'm attempting to calculate averages without using Pandas dataframes or the mean function (for practice). I have nested lists and would like to average the third element of the inner lists if the first and second elements match.
Example input:
mylist = [[USD, 2000, 13.40], [USD, 2000, 13.68], [USD, 2001, 13.99], [EUR, 2000, 10.50], [EUR, 2000, 11.02]]
The desired output is:
avlist = [[USD, 2000, 13.54], [USD, 2001, 13.99], [EUR, 2000, 10.76]]
The furthest I've gotten is to make a set from the first 2 elements and find the intersection with the original lists:
unique_list = list(set([x[0:2] for x in mylist]))
if (y for y in ([x[0:2] for x in mylist]) if y in unique_list):
# av_list =
Is it possible to then do something like 'where this intersection is true, add the third elements in my_list to a third element in unique_list and divide by the number of elements added'?
I hope the question is clear.
Start by grouping your data according to the keys you want to use to control the averaging:
>>> mylist = [['USD', 2000, 13.40], ['USD', 2000, 13.68], ['USD', 2001, 13.99], ['EUR', 2000, 10.50], ['EUR', 2000, 11.02]]
>>> from collections import defaultdict
>>> mydict = defaultdict(list)
>>> for curr, year, value in mylist:
mydict[(curr,year)].append(value)
That will give you the numbers you want to average as lists:
>>> mydict
defaultdict(<type 'list'>, {('USD', 2000): [13.4, 13.68], ('USD', 2001): [13.99], ('EUR', 2000): [10.5, 11.02]})
Then average each of the lists:
>>> for (curr, year), values in mydict.items():
print (curr, year, sum(values)/len(values))
USD 2000 13.54
USD 2001 13.99
EUR 2000 10.76
You could create a dictionary, keyed by the items you want to match on e.g.
data = {}
for item in mylist:
key = tuple(item[0:2])
values = data.get(key, [])
values.append(item[2])
data[key] = values
# {('EUR', 2000): [10.5, 11.02], ('USD', 2000): [13.4, 13.68], ('USD', 2001): [13.99]}
Then you can go through each item in the dictionary and calculate your average.
for key in data:
average = sum(data[key])/len(data[key])
print('{}, average = {}'.format(key, average))

Python - sorting a list of numbers based on indexes

I need to create a program that has a class that crates an object "Food" and a list called "fridge" that holds these objects created by class "Food".
class Food:
def __init__(self, name, expiration):
self.name = name
self.expiration = expiration
fridge = [Food("beer",4), Food("steak",1), Food("hamburger",1), Food("donut",3),]
This was not hard. Then i created an function, that gives you a food with highest expiration number.
def exp(fridge):
expList=[]
xen = 0
for i in range(0,len(fridge)):
expList.append(fridge[xen].expiration)
xen += 1
print(expList)
sortedList = sorted(expList)
return sortedList.pop()
exp(fridge)
This one works too, now i have to create a function that returns a list where the index of the list is the expiration date and the number of that index is number of food with that expiration date.
The output should look like: [0,2,1,1] - first index 0 means that there is no food with expiration date "0". Index 1 means that there are 2 pieces of food with expiration days left 1. And so on. I got stuck with too many if lines and i cant get this one to work at all. How should i approach this ? Thanks for the help.
In order to return it as a list, you will first need to figure out the maximum expiration date in the fridge.
max_expiration = max(food.expiration for food in fridge) +1 # need +1 since 0 is also a possible expiration
exp_list = [0] * max_expiration
for food in fridge:
exp_list[food.expiration] += 1
print(exp_list)
returns [0, 2, 0, 1, 1]
You can iterate on the list of Food objects and update a dictionary keyed on expiration, with the values as number of items having that expiration. Avoid redundancy such as keeping zero counts in a list by using a collections.Counter object (a subclass of dict):
from collections import Counter
d = Counter(food.expiration for food in fridge)
# fetch number of food with expiration 0
print(d[0]) # -> 0
# fetch number of food with expiration 1
print(d[1]) # -> 2
You can use itertools.groupby to create a dict where key will be the food expiration date and value will be the number of times it occurs in the list
>>> from itertools import groupby
>>> fridge = [Food("beer",4), Food("steak",1), Food("hamburger",1), Food("donut",3),]
>>> d = dict((k,len(list(v))) for k,v in groupby(sorted(l,key=lambda x: x.expiration), key=lambda x: x.expiration))
Here we specify groupby to group all elements of list that have same expiration(Note the key argument in groupby). The output of groupby operation is roughly equivalent to (k,[v]), where k is the group key and [v] is the list of values belong to that particular group.
This will produce output like this:
>>> d
>>> {1: 2, 3: 1, 4: 1}
At this point we have expiration and number of times a particular expiration occurs in a list, stored in a dict d.
Next we need to create a list such that If an element is present in the dict d output it, else output 0. We need to iterate from 0 till max number in dict d keys. To do this we can do:
>>> [0 if not d.get(x) else d.get(x) for x in range(0, max(d.keys())+1)]
This will yield your required output
>>> [0,2,0,1,1]
Here is a flexible method using collections.defaultdict:
from collections import defaultdict
def ReverseDictionary(input_dict):
reversed_dict = defaultdict(set)
for k, v in input_dict.items():
reversed_dict[v].add(k)
return reversed_dict
fridge_dict = {f.name: f.expiration for f in fridge}
exp_food = ReverseDictionary(fridge_dict)
# defaultdict(set, {1: {'hamburger', 'steak'}, 3: {'donut'}, 4: {'beer'}})
exp_count = {k: len(exp_food.get(k, set())) for k in range(max(exp_food)+1)}
# {0: 0, 1: 2, 2: 0, 3: 1, 4: 1}
Modify yours with count().
def exp(fridge):
output = []
exp_list = [i.expiration for i in fridge]
for i in range(0, max(exp_list)+1):
output.append(exp_list.count(i))
return output

Optimal algorithm for the comparison two dictionaries in Python 3

I have List of dictionaries like:
Stock=[
{'ID':1,'color':'red','size':'L','material':'cotton','weight':100,'length':300,'location':'China'},
{'ID':2,'color':'green','size':'M','material':'cotton','weight':200,'length':300,'location':'China'},
{'ID':3,'color':'blue','size':'L','material':'cotton','weight':100,'length':300,'location':'China'}
]
And other list of dictionaries like:
Prices=[
{'color':'red','size':'L','material':'cotton','weight':100,'length':300,'location':'China'}
{'color':'blue','size':'S','weight':500,'length':150,'location':'USA', 'cost':1$}
{'color':'pink','size':'L','material':'cotton','location':'China','cost':5$},
{'cost':5$,'color':'blue','size':'L','material':'silk','weight':100,'length':300}
]
So I need find 'cost' for each record in Stock from Prices. But may be a situation, when I don't find 100% coincidence of dict elements, and in this case I need most similar element and get it's "cost".
output=[{'ID':1,'cost':1$},{'ID':2,'cost':5$},...]
Please, prompt the optimal solution for this task. I think it's like Loop from highest to lowest compliance, when we try find record with max coincidence, and if not found - try less matching condition.
how about this
Stock=[
{'ID':1,'color':'red','size':'L','material':'cotton','weight':100,'length':300,'location':'China'},
{'ID':2,'color':'green','size':'M','material':'cotton','weight':200,'length':300,'location':'China'},
{'ID':3,'color':'blue','size':'L','material':'cotton','weight':100,'length':300,'location':'China'}
]
Prices=[
{'color':'red','size':'L','material':'cotton','weight':100,'length':300,'location':'China'},
{'cost':'2$','color':'blue','size':'S','weight':500,'length':150,'location':'USA'},
{'cost':'5$','color':'pink','size':'L','material':'cotton','location':'China'},
{'cost':'15$','color':'blue','size':'L','material':'silk','weight':100,'length':300}
]
Prices = [p for p in Prices if "cost" in p] #make sure that everything have a 'cost'
result = []
for s in Stock:
field = set(s.items())
best_match = max(Prices, key=lambda p: len( field.intersection(p.items()) ), default=None)
if best_match:
result.append( {"ID":s["ID"], "cost":best_match["cost"] } )
print(result)
#[{'ID': 1, 'cost': '5$'}, {'ID': 2, 'cost': '5$'}, {'ID': 3, 'cost': '15$'}]
to find the most similar entry I first transform the dict to a set then use max to find the largest intersection of a price with the stock that I'm checking using a lambda function for the key of max
it reminds me of fuzzy or neural network solutions,
[on python2]
anyway , here is a Numpy solution, :
Stock=[
{'ID':1,'color':'red','size':'L','material':'cotton','weight':100,'length':300,'location':'China'},
{'ID':2,'color':'green','size':'M','material':'cotton','weight':200,'length':300,'location':'China'},
{'ID':3,'color':'blue','size':'L','material':'cotton','weight':100,'length':300,'location':'China'}
]
Prices=[
{'color':'red','size':'L','material':'cotton','weight':100,'length':300,'location':'China'},
{'cost':2,'color':'blue','size':'S','weight':500,'length':150,'location':'USA'},
{'cost':5,'color':'pink','size':'L','material':'cotton','location':'China'},
{'cost':15,'color':'blue','size':'L','material':'silk','weight':100,'length':300}
]
import numpy as np
# replace non useful records.
for p in Prices:
if not(p.has_key('cost')):
Prices.remove(p)
def numerize(lst_of_dics):
r=[]
for d in lst_of_dics:
r1=[]
for n in ['color','size','material','weight','length','location']:
try:
if n=='color':
# it is 0s if unknown
# only 3 letters, should work ,bug!!!
z=[0,0,0]
r1+=[ord(d[n][0]),ord(d[n][1]),ord(d[n][2])]
elif n=='size':
z=[0,0,0]
r1+=[ord(d[n])]*3
elif n=='material':
z=[0,0,0]
r1+=[ord(d[n][0]),ord(d[n][1]),ord(d[n][2])]
elif n=='location':
z=[0,0,0]
r1+=[ord(d[n][0]),ord(d[n][1]),ord(d[n][2])]
else:
z=[0,0,0]
r1+=[d[n]]*3
except:
r1+=z
r.append(r1)
return r
St = numerize(Stock)
Pr = np.array(numerize(Prices))
output=[]
for i,s in enumerate(St):
s0 = np.reshape(s*Pr.shape[0],Pr.shape)
# stage 0: make one element array to subtract easily
s1 = abs(Pr -s0)
# abs diff
s2 = s1 * Pr.astype('bool') * s0.astype('bool')
# non-extentent does'nt mean match..
s21 = np.logical_not(Pr.astype('bool') * s0.astype('bool'))*25
s2 = s2+s21
# ignore the zero fields..(non-extentse)
s3 = np.sum(s2,1)
# take the smallest
s4 = np.where(s3==np.min(s3))[0][0]
c = Prices[s4]['cost']
#print c,i
output.append( {'ID':i+1 ,'cost':c})
print(output)
that gives me the next results (with many assumptions):
[{'cost': 15, 'ID': 1}, {'cost': 5, 'ID': 2}, {'cost': 15, 'ID': 3}]
Note, that this is correct comparison result based on Values and Kinds of properties
please up vote and check the answer if it satisfies you..

Categories