From dic pandas with nested dictionaries - python

I have a dictionary like that:
{12: {'Soccer': {'value': 31, 'year': 2013}},
23: {'Volley': {'value': 24, 'year': 2012},'Yoga': {'value': 3, 'year': 2014}},
39: {'Baseball': {'value': 2, 'year': 2014},'basket': {'value': 4, 'year': 2012}}}
and i would like to have a dataframe like this:
index column
12 {'Soccer': {'value': 31, 'year': 2013}}
23 {'Volley': {'value': 24, 'year': 2012},'Yoga': {'value': 3, 'year': 2014}}
39 {'Baseball': {'value': 2, 'year': 2014},'basket': {'value': 4, 'year': 2012}}
with each nested dictionary set in a unique column, with the row given by the key of the external dictionary. When I use 'from_dict' with orient parameter equal to index, it considers that keys from the nested dictionaries are the labels of the columns and it makes a square dataframe instead of a single column...
Thanks a lot

Use:
df = pd.DataFrame({'column':d})
Or:
df = pd.Series(d).to_frame('column')
print (df)
column
12 {'Soccer': {'year': 2013, 'value': 31}}
23 {'Volley': {'year': 2012, 'value': 24}, 'Yoga'...
39 {'Baseball': {'year': 2014, 'value': 2}, 'bask...

In [65]: pd.DataFrame(d.values(), index=d.keys(), columns=['column'])
Out[65]:
column
12 ({'Soccer': {'value': 31, 'year': 2013}}, {'Vo...
23 ({'Soccer': {'value': 31, 'year': 2013}}, {'Vo...
39 ({'Soccer': {'value': 31, 'year': 2013}}, {'Vo...

Related

Lowercase dictionary items within a list python

I am trying to lowercase all the keys in a dictionary(s) that are within a list. I actually have a code that prints the lowercase output I want within a for loop. I'm using a dictionary comprehension to lowercase, but I'm not sure how to append the changed values to my list.
amdardict = [{'1031': 98, '1032': 1, '33007': 70, 'AIRCRAFT_FLIGHT_NUMBER': 'CNFNXQ', 'DAY': 5, 'HEIGHT_OR_ALTITUDE': 1490.0, 'HOUR': 0, 'LATITUDE': 39.71, 'LONGITUDE': -41.79, 'MINUTE': 0, 'MONTH': 10, 'PHASE_OF_AIRCRAFT_FLIGHT': 5, 'TEMPERATURE_DRY_BULB_TEMPERATURE': 289.0, 'WIND_DIRECTION': 219, 'WIND_SPEED': 3.0, 'YEAR': 2019}
{'12101': 248.75, '4006': 55, '7010': 6135, '8009': 3, 'aircraft_flight_number': '????????', 'aircraft_registration_number_or_other_identification': 'AU0155', 'aircraft_tail_number': '??????', 'day': 5, 'destination_airport': '???', 'hour': 0, 'latitude': -34.3166, 'longitude': 151.9333, 'minute': 8, 'month': 10, 'observation_sequence_number': 64, 'origination_airport': '???', 'wind_direction': 208, 'wind_speed': 23.0, 'year': 2019}
]
for d in amdardict: print(dict((k.lower(), v) for k, v in d.items()))
Why modify the original list? Can you create a new empty list and slightly modify your code to append to that new list instead of printing:
new_list = []
for d in amdardict:
new_list.append(dict((k.lower(), v) for k, v in d.items()))
To change the keys in-place, you can use the dict.pop method.
>>> # Copy the list in case we make a mistake
>>> import copy
>>> backup = copy.deepcopy(amdardict)
>>> for d in amdardict:
... # <ake a list of keys() because we can't loop over keys()
... # and change keys simultaneously
... for k in list(d.keys()):
... if not k.islower():
# pop removes the key from the dict and returns the value
... d[k.lower()] = d.pop(k)
...
>>> amdardict
[{'aircraft_flight_number': 'CNFNXQ', 'day': 5, 'height_or_altitude': 1490.0, 'temperature_dry_bulb_temperature': 289.0, 'wind_direction': 219, 'wind_speed': 3.0, 'year': 2019, 'hour': 0, 'latitude': 39.71, 'longitude': -41.79, 'minute': 0, 'month': 10, 'phase_of_aircraft_flight': 5, '1031': 98, '1032': 1, '33007': 70}, {'aircraft_flight_number': '????????', 'aircraft_registration_number_or_other_identification': 'AU0155', 'aircraft_tail_number': '??????', 'day': 5, 'destination_airport': '???', 'hour': 0, 'latitude': -34.3166, 'longitude': 151.9333, 'minute': 8, 'month': 10, 'observation_sequence_number': 64, 'origination_airport': '???', 'wind_direction': 208, 'wind_speed': 23.0, 'year': 2019, '12101': 248.75, '4006': 55, '7010': 6135, '8009': 3}]

Sorting list by date Python [duplicate]

This question already has answers here:
How do I sort a list of dictionaries by a value of the dictionary?
(20 answers)
Closed 4 years ago.
My data is in the form:
[{'value': 2, 'year': u'2015'}, {'value': 4, 'year': u'2016'}, {'value': 3, 'year': u'2018'}, {'value': 0, 'year': u'2014'}, {'value': 0, 'year': u'2017'}]
I want to sort it by year. Can you please help?
You need to specify the key function applied for comparing when sorting:
my_data = sorted( my_data, key = lambda x : x["year"])
You should use itemgetter for that:
>>> from operator import itemgetter
>>> data = [{'value': 2, 'year': u'2015'}, {'value': 4, 'year': u'2016'}, {'value': 3, 'year': u'2018'}, {'value': 0, 'year': u'2014'}, {'value': 0, 'year': u'2017'}]
>>> result = sorted(data, key=itemgetter('year'))
>>> print(result)
[{'value': 0, 'year': '2014'}, {'value': 2, 'year': '2015'}, {'value': 4, 'year': '2016'}, {'value': 0, 'year': '2017'}, {'value': 3, 'year': '2018'}]

Convert dataframe to dictionary in Python

I have a csv file that I converted into dataframe using Pandas. Here's the dataframe:
Customer ProductID Count
John 1 50
John 2 45
Mary 1 75
Mary 2 10
Mary 5 15
I need an output in the form of a dictionary that looks like this:
{ProductID:1, Count:{John:50, Mary:75}},
{ProductID:2, Count:{John:45, Mary:10}},
{ProductID:5, Count:{John:0, Mary:15}}
I read the following answers:
python pandas dataframe to dictionary
and
Convert dataframe to dictionary
This is the code that I'm having:
df = pd.read_csv('customer.csv')
dict1 = df.set_index('Customer').T.to_dict('dict')
dict2 = df.to_dict(orient='records')
and this is my current output:
dict1 = {'John': {'Count': 45, 'ProductID': 2}, 'Mary': {'Count': 15, 'ProductID': 5}}
dict2 = [{'Count': 50, 'Customer': 'John', 'ProductID': 1},
{'Count': 45, 'Customer': 'John', 'ProductID': 2},
{'Count': 75, 'Customer': 'Mary', 'ProductID': 1},
{'Count': 10, 'Customer': 'Mary', 'ProductID': 2},
{'Count': 15, 'Customer': 'Mary', 'ProductID': 5}]
IIUC you can use:
d = df.groupby('ProductID').apply(lambda x: dict(zip(x.Customer, x.Count)))
.reset_index(name='Count')
.to_dict(orient='records')
print (d)
[{'ProductID': 1, 'Count': {'John': 50, 'Mary': 75}},
{'ProductID': 2, 'Count': {'John': 45, 'Mary': 10}},
{'ProductID': 5, 'Count': {'Mary': 15}}]

Check unique values for a key in a list of dicts [duplicate]

This question already has answers here:
Remove duplicate dict in list in Python
(16 answers)
Closed 6 years ago.
I have a list of dictionaries where I want to drop any dictionaries that repeat their id key. What's the best way to do this e.g:
example dict:
product_1={ 'id': 1234, 'price': 234}
List_of_products[product1:, product2,...........]
How can I the list of products so I have non repeating products based on their product['id']
Select one of product dictionaries in which the values with the same id are different. Use itertools.groupby,
import itertools
list_products= [{'id': 12, 'price': 234},
{'id': 34, 'price': 456},
{'id': 12, 'price': 456},
{'id': 34, 'price': 78}]
list_dicts = list()
for name, group in itertools.groupby(sorted(list_products, key=lambda d : d['id']), key=lambda d : d['id']):
list_dicts.append(next(group))
print(list_dicts)
# Output
[{'price': 234, 'id': 12}, {'price': 456, 'id': 34}]
If the product dictionaries with the same id are totally the same, there is an easier way as described in Remove duplicate dict in list in Python. Here is a MWE.
list_products= [{'id': 12, 'price': 234},
{'id': 34, 'price': 456},
{'id': 12, 'price': 234},
{'id': 34, 'price': 456}]
result = [dict(t) for t in set([tuple(d.items()) for d in list_products])]
print(result)
# Output
[{'price': 456, 'id': 34}, {'price': 234, 'id': 12}]
a = [{'id': 124, 'price': 234}, {'id': 125, 'price': 234}, {'id': 1234, 'price': 234}, {'id': 1234, 'price': 234}]
a.sort()
for indx, val in enumerate(a):
if val['id'] == a[indx+1]['id']:
del a[indx]

sum of one value in the list of dictionary based on one key in dict

I want to sum one value in the list of dictionary based on another key value is equal.
stackOverflow much easier question answer for just sum the total value:
I have a very big list of dictionaries and I want to sum the insides
For example: if we have
lst = [{'year': 2013, 'snow': 64.8, 'month': 1},
{'year': 2013, 'snow': 66.5, 'month': 2},
{'year': 2013, 'snow': 68.3, 'month': 12},
{'year': 2013, 'snow': 68.8, 'month': 3},
{'year': 2013, 'snow': 70.9, 'month': 11},
{'year': 2012, 'snow': 76.8, 'month': 7},
{'year': 2012, 'snow': 79.6, 'month': 5},
{'year': 1951, 'snow': 86.6, 'month': 12}]
to get the sum of snow fall in that year:
the output should:
snowfall = [{'year': 2013, 'totalsnow': 339.3},
{'year': 2012, 'totalsnow': 156.4},
{'year': 1951, 'totalsnow': 86.6}]
Here is my code:
for i in range(len(lst)):
while lst[i]['year']:
sum(value['snow'] for value in lst)
then it will goes wrong, output
582.3000000000001
How to get it right? Please be sample and explain as well. I am new to python.
Use a dictionary to track snow-per-year; a collections.defaultdict() object is ideal here:
from collections import defaultdict
snowfall = defaultdict(float)
for info in lst:
snowfall[info['year']] += info['snow']
snowfall = [{'year': year, 'totalsnow': snowfall[year]}
for year in sorted(snowfall, reverse=True)]
This first creates a defaultdict() object that'll create new float() objects (value 0.0) for keys that don't exist yet. It sums the values per year for you.
The last lines create your desired structure, sorted by year in descending order.
Demo:
>>> from collections import defaultdict
>>> lst = [{'year': 2013, 'snow': 64.8, 'month': 1},
... {'year': 2013, 'snow': 66.5, 'month': 2},
... {'year': 2013, 'snow': 68.3, 'month': 12},
... {'year': 2013, 'snow': 68.8, 'month': 3},
... {'year': 2013, 'snow': 70.9, 'month': 11},
... {'year': 2012, 'snow': 76.8, 'month': 7},
... {'year': 2012, 'snow': 79.6, 'month': 5},
... {'year': 1951, 'snow': 86.6, 'month': 12}]
>>> snowfall = defaultdict(float)
>>> for info in lst:
... snowfall[info['year']] += info['snow']
...
>>> snowfall = [{'year': year, 'totalsnow': snowfall[year]}
... for year in sorted(snowfall, reverse=True)]
>>> from pprint import pprint
>>> pprint(snowfall)
[{'totalsnow': 339.30000000000007, 'year': 2013},
{'totalsnow': 156.39999999999998, 'year': 2012},
{'totalsnow': 86.6, 'year': 1951}]

Categories