DataFrame pop function removing wanted values in Nest Dictionary - python

I have a DataFrame that has a nested dict within a column. I am removing the nested values and creating a column for each associated key. When using the pop function on pricings it removes values that are wanted. I wish to keep the '1 color', '2 color', '3 color', '4 color', '5 color', '6 color'.
The nested dict looks like this, with column name variations
{'name': 'printing on a DARK shirt',
'pricings': {'1 color': [{'max': 47, 'min': 1, 'price': 100.0},
{'max': 71, 'min': 48, 'price': 40.25},
{'max': 143, 'min': 72, 'price': 2.8},
{'max': 287, 'min': 144, 'price': 2.5}],
'2 color': [{'max': 47, 'min': 1, 'price': 200.0},
{'max': 71, 'min': 48, 'price': 4.25},
{'max': 143, 'min': 72, 'price': 3.8},
{'max': 287, 'min': 144, 'price': 3.5}],
'3 color': [{'max': 47, 'min': 1, 'price': 300.0},
{'max': 71, 'min': 48, 'price': 5.25},
{'max': 143, 'min': 72, 'price': 4.8},
{'max': 287, 'min': 144, 'price': 4.5}],
'4 color': [{'max': 47, 'min': 1, 'price': 400.0},
{'max': 71, 'min': 48, 'price': 6.25},
{'max': 143, 'min': 72, 'price': 5.8},
{'max': 287, 'min': 144, 'price': 5.5}],
'5 color': [{'max': 47, 'min': 1, 'price': 500.0},
{'max': 71, 'min': 48, 'price': 7.5},
{'max': 143, 'min': 72, 'price': 7.0},
{'max': 287, 'min': 144, 'price': 6.6}],
'6 color': [{'max': 47, 'min': 1, 'price': 600.0},
{'max': 71, 'min': 48, 'price': 8.5},
{'max': 143, 'min': 72, 'price': 8.0},
{'max': 287, 'min': 144, 'price': 7.6}]}}
The code I'm using looks like this
df2 = (pd.concat({i: pd.DataFrame(x) for i, x in df1.pop('variations').items()})
.reset_index(level=1, drop=True)
.join(df1 , how='left', lsuffix='_left', rsuffix='_right')
.reset_index(drop=True))
The output is as follows, with the new column name pricing added.
[{'max': 47, 'min': 1, 'price': 20.0},
{'max': 71, 'min': 48, 'price': 4.25},
{'max': 143, 'min': 72, 'price': 3.8},
{'max': 287, 'min': 144, 'price': 3.5}]
If its not clear in the DataFrame the actual list of colors '1 color', '2 color', '3 color', '4 color', '5 color', '6 color'. ranges has fallen off. This is important and the portion I want most. the colors have not created there own column so we are clear.

Related

How to Iterate through an array of dictionaries to copy only relevant keys to new dictionary?

I want to iterate through a dictionary array like the following to only copy the 'symbol' and 'product_progress' keys and their corresponding values to new dictionary array.
[{'coin_name': 'Bitcoin', 'coin_id': 'bitcoin', 'symbol': 'btc', 'rank': 1, 'product_progress': 93, 'team': 100, 'token_fundamentals': 100, 'github_activity': 95, 'marketing': 5, 'partnership': 5, 'uniqueness': 5, 'total_score': 96, 'exchange_name': 'Bitfinex', 'exchange_link': 'https://www.bitfinex.com/t/BTCUSD', 'website': 'https://bitcoin.org/en/', 'twitter': 'https://twitter.com/Bitcoin', 'telegram': None, 'whitepaper': 'https://bitcoin.org/en/bitcoin-paper'}, {'coin_name': 'Ethereum', 'coin_id': 'ethereum', 'symbol': 'eth', 'rank': 2, 'product_progress': 87, 'team': 98, 'token_fundamentals': 97, 'github_activity': 100, 'marketing': 5, 'partnership': 5, 'uniqueness': 5, 'total_score': 94, 'exchange_name': 'Gemini', 'exchange_link': 'https://gemini.com/', 'website': 'https://www.ethereum.org/', 'twitter': 'https://twitter.com/ethereum', 'telegram': None, 'whitepaper': 'https://ethereum.org/en/whitepaper/'}] ...
The code I have so far is:
# need to iterate through list of dictionaries
for index in range(len(projectlist3)):
for key in projectlist3[index]:
d['symbol'] = projectlist3[index]['symbol']
d['token_fundamentals'] = projectlist3[index]['token_fundamentals']
print(d)
It's just saving the last entry rather than all of the entries {'symbol': 'eth', 'token_fundamentals': 97}
Given your data:
l = [{
'coin_name': 'Bitcoin',
'coin_id': 'bitcoin',
'symbol': 'btc',
'rank': 1,
'product_progress': 93,
'team': 100,
'token_fundamentals': 100,
'github_activity': 95,
'marketing': 5,
'partnership': 5,
'uniqueness': 5,
'total_score': 96,
'exchange_name': 'Bitfinex',
'exchange_link': 'https://www.bitfinex.com/t/BTCUSD',
'website': 'https://bitcoin.org/en/',
'twitter': 'https://twitter.com/Bitcoin',
'telegram': None,
'whitepaper': 'https://bitcoin.org/en/bitcoin-paper'
}, {
'coin_name': 'Ethereum',
'coin_id': 'ethereum',
'symbol': 'eth',
'rank': 2,
'product_progress': 87,
'team': 98,
'token_fundamentals': 97,
'github_activity': 100,
'marketing': 5,
'partnership': 5,
'uniqueness': 5,
'total_score': 94,
'exchange_name': 'Gemini',
'exchange_link': 'https://gemini.com/',
'website': 'https://www.ethereum.org/',
'twitter': 'https://twitter.com/ethereum',
'telegram': None,
'whitepaper': 'https://ethereum.org/en/whitepaper/'
}]
You can use listcomp
new_l = [{field: d[field] for field in ['symbol', 'token_fundamentals']}
for d in l]
which is better equivalent of this:
new_l = []
for d in l:
new_d = {}
for field in ['symbol', 'token_fundamentals']:
new_d[field] = d[field]
new_l.append(new_d)
Judging by what your writing into d you want to save a list of objects so this would work:
[{"symbol": i['symbol'], "token_fundamentals": i['token_fundamentals']} for i in d]
Result:
[{'symbol': 'btc', 'token_fundamentals': 100}, {'symbol': 'eth', 'token_fundamentals': 97}]

how to make json_normalize build a dataframe from openweather respons

Hi I'm struggling to extract the data from openweather response. I am using json_normalize to to the table but the construction of statement is not clear for me. I managed to divide a peace of data in to smaller portions and to normalize it but I wonder if there is a nice and smooth way of doing it.
'daily': [{'dt': 1612432800, 'sunrise': 1612419552, 'sunset': 1612452288,'temp': {'day': -4.21, 'min': -10.24, 'max': -2.31, 'night': - 10.24, 'eve': -5.11, 'morn': -3.43},
'feels_like': {'day': -10.78, 'night': -13.48, 'eve': -9.52, 'morn': -11.35}, 'pressure': 1010, 'humidity': 96,
'dew_point': -5.84, 'wind_speed': 5.69, 'wind_deg': 13,
'weather': [{'id': 601, 'main': 'Snow', 'description': 'snow', 'icon': '13d'}], 'clouds': 100, 'pop': 1,
'snow': 10.24, 'uvi': 0.89}, {'dt': 1612519200, 'sunrise': 1612505843, 'sunset': 1612538809,
'temp': {'day': -3.7, 'min': -10.24, 'max': -2.6, 'night': -9.09, 'eve': -6.92,'morn': -8.96},
'feels_like': {'day': -8.01, 'night': -13.25, 'eve': -10.96, 'morn': -13.11},
'pressure': 1023, 'humidity': 98, 'dew_point': -4.64, 'wind_speed': 2.59,
'wind_deg': 273, 'weather': [{'id': 802, 'main': 'Clouds', 'description': 'scattered clouds','icon': '03d'}], 'clouds': 29,
'pop': 0.16, 'uvi': 0.91},{'dt': 1612605600, 'sunrise': 1612592132, 'sunset': 1612625330,
'temp': {'day': -8.27, 'min': -15.93, 'max': -7.49, 'night': -15.93, 'eve': -12.8, 'morn': -10.72},
'feels_like': {'day': -12.82, 'night': -20.74, 'eve': -17.38, 'morn': -14.93}, 'pressure': 1024,
'humidity': 92, 'dew_point': -11.71, 'wind_speed': 2.21, 'wind_deg': 32,
'weather': [{'id': 803, 'main': 'Clouds', 'description': 'broken clouds', 'icon': '04d'}], 'clouds': 67,
'pop': 0, 'uvi': 0.86}, {'dt': 1612692000, 'sunrise': 1612678420, 'sunset': 1612711851,
'temp': {'day': -11.72, 'min': -16.93, 'max': -9.81, 'night': -14.36, 'eve': -11.18,'morn': -16.76},
'feels_like': {'day': -17.5, 'night': -20.73, 'eve': -17.09, 'morn': -22},
'pressure': 1023, 'humidity': 94, 'dew_point': -13.77, 'wind_speed': 3.65,
'wind_deg': 81, 'weather': [{'id': 803, 'main': 'Clouds', 'description': 'broken clouds', 'icon': '04d'}], 'clouds': 54, 'pop': 0,'uvi': 0.98}, {'dt': 1612778400, 'sunrise': 1612764705, 'sunset': 1612798372,
'temp': {'day': -12.41, 'min': -15.94, 'max': -8.43, 'night': -11.33,'eve': -9.23, 'morn': -15.94},
'feels_like': {'day': -20.36, 'night': -19.04, 'eve': -17.44,'morn': -22.64}, 'pressure': 1015, 'humidity': 90,
'dew_point': -16.35, 'wind_speed': 6.64, 'wind_deg': 69, 'weather': [{'id': 804, 'main': 'Clouds', 'description': 'overcast clouds', 'icon': '04d'}], 'clouds': 97, 'pop': 0,'uvi': 1.01},{'dt': 1612864800, 'sunrise': 1612850989, 'sunset': 1612884894,'temp': {'day': -13.58, 'min': -14.7, 'max': -11.21, 'night': -11.4, 'eve': -11.26, 'morn': 13.48},'feels_like': {'day': -19.95, 'night': -17.27, 'eve': -17.3, 'morn': -20.35}, 'pressure': 1014, 'humidity': 94,'dew_point': -15.84, 'wind_speed': 4.33, 'wind_deg': 60,'weather': [{'id': 600, 'main': 'Snow', 'description': 'light snow', 'icon': '13d'}], 'clouds': 100,
'pop': 0.73, 'snow': 0.83, 'uvi': 0.98}, {'dt': 1612951200, 'sunrise': 1612937272, 'sunset': 1612971415,
'temp': {'day': -13.58, 'min': -17.87, 'max': -11.37,'night': -17.87, 'eve': -13.19, 'morn': -13.34},
'feels_like': {'day': -19.11, 'night': -23.19, 'eve': -18.44,'morn': -18.75}, 'pressure': 1021, 'humidity': 94,'dew_point': -15.74, 'wind_speed': 3.14, 'wind_deg': 54, 'weather': [{'id': 600, 'main': 'Snow', 'description': 'light snow', 'icon': '13d'}], 'clouds': 82, 'pop': 0.73,'snow': 0.78, 'uvi': 1},{'dt': 1613037600, 'sunrise': 1613023553, 'sunset': 1613057936,
'temp': {'day': -16.26, 'min': -20.28, 'max': -13.32, 'night': -19.55, 'eve': -14.36, 'morn': -19.46},
'feels_like': {'day': -22.12, 'night': -25.23, 'eve': -20, 'morn': -24.97}, 'pressure': 1028, 'humidity': 93,
'dew_point': -18.8, 'wind_speed': 3.41, 'wind_deg': 77,'weather': [{'id': 801, 'main': 'Clouds', 'description': 'few clouds', 'icon': '02d'}], 'clouds': 18, 'pop': 0,'uvi': 1}]}
day = temp_Json['daily']
data_frame_day = pd.json_normalize(day, 'weather', ['dt', 'sunrise', 'sunset', 'pressure', 'humidity', 'dew_point', 'wind_speed','wind_deg', 'clouds', 'pop', 'snow', 'uvi', ['temp', 'day'],['temp', 'min'],['temp', 'max'], ['temp', 'night'], ['temp', 'eve'], ['temp', 'morn'],['feels_like', 'day'], ['feels_like', 'night'], ['feels_like', 'eve'],['feels_like', 'morn']], errors='ignore')
The error is:
Traceback (most recent call last):
File "C:\Users\Jakub\PycharmProjects\Tests\main.py", line 263, in <module>
data_frame_day = pd.json_normalize(day, 'weather',
File "C:\Users\Jakub\PycharmProjects\Tests\venv\lib\site-packages\pandas\io\json\_normalize.py", line 336, in _json_normalize
_recursive_extract(data, record_path, {}, level=0)
File "C:\Users\Jakub\PycharmProjects\Tests\venv\lib\site-packages\pandas\io\json\_normalize.py", line 329, in _recursive_extract
raise KeyError(
KeyError: "Try running with errors='ignore' as key 'snow' is not always present"
This is how I would normalize the records:
df = pd.DataFrame(day)
# since weather column contains a list we need to transform each element to a row
df = df.explode('weather')
# normalize columns
weather = pd.json_normalize(df['weather']).add_prefix('weather.')
feels_like = pd.json_normalize(df['feels_like']).add_prefix('feels_like.')
temp = pd.json_normalize(df['temp']).add_prefix('temp.')
# join columns together after normalization and drop original unnormalized columns
df_normalized = pd.concat([weather, temp, feels_like, df], axis=1).drop(columns=['weather', 'temp', 'feels_like'])
This will give you the normalized dataframe.

Pandas, create dictionary from df, with one column as replacing another

I have an unknown number of DataFrames.
two for example:
date week_score daily_score site_name
0 2014-07-04 100 90 demo 2
1 2014-07-05 80 55 demo 2
2 2015-07-06 70 60 demo 2
date week_score daily_score site_name
0 2014-07-04 85 100 demo 1
1 2014-07-05 50 80 demo 1
2 2015-07-06 45 30 demo 1
I know the data frames all have the same shape and columns names.
I want to combine them into a list of dictionaries (df.to_dict(orient='records') but have the site_name as key and to do this for every score.
the desired output is a bit tricky:
{'week_score: [{'date': '2014-07-04', 'demo 2': 100, 'demo 1': 85},
{'date': '2014-07-05', 'demo 2': 80, 'demo 1': 50},
{'date': '2014-07-06', 'demo 2': 70, 'demo 1': 45}],
'daily_score: [{'date': '2014-07-04', 'demo 2': 90, 'demo 1': 100},
{'date': '2014-07-05', 'demo 2': 55, 'demo 1': 80},
{'date': '2014-07-06', 'demo 2': 60, 'demo 1': 30}],
}
you can try this code :
d = dict()
for col in df.columns[1:-1].tolist():
new_df = pd.DataFrame({'date':dfs[0]['date']})
for df in dfs:
site_name = df['site_name'].unique()[0]
dropped = df.drop('site_name',axis='columns')
new_df[site_name] = df[col]
d[col] = new_df.to_dict('records')
>>>d
output:
{'week_score': [{'date': '2014-07-04', 'demo1': 85, 'demo2': 100},
{'date': '2014-07-05', 'demo1': 50, 'demo2': 80},
{'date': '2015-07-06', 'demo1': 45, 'demo2': 70}],
'daily_score': [{'date': '2014-07-04', 'demo1': 100, 'demo2': 90},
{'date': '2014-07-05', 'demo1': 80, 'demo2': 55},
{'date': '2015-07-06', 'demo1': 30, 'demo2': 60}]}

change format of python dictionary

I have a python dictionary in this format:
{('first', 'negative'): 57, ('first', 'neutral'): 366, ('first', 'positive'): 249, ('second', 'negative'): 72, ('second', 'neutral'): 158, ('second', 'positive'): 99, ('third', 'negative'): 156, ('third', 'neutral'): 348, ('third', 'positive'): 270}
I want to convert it to:
{'first': [{'sentiment':'negative', 'value': 57}, {'sentiment': 'neutral', 'value': 366}, {'sentiment': 'positive', 'value': 249}], 'second': [{'sentiment':'negative', 'value': 72}, {'sentiment': 'neutral', 'value': 158}, {'sentiment': 'positive', 'value': 99}], 'third': [{'sentiment':'negative', 'value': 156}, {'sentiment': 'neutral', 'value': 348}, {'sentiment': 'positive', 'value': 270}]}
Thanks in advance
This should help.
o = {('first', 'negative'): 57, ('first', 'neutral'): 366, ('first', 'positive'): 249, ('second', 'negative'): 72, ('second', 'neutral'): 158, ('second', 'positive'): 99, ('third', 'negative'): 156, ('third', 'neutral'): 348, ('third', 'positive'): 270}
d = {}
for k,v in o.items(): #Iterate over your dict
if k[0] not in d:
d[k[0]] = [{"sentiment":k[1] , "value": v}]
else:
d[k[0]].append({"sentiment":k[1] , "value": v})
print d
Output:
{'second': [{'value': 72, 'sentiment': 'negative'}, {'value': 99, 'sentiment': 'positive'}, {'value': 158, 'sentiment': 'neutral'}], 'third': [{'value': 156, 'sentiment': 'negative'}, {'value': 348, 'sentiment': 'neutral'}, {'value': 270, 'sentiment': 'positive'}], 'first': [{'value': 57, 'sentiment': 'negative'}, {'value': 366, 'sentiment': 'neutral'}, {'value': 249, 'sentiment': 'positive'}]}
from collections import defaultdict
out = defaultdict(list)
for (label, sentiment), value in input_dict.items():
out[label].append(dict(sentiment=sentiment, value=value))

Converting data from Json to String in python

Could anybody please explain how to convert the following json data into a string in python. It's very big but i need your help...
You can see it from the following link:- http://api.openweathermap.org/data/2.5/forecast/daily?q=delhi&mode=json&units=metric&cnt=7&appid=146f5f89c18a703450d3bd6737d4fc94
Please suggest it's solution it is important for my project :-)
You can decode a JSON string in python like this:
import json
data = json.loads('json_string')
Source: https://docs.python.org/2/library/json.html
import requests
url = 'http://api.openweathermap.org/data/2.5/forecast/daily?q=delhi&mode=json&units=metric&cnt=7&appid=146f5f89c18a703450d3bd6737d4fc94'
response = requests.get(url)
response.text # this is a string
response.json() # this is a json dictionary
s = "The City is {city[name]} todays HIGH is {list[0][temp][max]}".format(**response.json())
print s
Some simple code that will read the JSON from your page and produce a Python dictionary follows. I have used the implicit concatenation of adjacent strings to improve the layout of the code.
import json
import urllib.request
f = urllib.request.urlopen
(url="http://api.openweathermap.org/data/2.5/forecast/daily?"
"q=delhi&mode=json&units=metric&"
"cnt=7&appid=146f5f89c18a703450d3bd6737d4fc94")
content = f.read()
result = json.loads(content.decode("utf-8"))
print(result)
This gives me the following output (which I have not shown in code style as it would appear in a single long line):
{'city': {'coord': {'lat': 28.666668, 'lon': 77.216667}, 'country': 'IN', 'id': 1273294, 'population': 0, 'name': 'Delhi'}, 'cnt': 7, 'message': 0.0081, 'list': [{'dt': 1467093600, 'weather': [{'icon': '01n', 'id': 800, 'description': 'clear sky', 'main': 'Clear'}], 'humidity': 82, 'clouds': 0, 'pressure': 987.37, 'speed': 2.63, 'temp': {'max': 32, 'eve': 32, 'night': 30.67, 'min': 30.67, 'day': 32, 'morn': 32}, 'deg': 104}, {'dt': 1467180000, 'weather': [{'icon': '10d', 'id': 501, 'description': 'moderate rain', 'main': 'Rain'}], 'humidity': 74, 'clouds': 12, 'pressure': 989.2, 'speed': 4.17, 'rain': 9.91, 'temp': {'max': 36.62, 'eve': 36.03, 'night': 31.08, 'min': 29.39, 'day': 35.61, 'morn': 29.39}, 'deg': 126}, {'dt': 1467266400, 'weather': [{'icon': '02d', 'id': 801, 'description': 'few clouds', 'main': 'Clouds'}], 'humidity': 71, 'clouds': 12, 'pressure': 986.56, 'speed': 3.91, 'temp': {'max': 36.27, 'eve': 35.19, 'night': 30.87, 'min': 29.04, 'day': 35.46, 'morn': 29.04}, 'deg': 109}, {'dt': 1467352800, 'weather': [{'icon': '10d', 'id': 502, 'description': 'heavy intensity rain', 'main': 'Rain'}], 'humidity': 100, 'clouds': 48, 'pressure': 984.48, 'speed': 0, 'rain': 18.47, 'temp': {'max': 30.87, 'eve': 30.87, 'night': 28.24, 'min': 24.96, 'day': 27.16, 'morn': 24.96}, 'deg': 0}, {'dt': 1467439200, 'weather': [{'icon': '10d', 'id': 501, 'description': 'moderate rain', 'main': 'Rain'}], 'humidity': 0, 'clouds': 17, 'pressure': 983.1, 'speed': 6.54, 'rain': 5.31, 'temp': {'max': 35.48, 'eve': 32.96, 'night': 27.82, 'min': 27.82, 'day': 35.48, 'morn': 29.83}, 'deg': 121}, {'dt': 1467525600, 'weather': [{'icon': '10d', 'id': 501, 'description': 'moderate rain', 'main': 'Rain'}], 'humidity': 0, 'clouds': 19, 'pressure': 984.27, 'speed': 3.17, 'rain': 7.54, 'temp': {'max': 34.11, 'eve': 34.11, 'night': 27.88, 'min': 27.53, 'day': 33.77, 'morn': 27.53}, 'deg': 133}, {'dt': 1467612000, 'weather': [{'icon': '10d', 'id': 503, 'description': 'very heavy rain', 'main': 'Rain'}], 'humidity': 0, 'clouds': 60, 'pressure': 984.82, 'speed': 5.28, 'rain': 54.7, 'temp': {'max': 33.12, 'eve': 33.12, 'night': 26.15, 'min': 25.78, 'day': 31.91, 'morn': 25.78}, 'deg': 88}], 'cod': '200'}

Categories