Parsing json string into dataframe

Parsing json string into dataframe - python

I have a nested json string as follows:
[{'id': 'tfghnbkivbgdcse',
'authorization': None,
'operation_type': 'in',
'card': {'type': 'debit',
'brand': 'mastercard',
'address': None,
'card_number': '123456XXXXXX7890',
'holder_name': 'aaaa bbbb’,
'expiration_year': '21',
'expiration_month': '11',
'bank_name': 'XXXXBANK',
'bank_code': '000'},
'status': 'failed',
'creation_date': '2018-06-30T23:59:16-05:00',
'error_message': 'Bank authorization is required for this charge',
'order_id': '1743790',
'amount': 2668.0,
'currency': 'USD',
'customer': {'name': 'AAAA',
'last_name': 'BBBB',
'email': 'XXXX_1234#outlook.com',
'phone_number': '1234567890',
'address': None,
'creation_date': '2018-06-30T23:59:17-05:00',
'external_id': None,
'clabe': None},
'fee': {'amount': 0.95, 'tax': 0.152, 'currency': 'USD'}}]
I want to convert json string into data frame. I have used json_normalize from pandas.io.json, but I am getting an error.

It works if you:
Change all None to "None"
Change 'aaaa bbbb’ to 'aaaa bbbb' (The last character was a different single quote character)
Change all <'> to <">
It's probably just the quote character you need to fix, if the json data is part of python code.
In [39]: ord("'")
Out[39]: 39
In [40]: ord("’")
Out[40]: 8217

Related

if statement in python airtable records - json

this are 3 records from airtable
I want to make a for loop in python (if value in 'check' is Update2 - do something, else do something else)
{'createdTime': '2022-11-09T15:57:28.000Z',
'fields': {'Last Modified': '2022-11-10T00:22:31.000Z',
'Name': 'Daniel',
'Status': 'Todo',
'check': 'update2'},
'id': 'recbvBuBBrgWO98pZ'}
{'createdTime': '2022-11-09T16:58:15.000Z',
'fields': {'Last Modified': '2022-11-10T00:22:32.000Z',
'Name': 'CLaudia',
'Status': 'In progress',
'check': 'update2'},
'id': 'reck3BB7lOVKG0cPI'}
{'createdTime': '2022-11-09T15:57:28.000Z',
'fields': {'Last Modified': '2022-11-10T00:22:32.000Z',
'Name': 'Isabella',
'Status': 'Done',
'check': 'update2'},
'id': 'recveGd8w9ukxLkk9'}

if record['fields']['check'] == 'update2':
do something
else:
do something else

How save a json file in python from api response when the class is a list and object is not serializable

I have tried to find the answer but I could not find it
I am looking for the way to save in my computer a json file from python.
I call the API
configuration = api.Configuration()
configuration.api_key['X-XXXX-Application-ID'] = 'xxxxxxx'
configuration.api_key['X-XXX-Application-Key'] = 'xxxxxxxx1'
## List our parameters as search operators
opts= {
'title': 'Deutsche Bank',
'body': 'fraud',
'language': ['en'],
'published_at_start': 'NOW-7DAYS',
'published_at_end': 'NOW',
'per_page': 1,
'sort_by': 'relevance'
}
try:
## Make a call to the Stories endpoint for stories that meet the criteria of the search operators
api_response = api_instance.list_stories(**opts)
## Print the returned story
pp(api_response.stories)
except ApiException as e:
print('Exception when calling DefaultApi->list_stories: %s\n' % e)
I got the response like this
[{'author': {'avatar_url': None, 'id': 1688440, 'name': 'Pranav Nair'},
'body': 'The law firm will investigate whether the bank or its officials have '
'engaged in securities fraud or unlawful business practices. '
'Industries: Bank Referenced Companies: Deutsche Bank',
'categories': [{'confident': False,
'id': 'IAB11-5',
'level': 2,
'links': {'_self': 'https://,
'parent': 'https://'},
'score': 0.39,
'taxonomy': 'iab-qag'},
{'confident': False,
'id': 'IAB3-12',
'level': 2,
'links': {'_self': 'https://api/v1/classify/taxonomy/iab-qag/IAB3-12',
'score': 0.16,
'taxonomy': 'iab-qag'},
'clusters': [],
'entities': {'body': [{'indices': [[168, 180]],
'links': {'dbpedia': 'http://dbpedia.org/resource/Deutsche_Bank'},
'score': 1.0,
'text': 'Deutsche Bank',
'types': ['Bank',
'Organisation',
'Company',
'Banking',
'Agent']},
{'indices': [[80, 95]],
'links': {'dbpedia': 'http://dbpedia.org/resource/Securities_fraud'},
'score': 1.0,
'text': 'securities fraud',
'types': ['Practice', 'Company']},
'hashtags': ['#DeutscheBank', '#Bank', '#SecuritiesFraud'],
'id': 3004661328,
'keywords': ['Deutsche',
'behalf',
'Bank',
'firm',
'investors',
'Deutsche Bank',
'bank',
'fraud',
'unlawful'],
'language': 'en',
'links': {'canonical': None,
'coverages': '/coverages?story_id=3004661328',
'permalink': 'https://www.snl.com/interactivex/article.aspx?KPLT=7&id=58657069',
'related_stories': '/related_stories?story_id=3004661328'},
'media': [],
'paragraphs_count': 1,
'published_at': datetime.datetime(2020, 5, 19, 16, 8, 5, tzinfo=tzutc()),
'sentences_count': 2,
'sentiment': {'body': {'polarity': 'positive', 'score': 0.599704},
'title': {'polarity': 'neutral', 'score': 0.841333}},
'social_shares_count': {'facebook': [],
'google_plus': [],
'source': {'description': None,
'domain': 'snl.com',
'home_page_url': 'http://www.snl.com/',
'id': 8256,
'links_in_count': None,
'locations': [{'city': 'Charlottesville',
'country': 'US',
'state': 'Virginia'}],
'logo_url': None,
'name': 'SNL Financial',
'scopes': [{'city': None,
'country': 'US',
'level': 'national',
'state': None},
{'city': None,
'country': None,
'level': 'international',
'state': None}],
'title': None},
'summary': {'sentences': ['The law firm will investigate whether the bank or '
'its officials have engaged in securities fraud or '
'unlawful business practices.',
'Industries: Bank Referenced Companies: Deutsche '
'Bank']},
'title': "Law firm to investigate Deutsche Bank's US ops on behalf of "
'investors',
'translations': {'en': None},
'words_count': 26}]
In the documentation says "Stories you retrieve from the API are returned as JSON objects by default. These JSON story objects contain 22 top-level fields, whereas a full story object will contain 95 unique data points"
The class is a list. When I have tried to save json file I have the error "TypeError: Object of type Story is not JSON serializable".
How I can save a json file in my computer?

The response you got is not json, json uses double quotes, but here its single quotes. Copy paste your response in the following link to see the issues
http://json.parser.online.fr/.
If you change it like
[{"author": {"avatar_url": None, "id": 1688440, "name": "Pranav Nair"},
"body": "......
It will work, You can use python json module to do it
import json
json.loads(the_dict_got_from_response).
But it should be the duty of the API provider to, To make it working you can json load the result you got.

How to flatten nested dict formatted '_source' column of csv, into dataframe

I have a csv with 500+ rows where one column "_source" is stored as JSON. I want to extract that into a pandas dataframe. I need each key to be its own column. #I have a 1 mb Json file of online social media data that I need to convert the dictionary and key values into their own separate columns. The social media data is from Facebook,Twitter/web crawled... etc. There are approximately 528 separate rows of posts/tweets/text with each having many dictionaries inside dictionaries. I am attaching a few steps from my Jupyter notebook below to give a more complete understanding. need to turn all key value pairs for dictionaries inside dictionaries into columns inside a dataframe
Thank you so much this will be a huge help!!!
I have tried changing it to a dataframe by doing this
source = pd.DataFrame.from_dict(source, orient='columns')
And it returns something like this... I thought it might unpack the dictionary but it did not.
#source.head()
#_source
#0 {'sub_organization_id': 'default', 'uid': 'aba...
#1 {'sub_organization_id': 'default', 'uid': 'ab0...
#2 {'sub_organization_id': 'default', 'uid': 'ac0...
below is the shape
#source.shape (528, 1)
below is what the an actual "_source" row looks like stretched out. There are many dictionaries and key:value pairs where each key needs to be its own column. Thanks! The actual links have been altered/scrambled for privacy reasons.
{'sub_organization_id': 'default',
'uid': 'ac0fafe9ba98327f2d0c72ddc365ffb76336czsa13280b',
'project_veid': 'default',
'campaign_id': 'default',
'organization_id': 'default',
'meta': {'rule_matcher': [{'atribs': {'website': 'github.com/res',
'source': 'Explicit',
'version': '1.1',
'type': 'crawl'},
'results': [{'rule_type': 'hashtag',
'rule_tag': 'Far',
'description': None,
'project_veid': 'A7180EA-7078-0C7F-ED5D-86AD7',
'campaign_id': '2A6DA0C-365BB-67DD-B05830920',
'value': '#Far',
'organization_id': None,
'sub_organization_id': None,
'appid': 'ray',
'project_id': 'CDE2F42-5B87-C594-C900E578C',
'rule_id': '1838',
'node_id': None,
'metadata': {'campaign_title': 'AF',
'project_title': 'AF '}}]}],
'render': [{'attribs': {'website': 'github.com/res',
'version': '1.0',
'type': 'Page Render'},
'results': [{'render_status': 'success',
'path': 'https://east.amanaws.com/rays-ime-store/renders/b/b/70f7dffb8b276f2977f8a13415f82c.jpeg',
'image_hash': 'bb7674b8ea3fc05bfd027a19815f82c',
'url': 'https://discooprdapp.com/',
'load_time': 32}]}]},
'norm_attribs': {'website': 'github.com/res',
'version': '1.1',
'type': 'crawl'},
'project_id': 'default',
'system_timestamp': '2019-02-22T19:04:53.569623',
'doc': {'appid': 'subtter',
'links': [],
'response_url': 'https://discooprdapp.com',
'url': 'https://discooprdapp.com/',
'status_code': 200,
'status_msg': 'OK',
'encoding': 'utf-8',
'attrs': {'uid': '2ab8f2651cb32261b911c990a8b'},
'timestamp': '2019-02-22T19:04:53.963',
'crawlid': '7fd95-785-4dd259-fcc-8752f'},
'type': 'crawl',
'norm': {'body': '\n',
'domain': 'discordapp.com',
'author': 'crawl',
'url': 'https://discooprdapp.com',
'timestamp': '2019-02-22T19:04:53.961283+00:00',
'id': '7fc5-685-4dd9-cc-8762f'}}
before you post make sure the actual code works for the data attached. Thanks!
The below code I tried but it did not work there was a syntax error that I could not figure out.
pd.io.json.json_normalize(source_data.[_source].apply(json.loads))
pd.io.json.json_normalize(source_data.[_source].apply(json.loads))
^
SyntaxError: invalid syntax
Whoever can help me with this will be a saint!

I had to do something like that a while back. Basically I used a function that completely flattened out the json to identify the keys that would be turned into the columns, then iterated through the json to reconstruct a row and append each row into a "results" dataframe. So with the data you provided, it created 52 column row and looking through it, looks like it included all the keys into it's own column. Anything nested, for example: 'meta': {'rule_matcher':[{'atribs': {'website': ...]} should then have a column name meta.rule_matcher.atribs.website where the '.' denotes those nested keys
data_source = {'sub_organization_id': 'default',
'uid': 'ac0fafe9ba98327f2d0c72ddc365ffb76336czsa13280b',
'project_veid': 'default',
'campaign_id': 'default',
'organization_id': 'default',
'meta': {'rule_matcher': [{'atribs': {'website': 'github.com/res',
'source': 'Explicit',
'version': '1.1',
'type': 'crawl'},
'results': [{'rule_type': 'hashtag',
'rule_tag': 'Far',
'description': None,
'project_veid': 'A7180EA-7078-0C7F-ED5D-86AD7',
'campaign_id': '2A6DA0C-365BB-67DD-B05830920',
'value': '#Far',
'organization_id': None,
'sub_organization_id': None,
'appid': 'ray',
'project_id': 'CDE2F42-5B87-C594-C900E578C',
'rule_id': '1838',
'node_id': None,
'metadata': {'campaign_title': 'AF',
'project_title': 'AF '}}]}],
'render': [{'attribs': {'website': 'github.com/res',
'version': '1.0',
'type': 'Page Render'},
'results': [{'render_status': 'success',
'path': 'https://east.amanaws.com/rays-ime-store/renders/b/b/70f7dffb8b276f2977f8a13415f82c.jpeg',
'image_hash': 'bb7674b8ea3fc05bfd027a19815f82c',
'url': 'https://discooprdapp.com/',
'load_time': 32}]}]},
'norm_attribs': {'website': 'github.com/res',
'version': '1.1',
'type': 'crawl'},
'project_id': 'default',
'system_timestamp': '2019-02-22T19:04:53.569623',
'doc': {'appid': 'subtter',
'links': [],
'response_url': 'https://discooprdapp.com',
'url': 'https://discooprdapp.com/',
'status_code': 200,
'status_msg': 'OK',
'encoding': 'utf-8',
'attrs': {'uid': '2ab8f2651cb32261b911c990a8b'},
'timestamp': '2019-02-22T19:04:53.963',
'crawlid': '7fd95-785-4dd259-fcc-8752f'},
'type': 'crawl',
'norm': {'body': '\n',
'domain': 'discordapp.com',
'author': 'crawl',
'url': 'https://discooprdapp.com',
'timestamp': '2019-02-22T19:04:53.961283+00:00',
'id': '7fc5-685-4dd9-cc-8762f'}}
Code:
def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
flat = flatten_json(data_source)
import pandas as pd
import re
results = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'\_(\d+)\_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'\_\d+\_(.*)', item )[0]
column = re.sub(r'\_\d+\_', '.', column)
row_idx = int(row_idx)
value = flat[item]
results.loc[row_idx, column] = value
for item in special_cols:
results[item] = flat[item]
Output:
print (results.to_string())
atribs_website atribs_source atribs_version atribs_type results.rule_type results.rule_tag results.description results.project_veid results.campaign_id results.value results.organization_id results.sub_organization_id results.appid results.project_id results.rule_id results.node_id results.metadata_campaign_title results.metadata_project_title attribs_website attribs_version attribs_type results.render_status results.path results.image_hash results.url results.load_time sub_organization_id uid project_veid campaign_id organization_id norm_attribs_website norm_attribs_version norm_attribs_type project_id system_timestamp doc_appid doc_response_url doc_url doc_status_code doc_status_msg doc_encoding doc_attrs_uid doc_timestamp doc_crawlid type norm_body norm_domain norm_author norm_url norm_timestamp norm_id
0 github.com/res Explicit 1.1 crawl hashtag Far NaN A7180EA-7078-0C7F-ED5D-86AD7 2A6DA0C-365BB-67DD-B05830920 #Far NaN NaN ray CDE2F42-5B87-C594-C900E578C 1838 NaN AF AF github.com/res 1.0 Page Render success https://east.amanaws.com/rays-ime-store/render... bb7674b8ea3fc05bfd027a19815f82c https://discooprdapp.com/ 32.0 default ac0fafe9ba98327f2d0c72ddc365ffb76336czsa13280b default default default github.com/res 1.1 crawl default 2019-02-22T19:04:53.569623 subtter https://discooprdapp.com https://discooprdapp.com/ 200 OK utf-8 2ab8f2651cb32261b911c990a8b 2019-02-22T19:04:53.963 7fd95-785-4dd259-fcc-8752f crawl \n discordapp.com crawl https://discooprdapp.com 2019-02-22T19:04:53.961283+00:00 7fc5-685-4dd9-cc-8762f

Python parsing JSON nested data

I am trying to parse this JSON data from the setlist.fm api. I am trying to get all the song names in order from each setlist. I have looked around but none of the methods describe on the internet are working.
Here is the JSON data
{'itemsPerPage': 20,
'page': 1,
'setlist': [{'artist': {'disambiguation': '',
'mbid': 'cc197bad-dc9c-440d-a5b5-d52ba2e14234',
'name': 'Coldplay',
'sortName': 'Coldplay',
'tmid': 806431,
'url': 'https://www.setlist.fm/setlists/coldplay-3d6bde3.html'},
'eventDate': '15-11-2017',
'id': '33e0845d',
'info': 'Last show of the A Head Full of Dreams Tour',
'lastUpdated': '2017-11-23T14:51:05.000+0000',
'sets': {'set': [{'song': [{'cover': {'disambiguation': '',
'mbid': '9dee40b2-25ad-404c-9c9a-139feffd4b57',
'name': 'Maria Callas',
'sortName': 'Callas, Maria',
'url': 'https://www.setlist.fm/setlists/maria-callas-33d6706d.html'},
'name': 'O mio babbino caro',
'tape': True},
{'info': 'extended intro with Charlie '
'Chaplin speech',
'name': 'A Head Full of Dreams'},
{'name': 'Yellow'},
{'name': 'Every Teardrop Is a '
'Waterfall'},
{'name': 'The Scientist'},
{'info': 'with "Oceans" excerpt in '
'intro',
'name': 'God Put a Smile Upon Your '
'Face'},
{'info': 'with Tiësto Remix outro',
'name': 'Paradise'}]},
{'name': 'B-Stage',
'song': [{'name': 'Always in My Head'},
{'name': 'Magic'},
{'info': 'single version',
'name': 'Everglow'}]},
{'name': 'A-Stage',
'song': [{'info': 'with "Army of One" excerpt '
'in intro',
'name': 'Clocks'},
{'info': 'partial',
'name': 'Midnight'},
{'name': 'Charlie Brown'},
{'name': 'Hymn for the Weekend'},
{'info': 'with "Midnight" excerpt in '
'intro',
'name': 'Fix You'},
{'name': 'Viva la Vida'},
{'name': 'Adventure of a Lifetime'},
{'cover': {'disambiguation': '',
'mbid': '3f8a5e5b-c24b-4068-9f1c-afad8829e06b',
'name': 'Soda Stereo',
'sortName': 'Soda Stereo',
'tmid': 1138263,
'url': 'https://www.setlist.fm/setlists/soda-stereo-7bd6d204.html'},
'name': 'De música ligera'}]},
{'name': 'C-Stage',
'song': [{'info': 'extended',
'name': 'Kaleidoscope',
'tape': True},
{'info': 'acoustic',
'name': 'In My Place'},
{'name': 'Amor Argentina'}]},
{'name': 'A-Stage',
'song': [{'cover': {'mbid': '2c82c087-8300-488e-b1e4-0b02b789eb18',
'name': 'The Chainsmokers '
'& Coldplay',
'sortName': 'Chainsmokers, '
'The & '
'Coldplay',
'url': 'https://www.setlist.fm/setlists/the-chainsmokers-and-coldplay-33ce5029.html'},
'name': 'Something Just Like This'},
{'name': 'A Sky Full of Stars'},
{'info': 'Extended Outro; followed by '
'‘Believe In Love’ Tour '
'Conclusion Video',
'name': 'Up&Up'}]}]},
'tour': {'name': 'A Head Full of Dreams'},
'url': 'https://www.setlist.fm/setlist/coldplay/2017/estadio-ciudad-de-la-plata-la-plata-argentina-33e0845d.html',
'venue': {'city': {'coords': {'lat': -34.9313889,
'long': -57.9488889},
'country': {'code': 'AR', 'name': 'Argentina'},
'id': '3432043',
'name': 'La Plata',
'state': 'Buenos Aires',
'stateCode': '01'},
'id': '3d62153',
'name': 'Estadio Ciudad de La Plata',
'url': 'https://www.setlist.fm/venue/estadio-ciudad-de-la-plata-la-plata-argentina-3d62153.html'},
'versionId': '7b4ce6d0'},
{'artist': {'disambiguation': '',
'mbid': 'cc197bad-dc9c-440d-a5b5-d52ba2e14234',
'name': 'Coldplay',
'sortName': 'Coldplay',
'tmid': 806431,
'url': 'https://www.setlist.fm/setlists/coldplay-3d6bde3.html'},
'eventDate': '14-11-2017',
'id': '63e08ec7',
'info': '"Paradise", "Something Just Like This" and "De música '
'ligera" were soundchecked',
'lastUpdated': '2017-11-15T02:40:25.000+0000',
'sets': {'set': [{'song': [{'cover': {'disambiguation': '',
'mbid': '9dee40b2-25ad-404c-9c9a-139feffd4b57',
'name': 'Maria Callas',
'sortName': 'Callas, Maria',
'url': 'https://www.setlist.fm/setlists/maria-callas-33d6706d.html'},
'name': 'O mio babbino caro',
'tape': True},
{'info': 'extended intro with Charlie '
'Chaplin speech',
'name': 'A Head Full of Dreams'},
{'name': 'Yellow'},
{'name': 'Every Teardrop Is a '
'Waterfall'},
{'name': 'The Scientist'},
{'info': 'with "Oceans" excerpt in '
'intro',
'name': 'Birds'},
{'info': 'with Tiësto Remix outro',
'name': 'Paradise'}]},
{'name': 'B-Stage',
'song': [{'name': 'Always in My Head'},
{'name': 'Magic'},
{'info': 'single version; dedicated '
'to the Argentinian victims '
'of the New York terrorist '
'attack',
'name': 'Everglow'}]},
{'name': 'A-Stage',
'song': [{'info': 'with "Army of One" excerpt '
'in intro',
'name': 'Clocks'},
{'info': 'partial',
'name': 'Midnight'},
{'name': 'Charlie Brown'},
{'name': 'Hymn for the Weekend'},
{'info': 'with "Midnight" excerpt in '
'intro',
'name': 'Fix You'},
{'name': 'Viva la Vida'},
{'name': 'Adventure of a Lifetime'},
{'cover': {'disambiguation': '',
'mbid': '3f8a5e5b-c24b-4068-9f1c-afad8829e06b',
'name': 'Soda Stereo',
'sortName': 'Soda Stereo',
'tmid': 1138263,
'url': 'https://www.setlist.fm/setlists/soda-stereo-7bd6d204.html'},
'info': 'Coldplay debut',
'name': 'De música ligera'}]},
{'name': 'C-Stage',
'song': [{'info': 'Part 1: "The Guest House"',
'name': 'Kaleidoscope',
'tape': True},
{'info': 'acoustic; Will on lead '
'vocals',
'name': 'In My Place'},
{'info': 'song made for Argentina',
'name': 'Amor Argentina'},
{'info': 'Part 2: "Amazing Grace"',
'name': 'Kaleidoscope',
'tape': True}]},
{'name': 'A-Stage',
'song': [{'name': 'Life Is Beautiful'},
{'cover': {'mbid': '2c82c087-8300-488e-b1e4-0b02b789eb18',
'name': 'The Chainsmokers '
'& Coldplay',
'sortName': 'Chainsmokers, '
'The & '
'Coldplay',
'url': 'https://www.setlist.fm/setlists/the-chainsmokers-and-coldplay-33ce5029.html'},
'name': 'Something Just Like This'},
{'name': 'A Sky Full of Stars'},
{'name': 'Up&Up'}]}]},
This is part of the JSON I grabbed from the URL.
Below is the code I am trying touse:
import requests
import json
from pprint import*
url = "https://api.setlist.fm/rest/1.0/artist/cc197bad-dc9c-440d-a5b5-d52ba2e14234/setlists?p=1"
headers = {'x-api-key': 'API-KEY',
'Accept': 'application/json'}
r = requests.get(url, headers=headers)
data = json.loads(r.text)
#pprint(r.json())
response = data['setlist']
#pprint(response)
for item in response:
pprint(item['sets']['set']['song']['name'])
However I get this error that I cannot resolve nor find any help online with:
pprint(item['sets']['set']['song']['name'])
TypeError: list indices must be integers or slices, not str

Dictionaries (Dict) are accessed by keys.
Lists are accessed by indexes.
i.e.
# Dict get 'item'.
data = {'key': 'item'}
data['key']
# List get 'item0'.
data = ['item0', 'item1']
data[0]
# Dict with List get 'item0'.
data = {'key': ['item0', 'item1']}
data['key'][0]
Both storage types can be nested in JSON and either needs to be accessed in a
different manner.
You have nested Lists which need to be indexed through and that can be done by
a for loop.
I have no access to workable json data except for the Python incomplete object
that you show so I have not tested my code. Thus, no assurance that this
is correct. If not, it may demonstrate how to do the task.
import requests
import json
from pprint import *
url = "https://api.setlist.fm/rest/1.0/artist/cc197bad-dc9c-440d-a5b5-d52ba2e14234/setlists?p=1"
headers = {'x-api-key': 'API-KEY',
'Accept': 'application/json'}
r = requests.get(url, headers=headers)
data = json.loads(r.text)
result = []
for setlist_item in data['setlist']:
for set_item in setlist_item['sets']['set']:
for song_item in set_item['song']:
result += [song_item['name']]
print(result)
Each for loop is processing each list to finally get to extending the result with
each song name.

Python - Extracting information from dictionary in a specific way

I am trying to extract certain values from a dictionary and display it a specific way. I'll show you my code example below:
dict = [{'Titel': 'Rush', 'Name': 'Floris', 'Starttime': '20:30', 'Email': 'Floris#email.com', 'Supplier': 'RTL8', 'Surname': 'Cake', 'Code': 'ABC123'},
{'Titel': 'Rush', 'Voornaam': 'Jaron', 'Starttime': '20:30', 'Email': 'JaronPie#email.com', 'Supplier': 'RTL8', 'Surname': 'Pie', 'Code': 'XYZ123'},
{'Titel': 'Underneath', 'Name': 'Klaas', 'Starttime': '04:00', 'Email': 'Klassieboy#gmail.com', 'Supplier': 'RTL8', 'Surname': 'Klassie', 'Code': 'fbhwuq8674'}]
That is my dictionary. What I want as output is:
Titel, Starttime,
Surname, Name, Email
So it would look like this:
Rush, 20:30,
Cake, Floris, Floris#email.com
Pie, Jaron, JaronPie#email.com
Underneath, 04:00,
Klassie, Klaas, Klassieboy#gmail.com

I used VKS's code to build it thanks VKS :P
Using groupy,list comprehension and sting formatting
Code:
from itertools import groupby
k = [{'Titel': 'Rush', 'Name': 'Floris', 'Starttime': '20:30', 'Email': 'Floris#email.com', 'Supplier': 'RTL8', 'Surname': 'Cake', 'Code': 'ABC123'},
{'Titel': 'Rush', 'Voornaam': 'Jaron', 'Starttime': '20:30', 'Email': 'JaronPie#email.com', 'Supplier': 'RTL8', 'Surname': 'Pie', 'Code': 'XYZ123'},
{'Titel': 'Underneath', 'Name': 'Klaas', 'Starttime': '04:00', 'Email': 'Klassieboy#gmail.com', 'Supplier': 'RTL8', 'Surname': 'Klassie', 'Code': 'fbhwuq8674'}]
lst=[(i["Titel"],i["Starttime"],i["Surname"],i.get("Name","None"), i["Email"]) for i in k]
lst.sort(key=lambda x:(x[0],x[1]))
for key,groups in groupby(lst,key=lambda x:(x[0],x[1])):
print "{}\t{}".format(key[0],key[1])
for value in groups:
print "{}\t{}\t{}".format(value[2],value[3],value[4])
print ""
Output:
Rush 20:30
Cake Floris Floris#email.com
Pie None JaronPie#email.com
Underneath 04:00
Klassie Klaas Klassieboy#gmail.com

k = [{'Titel': 'Rush', 'Name': 'Floris', 'Starttime': '20:30', 'Email': 'Floris#email.com', 'Supplier': 'RTL8', 'Surname': 'Cake', 'Code': 'ABC123'},
{'Titel': 'Rush', 'Voornaam': 'Jaron', 'Starttime': '20:30', 'Email': 'JaronPie#email.com', 'Supplier': 'RTL8', 'Surname': 'Pie', 'Code': 'XYZ123'},
{'Titel': 'Underneath', 'Name': 'Klaas', 'Starttime': '04:00', 'Email': 'Klassieboy#gmail.com', 'Supplier': 'RTL8', 'Surname': 'Klassie', 'Code': 'fbhwuq8674'}]
print [(i["Titel"],i["Starttime"],i["Surname"], i["Email"]) for i in k]
You can use this and you will get tuple of all info you want.

def print_my_d(d):
print(d[0]['Titel'], d[0]['Starttime'])
for l in d[:2]:
print(l['Surname'], l['Name'], l['Email'])
print()
print(d[2]['Titel'],d[2]['Starttime'])
print("{}, {}, {}".format(d[2]['Surname'],d[2]['Name'], d[2]['Email']))
print_my_d(d)
Rush 20:30
Cake Floris Floris#email.com
Pie Jaron JaronPie#email.com
Underneath 04:00
Klassie, Klaas, Klassieboy#gmail.com

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing json string into dataframe - python

Related

if statement in python airtable records - json

How save a json file in python from api response when the class is a list and object is not serializable

How to flatten nested dict formatted '_source' column of csv, into dataframe

Python parsing JSON nested data

Python - Extracting information from dictionary in a specific way

Categories

Resources