Converting from json to dataframe to sql - python

I'm trying to save all the json data to the sql database and I'm using python so I decided to use pandas.
Part of the JSON:
{
"stores": [
{
"ID": "123456",
"name": "Store 1",
"status": "Active",
"date": "2019-03-28T15:20:00Z",
"tagIDs": null,
"location": {
"cityID": 2,
"countryID": 4,
"geoLocation": {
"latitude": 1.13121,
"longitude": 103.4324231
},
"postcode": "123456",
"address": ""
},
"new": false
},
{
"ID": "223456",
"name": "Store 2",
"status": "Active",
"date": "2020-03-28T15:20:00Z",
"tagIDs": [
12,
35
],
"location": {
"cityID": 21,
"countryID": 5,
"geoLocation": {
"latitude": 1.12512,
"longitude": 103.23342
},
"postcode": "223456",
"address": ""
},
"new": true
}
]
}
My Code:
response = requests.get(.....)
result = response.text
data = json.loads(result)
df = pd.json_normalize(data["store"])
.....
db_connection = sqlalchemy.create_engine(.....)
df.to_sql(con=db_connection, name="store", if_exists="append" )
Error: _mysql_connector.MySQLInterfaceError: Python type list cannot be converted
How I want the dataframe to actually look like:
ID tagIDs date
0 123456 [] 2020-04-23T09:32:26Z
1 223456 [12,35] 2019-05-24T03:21:39Z
2 323456 [709,1493] 2019-03-28T15:38:39Z
I tried using different dataframes & json objects so far and they all work.
So I discovered the issue is with the json object.
Without the "tagIDs", everything else works fine.
I was thinking maybe if I converted the object to a string it can be parsed to sql but it didn't work either. How do I change the tagIDs such that I can parse everything to sql? Or is there another more efficient way to do this?

I think the tagIDs field is a list and your database does not seem to be happy with it.
Not sure this is the best way but you can try to convert it from list to string
df['tagIDs'] = df['tagIDs'].apply(lambda x: str(x))

Related

Getting the specified data from the json format using python?

I am having a json data like this
and I want the specific data from this json format based on the condition given below
{
"ok": true,
"members": [
{
"id": "W012A3CDE",
"team_id": "T012AB3C4",
"name": "spengler",
"deleted": false,
"color": "9f69e7",
"real_name": "spengler",
"tz": "America/Los_Angeles",
"tz_label": "Pacific Daylight Time",
"tz_offset": -25200,
"profile": {
"avatar_hash": "ge3b51ca72de",
"status_text": "Print is dead",
"status_emoji": ":books:",
}
},
{
"id": "W07QCRPA4",
"team_id": "T0G9PQBBK",
"name": "glinda",
"deleted": false,
"color": "9f69e7",
"real_name": "Glinda Southgood",
"tz": "America/Los_Angeles",
"tz_label": "Pacific Daylight Time",
"tz_offset": -25200,
"profile": {
"phone": "",
"skype": "",
"real_name": "Glinda Southgood",
"real_name_normalized": "Glinda Southgood",
"display_name": "Glinda the Fairly Good",
"display_name_normalized": "Glinda the Fairly Good",
"email": "glenda#south.oz.coven"
},
}
],
"cache_ts": 1498777272,
"response_metadata": {
"next_cursor": "dXNlcjpVMEc5V0ZYTlo="
}
}
now I want to get only name and id from the members where deleted:false from this json format using python Can anyone help me
Just a simple list comprehension should be enough assuming you've managed to load the json into a dict.
response = json.loads(<your_json_string_here>)
undeleted_members = [dict(id=member['id'], name=member['name']) for member in response['members'] if not member['deleted']]
print(undeleted_members)
Returns a list of dicts with just the name & ID like:
{'id': 'W07QCRPA4', 'name': 'glinda'}
Or if you want separate lists for IDs & names:
all_ids = [member['id'] for member in response['members'] if not member['deleted']]
all_names = [member['name'] for member in response['members'] if not member['deleted']]

I need help figuring out how to turn online data into a usable list that I can print data from

In a program I am working on, I use ArcCloud's music fingerprinting service. after uploading the data I need identified, I am given back this piece of data:
re = ACRCloudRecognizer(config)
data = (re.recognize_by_file('audio_name.mp3', 0))
>>>data
'{"metadata":{"timestamp_utc":"2020-05-18 23:00:59","music":[{"label":"NoCopyrightSounds","play_offset_ms":125620,"duration_ms":326609,"external_ids":{},"artists":[{"name":"Culture Code & Regoton"}],"result_from":1,"acrid":"a53ea40c6a8b4a6795ac3d799f6a4aec","title":"Waking Up","genres":[{"name":"Electro"}],"album":{"name":"Waking Up"},"score":100,"external_metadata":{},"release_date":"2014-05-25"}]},"cost_time":5.5099999904633,"status":{"msg":"Success","version":"1.0","code":0},"result_type":0}\n'
I think it's a list, but I am unable to figure out how to navigate nor grab specific information from it. I'm unsure how they set up the information, and what patterns to look for. Ideally, I would like to create a print function that would print the title, artists, and album.
Any help is much appreciated!
Formatting the JSON makes it more legible
{
"metadata": {
"timestamp_utc": "2020-05-18 23:00:59",
"music": [
{
"label": "NoCopyrightSounds",
"play_offset_ms": 125620,
"duration_ms": 326609,
"external_ids": {},
"artists": [
{
"name": "Culture Code & Regoton"
}
],
"result_from": 1,
"acrid": "a53ea40c6a8b4a6795ac3d799f6a4aec",
"title": "Waking Up",
"genres": [
{
"name": "Electro"
}
],
"album": {
"name": "Waking Up"
},
"score": 100,
"external_metadata": {},
"release_date": "2014-05-25"
}
]
},
"cost_time": 5.5099999904633,
"status": {
"msg": "Success",
"version": "1.0",
"code": 0
},
"result_type": 0
}
Looks like you're looking for .metadata.music.title (presumably), but only if .status.code is 0

Printing each instance of a single line item from a JSON using python

Does anyone know how to print and multiple instances of the same line from a JSON output?
The code I wish to decipher looks something similar to:
[
{
"project": {
"id": 6514847,
"name": "Trial_1",
"code": "123",
"created_at": "2014-10-08T04:22:14Z",
"updated_at": "2017-04-11T00:32:43Z",
"starts_on": "2014-10-08"
}
},
{
"project": {
"id": 6514864,
"name": "Trial_2",
"code": "456",
"created_at": "2014-10-08T04:26:39Z",
"updated_at": "2017-04-11T00:32:46Z",
"starts_on": "2014-10-08"
}
},
{
"project": {
"id": 12502453,
"name": "Trial_3",
"code": "789",
"created_at": "2016-12-08T05:14:38Z",
"updated_at": "2017-04-11T00:32:38Z",
"starts_on": "2016-12-08"
}
}
]
This code was a request.get()
I know I can print a single instance of this using
req = requests.get(url, headers=headers)
read_req = req.json()
trial = read_req['project']['code']
print(trial) #123
The final product I wish to see is linking each Project Name to its relevant Project Code.
You have a list of dicts of dicts. To iterate over each "project" dict you just use a for loop.
for entry in read_req:
trial = entry['project']['code']
print(trial)
In this case, each time through the loop entry will be a dictionary containing the "project" key.
You need for loop.
read_req = req.json()
for project in read_req:
print(project['project']['code'])
This should work for you:
assuming jsontxt is having input data
for i in range(0,len(jsontxt)):
print jsontxt[i]['project']['name'], jsontxt[i]['project']['code']

Flask python json parsing

Hello I am completely new to flask and python. I am using an API to geocode
and i get a json which is
"info": {
"copyright": {
"imageAltText": "\u00a9 2015 MapQuest, Inc.",
"imageUrl": "http://api.mqcdn.com/res/mqlogo.gif",
"text": "\u00a9 2015 MapQuest, Inc."
},
"messages": [],
"statuscode": 0
},
"options": {
"ignoreLatLngInput": false,
"maxResults": -1,
"thumbMaps": true
},
"results": [
{
"locations": [
{
"adminArea1": "US",
"adminArea1Type": "Country",
"adminArea3": "",
"adminArea3Type": "",
"adminArea4": "",
"adminArea4Type": "County",
"adminArea5": "",
"adminArea5Type": "City",
"adminArea6": "",
"adminArea6Type": "Neighborhood",
"displayLatLng": {
"lat": 33.663512,
"lng": -111.958849
},
"dragPoint": false,
"geocodeQuality": "ADDRESS",
"geocodeQualityCode": "L1AAA",
"latLng": {
"lat": 33.663512,
"lng": -111.958849
},
"linkId": "25438895i35930428r65831359",
"mapUrl": "http://www.mapquestapi.com/staticmap/v4/getmap?key=&rand=1009123942",
"postalCode": "",
"sideOfStreet": "R",
"street": "",
"type": "s",
"unknownInput": ""
}
],
"providedLocation": {
"city": " ",
"postalCode": "",
"state": "",
"street": "E Blvd"
}
}
]
}
RIght now i am doing this
data=json.loads(r)
return jsonify(data)
and this prints all the data as shown above. I need to get the latlng array from locations which is in results. I have tried
data.get("results").get("locations") and hundreds of combinations like that but i still cant get it to work. I basically need to store the lat and long in a session variable. Any help is appreciated
Assuming you just have one location as in your example:
from __future__ import print_function
import json
r = ...
data = json.loads(r)
latlng = data['results'][0]['locations'][0]['latLng']
latitude = latlng['lat']
longitude = latlng['lng']
print(latitude, longitude) # 33.663512 -111.958849
data.get("results") will return a list type object. As list object does not have get attribute, you can not do data.get("results").get("locations")
According to the json you provided, you can do like this:
data.get('results')[0].get('locations') # also a list
This will give you the array. Now you can get the lat and lng like this:
data.get('results')[0].get('locations')[0].get('latLng').get('lat') # lat
data.get('results')[0].get('locations')[0].get('latLng').get('lng') # lng
I summarize my comments as follows:
You can use data as a dict of dict and list.
A quick ref to dict and list:
A dictionary’s keys are almost arbitrary values.
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
official docs about stdtypes

python querying a json objectpath

I've a nested json structure, I'm using objectpath (python API version), but I don't understand how to select and filter some information (more precisely the nested information in the structure).
EG.
I want to select the "description" of the action "reading" for the user "John".
JSON:
{
"user":
{
"actions":
[
{
"name": "reading",
"description": "blablabla"
}
]
"name": "John"
}
}
CODE:
$.user[#.name is 'John' and #.actions.name is 'reading'].actions.description
but it doesn't work (empty set but in my JSON it isn't so).
Any suggestion?
Is this what you are trying to do?
import objectpath
data = {
"user": {
"actions": {
"name": "reading",
"description": "blablabla"
},
"name": "John"
}
}
tree = objectpath.Tree(data)
result = tree.execute("$.user[#.name is 'John'].actions[#.name is 'reading'].description")
for entry in result:
print entry
Output
blablabla
I had to fix your JSON. Also, tree.execute returns a generator. You could replace the for loop with print result.next(), but the for loop seemed more clear.
import objectpath import *
your_json = {"name": "felix", "last_name": "diaz"}
# This json path will bring all the key-values of your json
your_json_path='$.*'
my_key_values = Tree(your_json).execute(your_json_path)
# If you want to retrieve the name node...then specify it.
my_name= Tree(your_json).execute('$.name')
# If you want to retrieve a the last_name node...then specify it.
last_name= Tree(your_json).execute('$.last_name')
I believe you're just missing a comma in JSON:
{
"user":
{
"actions": [
{
"name": "reading",
"description": "blablabla"
}
],
"name": "John"
}
}
Assuming there is only one "John", with only one "reading" activity, the following query works:
$.user[#.name is 'John'].actions[0][#.name is 'reading'][0].description
If there could be multiple "John"s, with multiple "reading" activities, the following query will almost work:
$.user.*[#.name is 'John'].actions..*[#.name is 'reading'].description
I say almost because the use of .. will be problematic if there are other nested dictionaries with "name" and "description" entries, such as
{
"user": {
"actions": [
{
"name": "reading",
"description": "blablabla",
"nested": {
"name": "reading",
"description": "broken"
}
}
],
"name": "John"
}
}
To get a correct query, there is an open issue to correctly implement queries into arrays: https://github.com/adriank/ObjectPath/issues/60

Categories