I need to figure out how to structure my data in python such that when I call dumps and write to file I get the following structure of data as an example:
{
"Prop_a": [
{
"car": "brown",
"color": "yellow",
"engine": [
{
"mod_a": "x1",
"name": [
{
"diesel": "yes",
}
...
As you can see, I have nested elements that need expanded. The end-goal is to import the data into a database, I need JSON or CSV formatted data to do it.
EDIT: To all: I can easily print a single level of the dict of lists to JSON. What I need assistance with is how to format the nested structure.
EDIT #2:
Since code is being requested...
my_dict = {}
for x in group:
my_dict[x] = []
for y in sub_group:
mydict[x].append(data_symbol_reference)
Produces an output like:
{
"Prop_a" : [
"car",
"color",
],
...
I need assistance on the nesting the dict of lists within the list structure.
Get the official docs for python 3.6: JSON encoder and decoder
Related
I'm writing a program in Python to use an API that needs to get input from a JSON payload in a really specific way which is shown below. The poid element will contain a different number with each run of the program, the inventories element contains a list of dictionaries that I am trying to send to the API.
[
{
"poid":"22130",
"inventories":
[
{
"item": "SAMPLE-ITEM-1",
"mfgr": "SAMPLE-MANUFACTURER-1",
"quantity": "1",
"condition": "REF"
},
{
"item": "SAMPLE-ITEM-2",
"mfgr": "SAMPLE-MANUFACTURER-2",
"quantity": "3",
"condition": "REF"
}
]
}
]
The data I need to put into the file is stored in a dictionary and a list as shown below. For simplicity of this post, I'm showing what the dictionary and list would look like after another method creates them. I'm not sure if this is the most efficient way of storing this data when I'm having to write it to JSON.
pn_and_mfgr_dict = {'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1', 'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'}
quantities = ["1","3"]
poid = 22130 #this will be different each run
If it makes sense from what I've written above, I need to generate a JSON file that looks like the first codeblock given the information from the second codeblock. The item at index 0 in the quantities list corresponds to the first key/value pair in the dictionary and so on. The "condition" value in the first codeblock will always have "REF" as its value for my use, but I need to also include that in the final payload that gets sent to the API. Since the part number and manufacturer dictionary will be a different length with each run, I also need this method to work regardless of how many values are in the dictionary. This dictionary and the quantities list will always be the same length though. I think the best way I can solve this is making a for loop that iterates through the dictionary and puts respective data where it needs to be, then reading the file when the for loop is done and sending it as the payload but please correct me if there's a better way to do this like storing everything in variables. I also have no experience with JSON so I have attempted to use JSON libraries to accomplish this with no idea what I'm doing wrong. I can edit this with my attempts tonight but I wanted to post this as soon as possible.
Here is one possible solution:
import json
pn_and_mfgr_dict = {
'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1',
'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'
}
quantities = ['1', '3']
poid = 22130
payload = {
'poid': poid,
'inventories': [{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF'
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)]
}
print(json.dumps(payload, indent=2))
The code above will result in:
{
"poid": 22130,
"inventories": [
{
"item": "SAMPLE-ITEM-1",
"mfgr": "SAMPLE-MANUFACTURER-1",
"quantity": "1",
"condition": "REF"
},
{
"item": "SAMPLE-ITEM-2",
"mfgr": "SAMPLE-MANUFACTURER-2",
"quantity": "3",
"condition": "REF"
}
]
}
Naturally, you can adjust that for multiple poids with something like this:
poids = [22130, 22131, 22132]
for poid in poids:
# implement here the logic to get items and quantities for
# each poid
payload = {
'poid': poid,
'inventories': [{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF'
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)]
}
print(json.dumps(payload, indent=2))
You will need to change it to have the correspondents items and quantities for each poid, and I leave that as starting point for you to implement.
Your second block is your input, so you could immediately start by write down a function taking those input and returning a JSON string.
import json
from typing import Dict, List
def jsonify_data(pn_and_mfgr_dict: Dict, quantities: List, poid: int):
constructed_data = [] # TODO
return json.dumps(constructed_data)
Then you could start working on using the inputs to construct the output data you desired. And you already know how to do it.
I think the best way I can solve this is making a for loop that iterates through the dictionary and puts respective data where it needs to be
Yes, that's the way to do it.
Here's my version of solution:
import json
from typing import Dict, List
def jsonify_data(pn_and_mfgr_dict: Dict, quantities: List, poid: int):
inventories = [
{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF',
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)
]
constructed_data = [
{
'poid': f'{poid}',
'inventories': inventories,
}
]
return json.dumps(constructed_data)
import json
data = {'inventories': [{'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1'}, {'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'}]}
quantities = ["1", "3"]
poid = 22130
# Add poid to data
data['poid'] = poid
# Add quantities to data
for item in data['inventories']:
item['quantity'] = quantities.pop(0)
# Serializing json
json_object = json.dumps(data, indent=4)
print(json_object)
I am coding in Python.
I have a carV.json file with content
{"CarValue": "59", "ID": "100043" ...}
{"CarValue": "59", "ID": "100013" ...}
...
How can I sort the file content into
{"CarValue": "59", "ID": "100013" ...}
{"CarValue": "59", "ID": "100043" ...}
...
using the "ID" key to sort?
I tried different methods to read and perform the sort, but always ended up getting errors like "no sort attribute" or ' "unicode' object has no attribute 'sort'".
There are several steps:
Read the file using json.load()
Sort the list of objects using list.sort()
Use a key-function to specify the sort field.
Use operator.itemgetter() to extract the field of interest
Write the data with json.dump()
Here's some code to get you started:
import json, operator
s = '''\
[
{"CarValue": "59", "ID": "100043"},
{"CarValue": "59", "ID": "100013"}
]
'''
data = json.loads(s)
data.sort(key=operator.itemgetter('ID'))
print(json.dumps(data, indent=2))
This outputs:
[
{
"CarValue": "59",
"ID": "100013"
},
{
"CarValue": "59",
"ID": "100043"
}
]
For your application, open the input file and use json.load() instead of json.loads(). Likewise, open a output file and use json.dump() instead of json.dumps(). You can drop the indent parameter as well, that is just to make the output look nicely formatted.
simple and probably faster in case of large data - pandas.DataFrame.to_json
>>> import pandas as pd
>>> unsorted = pd.read_json("test.json")
>>> (unsorted.sort_values("ID")).to_json("sorted_test.json")
>>> sorted = unsorted.sort_values("ID")
>>> sorted
CarValue ID
1 59 100013
0 59 100043
>>> sorted.to_json("n.JSON")
I am wondering how I can convert a json list to a dictionary using the two values of the JSON objects as the key/value pair.
The JSON looks like this:
"test": [
{
"name": "default",
"range": "100-1000"
},
{
"name": "bigger",
"range": "1000-10000"
}
]
I basically want the dictionary to use the name as the key and the range as the value. SO the dictionary in this case would be {default:100-1000} {bigger: 1000-10000}
Is that possible?
You can first load the JSON string into a dictionary with json.loads. Next you can use dictionary comprehension to post process it:
from json import loads
{ d['name'] : d['range'] for d in loads(json_string)['test'] }
We then obtain:
>>> { d['name'] : d['range'] for d in loads(json_string)['test'] }
{'bigger': '1000-10000', 'default': '100-1000'}
In case there are two sub-dictionaries with the same name, then the last one will be stored in the result.
I have JSON output as follows:
{
"service": [{
"name": ["Production"],
"id": ["256212"]
}, {
"name": ["Non-Production"],
"id": ["256213"]
}]
}
I wish to find all ID's where the pair contains "Non-Production" as a name.
I was thinking along the lines of running a loop to check, something like this:
data = json.load(urllib2.urlopen(URL))
for key, value in data.iteritems():
if "Non-Production" in key[value]: print key[value]
However, I can't seem to get the name and ID from the "service" tree, it returns:
if "Non-Production" in key[value]: print key[value]
TypeError: string indices must be integers
Assumptions:
The JSON is in a fixed format, this can't be changed
I do not have root access, and unable to install any additional packages
Essentially the goal is to obtain a list of ID's of non production "services" in the most optimal way.
Here you go:
data = {
"service": [
{"name": ["Production"],
"id": ["256212"]
},
{"name": ["Non-Production"],
"id": ["256213"]}
]
}
for item in data["service"]:
if "Non-Production" in item["name"]:
print(item["id"])
Whatever I see JSON I think about functionnal programming ! Anyone else ?!
I think it is a better idea if you use function like concat or flat, filter and reduce, etc.
Egg one liner:
[s.get('id', [0])[0] for s in filter(lambda srv : "Non-Production" not in srv.get('name', []), data.get('service', {}))]
EDIT:
I updated the code, even if data = {}, the result will be [] an empty id list.
I've got some json from last.fm's api which I've serialised into a dictionary using simplejson. A quick example of the basic structure is below.
{
"artist": "similar": {
"artist": {
"name": "Blah",
"image": [{
"#text": "URLHERE",
"size": "small"
}, {
"#text": "URLHERE",
"size": "medium"
}, {
"#text": "URLHERE",
"size": "large"
}]
}
}
}
Any ideas how I can access the image urls of various different sizes?
Thanks,
Jack
Python does not have any problem with # in strings used as dict keys.
>>> import json
>>> j = '{"#foo": 6}'
>>> print json.loads(j)
{u'#foo': 6}
>>> print json.loads(j)[u'#foo']
6
>>> print json.loads(j)['#foo']
6
There are, however, problems with the JSON you post. For one, it isn't valid (perhaps you're missing a couple commas?). For two, you have a JSON object with the same key "image" three times, which cannot coexist and do anything useful.
In Javascript, these two syntaxes are equivalent:
o.foo
o['foo']
In Python they are not. The first gives you the foo attribute, the second gives you the foo key. (It's debatable whether this was a good idea or not.) In Python, you wouldn't be able to access #text as:
o.#text
because the hash will start a comment, and you'll have a syntax error.
But you want
o['#text']
in any case.
You can get what you want from the image list with a list comprehension. Something like
desired = [x for x in images if minSize < x['size'] < maxSize]
Here, images would be the list of dicts from the inner level of you data structure.