I basically have a file with this structure:
root \
{
field1 {
subfield_a {
"value1"
}
subfield_b {
"value2"
}
subfield_c {
"value1"
"value2"
"value3"
}
subfield_d {
}
}
field2 {
subfield_a {
"value1"
}
subfield_b {
"value1"
}
subfield_c {
"value1"
"value2"
"value3"
"value4"
"value5"
}
subfield_d {
}
}
}
I want to parse this file with python to get a multidimensional array that contains all the values of a specific subfield (for examples subfield_c). E.g. :
tmp = magic_parse_function("subfield_c",file)
print tmp[0] # [ "value1", "value2", "value3"]
print tmp[1] # [ "value1", "value2", "value3", "value4", "value5"]
I'm pretty sure I've to use the pyparsing class, but I don't where to start to set the regex (?) expression. Can someone give me some pointers ?
You can let pyparsing take care of the matching and iterating over the input, just define what you want it to match, and pass it the body of the file as a string:
def magic_parse_function(fld_name, source):
from pyparsing import Keyword, nestedExpr
# define parser
parser = Keyword(fld_name).suppress() + nestedExpr('{','}')("content")
# search input string for matching keyword and following braced content
matches = parser.searchString(source)
# remove quotation marks
return [[qs.strip('"') for qs in r[0].asList()] for r in matches]
# read content of file into a string 'file_body' and pass it to the function
tmp = magic_parse_function("subfield_c",file_body)
print(tmp[0])
print(tmp[1])
prints:
['value1', 'value2', 'value3']
['value1', 'value2', 'value3', 'value4', 'value5']
Related
I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}
I have a JSON file where I need to replace the UUID and update it with another one. I'm having trouble replacing the deeply nested keys and values.
Below is my JSON file that I need to read in python, replace the keys and values and update the file.
JSON file - myfile.json
{
"name": "Shipping box"
"company":"Detla shipping"
"description":"---"
"details" : {
"boxes":[
{
"box_name":"alpha",
"id":"a3954710-5075-4f52-8eb4-1137be51bf14"
},
{
"box_name":"beta",
"id":"31be3763-3d63-4e70-a9b6-d197b5cb6929"
}
]
}
"container": [
"a3954710-5075-4f52-8eb4-1137be51bf14":[],
"31be3763-3d63-4e70-a9b6-d197b5cb6929":[]
]
"data":[
{
"data_series":[],
"other":50
},
{
"data_series":[],
"other":40
},
{
"data_series":
{
"a3954710-5075-4f52-8eb4-1137be51bf14":
{
{
"dimentions":[2,10,12]
}
},
"31be3763-3d63-4e70-a9b6-d197b5cb6929":
{
{
"dimentions":[3,9,12]
}
}
},
"other":50
}
]
}
I want achieve something like the following-
"details" : {
"boxes":[
{
"box_name":"alpha"
"id":"replace_uuid"
},
}
.
.
.
"data":[ {
"data_series":
{
"replace_uuid":
{
{
"dimentions":[2,10,12]
}
}
]
In such a type of deeply nested dictionary, how can we replace all the occurrence of keys and values with another string, here replace_uuid?
I tried with pop() and dotty_dict but I wasn't able to replace the nested list.
I was able to achieve it in the following way-
def uuid_change(): #generate a random uuid
new_uuid = uuid.uuid4()
return str(new_uuid)
dict = json.load(f)
for uid in dict[details][boxes]:
old_id = uid['id']
replace_id = uuid_change()
uid['id'] = replace_id
for i in range(n):
for uid1 in dict['container'][i].keys()
if uid1 == old_id:
dict['container'][i][replace_id]
= dict['container'][i].pop(uid1) #replace the key
for uid2 in dict['data'][2]['data_series'].keys()
if uid2 == old_id:
dict['data'][2]['data_series'][replace_id]
= dict['data'][2]['data_series'].pop(uid2) #replace the key
I have a JSON file like this below and the keys in the custom_fields can vary for each id. I need to import this data into BigQuery but they don't allow field names to begin with a number. So, using Python 3.7, I am trying to find out how can I dynamically concatenate a value to the beginning of those keys within custom_fields without manually specifying each field name?
{
"response":[{
"id": "123",
"custom_fields":{
"5c30673efc89f7000400001d":"val1",
"5e34770a8e3d1b010a757981":"val2",
"5e3477d28e3d1b0140757993":"val3"
}},
{
"id": "456",
"custom_fields":{
"5c30673efc89f7000400001d":"val1",
"5e34770a8e3d1b010a757981":"val2",
"5e3477d28e3d1b0140757993":"val3"
}}]
}
The data is coming from an API and saved to cloud storage, with the output being retrieved and formatted to JSON with this:
response = urllib.request.Request('https://www.test.com')
result = urllib.request.urlopen(response)
resulttext = result.read()
jsonResponse = json.loads(resulttext.decode('utf-8'))
Desired output would be like:
{
"response":[{
"id": "123",
"custom_fields":{
"_5c30673efc89f7000400001d":"val1",
"_5e34770a8e3d1b010a757981":"val2",
"_5e3477d28e3d1b0140757993":"val3"
}},
{
"id": "456",
"custom_fields":{
"_5c30673efc89f7000400001d":"val1",
"_5e34770a8e3d1b010a757981":"val2",
"_5e3477d28e3d1b0140757993":"val3"
}}]
}
If the jsonResponse is like what you've shown in your post then this should do the job fine.
for d in jsonResponse["response"]:
d["custom_fields"] = {f"_{k}": v for k, v in d["custom_fields"].items()}
import pprint
a_dict = {
"id": "123",
"custom_fields":{
"5c30673efc89f7000400001d":"val1",
"5e34770a8e3d1b010a757981":"val2",
"5e3477d28e3d1b0140757993":"val3"
}
}
print('before')
pprint.pprint(a_dict)
for key in a_dict['custom_fields']:
k_new = '_' + key
a_dict['custom_fields'][k_new] = a_dict['custom_fields'].pop(key)
print('after')
pprint.pprint(a_dict)
outputs:
before
{'custom_fields': {'5c30673efc89f7000400001d': 'val1',
'5e34770a8e3d1b010a757981': 'val2',
'5e3477d28e3d1b0140757993': 'val3'},
'id': '123'}
after
{'custom_fields': {'_5c30673efc89f7000400001d': 'val1',
'_5e34770a8e3d1b010a757981': 'val2',
'_5e3477d28e3d1b0140757993': 'val3'},
'id': '123'}
following Update json nodes in Python using jsonpath, would like to know how one might update the JSON data given a certain context.
So, say we pick the exact same JSON example:
{
"SchemeId": 10,
"nominations": [
{
"nominationId": 1
}
]
}
But this time, would like to double the value of the original value, hence some lambda function is needed which takes into account the current node value.
No need for lambdas; for example, to double SchemeId, something like this should work:
data = json.loads("""the json string above""")
jsonpath_expr = parse('$.SchemeId')
jsonpath_expr.find(data)
val = jsonpath_expr.find(data)[0].value
jsonpath_expr.update(data, val*2)
print(json.dumps(data, indent=2))
Output:
{
"SchemeId": 20,
"nominations": [
{
"nominationId": 1
}
]
}
Here is example with lambda expression:
import json
from jsonpath_ng import parse
settings = '''{
"choices": {
"atm": {
"cs": "Strom",
"en": "Tree"
},
"bar": {
"cs": "Dům",
"en": "House"
},
"sea": {
"cs": "Moře",
"en": "Sea"
}
}
}'''
json_data = json.loads(settings)
pattern = parse('$.choices.*')
def magic(f: dict, to_lang='cs'):
return f[to_lang]
pattern.update(json_data,
lambda data_field, data, field: data.update({field: magic(data[field])}))
json_data
returns
{
'choices': {
'atm': 'Strom',
'bar': 'Dům',
'sea': 'Moře'
}
}
I have this structure, converted using json.load(json)
jsonData = [ {
thing: [
name: 'a name',
keys: [
key1: 23123,
key2: 83422
]
thing: [
name: 'another name',
keys: [
key1: 67564,
key2: 93453
]
etc....
} ]
I have key1check = 67564,
I want to check if a thing's key1 matches this value
if key1check in val['thing']['keys']['key1'] for val in jsonData:
print ('key found, has name of: {}'.format(jsonData['thing']['name'])
Should this work? Is there a better was to do this?
Not quite:
in is for inclusion in a sequence, such as a string or a list. You're comparing integer values, so a simple == is what you need.
Your given structure isn't legal Python: you have brackets in several places where you're intending a dictionary; you need braces instead.
Otherwise, you're doing fine ... but you should not ask us if it will work: ask the Python interpreter by running the code.
Try this for your structure:
jsonData = [
{ "thing": {
"name": 'a name',
"keys": {
"key1": 23123,
"key2": 83422
} } },
{ "thing": {
"name": 'another name',
"keys": {
"key1": 67564,
"key2": 93453
} } }
]
You can loop through #Prune 's dictionary using something like this as long as the structure is consistent.
for item in jsonData:
if item['thing']['keys']['key1'] == key1check:
print("true")
else:
print("false")