Recursive function to create hierarchical JSON object?

Recursive function to create hierarchical JSON object? - python

I'm just not a good enough computer scientist to figure this out by myself :(
I have an API that returns JSON responses that look like this:
// call to /api/get/200
{ id : 200, name : 'France', childNode: [ id: 400, id: 500] }
// call to /api/get/400
{ id : 400, name : 'Paris', childNode: [ id: 882, id: 417] }
// call to /api/get/500
{ id : 500, name : 'Lyon', childNode: [ id: 998, id: 104] }
// etc
I would like to parse it recursively and build a hierarchical JSON object that looks something like this:
{ id: 200,
name: 'France',
children: [
{ id: 400,
name: 'Paris',
children: [...]
},
{ id: 500,
name: 'Lyon',
children: [...]
}
],
}
So far, I have this, which does parse every node of the tree, but doesn't save it into a JSON object. How can I expand this to save it into the JSON object?
hierarchy = {}
def get_child_nodes(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
for childnode in response['childNode']:
temp_obj = {}
temp_obj['id'] = childnode['id']
temp_obj['name'] = childnode['name']
children = get_child_nodes(temp_obj['id'])
// How to save temp_obj into the hierarchy?
get_child_nodes(ROOT_NODE)
This isn't homework, but maybe I need to do some homework to get better at solving this kind of problem :( Thank you for any help.

def get_node(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
temp_obj = {}
temp_obj['id'] = response['id']
temp_obj['name'] = response['name']
temp_obj['children'] = [get_node(child['id']) for child in response['childNode']]
return temp_obj
hierarchy = get_node(ROOT_NODE)

You could use this (a more compact and readable version)
def get_child_nodes(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
return {
"id":response['id'],
"name":response['name'],
"children":map(lambda childId: get_child_nodes(childId), response['childNode'])
}
get_child_nodes(ROOT_NODE)

You're not returning anything from each call to the recursive function. So, it seems like you just want to append each temp_obj dictionary into a list on each iteration of the loop, and return it after the end of the loop. Something like:
def get_child_nodes(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
nodes = []
for childnode in response['childNode']:
temp_obj = {}
temp_obj['id'] = childnode['id']
temp_obj['name'] = childnode['name']
temp_obj['children'] = get_child_nodes(temp_obj['id'])
nodes.append(temp_obj)
return nodes
my_json_obj = json.dumps(get_child_nodes(ROOT_ID))
(BTW, please beware of mixing tabs and spaces as Python isn't very forgiving of that. Best to stick to just spaces.)

I had the same problem this afternoon, and ended up rejigging some code I found online.
I've uploaded the code to Github (https://github.com/abmohan/objectjson) as well as PyPi (https://pypi.python.org/pypi/objectjson/0.1) under the package name 'objectjson'. Here it is below, as well:
Code (objectjson.py)
import json
class ObjectJSON:
def __init__(self, json_data):
self.json_data = ""
if isinstance(json_data, str):
json_data = json.loads(json_data)
self.json_data = json_data
elif isinstance(json_data, dict):
self.json_data = json_data
def __getattr__(self, key):
if key in self.json_data:
if isinstance(self.json_data[key], (list, dict)):
return ObjectJSON(self.json_data[key])
else:
return self.json_data[key]
else:
raise Exception('There is no json_data[\'{key}\'].'.format(key=key))
def __repr__(self):
out = self.__dict__
return '%r' % (out['json_data'])
Sample Usage
from objectjson import ObjectJSON
json_str = '{ "test": {"a":1,"b": {"c":3} } }'
json_obj = ObjectJSON(json_str)
print(json_obj) # {'test': {'b': {'c': 3}, 'a': 1}}
print(json_obj.test) # {'b': {'c': 3}, 'a': 1}
print(json_obj.test.a) # 1
print(json_obj.test.b.c) # 3

Disclaimer : I have no idea what json is about, so you may have to sort out how to write it correctly in your language :p. If the pseudo-code in my example is too pseudo, feel free to ask more details.
You need to return something somewhere. If you never return something in your recursive call, you can't get the reference to your new objects and store it in the objects you have where you called the recursion.
def getChildNodes (node) returns [array of childNodes]
data = getData(fromServer(forThisNode))
new childNodes array
for child in data :
new temp_obj
temp_obj.stores(child.interestingStuff)
for grandchild in getChildNodes(child) :
temp_obj.arrayOfchildren.append(grandchild)
array.append(temp_obj)
return array
Alternatively, you can use an iterator instead of a return, if your language supports it.

Related

Can't access nested dictionary data **kwargs python and best practices for nested data mutation in GraphQL

In Graphene-Django and GraphQL I am trying to create a resolve_or_create method for nested data creation inside my mutations.
I'm trying to pass a dictionary with the input from the user as a **kwarg to my resolve_or_create function, and though I can see "location" in the variable watcher (in VSCode), I continuously am getting the error 'dict' object has no attribute 'location'
Here's my resolve_or_create method:
def resolve_or_create(*args, **kwargs):
input = {}
result = {}
input.location = kwargs.get('location', None)
if input.location is not None:
input.location = Location.objects.filter(pk=input.location.id).first()
if input.location is None:
location = Location.objects.create(
location_city = input.location.location_city,
location_state = input.location.location_state,
location_sales_tax_rate = input.location.location_sales_tax_rate
)
if location is None:
return None
result.location = location
return result
and my CreateCustomer definition, where this method is being called
class CreateCustomer(graphene.Mutation):
class Arguments:
input = CustomerInput(required=True)
ok = graphene.Boolean()
customer = graphene.Field(CustomerType)
#staticmethod
def mutate(root, info, input=None):
ok = True
resolved = resolve_or_create(**{'location':input.customer_city})
customer_instance = Customer(
customer_name = input.customer_name,
customer_address = input.customer_address,
customer_city = resolved.location,
customer_state = input.customer_state,
customer_zip = input.customer_zip,
customer_email = input.customer_email,
customer_cell_phone = input.customer_cell_phone,
customer_home_phone = input.customer_home_phone,
referred_from = input.referred_from
)
customer_instance.save()
return CreateCustomer(ok=ok, customer=customer_instance)
Here is an example mutation that would create a new customer with an existing location
mutation createCustomer {
createCustomer(input: {
customerName: "Ricky Bobby",
customerAddress: "1050 Airport Drive",
customerCity: {id:1},
customerState: "TX",
customerZip: "75222",
customerEmail: "mem",
customerCellPhone: "124567894",
referredFrom: "g"
}) {
ok,
customer{
id,
customerName,
customerAddress,
customerCity {
id
},
customerState,
customerZip,
customerEmail,
customerCellPhone,
referredFrom
}
}
}
and here is an example mutation that would create a customer with a new location
mutation createCustomer {
createCustomer(input: {
customerName: "Ricky Bobby",
customerAddress: "1050 Airport Drive",
customerCity: {locationCity: "Dallas", locationState: "TX", locationSalesTaxRate:7.77},
customerState: "TX",
customerZip: "75222",
customerEmail: "mem",
customerCellPhone: "124567894",
referredFrom: "g"
}) {
ok,
customer{
id,
customerName,
customerAddress,
customerCity {
id
},
customerState,
customerZip,
customerEmail,
customerCellPhone,
referredFrom
}
}
}
So my question is two-fold.
First, how can I retrieve my passed in location dict from kwargs?
Second, is there a better way to be resolving and creating nested data than this? Would this be expected behavior in a best-practice GraphQL API?
I have also tried resolve_or_create(location=input.customer_city})

I realized my syntax for dictionary assignment and access was wrong.
Corrected function:
def resolve_or_create(*args, **kwargs):
input = {}
result = {}
candidate = None
input['location'] = kwargs['location']
input['customer'] = kwargs.get('customer', None)
input['technician'] = kwargs.get('tech', None)
if input['location'] is not None:
if 'id' in input['location']:
candidate = Location.objects.filter(pk=input['location']['id']).first()
result['location'] = candidate
if candidate is None:
result['location'] = Location.objects.create(
location_city = input['location']['location_city'],
location_state = input['location']['location_state'],
location_sales_tax_rate = input['location']['location_sales_tax_rate']
)
if result['location'] is None:
return None
return result
I would still like to discuss the most effective way to accomplish created and mutating nested data in graphene-django. Is there a DRYer way to achieve what I'm looking to do or something I'm missing?

Appending dictionaries when in a for loop

Sorry if this has been asked, I've searched a ton and not finding what I'm looking for. Lots of ways to combine dictionaries, but haven't found anything for combining in a loop that applies.
I have a function (masternodes()) that will return a filtered queryset. With each object returned from that queryset, I loop through and query another function (masternodeDetail()) to get some details about that object. The function (masternodeDetail()) returns a dictionary. I'd like to combine the dictionaries returned in to a single dictionary so I can pass that to my template.
Here's my code I've been working with:
def masternodes(request):
user = request.user
mn = {}
queryset = Masternode.objects.filter(created_by=user)
for n in queryset:
mn = masternodeDetail(n.wallet_address, n.coin_symbol)
print(mn)
args = {'masternodes': queryset, 'masternodetail': mn}
return render(request, 'monitor.html', args)
def masternodeDetail(wallet, ticker):
monitorinfo = MonitorInfo.objects.get(coin_symbol=ticker)
coin_symbol = monitorinfo.coin_symbol
daemon = monitorinfo.daemon
rpc_user = monitorinfo.rpc_user
rpc_password = monitorinfo.rpc_password
rpc_port = monitorinfo.rpc_port
coin_name = monitorinfo.coin_name
notes = monitorinfo.notes
masternode_reward = monitorinfo.masternode_reward
coingecko_id = monitorinfo.coingecko_id
json_output = monitorinfo.json_output
headers = {'content-type': 'text/plain'}
payload = {"method": "masternodelist", "params": ['full', wallet], "jsonrpc": "1.0", "id": "MN Monitoring v0.1"}
response = requests.post("http://" + daemon + ":" + rpc_port, json=payload, headers=headers, auth=(rpc_user,rpc_password)).json()
res = response["result"]
if json_output == False:
for k,v in res.items():
list = v.split()
## status protocol payee lastseen activeseconds lastpaidtime lastpaidblock IP ##
status = list[0]
protocol = list[1]
payee = list[2]
lastseen = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(float(list[3])))
activeseconds = datetime.timedelta(seconds=float(list[4]))
lastpaidtime = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(float(list[5])))
lastpaidblock = list[6]
nodeip = list[7].split(':', 1)[0]
masternode_result = {payee: {'coin':coin_symbol, 'coin_name':coin_name, 'notes':notes, 'masternode_reward':masternode_reward,
'coingecko_id':coingecko_id, 'status':status, 'protocol':protocol, 'payee':payee, 'lastseen': lastseen,
'activeseconds': activeseconds, 'lastpaidtime': lastpaidtime, 'lastpaidblock': lastpaidblock, 'nodeip': nodeip}}
masternodes_result = {**masternode_result, **masternode_result}
return masternodes_result
else:
for txid in res:
status = res[txid]['status']
nodeip = res[txid]['address'].split(':', 1)[0]
protocol = res[txid]['protocol']
payee = res[txid]['payee']
lastseen = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(float(res[txid]['lastseen'])))
activeseconds = datetime.timedelta(seconds=float(res[txid]['activeseconds']))
lastpaidtime = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(float(res[txid]['lastpaidtime'])))
lastpaidblock = res[txid]['lastpaidblock']
masternode_result = {payee: {'coin':coin_symbol, 'coin_name':coin_name, 'notes':notes, 'masternode_reward':masternode_reward,
'coingecko_id':coingecko_id, 'status':status, 'protocol':protocol, 'payee':payee, 'lastseen': lastseen,
'activeseconds': activeseconds, 'lastpaidtime': lastpaidtime, 'lastpaidblock': lastpaidblock, 'nodeip': nodeip}}
masternodes_result = {**masternode_result, **masternode_result}
return masternodes_result
How can I combine this in to a single dictionary or get all the info I need back to my template? You can see one of my many attempts using the {,} method I've used before but since it's not two different named dictionaries, I'guessing that's why it's not working. Right now, the last dictionary returned from my loop is what's getting passed. Maybe I'm approaching this completely wrong and there's a better way to get this info? Suggestions?

I don't think you want a "single dictionary", I can't imagine what that would look like. Instead you probably want a list of dictionaries - in which case you could do:
mn = []
for n in queryset:
mn.append(masternodeDetail(n.wallet_address, n.coin_symbol))
or even shorter:
mn = [masternodeDetail(n.wallet_address, n.coin_symbol) for n in queryset]

Start with an empty dictionary full_dict and use full_dict.update(d) to add dictionary d to full_dict.
Be aware that if there are commons keys between full_dict and d, they will be replaced.
The following example code starts with an empty dict and updates it with each dict passed in a for loop. Hopefully you can adapt this idea into your code:
full_dict = {}
dict_a = {"a": 10, "b": 20}
dict_b = {"b": 100, "c": 120}
for d in (dict_a, dict_b):
full_dict.update(d)
print(full_dict) # {"a": 10, "b": 100, "c": 120}

Python container troubles

Basically what I am trying to do is generate a json list of SSH keys (public and private) on a server using Python. I am using nested dictionaries and while it does work to an extent, the issue lies with it displaying every other user's keys; I need it to list only the keys that belong to the user for each user.
Below is my code:
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f) # gets the creation time of file (f)
username_list = f.split('/') # splits on the / character
user = username_list[2] # assigns the 2nd field frome the above spilt to the user variable
key_length_cmd = check_output(['ssh-keygen','-l','-f', f]) # Run the ssh-keygen command on the file (f)
attr_dict = {}
attr_dict['Date Created'] = str(datetime.datetime.fromtimestamp(c_time)) # converts file create time to string
attr_dict['Key_Length]'] = key_length_cmd[0:5] # assigns the first 5 characters of the key_length_cmd variable
ssh_user_key_dict[f] = attr_dict
user_dict['SSH_Keys'] = ssh_user_key_dict
main_dict[user] = user_dict
A list containing the absolute path of the keys (/home/user/.ssh/id_rsa for example) is passed to the function. Below is an example of what I receive:
{
"user1": {
"SSH_Keys": {
"/home/user1/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:20.995862",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:21.457867",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:21.423867",
"Key_Length]": "2048 "
},
"/home/user1/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:20.956862",
"Key_Length]": "2048 "
}
}
},
As can be seen, user2's key files are included in user1's output. I may be going about this completely wrong, so any pointers are welcomed.

Thanks for the replies, I read up on nested dictionaries and found that the best answer on this post, helped me solve the issue: What is the best way to implement nested dictionaries?
Instead of all the dictionaries, I simplfied the code and just have one dictionary now. This is the working code:
class Vividict(dict):
def __missing__(self, key): # Sets and return a new instance
value = self[key] = type(self)() # retain local pointer to value
return value # faster to return than dict lookup
main_dict = Vividict()
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f)
username_list = f.split('/')
user = username_list[2]
key_bit_cmd = check_output(['ssh-keygen','-l','-f', f])
date_created = str(datetime.datetime.fromtimestamp(c_time))
key_type = key_bit_cmd[-5:-2]
key_bits = key_bit_cmd[0:5]
main_dict[user]['SSH Keys'][f]['Date Created'] = date_created
main_dict[user]['SSH Keys'][f]['Key Type'] = key_type
main_dict[user]['SSH Keys'][f]['Bits'] = key_bits

How to decode dataTables Editor form in python flask?

I have a flask application which is receiving a request from dataTables Editor. Upon receipt at the server, request.form looks like (e.g.)
ImmutableMultiDict([('data[59282][gender]', u'M'), ('data[59282][hometown]', u''),
('data[59282][disposition]', u''), ('data[59282][id]', u'59282'),
('data[59282][resultname]', u'Joe Doe'), ('data[59282][confirm]', 'true'),
('data[59282][age]', u'27'), ('data[59282][place]', u'3'), ('action', u'remove'),
('data[59282][runnerid]', u''), ('data[59282][time]', u'29:49'),
('data[59282][club]', u'')])
I am thinking to use something similar to this really ugly code to decode it. Is there a better way?
from collections import defaultdict
# request.form comes in multidict [('data[id][field]',value), ...]
# so we need to exec this string to turn into python data structure
data = defaultdict(lambda: {}) # default is empty dict
# need to define text for each field to be received in data[id][field]
age = 'age'
club = 'club'
confirm = 'confirm'
disposition = 'disposition'
gender = 'gender'
hometown = 'hometown'
id = 'id'
place = 'place'
resultname = 'resultname'
runnerid = 'runnerid'
time = 'time'
# fill in data[id][field] = value
for formkey in request.form.keys():
exec '{} = {}'.format(d,repr(request.form[formkey]))

This question has an accepted answer and is a bit old but since the DataTable module seems being pretty popular among jQuery community still, I believe this approach may be useful for someone else. I've just wrote a simple parsing function based on regular expression and dpath module, though it appears not to be quite reliable module. The snippet may be not very straightforward due to an exception-relied fragment, but it was only one way to prevent dpath from trying to resolve strings as integer indices I found.
import re, dpath.util
rxsKey = r'(?P<key>[^\W\[\]]+)'
rxsEntry = r'(?P<primaryKey>[^\W]+)(?P<secondaryKeys>(\[' \
+ rxsKey \
+ r'\])*)\W*'
rxKey = re.compile(rxsKey)
rxEntry = re.compile(rxsEntry)
def form2dict( frmDct ):
res = {}
for k, v in frmDct.iteritems():
m = rxEntry.match( k )
if not m: continue
mdct = m.groupdict()
if not 'secondaryKeys' in mdct.keys():
res[mdct['primaryKey']] = v
else:
fullPath = [mdct['primaryKey']]
for sk in re.finditer( rxKey, mdct['secondaryKeys'] ):
k = sk.groupdict()['key']
try:
dpath.util.get(res, fullPath)
except KeyError:
dpath.util.new(res, fullPath, [] if k.isdigit() else {})
fullPath.append(int(k) if k.isdigit() else k)
dpath.util.new(res, fullPath, v)
return res
The practical usage is based on native flask request.form.to_dict() method:
# ... somewhere in a view code
pars = form2dict(request.form.to_dict())
The output structure includes both, dictionary and lists, as one could expect. E.g.:
# A little test:
rs = jQDT_form2dict( {
'columns[2][search][regex]' : False,
'columns[2][search][value]' : None,
'columns[2][search][regex]' : False,
} )
generates:
{
"columns": [
null,
null,
{
"search": {
"regex": false,
"value": null
}
}
]
}
Update: to handle lists as dictionaries (in more efficient way) one may simplify this snippet with following block at else part of if clause:
# ...
else:
fullPathStr = mdct['primaryKey']
for sk in re.finditer( rxKey, mdct['secondaryKeys'] ):
fullPathStr += '/' + sk.groupdict()['key']
dpath.util.new(res, fullPathStr, v)

I decided on a way that is more secure than using exec:
from collections import defaultdict
def get_request_data(form):
'''
return dict list with data from request.form
:param form: MultiDict from `request.form`
:rtype: {id1: {field1:val1, ...}, ...} [fieldn and valn are strings]
'''
# request.form comes in multidict [('data[id][field]',value), ...]
# fill in id field automatically
data = defaultdict(lambda: {})
# fill in data[id][field] = value
for formkey in form.keys():
if formkey == 'action': continue
datapart,idpart,fieldpart = formkey.split('[')
if datapart != 'data': raise ParameterError, "invalid input in request: {}".format(formkey)
idvalue = int(idpart[0:-1])
fieldname = fieldpart[0:-1]
data[idvalue][fieldname] = form[formkey]
# return decoded result
return data

Datastructure for csv data

As indata I get for example the following (CSV-file):
1;data;data;data
1.1;data;data;data
1.1.1;data;data;data
1.1.2;data;data;data
2;data;data;data
2.1;data;data;data
etc...
I have tried to use a Tree data structure:
def Tree():
return collections.defaultdict(Tree)
But the problem is that I would like to store data as following:
t = Tree()
t[1][1] = [data,data,...,data]
...
t[1][1][1] = [data,data,...,data]
And this won't work with the data-structure defined above.

You cannot both store a key and children in a single dict entry:
tree[1] = [myData]
tree[1][5] = [myOtherData]
In this case, tree[1] would have to be both [myData] and {5: [MyOtherData]}
To work around it, you could store the data as a seperate element in the dict:
tree = {
1: {
'data': [myData]
5: {
'data': [myOtherData]
}
}
}
Now you can use:
tree[1]['data'] = [myData]
tree[1][5]['data'] = [myOtherData]
Or if you want, you can use another magic index like 0 (if 0 never occurs), but 'data' is probably clearer.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Recursive function to create hierarchical JSON object? - python

Related

Can't access nested dictionary data **kwargs python and best practices for nested data mutation in GraphQL

Appending dictionaries when in a for loop

Python container troubles

How to decode dataTables Editor form in python flask?

Datastructure for csv data

Categories

Resources