Datastructure for csv data

Datastructure for csv data - python

As indata I get for example the following (CSV-file):
1;data;data;data
1.1;data;data;data
1.1.1;data;data;data
1.1.2;data;data;data
2;data;data;data
2.1;data;data;data
etc...
I have tried to use a Tree data structure:
def Tree():
return collections.defaultdict(Tree)
But the problem is that I would like to store data as following:
t = Tree()
t[1][1] = [data,data,...,data]
...
t[1][1][1] = [data,data,...,data]
And this won't work with the data-structure defined above.

You cannot both store a key and children in a single dict entry:
tree[1] = [myData]
tree[1][5] = [myOtherData]
In this case, tree[1] would have to be both [myData] and {5: [MyOtherData]}
To work around it, you could store the data as a seperate element in the dict:
tree = {
1: {
'data': [myData]
5: {
'data': [myOtherData]
}
}
}
Now you can use:
tree[1]['data'] = [myData]
tree[1][5]['data'] = [myOtherData]
Or if you want, you can use another magic index like 0 (if 0 never occurs), but 'data' is probably clearer.

Related

Read List from stringname append

I have the following problem, I want to reference a variable from a string so that I can call up a list.
I enter the user into the function def fetch(user). e.g. name1
I would like from name1, read the list name1_skiplist
or from name2 read name2_skiplist
name1_skiplist = [('home', '/pic'),('home', '/jpg'),]
name2_skiplist = [('etc', '/pic'),('etc', '/jpg'),]
name3_skiplist = [('tmp', '/pic'),('tmp', '/jpg'),]
def fetch(user):
joinedlist = []
joinedlist = user + '_skiplist'
if joinedlist:
....

Dict is more suited for you use case to retrieve list based on your key.
data = {'name1_skiplist': [('home', '/pic'), ('home', '/jpg'), ],
'name2_skiplist': [('etc', '/pic'), ('etc', '/jpg'), ],
'name3_skiplist': [('tmp', '/pic'), ('tmp', '/jpg'), ]}
def fetch(user):
joinedlist = user + '_skiplist'
result = data.get(joinedlist)
return result

Organize related information in collections -- data structures like dicts, lists,
tuples, namedtuples, dataclasses, etc. In your case, assuming I understand
your goal, a dict is probably a decent choice. For example:
skips = {
'home': [('home', '/pic'), ('home', '/jpg')],
'etc': [('etc', '/pic'), ('etc', '/jpg')],
'tmp': [('tmp', '/pic'), ('tmp', '/jpg')],
}
An illustrated usage:
for name in skips:
sks = skips[name]
print(name, sks)

Python container troubles

Basically what I am trying to do is generate a json list of SSH keys (public and private) on a server using Python. I am using nested dictionaries and while it does work to an extent, the issue lies with it displaying every other user's keys; I need it to list only the keys that belong to the user for each user.
Below is my code:
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f) # gets the creation time of file (f)
username_list = f.split('/') # splits on the / character
user = username_list[2] # assigns the 2nd field frome the above spilt to the user variable
key_length_cmd = check_output(['ssh-keygen','-l','-f', f]) # Run the ssh-keygen command on the file (f)
attr_dict = {}
attr_dict['Date Created'] = str(datetime.datetime.fromtimestamp(c_time)) # converts file create time to string
attr_dict['Key_Length]'] = key_length_cmd[0:5] # assigns the first 5 characters of the key_length_cmd variable
ssh_user_key_dict[f] = attr_dict
user_dict['SSH_Keys'] = ssh_user_key_dict
main_dict[user] = user_dict
A list containing the absolute path of the keys (/home/user/.ssh/id_rsa for example) is passed to the function. Below is an example of what I receive:
{
"user1": {
"SSH_Keys": {
"/home/user1/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:20.995862",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:21.457867",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:21.423867",
"Key_Length]": "2048 "
},
"/home/user1/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:20.956862",
"Key_Length]": "2048 "
}
}
},
As can be seen, user2's key files are included in user1's output. I may be going about this completely wrong, so any pointers are welcomed.

Thanks for the replies, I read up on nested dictionaries and found that the best answer on this post, helped me solve the issue: What is the best way to implement nested dictionaries?
Instead of all the dictionaries, I simplfied the code and just have one dictionary now. This is the working code:
class Vividict(dict):
def __missing__(self, key): # Sets and return a new instance
value = self[key] = type(self)() # retain local pointer to value
return value # faster to return than dict lookup
main_dict = Vividict()
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f)
username_list = f.split('/')
user = username_list[2]
key_bit_cmd = check_output(['ssh-keygen','-l','-f', f])
date_created = str(datetime.datetime.fromtimestamp(c_time))
key_type = key_bit_cmd[-5:-2]
key_bits = key_bit_cmd[0:5]
main_dict[user]['SSH Keys'][f]['Date Created'] = date_created
main_dict[user]['SSH Keys'][f]['Key Type'] = key_type
main_dict[user]['SSH Keys'][f]['Bits'] = key_bits

Python to dynamically build JSON with sub arrays

I can build JSON from simple dictionary {} and List [], but when I try to build more complex structures. I get '\' embedded in the output JSON.
The structure I want:
{"name": "alpha",
"results": [{"entry1":
[
{"sub1": "one"},
{"sub2": "two"}
]
},
{"entry2":
[
{"sub1": "one"},
{"sub2": "two"}
]
}
]
}
This is what I get:
{'name': 'alpha',
'results': '[{"entry1": "[{\\\\"sub1\\": \\\\"one\\\\"}, {\\\\"sub2\\\\": '
'\\\\"two\\\\"}]"}, {"entry2": "[{\\\\"sub1\\\\": \\\\"one\\\\"},
{\\\\"sub2\\\\": '
'\\\\"two\\\\"}]"}]'}
Note the embedded \\. Every time the code goes through json.dumps another \ is appended.
Here's code that almost works, but doesn't:
import json
import pprint
testJSON = {}
testJSON["name"] = "alpha"
#build sub entry List
entry1List = []
entry2List = []
topList = []
a1 = {}
a2 = {}
a1["sub1"] = "one"
a2["sub2"] = "two"
entry1List.append(a1)
entry1List.append(a2)
entry2List.append(a1)
entry2List.append(a2)
# build sub entry JSON values for Top List
tmpDict1 = {}
tmpDict2 = {}
tmpDict1["entry1"] = json.dumps(entry1List)
tmpDict2["entry2"] = json.dumps(entry2List)
topList.append(tmpDict1)
topList.append(tmpDict2)
# Now lets' add the List with 2 sub List to the JSON
testJSON["results"] = json.dumps(topList)
pprint.pprint (testJSON)

Look at this line:
tmpDict1["entry1"] = json.dumps(entry1List)
This is specifying that key entry1 have the value of the string output of converting entry1List to json. In essence, it's putting JSON in a JSON string, so it's escaped. To nest the datastructure, I'd go with:
tmpDict1["entry1"] = entry1List
Same with the other places. Once there is a tree of lists and dicts - you should only need to call json.dumps() once on the root container (either a dict or a list).

How to decode dataTables Editor form in python flask?

I have a flask application which is receiving a request from dataTables Editor. Upon receipt at the server, request.form looks like (e.g.)
ImmutableMultiDict([('data[59282][gender]', u'M'), ('data[59282][hometown]', u''),
('data[59282][disposition]', u''), ('data[59282][id]', u'59282'),
('data[59282][resultname]', u'Joe Doe'), ('data[59282][confirm]', 'true'),
('data[59282][age]', u'27'), ('data[59282][place]', u'3'), ('action', u'remove'),
('data[59282][runnerid]', u''), ('data[59282][time]', u'29:49'),
('data[59282][club]', u'')])
I am thinking to use something similar to this really ugly code to decode it. Is there a better way?
from collections import defaultdict
# request.form comes in multidict [('data[id][field]',value), ...]
# so we need to exec this string to turn into python data structure
data = defaultdict(lambda: {}) # default is empty dict
# need to define text for each field to be received in data[id][field]
age = 'age'
club = 'club'
confirm = 'confirm'
disposition = 'disposition'
gender = 'gender'
hometown = 'hometown'
id = 'id'
place = 'place'
resultname = 'resultname'
runnerid = 'runnerid'
time = 'time'
# fill in data[id][field] = value
for formkey in request.form.keys():
exec '{} = {}'.format(d,repr(request.form[formkey]))

This question has an accepted answer and is a bit old but since the DataTable module seems being pretty popular among jQuery community still, I believe this approach may be useful for someone else. I've just wrote a simple parsing function based on regular expression and dpath module, though it appears not to be quite reliable module. The snippet may be not very straightforward due to an exception-relied fragment, but it was only one way to prevent dpath from trying to resolve strings as integer indices I found.
import re, dpath.util
rxsKey = r'(?P<key>[^\W\[\]]+)'
rxsEntry = r'(?P<primaryKey>[^\W]+)(?P<secondaryKeys>(\[' \
+ rxsKey \
+ r'\])*)\W*'
rxKey = re.compile(rxsKey)
rxEntry = re.compile(rxsEntry)
def form2dict( frmDct ):
res = {}
for k, v in frmDct.iteritems():
m = rxEntry.match( k )
if not m: continue
mdct = m.groupdict()
if not 'secondaryKeys' in mdct.keys():
res[mdct['primaryKey']] = v
else:
fullPath = [mdct['primaryKey']]
for sk in re.finditer( rxKey, mdct['secondaryKeys'] ):
k = sk.groupdict()['key']
try:
dpath.util.get(res, fullPath)
except KeyError:
dpath.util.new(res, fullPath, [] if k.isdigit() else {})
fullPath.append(int(k) if k.isdigit() else k)
dpath.util.new(res, fullPath, v)
return res
The practical usage is based on native flask request.form.to_dict() method:
# ... somewhere in a view code
pars = form2dict(request.form.to_dict())
The output structure includes both, dictionary and lists, as one could expect. E.g.:
# A little test:
rs = jQDT_form2dict( {
'columns[2][search][regex]' : False,
'columns[2][search][value]' : None,
'columns[2][search][regex]' : False,
} )
generates:
{
"columns": [
null,
null,
{
"search": {
"regex": false,
"value": null
}
}
]
}
Update: to handle lists as dictionaries (in more efficient way) one may simplify this snippet with following block at else part of if clause:
# ...
else:
fullPathStr = mdct['primaryKey']
for sk in re.finditer( rxKey, mdct['secondaryKeys'] ):
fullPathStr += '/' + sk.groupdict()['key']
dpath.util.new(res, fullPathStr, v)

I decided on a way that is more secure than using exec:
from collections import defaultdict
def get_request_data(form):
'''
return dict list with data from request.form
:param form: MultiDict from `request.form`
:rtype: {id1: {field1:val1, ...}, ...} [fieldn and valn are strings]
'''
# request.form comes in multidict [('data[id][field]',value), ...]
# fill in id field automatically
data = defaultdict(lambda: {})
# fill in data[id][field] = value
for formkey in form.keys():
if formkey == 'action': continue
datapart,idpart,fieldpart = formkey.split('[')
if datapart != 'data': raise ParameterError, "invalid input in request: {}".format(formkey)
idvalue = int(idpart[0:-1])
fieldname = fieldpart[0:-1]
data[idvalue][fieldname] = form[formkey]
# return decoded result
return data

Recursive function to create hierarchical JSON object?

I'm just not a good enough computer scientist to figure this out by myself :(
I have an API that returns JSON responses that look like this:
// call to /api/get/200
{ id : 200, name : 'France', childNode: [ id: 400, id: 500] }
// call to /api/get/400
{ id : 400, name : 'Paris', childNode: [ id: 882, id: 417] }
// call to /api/get/500
{ id : 500, name : 'Lyon', childNode: [ id: 998, id: 104] }
// etc
I would like to parse it recursively and build a hierarchical JSON object that looks something like this:
{ id: 200,
name: 'France',
children: [
{ id: 400,
name: 'Paris',
children: [...]
},
{ id: 500,
name: 'Lyon',
children: [...]
}
],
}
So far, I have this, which does parse every node of the tree, but doesn't save it into a JSON object. How can I expand this to save it into the JSON object?
hierarchy = {}
def get_child_nodes(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
for childnode in response['childNode']:
temp_obj = {}
temp_obj['id'] = childnode['id']
temp_obj['name'] = childnode['name']
children = get_child_nodes(temp_obj['id'])
// How to save temp_obj into the hierarchy?
get_child_nodes(ROOT_NODE)
This isn't homework, but maybe I need to do some homework to get better at solving this kind of problem :( Thank you for any help.

def get_node(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
temp_obj = {}
temp_obj['id'] = response['id']
temp_obj['name'] = response['name']
temp_obj['children'] = [get_node(child['id']) for child in response['childNode']]
return temp_obj
hierarchy = get_node(ROOT_NODE)

You could use this (a more compact and readable version)
def get_child_nodes(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
return {
"id":response['id'],
"name":response['name'],
"children":map(lambda childId: get_child_nodes(childId), response['childNode'])
}
get_child_nodes(ROOT_NODE)

You're not returning anything from each call to the recursive function. So, it seems like you just want to append each temp_obj dictionary into a list on each iteration of the loop, and return it after the end of the loop. Something like:
def get_child_nodes(node_id):
request = urllib2.Request(ROOT_URL + node_id)
response = json.loads(urllib2.urlopen(request).read())
nodes = []
for childnode in response['childNode']:
temp_obj = {}
temp_obj['id'] = childnode['id']
temp_obj['name'] = childnode['name']
temp_obj['children'] = get_child_nodes(temp_obj['id'])
nodes.append(temp_obj)
return nodes
my_json_obj = json.dumps(get_child_nodes(ROOT_ID))
(BTW, please beware of mixing tabs and spaces as Python isn't very forgiving of that. Best to stick to just spaces.)

I had the same problem this afternoon, and ended up rejigging some code I found online.
I've uploaded the code to Github (https://github.com/abmohan/objectjson) as well as PyPi (https://pypi.python.org/pypi/objectjson/0.1) under the package name 'objectjson'. Here it is below, as well:
Code (objectjson.py)
import json
class ObjectJSON:
def __init__(self, json_data):
self.json_data = ""
if isinstance(json_data, str):
json_data = json.loads(json_data)
self.json_data = json_data
elif isinstance(json_data, dict):
self.json_data = json_data
def __getattr__(self, key):
if key in self.json_data:
if isinstance(self.json_data[key], (list, dict)):
return ObjectJSON(self.json_data[key])
else:
return self.json_data[key]
else:
raise Exception('There is no json_data[\'{key}\'].'.format(key=key))
def __repr__(self):
out = self.__dict__
return '%r' % (out['json_data'])
Sample Usage
from objectjson import ObjectJSON
json_str = '{ "test": {"a":1,"b": {"c":3} } }'
json_obj = ObjectJSON(json_str)
print(json_obj) # {'test': {'b': {'c': 3}, 'a': 1}}
print(json_obj.test) # {'b': {'c': 3}, 'a': 1}
print(json_obj.test.a) # 1
print(json_obj.test.b.c) # 3

Disclaimer : I have no idea what json is about, so you may have to sort out how to write it correctly in your language :p. If the pseudo-code in my example is too pseudo, feel free to ask more details.
You need to return something somewhere. If you never return something in your recursive call, you can't get the reference to your new objects and store it in the objects you have where you called the recursion.
def getChildNodes (node) returns [array of childNodes]
data = getData(fromServer(forThisNode))
new childNodes array
for child in data :
new temp_obj
temp_obj.stores(child.interestingStuff)
for grandchild in getChildNodes(child) :
temp_obj.arrayOfchildren.append(grandchild)
array.append(temp_obj)
return array
Alternatively, you can use an iterator instead of a return, if your language supports it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Datastructure for csv data - python

Related

Read List from stringname append

Python container troubles

Python to dynamically build JSON with sub arrays

How to decode dataTables Editor form in python flask?

Recursive function to create hierarchical JSON object?

Categories

Resources