i want to reduce the number of times i hit the data base to retrieve data. So i think taking entire data into a tree will increase the performance of the system as i will not be hitting the database frequently.
I am a beginner in python, so please do help and advice me in creating the tree structure.
You can use nested dictionaries. Nested means that the value of a key:value pair can be another dictionary.
JerseyMike has given a nice example, I just want to point out that his addItemAttributes function is equivalent to the more concise
def addItemAttributes(tree, idList):
(menu, cat, subcat, item, attribs) = idList;
currDict = tree.setdefault(menu, {})\
.setdefault(cat, {})\
.setdefault(subcat, {})\
.setdefault(item, {})
for a in attribs:
currDict[a[0]] = a[1]
...and that you might like to wrap getItemAttributes in a try block so you can deal with the case that one of the keys is missing, eg.
try:
getItemAttributes(...)
except KeyError:
#key was incorrect, deal with the situation
I'm putting this together on the fly since I don't have my Python interpreter handy. Here are two functions for populating and reading from the nested dictionary structures. The first parameter passed to each is the base dictionary that will hold all of your menu information.
This function is used to add to the dictionaries. This function can be probably be more efficient, code-wise, but I wanted you to understand what was going on. For each item of each subcategory of each category of menu, you will need to construct a list of the attributes to pass in.
# idList is a tuple consisting of the following elements:
# menu: string - menu name
# cat: string - category name
# subcat: string - subcategory name
# item: string - item name
# attribs: list - a list of the attributes tied to this item in the form of
# [('Price', '7.95'),('ContainsPeanuts', 'Yes'),('Vegan', 'No'),...].
# You can do the attribs another way, this was convenient for
# the example.
def addItemAttributes(tree, idList):
(menu, cat, subcat, item, attribs) = idList;
if not tree.has_key(menu): # if the menu does not exist...
tree[menu] = {} # Create a new dictionary for this menu
currDict = tree[menu] # currDict now holds the menu dictionary
if not currDict.has_key(cat): # if the category does not exist...
currDict[cat] = {} # Create a new dictionary for this category
currDict = currDict[cat] # currDict now holds the category dictionary
if not currDict.has_key(subcat): # if the subcategory does not exist...
currDict[subcat] = {} # Create a new dictionary for this subcategory
currDict = currDict[subcat] # currDict now holds the subcategory dictionary
if not currDict.has_key(item): # if the category does not exist...
currDict[item] = {} # Create a new dictionary for this category
currDict = currDict[item] # currDict now holds the category dictionary
for a in attribs
currDict[a(0)] = a(1)
The function to read from the nested structure is easier to follow:
# Understand that if any of the vaules passed to the following function
# have not been loaded, you will get an error. This is the quick and
# dirty way. Thank you to Janne for jarring my mind to the try/except.
def getItemAttributes(tree, menu, cat, subcat, item):
try:
return tree[menu][cat][subcat][item].items()
except KeyError:
# take care of missing keys
I hope this helps. :)
Related
I am working on a code which pulls data from database and based on the different type of tables , store the data in dictionary for further usage.
This code handles around 20-30 different table so there are 20-30 dictionaries and few lists which I have defined as class variables for further usage in code.
for example.
class ImplVars(object):
#dictionary capturing data from Asset-Feed table
general_feed_dict = {}
ports_feed_dict = {}
vulns_feed_dict = {}
app_list = []
...
I want to clear these dictionaries before I add data in it.
Easiest or common way is to use clear() function but this code is repeatable as I will have to write for each dict.
Another option I am exploring is with using dir() function but its returning variable names as string.
Is there any elegant method which will allow me to fetch all these class variables and clear them ?
You can use introspection as you suggest:
for d in filter(dict.__instancecheck__, ImplVars.__dict__.values()):
d.clear()
Or less cryptic, covering lists and dicts:
for obj in ImplVars.__dict__.values():
if isinstance(obj, (list, dict)):
obj.clear()
But I would recommend you choose a bit of a different data structure so you can be more explicit:
class ImplVars(object):
data_dicts = {
"general_feed_dict": {},
"ports_feed_dict": {},
"vulns_feed_dict": {},
}
Now you can explicitly loop over ImplVars.data_dicts.values and still have other class variables that you may not want to clear.
code:
a_dict = {1:2}
b_dict = {2:4}
c_list = [3,6]
vars_copy = vars().copy()
for variable, value in vars_copy.items():
if variable.endswith("_dict"):
vars()[variable] = {}
elif variable.endswith("_list"):
vars()[variable] = []
print(a_dict)
print(b_dict)
print(c_list)
result:
{}
{}
[]
Maybe one of the easier kinds of implementation would be to create a list of dictionaries and lists you want to clear and later make the loop clear them all.
d = [general_feed_dict, ports_feed_dict, vulns_feed_dict, app_list]
for element in d:
element.clear()
You could also use list comprehension for that.
I am trying to create a python script and I am stuck with the dictionaries. I have read through some of the other forums but can't seem to get anywhere. I am a very new python programmer so please be gentle.
What I want to do:
1) set up a dictionary like this: {'Name':'userid','jobid:jobid','walltime:walltime,'nodes:nds'}
2) iterate through a list of entries created from and external function call and extract information to populate the dictionary
3) Problem: I cannot figure out how to append entries to the appropriate keys
For example, I want this:
{‘Name’:’jose’,’jobid’:’001,002,003,005’,’walltime:32:00,240:00,04:00,07:00’,’nodes’:32,32,500’}
Notice for one userid, I have multiple jobids, walltimes and nodes.
(len(jobids)==len(walltimes)==len(nodes) for any one userid but can vary across userids)
I am able to get the script to find the first value for each username, but it never appends. How can I get this to append?
Here is what I have tried
from collections import defaultdict
pdict = defaultdict(list)
start the loop:
# get new values – add these to the dictionary keyed
# on username (create a new entry or append to existing entry)
…
(jobid,userid,jobname, sessid, nds, tsk, walltime,rest)= m.groups()
...
if userid in pdict:
print "DEBUG: %s is currently in the dictionary -- appending entries" %(userid)
pdict[userid][‘jobid’] = pdict[userid][jobid].append(jobid) I
# repeat for nodes, walltime, etc
if not userid in pdict:
print "DEBUG: %s is not in the dictionary creating entry" %(userid)
pdict[userid] = {} # define a dictionary within a dictionary with key off userid
pdict[userid]['jobid'] = jobid
pdict[userid]['jobname'] = jobname
pdict[userid]['nodes'] = nds
pdict[userid]['walltime'] = walltime
I know this is wrong but can’t figure out how to get the append to work. I have tried many of the suggestions offered on this site. I need to append (to the dictionary) the most recent values from the loop keyed to userid
Here is an example of the ouput – it does not append multiple entries for each userid but rather takes only the first value for each userid
userid jmreill contains data: {'nodes': '1', 'jobname':
'A10012a_ReMig_Q', 'walltime': '230:0', 'jobid': '1365582'}
userid igorysh contains data: {'nodes': '20', 'jobname':
'emvii_Beam_fwi6', 'walltime': '06:50', 'jobid': '1398100'}
Any suggestions? This should be easy but I can’t figure it out!
from collections import defaultdict
pdict = defaultdict(dict)
start the loop:
# get new values – add these to the dictionary keyed
# on username (create a new entry or append to existing entry)
…
(jobid,userid,jobname, sessid, nds, tsk, walltime,rest)= m.groups()
...
if userid in pdict:
print "DEBUG: %s is currently in the dictionary -- appending entries" %(userid)
pdict[userid][jobid].append(jobid)
# repeat for nodes, walltime, etc
if userid not in pdict:
print "DEBUG: %s is not in the dictionary creating entry" %(userid)
pdict[userid]['jobid'] = [jobid]
pdict[userid]['jobname'] = jobname
pdict[userid]['nodes'] = nds
pdict[userid]['walltime'] = walltime
The value corresponding to key 'jobid' should be a list of strings rather than a string. If you create your dictionary this way, you can append new jobid's to the list simply by:
pdict[userid]['jobid'].append(jobid)
I cant remember the explanation why to use the lambda expression in the following code, but you have to define a defaultdict of a defaultdict:
pdict = defaultdict(lambda: defaultdict(list))
pdict[userid][‘jobid’].append('1234')
will work.
The append() method does not return the list...it is modified in place. Also, you need to initialize your elements as lists (using square brackets):
if userid in pdict:
print "DEBUG: %s is currently in the dictionary -- appending entries" %(userid)
pdict[userid][jobid].append(jobid) ## just call append here
# repeat for nodes, walltime, etc
if not userid in pdict:
print "DEBUG: %s is not in the dictionary creating entry" %(userid)
pdict[userid] = {} # define a dictionary within a dictionary with key off userid
pdict[userid]['jobid'] = [jobid,] ## initialize with lists
pdict[userid]['jobname'] = [jobname,]
pdict[userid]['nodes'] = [nds,]
pdict[userid]['walltime'] = [walltime,]
append doesn't return a value, it modifies the list in place, and you forgot to quote 'jobid' on the right of equals. So you should replace pdict[userid][‘jobid’] = pdict[userid][jobid].append(jobid) with pdict[userid]['jobid'].append(jobid). Also take into account comment from #Jasper.
You are looking for a dict of dicts? AutoVivification is the perfect solution. Implement the perl’s autovivification feature in Python.
class AutoVivification(dict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
This makes everything easier. Note that the value of pdict[userid]['jobid'] should be a list [jobid] instead of a variable jobid as you have multiple jobid.
pdict = AutoVivification()
if userid in pdict:
pdict[userid]['jobid'].append(jobid)
else:
pdict[userid]['jobid'] = [jobid] # a list
Refer to What is the best way to implement nested dictionaries in Python?.
I have a list of dictionaries that maps different IDs to a central ID. I have a document with these different IDs associated with terms. I have created a function that now has a key the central ID from the different IDs in the document. The goFile is the document where in the first column there's an ID and in the second one there's a GOterm. The mappingList is a list containing dictionaries in which the ID in the goFile is mapped to a main ID.
My expected output is a dictionary with a main ID as a key and a set with the go terms associated with it as value.
def parseGO(mappingList, goFile):
# open the file
file = open(goFile)
# this will be the dictionary that this function returns
# entries will have as a key an Ensembl ID
# and the value will be a set of GO terms
GOdict = {}
GOset = set()
for line in file:
splitline = line.split(' ')
GO_term = splitline[1]
value_ID = splitline[0]
for dict in mappingList:
if value_ID in dict:
ENSB_term = dict[value_ID]
#my best try
for dict in mappingList:
for key in GOdict.keys():
if value_ID in dict and key == dict[value_ID]:
GOdict[ENSB_term].add(GO_term)
GOdict[ENSB_term] = GOset
return GOdict
My problem is that now I have to add to the central ID in my GOdict the terms that are associated in the document to the different IDs. To avoid duplicates i use a set (GOset). How do I do it? All my try end having all the terms mapped to all the main IDs.
Some sample:
mappingList = [{'1234': 'mainID1', '456': 'mainID2'}, {'789': 'mainID2'}]
goFile:
1234 GOTERM1
1234 GOTERM2
456 GOTERM1
456 GOTERM3
789 GOTERM1
expected output:
GOdict = {'mainID1': set([GOTERM1, GOTERM2]), 'mainID2': set([GOTERM1, GOTERM3])}
First off, you shouldn't use the variable name 'dict', as it shadows the built-in dict class, and will cause you problems at some point.
The following should work for you:
from collections import defaultdict
def parse_go(mapping_list, go_file):
go_dict = defaultdict(set)
with open(go_file) as f: # Better garbage handling using 'with'
for line in f:
(value_id, go_term) = line.split() # Feel free to change the split behaviour
# work better for you.
for map_dict in mapping_list:
if value_id in map_dict:
go_dict[map_dict[value_id]].add(go_term)
return go_dict
The code is fairly straightforward, but here's a breakdown anyway.
We use a default dictionary instead of a normal dictionary so we can eliminate all that if in or setdefault() boilerplate.
For each line in the file, we check if the first item (value_id) is a key in any of the mapping dictionaries, and if so, adds the lines second item (go_term) to that value_id's set in the dictionary.
EDIT: Request for doing this without defaultdict(). Assume that go_dict is just a normal dictionary (go_dict = {}), your for loop would look like:
for map_dict in mapping_list:
if value_id in map_dict:
esnb_entry = go_dict.setdefault(map_dict[value_id], set())
esnb_entry.add(go_term)
I wish to use a Python dictionary to keep track of some running tasks. Each of these tasks has a number of attributes which makes it unique, so I'd like to use a function of these attributes to generate the dictionary keys, so that I can find them in the dictionary again by using the same attributes; something like the following:
class Task(object):
def __init__(self, a, b):
pass
#Init task dictionary
d = {}
#Define some attributes
attrib_a = 1
attrib_b = 10
#Create a task with these attributes
t = Task(attrib_a, attrib_b)
#Store the task in the dictionary, using a function of the attributes as a key
d[[attrib_a, attrib_b]] = t
Obviously this doesn't work (the list is mutable, and so can't be used as a key ("unhashable type: list")) - so what's the canonical way of generating a unique key from several known attributes?
Use a tuple in place of the list. Tuples are immutable and can be used as dictionary keys:
d[(attrib_a, attrib_b)] = t
The parentheses can be omitted:
d[attrib_a, attrib_b] = t
However, some people seem to dislike this syntax.
Use a tuple
d[(attrib_a, attrib_b)] = t
That should work fine
Say I have a dictionary full of parameters:
{speed = 1, intelligence = 3, dexterity = 2}
I want to call a loop that creates a Label and a SpinBox for each item in this list procedurally, in case I want to add more attributes later. I can create the window and return the updated values just fine. My only issue is that I want all the widgets to be created as necessary, whether I have 7 or 20 attributes to edit.
So the label object could be called speed_Label and the intelligence label object intelligence_Label, and the spinbox containing the value of speed would be speed_SpinBox and so on, which then I could pass back easily. However, this
a) seems like poor naming practice
b) seems difficult seeing as I can't find out how to give objects names procedurally, say
for KEY in dict.keys(): # say the KEY is "Speed"
# this would produce a Label object called Speed_Label
# which displays the text "Speed"
"KEY" + "_Label" = QLabel("KEY")
Why not simply use a list or dict?
Something like this should work:
widgets = {}
form = QFormLayout()
for key, value in your_dict.iteritems():
widgets[key] = widget = {}
widget['spinbox'] = spinbox = QSpinBox()
spinbox.setValue(value)
form.addRow(key, spinbox)