Create many empty dictionary in Python - python

I'm trying to create many dictionaries in a for loop in Python 2.7. I have a list as follows:
sections = ['main', 'errdict', 'excdict']
I want to access these variables, and create new dictionaries with the variable names. I could only access the list sections and store an empty dictionary in the list but not in the respective variables.
for i in enumerate(sections):
sections[i] = dict()
The point of this question is. I'm going to obtain the list sections from a .ini file, and that variable will vary. And I can create an array of dictionaries, but that doesn't work well will the further function requirements. Hence, my doubt.

Robin Spiess answered your question beautifully.
I just want to add the one-liner way:
section_dict = {sec : {} for sec in sections}
For maintaining the order of insertion, you'll need an OrderedDict:
from collections import OrderedDict
section_dict = OrderedDict((sec, {}) for sec in sections)

To clear dictionaries
If the variables in your list are already dictionaries use:
for var in sections:
var.clear()
Note that here var = {} does not work, see Difference between dict.clear() and assigning {} in Python.
To create new dictionaries
As long as you only have a handful of dicts, the best way is probably the easiest one:
main = {} #same meaning as main = dict() but slightly faster
errdict = {}
excdict = {}
sections = [main,errdict,excdict]
The variables need to be declared first before you can put them in a list.
For more dicts I support #dslack's answer in the comments (all credit to him):
sections = [dict() for _ in range(numberOfDictsYouWant)]
If you want to be able to access the dictionaries by name, the easiest way is to make a dictionary of dictionaries:
sectionsdict = {}
for var in sections:
sectionsdict[var] = {}
You might also be interested in: Using a string variable as a variable name

Related

Python Refactoring - Changing Variable Type and Value Within a Loop

I'm working on automating some word and PDF documents that need to be updated on a certain cadence.
The way I'm doing this is using dictionaries that replace variables within word documents.
My code works but because my area is not tech savvy I'm using an excel file so people can replace the values in that file whenever they need to update the documents.
I was also successful on pulling the dictionary key and values from excel but I'm trying to refactor this code which is repetitive. Here is an excerpt with 2 of the 7 dictionaries I'm creating:
dic = pd.read_excel('test.xlsx',"AD")
AD = dict(zip(dic.Key,dic.Value))
dic = pd.read_excel('test.xlsx',"RSM")
RSM = dict(zip(dic.Key,dic.Value))
I'm trying to refactor this so I can run it all within a single loop and trying something like this:
import pandas as pd
AD = "AD"
RSM = "RSM"
groups = [AD, RSM]
for item in groups:
dic = pd.read_excel('test.xlsx',item)
item = dict(zip(dic.Key,dic.Value))
So I'm basically first using the variable as a string to call the excel tab within the read_excel method and then I want to replace that same variable to become the output dictionary.
When I print item within the loop I do get the correct dictionaries but I'm not able to output a variable that stores each dictionary that the loop creates.
Any help would be appreciated.
Thanks!
You're almost there, you can just have a dictionary of dictionaries:
import pandas as pd
groups = ['AD', 'RSM']
dicts = {}
for item in groups:
dic = pd.read_excel('test.xlsx', item)
dicts[item] = dict(zip(dic.Key, dic.Value))
Now you can just access them like this:
print(dicts['AD']['some key'])
The values of a dictionary can be anything, including other dictionaries. Keys of dictionaries can be many things as well, as long as they're hashable, and strings are a common choice of course - and the names of your groups are just that.
Also note that I removed the variables named AD and RSM. You don't really achieve anything by having variables that are named after the string value they are assigned. It only serves to be able to leave off the quotes where you use the values, but it creates an additional indirection that serves no purpose.
If you don't even need the list of groups, but just want groups to be the actual dictionaries:
import pandas as pd
groups = {}
for item in ['AD', 'RSM']:
dic = pd.read_excel('test.xlsx', item)
groups[item] = dict(zip(dic.Key, dic.Value))
The problem is that you assign the result to the item variable and not to an entry in the list.
A simple fix would be to use a dictionary instead of a list to save the reult, eg
import pandas as pd
AD = "AD"
RSM = "RSM"
groups = {AD: None, RSM: None}
for item in groups.keys():
dic = pd.read_excel('test.xlsx',item)
groups[item] = dict(zip(dic.Key,dic.Value))
My suggestion would be to use an overall dictionary to track your work and also to save the results there. I refactored your code slightly to this:
import pandas as pd
groups = dict.fromkeys(('AD', 'RSM')) # setup main dict containing dicts
for item in groups:
dic = pd.read_excel('test.xlsx', item)
groups[item] = dict(zip(dic.Key, dic.Value)) # store individual dict
There's no need for your global constants that are used only once, so I removed those. I also added some spaces to help your Python code conform with PEP-8, the global standard style guide.
Now you can access each dictionary as you like, for example, groups['AD'].

How to reset value of multiple dictionaries elegantly in python

I am working on a code which pulls data from database and based on the different type of tables , store the data in dictionary for further usage.
This code handles around 20-30 different table so there are 20-30 dictionaries and few lists which I have defined as class variables for further usage in code.
for example.
class ImplVars(object):
#dictionary capturing data from Asset-Feed table
general_feed_dict = {}
ports_feed_dict = {}
vulns_feed_dict = {}
app_list = []
...
I want to clear these dictionaries before I add data in it.
Easiest or common way is to use clear() function but this code is repeatable as I will have to write for each dict.
Another option I am exploring is with using dir() function but its returning variable names as string.
Is there any elegant method which will allow me to fetch all these class variables and clear them ?
You can use introspection as you suggest:
for d in filter(dict.__instancecheck__, ImplVars.__dict__.values()):
d.clear()
Or less cryptic, covering lists and dicts:
for obj in ImplVars.__dict__.values():
if isinstance(obj, (list, dict)):
obj.clear()
But I would recommend you choose a bit of a different data structure so you can be more explicit:
class ImplVars(object):
data_dicts = {
"general_feed_dict": {},
"ports_feed_dict": {},
"vulns_feed_dict": {},
}
Now you can explicitly loop over ImplVars.data_dicts.values and still have other class variables that you may not want to clear.
code:
a_dict = {1:2}
b_dict = {2:4}
c_list = [3,6]
vars_copy = vars().copy()
for variable, value in vars_copy.items():
if variable.endswith("_dict"):
vars()[variable] = {}
elif variable.endswith("_list"):
vars()[variable] = []
print(a_dict)
print(b_dict)
print(c_list)
result:
{}
{}
[]
Maybe one of the easier kinds of implementation would be to create a list of dictionaries and lists you want to clear and later make the loop clear them all.
d = [general_feed_dict, ports_feed_dict, vulns_feed_dict, app_list]
for element in d:
element.clear()
You could also use list comprehension for that.

How can I rename a dictionary within a program?

I ask the user of my program to input the number of datasets he/she wants to investigate, e.g. three datasets. Accordingly, I should then create three dictionaries (dataset_1, dataset_2, and dataset_3) to hold the values for the various parameters. Since I do not know beforehand the number of datasets the user wants to investigate, I have to create and name the dictionaries within the program.
Apparently, Python does not let me do that. I could not rename the dictionary once it has been created.
I have tried using os.rename("oldname", "newname"), but that only works if I have a file stored on my computer hard disk. I could not get it to work with an object that lives only within my program.
number_sets = input('Input the number of datasets to investigate:')
for dataset in range(number_sets):
init_dict = {}
# create dictionary name for the particular dataset
dict_name = ''.join(['dataset_', str(dataset+1)])
# change the dictionary´s name
# HOW CAN I CHANGE THE DICTIONARY´S NAME FROM "INIT_DICT"
# TO "DATASET_1", WHICH IS THE STRING RESULT FOR DICT_NAME?
I would like to have in the end
dataset_1 = {}
dataset_2 = {}
and so on.
You don't (need to). Keep a list of data sets.
datasets = []
for i in range(number_sets):
init_dict = {}
...
datasets.append(init_dict)
Then you have datasets[0], datasets[1], etc., rather than dataset_1, dataset_2, etc.
Inside the loop, init_dict is set to a brand new empty directory at the top of each iteration, without affecting the dicts added to datasets on previous iterations.
If you want to create variables like that you could use the globals
number_sets = 2
for dataset in range(number_sets):
dict_name = ''.join(['dataset_', str(dataset+1)])
globals() [dict_name] = {}
print(dataset_1)
print(dataset_2)
However this is not a good practice, and it should be avoided, if you need to keep several variables that are similar the best thing to do is to create a list.
You can use a single dict and then add all the data sets into it as a dictionary:
all_datasets = {}
for i in range(number_sets):
all_datasets['dataset'+str(i+1)] = {}
And then you can access the data by using:
all_datasets['dataset_1']
This question gets asked many times in many different variants (this is one of the more prominent ones, for example). The answer is always the same:
It is not easily possible and most of the time not a good idea to create python variable names from strings.
The more easy, approachable, safe and usable way is to just use another dictionary. One of the cool things about dictionaries: any object can become a key / value. So the possibilities are nearly endless. In your code, this can be done easily with a dict comprehension:
number_sets = int(input('Input the number of datasets to investigate:')) # also notice that you have to add int() here
data = {''.join(['dataset_', str(dataset + 1)]): {} for dataset in range(number_sets)}
print(data)
>>> 5
{'dataset_1': {}, 'dataset_2': {}, 'dataset_3': {}, 'dataset_4': {}, 'dataset_5': {}}
Afterwards, these dictionaries can be easily accessed via data[name_of_dataset]. Thats how it should be done.

It's possibile to use set function on a object basing only one attribute?

I'm creating this type of object:
class start_url_mod ():
link = ""
id = 0
data = ""
I'm creating a list of this object and I want to know if there is some way in order to delete one of then if I find same link attribute.
I know the function set() for the deleting of duplicates in a "sample" list, but there is something very fast and computational acceptable?
Use a dict key-ed on the attribute. You can preserve order with collections.OrderedDict:
from collections import OrderedDict
# Keep the last copy with a given link
kept_last = OrderedDict((x.link, x) for x in nonuniquelist).values()
# Keep the first copy with a given link (still preserving input order)
kept_first = list(reversed(OrderedDict((x.link, x) for x in reversed(nonuniquelist)).viewvalues()))
If order is not important, plain dict via dict comprehensions is significantly faster in Python 2.7 (because OrderedDict is implemented in Python, not C, and because dict comprehensions are optimized more than constructor calls; in Python 3.5 it's implemented in C):
# Keep the last copy with a given link but order not preserved in result
kept_last = {x.link: x for x in nonuniquelist}.values()
# Keep the first copy with a given link but order not preserved in result
kept_first = {x.link: x for x in reversed(nonuniquelist)}.values()
You can use a dictionary with the attribute that you're interested in being the key ...

Pythonic way to get the index of element from a list of dicts depending on multiple keys

I am very new to python, and I have the following problem. I came up with the following solution. I am wondering whether it is "pythonic" or not. If not, what would be the best solution ?
The problem is :
I have a list of dict
each dict has at least three items
I want to find the position in the list of the dict with specific three values
This is my python example
import collections
import random
# lets build the list, for the example
dicts = []
dicts.append({'idName':'NA','idGroup':'GA','idFamily':'FA'})
dicts.append({'idName':'NA','idGroup':'GA','idFamily':'FB'})
dicts.append({'idName':'NA','idGroup':'GB','idFamily':'FA'})
dicts.append({'idName':'NA','idGroup':'GB','idFamily':'FB'})
dicts.append({'idName':'NB','idGroup':'GA','idFamily':'FA'})
dicts.append({'idName':'NB','idGroup':'GA','idFamily':'FB'})
dicts.append({'idName':'NB','idGroup':'GB','idFamily':'FA'})
dicts.append({'idName':'NB','idGroup':'GB','idFamily':'FB'})
# let's shuffle it, again for example
random.shuffle(dicts)
# now I want to have for each combination the index
# I use a recursive defaultdict definition
# because it permits creating a dict of dict
# even if it is not initialized
def tree(): return collections.defaultdict(tree)
# initiate mapping
mapping = tree()
# fill the mapping
for i,d in enumerate(dicts):
idFamily = d['idFamily']
idGroup = d['idGroup']
idName = d['idName']
mapping[idName][idGroup][idFamily] = i
# I end up with the mapping providing me with the index within
# list of dicts
Looks reasonable to me, but perhaps a little too much. You could instead do:
mapping = {
(d['idName'], d['idGroup'], d['idFamily']) : i
for i, d in enumerate(dicts)
}
Then access it with mapping['NA', 'GA', 'FA'] instead of mapping['NA']['GA']['FA']. But it really depends how you're planning to use the mapping. If you need to be able to take mapping['NA'] and use it as a dictionary then what you have is fine.

Categories