Can a python dictionary be created and updated on the go - python

I have to create a nested dictionary on the go inside a for loop. I have the parent dictionary data initialized to empty. Now inside the loop, I get the key to be added to the parent dictionary. And each key being again a dictionary.
data = {}
for condition
Get x, y # x is the new key
if x not in data:
data[x] ={}
data[x].update({y:1}) # or data[x][y] = 1
But I want to do the above piece in one line as below
data = {}
for condition
Get x, y # x is the new key
if x not in data:
data.update({x:{}}.update({y:1}))
Here I am getting TypeError: 'NoneType' object is not iterable. I guess this is due to the inner update (i.e. update({y:1}) is getting executed first and trying to update x which is not yet present, hence NoneType.
Is there any other way I can achieve this in one line? Or do I have the only way to create an empty dictionary first and then update the same as shown in first code piece ?

Are you trying to do automatic nested dictionary insertion? If so, you could try using a defaultdict:
from collections import defaultdict
data = defaultdict(dict)
for i in range(10):
x = "..."
y = "..."
data[x][y] = 1
print data["..."]["..."]
This prints 1

It can also be achieved data[x] = {y :1} or data.update({x: {y:1}) as mentioned by #karthikr

Related

Optimize a dictionary key conditional

I would like to optimize this piece of code. I'm sure there is a way to write it in a single line:
if 'value' in dictionary:
x = paas_server['support']
else:
x = []
use dictionary get() method as:
x = dictionary.get('support', [])
if support is not a key in the dictionary, it returns second method's argument, here, an empty list.

How can I combine separate dictionary outputs from a function in one dictionary?

For our python project we have to solve multiple questions. We are however stuck at this one:
"Write a function that, given a FASTA file name, returns a dictionary with the sequence IDs as keys, and a tuple as value. The value denotes the minimum and maximum molecular weight for the sequence (sequences can be ambiguous)."
import collections
from Bio import Seq
from itertools import product
def ListMW(file_name):
seq_records = SeqIO.parse(file_name, 'fasta',alphabet=generic_dna)
for record in seq_records:
dictionary = Seq.IUPAC.IUPACData.ambiguous_dna_values
result = []
for i in product(*[dictionary[j] for j in record]):
result.append("".join(i))
molw = []
for sequence in result:
molw.append(SeqUtils.molecular_weight(sequence))
tuple= (min(molw),max(molw))
if min(molw)==max(molw):
dict={record.id:molw}
else:
dict={record.id:(min(molw), max(molw))}
print(dict)
Using this code we manage to get this output:
{'seq_7009': (6236.9764, 6367.049999999999)}
{'seq_418': (3716.3642000000004, 3796.4124000000006)}
{'seq_9143_unamb': [4631.958999999999]}
{'seq_2888': (5219.3359, 5365.4089)}
{'seq_1101': (4287.7417, 4422.8254)}
{'seq_107': (5825.695099999999, 5972.8073)}
{'seq_6946': (5179.3118, 5364.420900000001)}
{'seq_6162': (5531.503199999999, 5645.577399999999)}
{'seq_504': (4556.920899999999, 4631.959)}
{'seq_3535': (3396.1715999999997, 3446.1969999999997)}
{'seq_4077': (4551.9108, 4754.0073)}
{'seq_1626_unamb': [3724.3894999999998]}
As you can see this is not one dictionary but multiple dictionaries under each other. So is there anyway we can change our code or type an extra command to get it in this format:
{'seq_7009': (6236.9764, 6367.049999999999),
'seq_418': (3716.3642000000004, 3796.4124000000006),
'seq_9143_unamb': (4631.958999999999),
'seq_2888': (5219.3359, 5365.4089),
'seq_1101': (4287.7417, 4422.8254),
'seq_107': (5825.695099999999, 5972.8073),
'seq_6946': (5179.3118, 5364.420900000001),
'seq_6162': (5531.503199999999, 5645.577399999999),
'seq_504': (4556.920899999999, 4631.959),
'seq_3535': (3396.1715999999997, 3446.1969999999997),
'seq_4077': (4551.9108, 4754.0073),
'seq_1626_unamb': (3724.3894999999998)}
Or in someway manage to make clear that it should use the seq_ID ans key and the Molecular weight as a value for one dictionary?
Set a dictionnary right before your for loop, then update it during your loop such as :
import collections
from Bio import Seq
from itertools import product
def ListMW(file_name):
seq_records = SeqIO.parse(file_name, 'fasta',alphabet=generic_dna)
retDict = {}
for record in seq_records:
dictionary = Seq.IUPAC.IUPACData.ambiguous_dna_values
result = []
for i in product(*[dictionary[j] for j in record]):
result.append("".join(i))
molw = []
for sequence in result:
molw.append(SeqUtils.molecular_weight(sequence))
tuple= (min(molw),max(molw))
if min(molw)==max(molw):
retDict[record.id] = molw
else:
retDict[record.id] = (min(molw), max(molw))}
# instead of printing now, print in the end of your function / script
# print(dict)
Right now, you're setting a new dict at each turn of your loop, and print it. It is just a normal behaviour of your code to print lots and lots of dict.
you're creating a dictionary with 1 entry at each iteration.
You want to:
define a dict variable (better use dct to avoid reusing built-in type name) before your loop
rewrite the assignment to dict in the loop
So before the loop:
dct = {}
and in the loop (instead of your if + dict = code), in a ternary expression, with min & max computed only once:
minval = min(molw)
maxval = max(molw)
dct[record.id] = molw if minval == maxval else (minval,maxval)

How can I associate a dict key to an attribute of an object within a list?

class SpreadsheetRow(object):
def __init__(self,Account1):
self.Account1=Account1
self.Account2=0
I have a while loop that fills a list of objects ,and another loop that fills a dictionary associating Var1:Account2. But, I need to get that dictionary's value into each object, if the key matches the object's Account1.
So basically, I have:
listofSpreadsheetRowObjects=[SpreadsheetRow1, SpreadsheetRow2, SpreadsheetRow3]
dict_var1_to_account2={1234:888, 1991:646, 90802:5443}
I've tried this:
for k, v in dict_var1_to_account2.iteritems():
if k in listOfSpreadsheetRowObjects:
if self.account1=k:
self.account2=v
But, it's not working, and I suspect it's my first "if" statement, because listOfSpreadsheetRowObjects is just a list of those objects. How would I access account1 of each object, so I can match them as needed?
Eventually, I should have three objects with the following information:
SpreadsheetRow
self.Account1=Account1
self.Account2=(v from my dictionary, if account1 matches the key in my dictionary)
You can use a generator expression within any() to check if any account1 attribute of those objects is equal with k:
if any(k == item.account1 for item in listOfSpreadsheetRows):
You can try to use the next function like this:
next(i for i in listOfSpreadsheetRows if k == i.account1)
If you have a dictionary d and want to get the value associated to the key x then you look up that value like this:
v = d[x]
So if your dictionary is called dict_of_account1_to_account2 and the key is self.Account1 and you want to set that value to self.Account2 then you would do:
self.Account2 = dict_of_account1_to_account2[self.Account1]
The whole point of using a dictionary is that you don't have to iterate through the entire thing to look things up.
Otherwise if you are doing this initialization of .Account2 after creating all the SpreadsheetRow objects then using self doesn't make sense, you would need to iterate through each SpreadsheetRow item and do the assignment for each one, something like this:
for row in listofSpreadsheetRowObjects:
for k, v in dict_of_account1_to_account2.iteritems():
if row.Account1 == k:
row.Account2 = v
But again, you don't have to iterate over the dictionary to make the assignment, just look up row.Account1 from the dict:
for row in listofSpreadsheetRowObjects:
row.Account2 = dict_of_account1_to_account2[row.Account1]

How to make a list of dicts with nested dicts

I am not sure if thats what i really want but i want this structure basically,
-document
-pattern_key
-start_position
-end_position
right now i have this
dictionary[document] = {"pattern_key":key, "startPos":index_start, "endPos": index_end}
but i want to nest startPos, endPos under pattern key
Also how can i update this, to add a new entry of pattern_key, startPos, endPos, under same document?
with this you can updates values of the keys
for x in dict.keys():
if 'pattern_key' in x:
dict[x]= 'new value'
if 'endPost'...
.....
print dict
>>> should have new values
for add new entrys:
old = {}
old['keyxD'] = [9999]
new = {}
new['newnewnew'] = ['supernewvalue']
new.update(old)
print new
>>>{'newnewnew': ['supernewvalue'], 'keyxD': [9999]}
Not sure what you exactly want, but you can try this:
document = {'pattern_key' : {}}
This created an object document which is a dictionary, where the key is a string (pattern_key) and the value is another dictionary. Now, to add new entries to the value of the dictionary (which is another dictionary) you just add your values:
document['pattern_key']['start_position'] = 1
document['pattern_key']['end_position'] = 10
Afterwards, you can enter either case with document['pattern_key']['start_position'] = 2 and this will change the start position to 2.
Hope this helps

Python - Updating value in one dictionary is updating value in all dictionaries

I have a list of dictionaries called lod. All dictionaries have the same keys but different values. I am trying to update one specific value in the list of values for the same key in all the dictionaries.
I am attempting to do it with the following for loop:
for i in range(len(lod)):
a=lod[i][key][:]
a[p]=a[p]+lov[i]
lod[i][key]=a
What's happening is each is each dictionary is getting updated len(lod) times so lod[0][key][p] is supposed to have lov[0] added to it but instead it is getting lov[0]+lov[1]+.... added to it.
What am I doing wrong?
Here is how I declared the list of dicts:
lod = [{} for _ in range(len(dataul))]
for j in range(len(dataul)):
for i in datakl:
rrdict[str.split(i,',')[0]]=list(str.split(i,',')[1:len(str.split(i,','))])
lod[j]=rrdict
The problem is in how you created the list of dictionaries. You probably did something like this:
list_of_dicts = [{}] * 20
That's actually the same dict 20 times. Try doing something like this:
list_of_dicts = [{} for _ in range(20)]
Without seeing how you actually created it, this is only an example solution to an example problem.
To know for sure, print this:
[id(x) for x in list_of_dicts]
If you defined it in the * 20 method, the id is the same for each dict. In the list comprehension method, the id is unique.
This it where the trouble starts: lod[j] = rrdict. lod itself is created properly with different dictionaries. Unfortunately, afterwards any references to the original dictionaries in the list get overwritten with a reference to rrdict. So in the end, the list contains only references to one single dictionary. Here is some more pythonic and readable way to solve your problem:
lod = [{} for _ in range(len(dataul))]
for rrdict in lod:
for line in datakl:
splt = line.split(',')
rrdict[splt[0]] = splt[1:]
You created the list of dictionaries correctly, as per other answer.
However, when you are updating individual dictionaries, you completely overwrite the list.
Removing noise from your code snippet:
lod = [{} for _ in range(whatever)]
for j in range(whatever):
# rrdict = lod[j] # Uncomment this as a possible fix.
for i in range(whatever):
rrdict[somekey] = somevalue
lod[j] = rrdict
Assignment on the last line throws away the empty dict that was in lod[j] and inserts a reference to the object represented by rrdict.
Not sure what your code does, but see a commented-out line - it might be the fix you are looking for.

Categories