Updating values in dictionary object - python

My program has a class Words where a defaultdict(int) named t_e_f is created as an object and a function main() that contains a pointer to a function that uses the values of the dictionary 't_e_f' to compute other calculations. 't_e_f' is a dictionary having as key a tuple of words and as value a float number.
My programs looks like this:
class Words:
def __init__(init):
self.t_e_f=Words.set_t_e_f(self)
def set_t_e_f(self):
raw_text_e=open_file('toyen')
raw_text_f=open_file('toyde')
tokens_e=raw_text_e.split()
tokens_f=raw_text_f.split()+['NULL']
tef_dict=collections.defaultdict(int)
for word_e in tokens_e_set:
for word_f in tokens_f_set:
tef_dict[(word_e,word_f)]=1/len(tokens_e_set)
return tef_dict
def get_t_e_f(self):
return self.t_e_f
def main():
words=Words()
t_e_f=words.get_t_e_f()
s_total_e=normalization(t_e_f)
I then have a normalization function that takes t_e_f and uses it to compute calculations over the values of another dictionary created in the normalization function, s_total_e.
def normalization(t_e_f):
s_total_e=collections.defaultdict(int)
words_sent_e=['the','big','book']
words_sent_de=['das','grosse','buch']
for item in words_sent_e:
s_total_e[item]
for item in words_sent_e:
for item_2 in words_sent_de:
s_total_e[item]+=t_e_f[(item,item_2)]
The problem is that when t_e_f is passed to normalization all the values are set to 0, therefore losing the initial values set when the words object was created. I was wondering what was happening and how to solve this problem.
Thank you.

The tef_dict variable isn't being saved to the instance and is not being returned. Add a line to set_t_e_f():
return tef_dict
Also note that defaultdict will automatically add a zero entry even if you only lookup or inspect a missing key.
You may be better-off using collections.Counter() instead. Unlike defaultdict, it will return zeros for missing keys but won't add them to the underlying dictionary.

Related

Multi-level defaultdict with variable depth and with list and int type

I am trying to create a multi-level dict with variable depth and with list and int type.
Data structure is like below
A
--B1
-----C1=1
-----C2=[1]
--B2=[3]
D
--E
----F
------G=4
In the case of above data structure, the last value can be an int or list.
If the above data structure has the only int then I can be easily achieved by using the below code:
from collections import defaultdict
f = lambda: defaultdict(f)
d = f()
d['A']['B1']['C1'] = 1
But as the last value has both list and int, it becomes a bit problematic for me.
Now we can insert data in a list using two ways.
d['A']['B1']['C2']= [1]
d['A']['B1']['C2'].append([2])
But when I am using only the append method it is causing the error.
Error is:
AttributeError: 'collections.defaultdict' object has no attribute 'append'
so Is there any way to use only the append method for a list?
There's no way you can use your current defaultdict-based structure to make d['A']['B1']['C2'].append(1) work properly if the 'C2' key doesn't already exist, since the data structure can't tell that the unknown key should correspond to a list rather than another layer of dictionary. It doesn't know what method you're going to call on the value it returns, so it can't know it shouldn't return a dictionary (like it did when it first looked up 'A' and 'B').
This isn't an issue for bare integers, since for those you're as assigning directly to a new key (and all the earlier levels are dictionaries). When you're assigning, the data structure isn't creating the value, you are, so you can use any type you want.
Now, if your keys are distinctive in some way, so that given a key like 'C2' you can know for sure that it should correspond to a list, you may have a chance. You can write your own dict subclass, defining a __missing__ method to handle lookups of keys that don't exist yet in your own special way:
def Tree(dict):
def __missing__(self, key):
if key_corresponds_to_list(key): # magic from somewhere
result = self[key] = []
else:
result = self[key] = Tree()
return result
# you might also want a custom __repr__
Here's an example run with a magic key function that makes any even-length key default to a list, while an odd-length key defaults to a dict:
> def key_corresponds_to_list(key):
return len(key) % 2 == 0
> t = Tree()
> t["A"]["B"]["C2"].append(1) # the default value for C2 is a list because it's even length
> t
{'A': {'B': {'C2': [1]}}}
> t["A"]["B"]["C10"]["D"] = 2 # C10's another layer of dict, since it's length is odd
> t
{'A': {'B': {'C10': {'D': 2}, 'C2': [1]}}} # it didn't matter what length D was though
You probably won't actually want to use a global function to control the class like this, I just did that as an example. If you go with this approach, I'd suggest putting the logic directly into the __missing__ method (or maybe passing a function as a parameter, like defaultdict does with its factory function).

python call function in get method of dictionary instead default value

Need some help in order to understand some things in Python and get dictionary method.
Let's suppose that we have some list of dictionaries and we need to make some data transformation (e.g. get all names from all dictionaries by key 'name'). Also I what to call some specific function func(data) if key 'name' was not found in specific dict.
def func(data):
# do smth with data that doesn't contain 'name' key in dict
return some_data
def retrieve_data(value):
return ', '.join([v.get('name', func(v)) for v in value])
This approach works rather well, but as far a I can see function func (from retrieve_data) call each time, even key 'name' is present in dictionary.
If you want to avoid calling func if the dictionary contains the value, you can use this:
def retrieve_data(value):
return ', '.join([v['name'] if 'name' in v else func(v) for v in value])
The reason func is called each time in your example is because it gets evaluated before get even gets called.

Python: searching through set of objects

I have a set of objects W that have the attributes name and a score. The __hash__() function is based upon the name only, and the __eq__() function is not defined, so it is based upon the __hash__() function.
Now, I want to use the score of the object. Is there a quicker way to reference to an instance than the following script? Given the way a set works, there must be...
tmp_obj = W(name="myname", score=0)
for obj in w_set:
if obj == tmp_obj: break
else:
# do nothing with obj
# do something with obj.score
You can use the in operator to check for set membership. This is a constant time operation in sets and dictionaries, since they are implemented as hash tables. For lists and tuples in is linear time.
obj = W("myname", 0)
if obj in w_set:
# do something with obj
You don't say how you set up your object, but why not just use if obj.score == 0?
for obj in w_set:
if obj.score == 0:
break
Or perhaps your question is about avoiding the linear search?
If you have a lot of objects and you'll be doing a lot of searches by score, you need to build an index mapping scores to objects. Presumably several objects could have the same score, so we'll build a list for each score (a set would also work):
from collections import defaultdict
score_index = defaultdict(list)
for obj in w_set:
score_index[obj.score].append(obj)
You can now loop over the list of all objects with score zero without searching:
for obj in score_index[0]:
# Do something

How to use Python Decorator to change only one part of function?

I am practically repeating the same code with only one minor change in each function, but an essential change.
I have about 4 functions that look similar to this:
def list_expenses(self):
explist = [(key,item.amount) for key, item in self.expensedict.iteritems()] #create a list from the dictionary, making a tuple of dictkey and object values
sortedlist = reversed(sorted(explist, key = lambda (k,a): (a))) #sort the list based on the value of the amount in the tuples of sorted list. Reverse to get high to low
for ka in sortedlist:
k, a = ka
print k , a
def list_income(self):
inclist = [(key,item.amount) for key, item in self.incomedict.iteritems()] #create a list from the dictionary, making a tuple of dictkey and object values
sortedlist = reversed(sorted(inclist, key = lambda (k,a): (a))) #sort the list based on the value of the amount in the tuples of sorted list. Reverse to get high to low
for ka in sortedlist:
k, a = ka
print k , a
I believe this is what they refer to as violating "DRY", however I don't have any idea how I can change this to be more DRYlike, as I have two seperate dictionaries(expensedict and incomedict) that I need to work with.
I did some google searching and found something called decorators, and I have a very basic understanding of how they work, but no clue how I would apply it to this.
So my request/question:
Is this a candidate for a decorator, and if a decorator is
necessary, could I get hint as to what the decorator should do?
Pseudocode is fine. I don't mind struggling. I just need something
to start with.
What do you think about using a separate function (as a private method) for list processing? For example, you may do the following:
def __list_processing(self, list):
#do the generic processing of your lists
def list_expenses(self):
#invoke __list_processing with self.expensedict as a parameter
def list_income(self):
#invoke __list_processing with self.incomedict as a parameter
It looks better since all the complicated processing is in a single place, list_expenses and list_income etc are the corresponding wrapper functions.

How to save 'arrays' to variable from QueryDict

I have a following structure of QueryDict:
QueryDict: {u'tab[1][val1]': [u'val1'], u'tab[1][val2]': [u'val2'], u'tab[0][val1]': [u'val1'], u'tab[1][val2]': [u'val2']}
I want to store it in an iterable variable so I can do something like this:
for x in xs:
do_something(x.get('val1'))
where x is tab[0] etc
I tried:
dict(request.POST._iteritems())
but it doesn't return tab[0] but tab[0][val1] as an element.
Is it possible to store entire tab[idx] in variable?
Django's QueryDict has a few additional methods to deal with multiple values per key compared to the traditional dict; useful for your purposes are:
QueryDict.iterlists() Like QueryDict.iteritems() except it includes
all values, as a list, for each member of the dictionary.
QueryDict.getlist(key, default)
Returns the data with the requested key, as a Python list. Returns an empty list if the key doesn’t exist and no default value was provided. It’s guaranteed to return a list of some sort unless the default value was no list.
QueryDict.lists()
Like items(), except it includes all values, as a list, for each member of the dictionary.
So you can do something like:
qd = QueryDict(...)
for values in qd.lists():
for value in values:
do_something(value)
Also note that the "normal" dict methods like get always return only a single value (the last value for that key).

Categories