Python: Finding a minimum value according to a dictionary - python

So this is what I'd "like" to be able to write:
cur_loc = min(open_set,key=lambda x:costs[x])
cur_loc is a tuple, and the goal is to set it equal to the tuple in open_set with the lowest cost. (and you find the cost of x with costs[x])
How could I do this? I tried Python.org's documentation on min(), but I didn't seem to find much help.
Thanks!
EDIT:
I resolved my own problem.
I was retarded and hadn't initialized the costs dictionary. I was actually copy and pasting someone else's python code in order to test what they were doing, but apparently the snippet they created didn't include the initialization part. Woops. If anyone is interested:
for row in range(self.rows):
for col in range(self.cols):
myloc = (row,col)
if (myloc) not in closed_set:
costs[myloc] = (abs(end_row-row)+abs(end_col - col))*10
if (myloc) not in open_set:
open_set.add(myloc)
parents[myloc] = cur_loc
cur_loc = min(open_set,key=lambda x:costs[x])

Worked for me. What's your question?
>>> costs = { '1': 1, '2': 2, '3': 3 }
>>> open_set = set( ['1','2'] )
>>> min(open_set,key=lambda x:costs[x])
'1'

The you supplied would be correct if cost[x] was a dictionary lookup with tuples as keys. I think you may have meant to extract a field for the tuple and lookup the cost of that field:
>>> costs = dict(red=10, green=20, blue=30)
>>> open_set = {('red', 'car'), ('green', 'boat'), ('blue', 'plane')}
>>> min(open_set, key=lambda x: costs[x[0]])
('red', 'car')

Related

How to structure dictionary to apply to function with enumerate

I am trying to re-build a simple function, that ask for a dictionary as an input. No matter what I try I cannot figure out a minimum working example of a dictionary to pass through this function. I've read upon dictionaries and there is not so much room to create it differently, hence I do not know what the problem is.
I've tried to apply following minimum dictionary examples:
import nltk
#Different dictionaries to try as minimum working examples:
comments1 = {1 : 'Rockies', 2: 'Red Sox'}
comments2 = {'key1' : 'Rockies', 'key2': 'Red Sox'}
comments3 = dict([(1, 3), (2, 3)])
#Function:
def tokenize_body(comments):
tokens = {}
for idx, com_id in enumerate(comments):
body = comments[com_id]['body']
tokenized = [x.lower() for x in nltk.word_tokenize(body)]
tokens[com_id] = tokenized
return tokens
tokens = tokenize_body(comments1)
I know that with enumerate I am basically calling the index and the key, I can not figure out how to call the 'body', i.e the strings that I want to tokenize.
For both comments1 and comments2 with strings as inputs I receive the error: TypeError: string indices must be integers.
If I apply integers instead of strings, comments3, I receive the error:
TypeError: 'int' object is not subscriptable.
This may seem trivial to you, but I can not figure out what I am doing wrong. If you could provide a minimum working example, that would be highly appreciated.
In order to loop through a dictionary in python, you need to use the items method to get both keys and values:
comments = {"key1": "word", "key2": "word2"}
def tokenize_body(comments):
tokens = {}
for key, value in comments.items():
# values - word, word2
# keys - key1, key2
tokens[key] = [x.lower() for x in nltk.word_tokenize(value)]
return tokens
enumerate is used for lists, in order to get the index of an element:
l = ['a', 'b']
for index, elm in enumerate(l):
print(index) # => 0, 1
You might be looking for .items(), e.g.:
for idx, item in enumerate(comments1.items()):
print(idx, item)
This will print
0 (1, 'Rockies')
1 (2, 'Red Sox')
See a demo on ideone.com.

How to reduce on a list of tuples in python

I have an array and I want to count the occurrence of each item in the array.
I have managed to use a map function to produce a list of tuples.
def mapper(a):
return (a, 1)
r = list(map(lambda a: mapper(a), arr));
//output example:
//(11817685, 1), (2014036792, 1), (2014047115, 1), (11817685, 1)
I'm expecting the reduce function can help me to group counts by the first number (id) in each tuple. For example:
(11817685, 2), (2014036792, 1), (2014047115, 1)
I tried
cnt = reduce(lambda a, b: a + b, r);
and some other ways but they all don't do the trick.
NOTE
Thanks for all the advice on other ways to solve the problems, but I'm just learning Python and how to implement a map-reduce here, and I have simplified my real business problem a lot to make it easy to understand, so please kindly show me a correct way of doing map-reduce.
You could use Counter:
from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
counter = Counter(arr)
print zip(counter.keys(), counter.values())
EDIT:
As pointed by #ShadowRanger Counter has items() method:
from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
print Counter(arr).items()
Instead of using any external module you can use some logic and do it without any module:
track={}
if intr not in track:
track[intr]=1
else:
track[intr]+=1
Example code :
For these types of list problems there is a pattern :
So suppose you have a list :
a=[(2006,1),(2007,4),(2008,9),(2006,5)]
And you want to convert this to a dict as the first element of the tuple as key and second element of the tuple. something like :
{2008: [9], 2006: [5], 2007: [4]}
But there is a catch you also want that those keys which have different values but keys are same like (2006,1) and (2006,5) keys are same but values are different. you want that those values append with only one key so expected output :
{2008: [9], 2006: [1, 5], 2007: [4]}
for this type of problem we do something like this:
first create a new dict then we follow this pattern:
if item[0] not in new_dict:
new_dict[item[0]]=[item[1]]
else:
new_dict[item[0]].append(item[1])
So we first check if key is in new dict and if it already then add the value of duplicate key to its value:
full code:
a=[(2006,1),(2007,4),(2008,9),(2006,5)]
new_dict={}
for item in a:
if item[0] not in new_dict:
new_dict[item[0]]=[item[1]]
else:
new_dict[item[0]].append(item[1])
print(new_dict)
output:
{2008: [9], 2006: [1, 5], 2007: [4]}
After writing my answer to a different question, I remembered this post and thought it would be helpful to write a similar answer here.
Here is a way to use reduce on your list to get the desired output.
arr = [11817685, 2014036792, 2014047115, 11817685]
def mapper(a):
return (a, 1)
def reducer(x, y):
if isinstance(x, dict):
ykey, yval = y
if ykey not in x:
x[ykey] = yval
else:
x[ykey] += yval
return x
else:
xkey, xval = x
ykey, yval = y
a = {xkey: xval}
if ykey in a:
a[ykey] += yval
else:
a[ykey] = yval
return a
mapred = reduce(reducer, map(mapper, arr))
print mapred.items()
Which prints:
[(2014036792, 1), (2014047115, 1), (11817685, 2)]
Please see the linked answer for a more detailed explanation.
If all you need is cnt, then a dict would probably be better than a list of tuples here (if you need this format, just use dict.items).
The collections module has a useful data structure for this, a defaultdict.
from collections import defaultdict
cnt = defaultdict(int) # create a default dict where the default value is
# the result of calling int
for key in arr:
cnt[key] += 1 # if key is not in cnt, it will put in the default
# cnt_list = list(cnt.items())

Extending python dictionary and changing key values

Assume I have a python dictionary with 2 keys.
dic = {0:'Hi!', 1:'Hello!'}
What I want to do is to extend this dictionary by duplicating itself, but change the key value.
For example, if I have a code
dic = {0:'Hi!', 1:'Hello'}
multiplier = 3
def DictionaryExtend(number_of_multiplier, dictionary):
"Function code"
then the result should look like
>>> DictionaryExtend(multiplier, dic)
>>> dic
>>> dic = {0:'Hi!', 1:'Hello', 2:'Hi!', 3:'Hello', 4:'Hi!', 5:'Hello'}
In this case, I changed the key values by adding the multipler at each duplication step. What's the efficient way of doing this?
Plus, I'm also planning to do the same job for list variable. I mean, extend a list by duplicating itself and change some values like above exmple. Any suggestion for this would be helpful, too!
You can try itertools to repeat the values and OrderedDict to maintain input order.
import itertools as it
import collections as ct
def extend_dict(multiplier, dict_):
"""Return a dictionary of repeated values."""
return dict(enumerate(it.chain(*it.repeat(dict_.values(), multiplier))))
d = ct.OrderedDict({0:'Hi!', 1:'Hello!'})
multiplier = 3
extend_dict(multiplier, d)
# {0: 'Hi!', 1: 'Hello!', 2: 'Hi!', 3: 'Hello!', 4: 'Hi!', 5: 'Hello!'}
Regarding handling other collection types, it is not clear what output is desired, but the following modification reproduces the latter and works for lists as well:
def extend_collection(multiplier, iterable):
"""Return a collection of repeated values."""
repeat_values = lambda x: it.chain(*it.repeat(x, multiplier))
try:
iterable = iterable.values()
except AttributeError:
result = list(repeat_values(iterable))
else:
result = dict(enumerate(repeat_values(iterable)))
return result
lst = ['Hi!', 'Hello!']
multiplier = 3
extend_collection(multiplier, lst)
# ['Hi!', 'Hello!', 'Hi!', 'Hello!', 'Hi!', 'Hello!']
It's not immediately clear why you might want to do this. If the keys are always consecutive integers then you probably just want a list.
Anyway, here's a snippet:
def dictExtender(multiplier, d):
return dict(zip(range(multiplier * len(d)), list(d.values()) * multiplier))
I don't think you need to use inheritance to achieve that. It's also unclear what the keys should be in the resulting dictionary.
If the keys are always consecutive integers, then why not use a list?
origin = ['Hi', 'Hello']
extended = origin * 3
extended
>> ['Hi', 'Hello', 'Hi', 'Hello', 'Hi', 'Hello']
extended[4]
>> 'Hi'
If you want to perform a different operation with the keys, then simply:
mult_key = lambda key: [key,key+2,key+4] # just an example, this can be any custom implementation but beware of duplicate keys
dic = {0:'Hi', 1:'Hello'}
extended = { mkey:dic[key] for key in dic for mkey in mult_key(key) }
extended
>> {0:'Hi', 1:'Hello', 2:'Hi', 3:'Hello', 4:'Hi', 5:'Hello'}
You don't need to extend anything, you need to pick a better input format or a more appropriate type.
As others have mentioned, you need a list, not an extended dict or OrderedDict. Here's an example with lines.txt:
1:Hello!
0: Hi.
2: pylang
And here's a way to parse the lines in the correct order:
def extract_number_and_text(line):
number, text = line.split(':')
return (int(number), text.strip())
with open('lines.txt') as f:
lines = f.readlines()
data = [extract_number_and_text(line) for line in lines]
print(data)
# [(1, 'Hello!'), (0, 'Hi.'), (2, 'pylang')]
sorted_text = [text for i,text in sorted(data)]
print(sorted_text)
# ['Hi.', 'Hello!', 'pylang']
print(sorted_text * 2)
# ['Hi.', 'Hello!', 'pylang', 'Hi.', 'Hello!', 'pylang']
print(list(enumerate(sorted_text * 2)))
# [(0, 'Hi.'), (1, 'Hello!'), (2, 'pylang'), (3, 'Hi.'), (4, 'Hello!'), (5, 'pylang')]

Append several variables to a list in Python

I want to append several variables to a list. The number of variables varies. All variables start with "volume". I was thinking maybe a wildcard or something would do it. But I couldn't find anything like this. Any ideas how to solve this? Note in this example it is three variables, but it could also be five or six or anything.
volumeA = 100
volumeB = 20
volumeC = 10
vol = []
vol.append(volume*)
You can use extend to append any iterable to a list:
vol.extend((volumeA, volumeB, volumeC))
Depending on the prefix of your variable names has a bad code smell to me, but you can do it. (The order in which values are appended is undefined.)
vol.extend(value for name, value in locals().items() if name.startswith('volume'))
If order is important (IMHO, still smells wrong):
vol.extend(value for name, value in sorted(locals().items(), key=lambda item: item[0]) if name.startswith('volume'))
Although you can do
vol = []
vol += [val for name, val in globals().items() if name.startswith('volume')]
# replace globals() with locals() if this is in a function
a much better approach would be to use a dictionary instead of similarly-named variables:
volume = {
'A': 100,
'B': 20,
'C': 10
}
vol = []
vol += volume.values()
Note that in the latter case the order of items is unspecified, that is you can get [100,10,20] or [10,20,100]. To add items in an order of keys, use:
vol += [volume[key] for key in sorted(volume)]
EDIT removed filter from list comprehension as it was highlighted that it was an appalling idea.
I've changed it so it's not too similar too all the other answers.
volumeA = 100
volumeB = 20
volumeC = 10
lst = map(lambda x : x[1], filter(lambda x : x[0].startswith('volume'), globals().items()))
print lst
Output
[100, 10, 20]
do you want to add the variables' names as well as their values?
output=[]
output.append([(k,v) for k,v in globals().items() if k.startswith('volume')])
or just the values:
output.append([v for k,v in globals().items() if k.startswith('volume')])
if I get the question appropriately, you are trying to append different values in different variables into a list. Let's see the example below.
Assuming :
email = 'example#gmail.com'
pwd='Mypwd'
list = []
list.append(email)
list.append (pwd)
for row in list:
print(row)
# the output is :
#example#gmail.com
#Mypwd
Hope this helps, thank you.

SQLite compare query Python

I've been trying to figure out the best way to write a query to compare the rows in two tables. My goal is to see if the two tuples in result Set A are in the larger result set B. I only want to see the tuples that are different in the query results.
'''SELECT table1.field_b, table1.field_c, table1.field_d
'''FROM table1
'''ORDER BY field_b
results_a = [(101010101, 111111111, 999999999), (121212121, 222222222, 999999999)]
'''SELECT table2.field_a, table2.fieldb, table3.field3
'''FROM table2
'''ORDER BY field_a
results_b =[(101010101, 111111111, 999999999), (121212121, 333333333, 999999999), (303030303, 444444444, 999999999)]
So what I want to do is take results_a and make sure that they have an exact match somewhere in results_b. So since the second record in the second tuple is different than what is in results_a, I would like to return the second tuple in results_a.
Ultimately I would like to return a set that also has the second tuple that did not match in the other set so I could reference both in my program. Ideally since the second tuples primary key (field_b in table1) didn't match the corresponding primary key (field_a) in table2 then I would want to display results_c ={(121212121, 222222222, 999999999):(121212121, 222222222, 999999999)}. This is complicated by the facts that the results in both tables will not be in the same order so I can't write code that says (compare tuple2 in results_a to tuple2 in results_b). It is more like (compare tuple2 in results_a and see if it matches any record in results_b. If the primary keys match and none of the tuples in results b completely match or no partial match is found return the records that don't match.)
I apologize that this is so wordy. I couldn't think of a better way to explain it. Any help would be much appreciated.
Thanks!
UPDATED EFFORT ON PARTIAL MATCHES
a = [(1, 2, 3),(4,5,7)]
b = [(1, 2, 3),(4,5,6)]
pmatch = dict([])
def partial_match(x,y):
return sum(ea == eb for (ea,eb) in zip(x,y))>=2
for el_a in a:
pmatch[el_a] = [el_b for el_b in b if partial_match(el_a,el_b)]
print(pmatch)
OUTPUT = {(4, 5, 7): [(4, 5, 6)], (1, 2, 3): [(1, 2, 3)]}. I would have expected it to be just {(4,5,7):(4,5,6)} because those are the only sets that are different. Any ideas?
Take results_a and make sure that they have an exact match somewhere in results_b:
for el in results_a:
if el in results_b:
...
Get partial matches:
pmatch = dict([])
def partial_match(a,b):
# for instance ...
return sum(ea == eb for (ea,eb) in zip(a,b)) >= 2
for el_a in results_a:
pmatch[el_a] = [el_b for el_b in results_b if partial_macth(el_a,el_b)]
Return the records that don't match:
no_match = [el for el in results_a if el not in results_b]
-- EDIT / Another possible partial_match
def partial_match(x,y):
nb_matches = sum(ea == eb for (ea,eb) in zip(x,y))
return 0.6 < float(nb_matches) / len(x) < 1

Categories