can't group by anagram correctly - python

I wrote a python function to group a list of words by anagram:
def groupByAnagram(list):
dic = {}
for x in list:
sort = ''.join(sorted(x))
if sort in dic == True:
dic[sort].append(x)
else:
dic[sort] = [x]
for y in dic:
for z in dic[y]:
print z
groupByAnagram(['cat','tac','dog','god','aaa'])
but this only returns:
aaa
god
tac
what am I doing wrong?

if sort in dic == True:
Thanks to operator chaining, this line is equivalent to
if (sort in dic) and (dic == True):
But dic is a dictionary, so it will never compare equal to True. Just drop the == True comparison entirely.
if sort in dic:

remove the "== True" in your if clause. You can just check with sort in dic.
change the if-clause to:
if sort in dic:
and everything works as expected.
You can also remove the if-clause by using the default dict of the collections package. This way you do not have to check if you have to create a new list for your dict, each time.
import collections
def groupByAnagram2(word_list):
dic = collections.defaultdict(list)
for x in word_list:
sort = ''.join(sorted(x))
dic[sort].append(x)
for words in dic.values():
for word in words:
print word

Related

Return key of list value while searching for element in list Python

Let me give you an example for this. I have a dictionary
word = 'mango'
my_dict = {'A':['apple','banana','pear'],
'B':['mango','carrot','guava'],
'C':['orange','lemon','ginger']}
I want to be able to return 'B' as the answer by iterating through all the list/value elements . How could I do this? functions and comprehensions are both acceptable. Please help me out.
Something like:
word = 'mango'
my_dict = {'A':['apple','banana','pear'],
'B':['mango','carrot','guava'],
'C':['orange','lemon','ginger']}
def search_value():
for key, _list in my_dict.items():
if word in _list:
return key
print(search_value()) # B

matches key with value in dictionary

every key compares with value(you may say there is spelling check betwen key and value).If there are only 2 words mismatch then print the key
input={"their":"thuor","diksha","dijmno"}
output=["their"]
def find_correct(words_dict):
count=0
final_list=[]
for key,value in words_dict.items():
for i in range(len(value)): # this may need adjusting for different length words
if(value[i]!=key[i]):
count+=1
if(count<=2):
final_list.append(key)
return final_list
print(find_correct({"their":"thuor","diksha":"dijmno"}))
This can be done with list comprehension and sets
print([i for i in d if len(set(i) - set(d[i])) == 2])

Python how to replace values in one list with values in a dictionary?

example:
my list is ['tree','world','tre','worl']
my dict is {'tre':'good','worl':nice}
my scripts:
def replace(list, dictionary):
for i in list:
for k in dictionary:
list = list.replace(k, dictionary[k])
return list
print replace(input_file,curaw_dict)
but every time I receive the result is like:
goode
niced
good
nice
how can I make it more accurate
make it like
tree
world
good
nice
Thanks alot
Lets make it a list comprehension instead.
replaced_list = [x if x not in my_dict else my_dict[x] for x in my_list]
I guess if you want a function you could do:
replace = lambda my_dict, my_list: [x if x not in my_dict else my_dict[x] for x in my_list]
or
def replace(my_list, my_dict):
return [x if x not in my_dict else my_dict[x] for x in my_list]
input_file = ['tree', 'world', 'tre', 'worl']
curaw_dict = {'tre':'good','worl':'nice'}
def replace(list, dictionary):
return [curaw_dict.get(item, item) for item in list]
print replace(input_file,curaw_dict)
>>> li=['tree', 'world', 'tre', 'worl']
>>> di={'tre':'good','worl':'nice'}
>>> print('\n'.join(di.get(e,e) for e in li))
tree
world
good
nice
def replace(list, dictionary):
for idx, val in enumerate(list):
if i in k:
list[idx] = dictionary[list[idx]]
return list
print replace(input_file,curaw_dict)
You don't need to iterate over a dictionary. Replace does partial replacements, but in will check if a key exists in a dictionary.
'key' in dictionary is the way to check if a key exists in a dict:
def replace(list, dictionary):
new_list = []
for i in list:
if i in dictionary:
new_list.append(dictionary[i])
else:
new_list.append(i)
return new_list
for k in dictionary: is not the correct way to iterate over the items in the dictionary. Instead, you should use enumerate to iterate over the items in the list and look them up in the dictionary:
def replace(lst, dictionary):
for k,v in enumerate(lst):
if v in dictionary:
lst[k] = dictionary[v]
return lst
For each item in the list, k is the index of the value and v is the value itself. You then check if the value is in the dictionary, and if it is, you replace the value in the list with the value in the dictionary.
You also should not name your variables list, since that is a reserved word in Python.
You can alternatively use a list comprehension:
def replace(lst, dictionary):
return [item if not item in dictionary else dictionary[item] for item in lst]

Finding if there are distinct elements in a python dictionary

I have a python dictionary containing n key-value pairs, out of which n-1 values are identical and 1 is not. I need to find the key of the distinct element.
For example: consider a python list [{a:1},{b:1},{c:2},{d:1}]. I need the to get 'c' as the output.
I can use a for loop to compare consecutive elements and then use two more for loops to compare those elements with the other elements. But is there a more efficient way to go about it or perhaps a built-in function which I am unaware of?
If you have a dictionary you can quickly check and find the first value which is different from the next two values cycling around the keys of your dictionary.
Here's an example:
def find_different(d):
k = d.keys()
for i in xrange(0, len(k)):
if d[k[i]] != d[k[(i+1)%len(k)]] and d[k[i]] != d[k[(i+2)%len(k)]]:
return k[i]
>>> mydict = {'a':1, 'b':1, 'c':2, 'd':1}
>>> find_different(mydict)
'c'
Otherwise, if what you have is a list of single-key dictionaries, then you can do it quite nicely mapping your list with a function which "extracts" the values from your elements, then check each one using the same logic.
Here's another working example:
def find_different(l):
mask = map(lambda x: x[x.keys()[0]], l)
for i in xrange(0, len(l)):
if mask[i] != mask[(i+1)%len(l)] and mask[i] != mask[(i+2)%len(l)]:
return l[i].keys()[0]
>>> mylist = [{'a':1},{'b':1},{'c':2},{'d':1}]
>>> find_different(mylist)
'c'
NOTE: these solutions do not work in Python 3 as the map function doesn't return a list and neither does the .keys() method of dictionaries.
Assuming that your "list of pairs" (actually list of dictionaries, sigh) cannot be changed:
from collections import defaultdict
def get_pair(d):
return (d.keys()[0], d.values()[0])
def extract_unique(l):
d = defaultdict(list)
for key, value in map(get_pair, l):
d[value].append(key)
return filter(lambda (v,l): len(l) == 1, d.items())[0][1]
If you already have your dictionary, then you make a list of all of the keys: key_list = yourDic.keys(). Using that list, you can then loop through your dictionary. This is easier if you know one of the values, but below I assume that you do not.
yourDic = {'a':1, 'b':4, 'c':1, 'd':1, }
key_list = yourDic.keys()
previous_value = yourDic[key_list[0]] # Making it so loop gets past first test
count = 0
for key in key_list:
test_value = yourDic[key]
if (test_value != previous_value) and count == 1: # Checks first key
print key_list[count - 1]
break
elif (test_value != previous_value):
print key
break
else:
previous_value = test_value
count += 1
So, once you find the value that is different, it will print the key. If you want it to print the value, too, you just need a print test_value statement

How to turn a dictionary "inside-out"

Disclaimer: I am just getting started learning Python
I have a function that counts the number of times a word appears in a text file and sets the word as the key and the count as the value, and stores it in a dictionary "book_index". Here is my code:
alice = open('location of the file', 'r', encoding = "cp1252")
def book_index(alice):
"""Alice is a file reference"""
"""Alice is opened, nothing else is done"""
worddict = {}
line = 0
for ln in alice:
words = ln.split()
for wd in words:
if wd not in worddict:
worddict[wd] = 1 #if wd is not in worddict, increase the count for that word to 1
else:
worddict[wd] = worddict[wd] + 1 #if wd IS in worddict, increase the count for that word BY 1
line = line + 1
return(worddict)
I need to turn that dictionary "inside out" and use the count as the key, and any word that appears x amount of times as the value. For instance: [2, 'hello', 'hi'] where 'hello' and 'hi' appear twice in the text file.
Do I need to loop through my existing dictionary or loop through the text file again?
As a dictionary is a key to value mapping, you cannot efficiently filter by the values. So you will have to loop through all elements in the dictionary to get the keys which values have some specific value.
This will print out all keys in the dictionary d where the value is equal to searchValue:
for k, v in d.items():
if v == searchValue:
print(k)
Regarding your book_index function, note that you can use the built-in Counter for counting things. Counter is essentially a dictionary that works with counts as its values and automatically takes care of nonexistant keys. Using a counter, your code would look like this:
from collections import Counter
def book_index(alice):
worddict = Counter()
for ln in alice:
worddict.update(ln.split())
return worddict
Or, as roippi suggested as a comment to another answer, just worddict = Counter(word for line in alice for word in line.split()).
Personally I would suggest the use of a Counter object here, which is specifically made for this kind of application. For instance:
from collections import Counter
counter = Counter()
for ln in alice:
counter.update(ln.split())
This will give you the relevant dictionary, and if you then read the Counter docs
You can just retrieve the most common results.
This might not work in every case in your proposed problem, but it's slightly nicer than manually iterating through even the first time around.
If you really want to "flip" this dictionary you could do something along these lines:
matching_values = lambda value: (word for word, freq in wordict.items() if freq==value)
{value: matching_values for value in set(worddict.values())}
The above solution has some advantages over other solutions in that the lazy execution means that for very sparse cases where you're not looking to make a lot of calls to this function, or just discover which value actually have corresponding entries, this will be faster as it won't actually iterate through the dictionary.
That said, this solution will usually be worse than the vanilla iteration solution since it actively iterates through the dictionary every time you need a new number.
Not radically different, but I didn't want to just copy the other answers here.
Loop through your existing dictionary, here is an example using dict.setdefault():
countdict = {}
for k, v in worddict.items():
countdict.setdefault(v, []).append(k)
Or with collections.defaultdict:
import collections
countdict = collections.defaultdict(list)
for k, v in worddict.items():
countdict[v].append(k)
Personally I prefer the setdefault() method because the result is a regular dictionary.
Example:
>>> worddict = {"hello": 2, "hi": 2, "world": 4}
>>> countdict = {}
>>> for k, v in worddict.items():
... countdict.setdefault(v, []).append(k)
...
>>> countdict
{2: ['hi', 'hello'], 4: ['world']}
As noted in some of the other answers, you can significantly shorten your book_index function by using collections.Counter.
Without duplicates:
word_by_count_dict = {value: key for key, value in worddict.iteritems()}
See PEP 274 to understand dictionary comprehension with Python: http://www.python.org/dev/peps/pep-0274/
With duplicates:
import collections
words_by_count_dict = collections.defaultdict(list)
for key, value in worddict.iteritems():
words_by_count_dict[value].append(key)
This way:
words_by_count_dict[2] = ["hello", "hi"]

Categories