How to get occurrence of word length? Python3 - python

Ran into another problem. So I'm trying to make a dictionary where the key would be the word length and the value would be the amount of time a word with that length is read from a text file.
My Code:
words = new_text.split()
w_dict = {}
w_list = []
for c in words:
if len(c) not in w_dict.fromkeys(range(0, 1000)):
w_dict[len(c)] += 1
else :
w_dict[len(c)] = 1
w_list = sorted(w_dict.items(), key = lambda x: x[0])
w_final_dict = dict(w_list)
print(w_final_dict)
My Output:
{1: 1, 2: 1, 4: 1, 5: 1}
My Sample text was "hello my name is Kate". Based on that I know that it does iterate and does check if there is a word length that matches the text since there is no word with len(3) in output. But there's 2 len(4) and and 2 len(5) so I don't understand why it didn't increment. Any help would be great. Thanks in advance!

I think the problem in your code is this check:
if len(c) not in w_dict.fromkeys(range(0, 1000)):
You just want to be checking w_dict, not the result of fromkeys (which I think is a whole new dictionary).
But you can do this whole thing in one line with collections.Counter:
>>> new_text = "hello my name is kate"
>>> from collections import Counter
>>> Counter(map(len, new_text.split()))
Counter({2: 2, 4: 2, 5: 1})
Counter takes an iterable for its constructor and it produces a dict where each item from the iterable is a key and the value is the number of times that value appeared. map(len, new_text.split()) gives us an iterable of the lengths of all the words in the string, so passing that to Counter gives us the dictionary of counts that we want.

There is problem with your logic of if statement
In Your condition if len(c) not in w_dict.fromkeys(range(0, 1000)):
The function `w_dict.fromkeys(range(0, 1000)) generates output as following:
{0: None, 1: None, 2: None, 3: None, 4: None, 5: None, .... .... 999:None}
So you can not check using the way you checked with your logic for dictionaries
Hence condition if len(c) not in w_dict.fromkeys(range(0, 1000)): will always evaluate to FALSE and it will never increase the count and keep overwriting count to 1 by evaluating else part
That is why you get output {1: 1, 2: 1, 4: 1, 5: 1}
Correct Solution
Change your condition to this
if w_dict.get(len(c)):
.get(key) is one of the inbuilt function of dictionaries which returns value based on existence of key. Also it does not generate KEY ERROR.
if key exists => Return value stored at that key => This makes condition TURE
if key does not exists => Returns None keyword => This makes condition FALSE
So you get desired results
Remember always - use .get(key) function whenever you deal with keys in dictionaries
Refer to Dictionaries tutorials on web to learn more about how to iterate and check conditions on them
Hope this helps and clear your doubts :)

Keep it simple. Check if the len(word) exists as a key in the dict and add to the count.
If it does not exist, create the key, value pair.
string = 'hello my name is Kate'
data = {}
for word in string.split():
if len(word) not in data:
data[len(word)] = 1
else:
data[len(word)] += 1
Output:
>>> data
{5: 1, 2: 2, 4: 2}

The reason you don't see 2 len(4) and 2 len(5) key values is because python doesn't allow duplicate keys in dictionaries.
Keys in python are used as unique identifiers. By having duplicates you thus create ambiguity. If you try to add a key value pair, but the key already exists, Python will simply update the value of the key in dictionary with the new value.

Related

Extract a list of keys by Sorting the dictionary in python

I have my program's output as a python dictionary and i want a list of keys from the dictn:
s = "cool_ice_wifi"
r = ["water_is_cool", "cold_ice_drink", "cool_wifi_speed"]
good_list=s.split("_")
dictn={}
for i in range(len(r)):
split_review=r[i].split("_")
counter=0
for good_word in good_list:
if good_word in split_review:
counter=counter+1
d1={i:counter}
dictn.update(d1)
print(dictn)
The conditions on which we should get the keys:
The keys with the same values will have the index copied as it is in a dummy list.
The keys with highest values will come first and then the lowest in the dummy list
Dictn={0: 1, 1: 1, 2: 2}
Expected output = [2,0,1]
You can use a list comp:
[key for key in sorted(dictn, key=dictn.get, reverse=True)]
In Python3 it is now possible to use the sorted method, as described here, to sort the dictionary in any way you choose.
Check out the documentation, but in the simplest case you can .get the dictionary's values, while for more complex operations, you'd define a key function yourself.
Dictionaries in Python3 are now insertion-ordered, so one other way to do things is to sort at the moment of dictionary creation, or you could use an OrderedDict.
Here's an example of the first option in action, which I think is the easiest
>>> a = {}
>>> a[0] = 1
>>> a[1] = 1
>>> a[2] = 2
>>> print(a)
{0: 1, 1: 1, 2: 2}
>>>
>>> [(k) for k in sorted(a, key=a.get, reverse=True)]
[2, 0, 1]

Unable to solve dictionary update value error while parsing financial statement

I'm parsing the below financial statement and trying to create dictionaries out of them. But I keep getting this error: ValueError: dictionary update sequence element #0 has length 1; 2 is required
Below is the cleaned financial statement:
[[XXX XXX LTD.'],
['Statement of Loss and Retained Earnings'],
['For the Year Ended May', 'XX,', 'XXXX'],
['Unaudited - See Notice To Reader'],
['XXXX', 'XXXX'],
['REVENUE', 'XXX,XXX,XXX', 'XXX,XXX,XXX']
]
Below is the code that I'm using to create dictionaries:
Python 3.6
for temp in cleaned_list:
if len(temp) == 1:
statement[temp[0]] = temp[0]
elif len(temp) > 1:
statement[temp[0]] = {}
for temp_1 in temp[1:]:
statement[temp[0]].update(temp_1)
If the list has a length of one, I want to make the entry of that list both its dictionary key and value. If the list entry has multiple items, I want to make the first entry the key, and the remaining entries the values. I'm not sure what the error that I'm getting is, and why it's occurring. Why do you think this is happening and how can I fix it?
As detailed here, the update() method updates a dictionary with elements from a dictionary object or an iterable object of key/value pairs. You are getting an error message because you are trying to update your dictionary without specifying the key associated with the values in temp_1.
This should do the trick:
statement={}
for temp in cleaned_list:
key=temp[0]
statement.update({key:None})
if len(temp)==1:
value=key
statement.update({key:value})
elif len(temp) > 1:
values=temp[1:]
statement.update({key:values})
statement = {}
for temp in cleaned_list:
if len(temp) == 1:
statement[temp[0]] = temp[0]
elif len(temp) > 1:
if temp[0] in statement:
statement[temp[0]].extend(temp[1:])
else:
statement[temp[0]] = temp[1:]
Explanation (update): The statement.update() replaces the value in the key and at the same time you are already re-setting the dictionary key pair with statement[temp[0]] = {}. So, it doesn't seem like you want to update the value but append the list items. I use extend() so that you don't have a value list with list items like 'key': ['foo', 'bar', ['foo2', 'bar2']], which instead will become 'key': ['foo', 'bar', 'foo2', 'bar2'] when using extend(). Also, I added the if statement to check if the key already exists.

How to remove a dict objects(letter) that remain in another str?

Suppose I have this dictionary:
x = {'a':2, 'b':5, 'g':7, 'a':3, 'h':8}`
And this input string:
y = 'agb'
I want to delete the keys of x that appear in y, such as, if my input is as above, output should be:
{'h':8, 'a':3}
My current code is here:
def x_remove(x,word):
x1 = x.copy() # copy the input dict
for i in word: # iterate all the letters in str
if i in x1.keys():
del x1[i]
return x1
But when the code runs, it removes all existing key similar as letters in word. But i want though there is many keys similar as letter in word , it only delete one key not every
wheres my wrong, i got that maybe but Just explain me how can i do that without using del function
You're close, but try this instead:
def x_remove(input_dict, word):
output_dict = input_dict.copy()
for letter in word:
if letter in output_dict:
del output_dict[letter]
return output_dict
For example:
In [10]: x_remove({'a': 1, 'b': 2, 'c':3}, 'ac')
Out[10]: {'b': 2}
One problem was your indentation. Indentation matters in Python, and is used the way { and } and ; are in other languages. Another is the way you were checking to see if each letter was in the list; you want if letter in output_dict since in on a dict() searches keys.
It's also easier to see what's going on when you use descriptive variable names.
We can also skip the del entirely and make this more Pythonic, using a dict comprehension:
def x_remove(input_dict, word):
return {key: value for key, value in input_dict if key not in word}
This will still implicitly create a shallow copy of the list (without the removed elements) and return it. This will be more performant as well.
As stated in the comments, all keys in dictionaries are unique. There can only ever be one key named 'a' or b.
Dictionary must have unique keys. You may use list of tuples for your data instead.
x = [('a',2), ('b',5), ('g',7), ('a',3), ('h',8)]
Following code then deletes the desired entries:
for letter in y:
idx = 0
for item in x.copy():
if item[0] == letter:
del x[idx]
break
idx += 1
Result:
>>> x
[('a', 3), ('h', 8)]
You can also implement like
def remove_(x,y)
for i in y:
try:
del x[i]
except:
pass
return x
Inputs x = {'a': 1, 'b': 2, 'c':3} and y = 'ac'.
Output
{'b': 2}

How can I populate a dictionary with an enumerated list?

I have the following dictionary, where keys are integers and values are floats:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
This dictionary has keys 1, 2, 3 and 4.
Now, I remove the key '2':
foo = {1:0.001,3:1.093,4:5.246}
I only have the keys 1, 3 and 4 left. But I want these keys to be called 1, 2 and 3.
The function 'enumerate' allows me to get the list [1,2,3]:
some_list = []
for k,v in foo.items():
some_list.append(k)
num_list = list(enumerate(some_list, start=1))
Next, I try to populate the dictionary with these new keys and the old values:
new_foo = {}
for i in num_list:
for value in foo.itervalues():
new_foo[i[0]] = value
However, new_foo now contains the following values:
{1: 5.246, 2: 5.246, 3: 5.246}
So every value was replaced by the last value of 'foo'. I think the problem comes from the design of my for loop, but I don't know how to solve this. Any tips?
Using the list-comprehension-like style:
bar = dict( (k,v) for k,v in enumerate(foo.values(), start=1) )
But, as mentioned in the comments the ordering is going to be arbitrary, since the dict structure in python is unordered. To preserve the original order the following can be used:
bar = dict( ( i,foo[k] ) for i, k in enumerate(sorted(foo), start=1) )
here sorted(foo) returns the list of sorted keys of foo. i is the new enumeration of the sorted keys as well as the new enumeration for the new dict.
Like others have said, it would be best to use a list instead of dict. However, in case you prefer to stick with a dict, you can do
foo = {j+1:foo[k] for j,k in enumerate(sorted(foo))}
Agreeing with the other responses that a list implements the behavior you describe, and so it probably more appropriate, but I will suggest an answer anyway.
The problem with your code is the way you are using the data structures. Simply enumerate the items left in the dictionary:
new_foo = {}
for key, (old_key, value) in enumerate( sorted( foo.items() ) ):
key = key+1 # adjust for 1-based
new_foo[key] = value
A dictionary is the wrong structure here. Use a list; lists map contiguous integers to values, after all.
Either adjust your code to start at 0 rather than 1, or include a padding value at index 0:
foo = [None, 0.001, 2.097, 1.093, 5.246]
Deleting the 2 'key' is then as simple as:
del foo[2]
giving you automatic renumbering of the rest of your 'keys'.
This looks suspiciously like Something You Should Not Do, but I'll assume for a moment that you're simplifying the process for an MCVE rather than actually trying to name your dict keys 1, 2, 3, 4, 5, ....
d = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
del d[2]
# d == {1:0.001, 3:1.093, 4:5.246}
new_d = {idx:val for idx,val in zip(range(1,len(d)+1),
(v for _,v in sorted(d.items())))}
# new_d == {1: 0.001, 2: 1.093, 3: 5.246}
You can convert dict to list, remove specific element, then convert list to dict. Sorry, it is not a one liner.
In [1]: foo = {1:0.001,2:2.097,3:1.093,4:5.246}
In [2]: l=foo.values() #[0.001, 2.097, 1.093, 5.246]
In [3]: l.pop(1) #returns 2.097, not the list
In [4]: dict(enumerate(l,1))
Out[4]: {1: 0.001, 2: 1.093, 3: 5.246}
Try:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
foo.pop(2)
new_foo = {i: value for i, (_, value) in enumerate(sorted(foo.items()), start=1)}
print new_foo
However, I'd advise you to use a normal list instead, which is designed exactly for fast lookup of gapless, numeric keys:
foo = [0.001, 2.097, 1.093, 5.245]
foo.pop(1) # list indices start at 0
print foo
One liner that filters a sequence, then re-enumerates and constructs a dict.
In [1]: foo = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
In [2]: selected=1
In [3]: { k:v for k,v in enumerate((foo[i] for i in foo if i<>selected), 1) }
Out[3]: {1: 2.097, 2: 1.093, 3: 5.246}
I have a more compact method.
I think it's more readable and easy to understand. You can refer as below:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
del foo[2]
foo.update({k:foo[4] for k in foo.iterkeys()})
print foo
So you can get answer you want.
{1: 5.246, 3: 5.246, 4: 5.246}

How do I create a dictionary from a string returning the number of characters [duplicate]

This question already has answers here:
Count the number of occurrences of a character in a string
(26 answers)
Closed 8 years ago.
I want a string such as 'ddxxx' to be returned as ('d': 2, 'x': 3). So far I've attempted
result = {}
for i in s:
if i in s:
result[i] += 1
else:
result[i] = 1
return result
where s is the string, however I keep getting a KeyError. E.g. if I put s as 'hello' the error returned is:
result[i] += 1
KeyError: 'h'
The problem is with your second condition. if i in s is checking for the character in the string itself and not in the dictionary. It should instead be if i in result.keys() or as Neil mentioned It can just be if i in result
Example:
def fun(s):
result = {}
for i in s:
if i in result:
result[i] += 1
else:
result[i] = 1
return result
print (fun('hello'))
This would print
{'h': 1, 'e': 1, 'l': 2, 'o': 1}
You can solve this easily by using collections.Counter. Counter is a subtype of the standard dict that is made to count things. It will automatically make sure that indexes are created when you try to increment something that hasn’t been in the dictionary before, so you don’t need to check it yourself.
You can also pass any iterable to the constructor to make it automatically count the occurrences of the items in that iterable. Since a string is an iterable of characters, you can just pass your string to it, to count all characters:
>>> import collections
>>> s = 'ddxxx'
>>> result = collections.Counter(s)
>>> result
Counter({'x': 3, 'd': 2})
>>> result['x']
3
>>> result['d']
2
Of course, doing it the manual way is fine too, and your code almost works fine for that. Since you get a KeyError, you are trying to access a key in the dictionary that does not exist. This happens when you happen to come accross a new character that you haven’t counted before. You already tried to handle that with your if i in s check but you are checking the containment in the wrong thing. s is your string, and since you are iterating the character i of the string, i in s will always be true. What you want to check instead is whether i already exists as a key in the dictionary result. Because if it doesn’t you add it as a new key with a count of 1:
if i in result:
result[i] += 1
else:
result[i] = 1
Using collections.Counter is the sensible solution. But if you do want to reinvent the wheel, you can use the dict.get() method, which allows you to supply a default value for missing keys:
s = 'hello'
result = {}
for c in s:
result[c] = result.get(c, 0) + 1
print result
output
{'h': 1, 'e': 1, 'l': 2, 'o': 1}
Here is a simple way of doing this if you don't want to use collections module:
>>> st = 'ddxxx'
>>> {i:st.count(i) for i in set(st)}
{'x': 3, 'd': 2}

Categories