I have been trying to build a function to get letter frequencies from a string and store them in a dictionary.
I have done something like that :
s="today the weather was really nice"
def get_letter_freq(s):
for letter in(s):
x=letter.split()
f=dict()
for each_letter in x:
if f.has_key(x):
f[x]+=1
else:
f[x]=1
print f
Could you help me put things into order and find my mistakes?
Why I get an error that my 'f' is not defined?
In your code, your first for loop, where you have your letter.split() statement seems useless. Why you want to split a single character, you get in your loop?
Secondly, you have defined your f = dict() inside your function and
using it ouside. It will not be visible outside.
Third, your should not use f.has_key. Just do, key in my_dict to
check for availability of key in dict.
And at last, you can pass your dictionary as parameter to your
function. Then modify it there, and finally return it. (Although you can also do it without passing the dict in your function. just create a new one there, and return it).
So, in your code, almost everything is fine. You just need to remove your first for loop in function, and move f = dict() outside the function, before invoking it. And pass it as a paramter.
Way 1:
So, you can rather try the following modified code of yours: -
def get_letter_freq(my_dict, s):
for letter in s:
if letter in my_dict:
my_dict[letter] += 1
else:
my_dict[letter] = 1
return my_dict
my_dict = dict()
my_str = "today the weather was really nice"
print get_letter_freq(my_dict, my_str)
Way 2: -
Alternatively, you can also use a pre-defined library function Counter from collections, which does exactly what you want.
WAY 3: -
As suggested by #thebjorn in comment, you can also use defaultdict, which will make your task easier, in that, you won't have to check for the availability of key in dictionary before adding it. The count will automatically default to 0: -
from collections import defaultdict
def get_letter_freq(s):
my_dict = defaultdict(int)
for letter in s:
my_dict[letter] += 1
return my_dict
my_str = "today the weather was really nice"
print list(get_letter_freq(my_str).items())
Besides that indentation error your program has many other problems, like:
s = "today the weather was really nice"
def get_letter_freq(s):
f = dict()
for each_letter in s: #you can directly iterate over a string, so no need of split()
if each_letter in f: #has_key() has been deprecated
f[each_letter]+=1
else:
f[each_letter]=1
return f #better return the output from function
print get_letter_freq(s)
By the way collections.Counter() is good for this purpose:
In [61]: from collections import Counter
In [62]: strs = "today the weather was really nice"
In [63]: Counter(strs)
Out[63]: Counter({' ': 5, 'e': 5, 'a': 4, 't': 3, 'h': 2, 'l': 2, 'r': 2, 'w': 2, 'y': 2, 'c': 1, 'd': 1, 'i': 1, 'o': 1, 'n': 1, 's': 1})
f is defined inside get_letter_freq, you can't access it from outside.
Your function should return the constructed dictionary.
You should actually call the function.
What do you expect from splitting a single letter? Just leave that part out, and you don't need the inner loop.
print f needs to be indented, if it has to be part of get_letter_freq.
& f does not exist outside get_letter_freq. Hence the error.
import string
s="today the weather was really nice"
print dict([ ( letter, s.count(letter)) for letter in string.lowercase[:25]])
If case sensitivity is important use s.lower().count(letter) instead.
Related
So I was wondering if anybody wants to help me with this. I don't even understand where to begin? Any help would be appreciated.
Write a function called count_bases that counts the number of times each letter occurs in a given string. The results should be returned as a dictionary, with letters in upper case as keys and the number of occurrences as (integer) values
For example when the function is called with the string 'ATGATAGG', it should return {'A': 3, 'T': 2, 'G': 3, 'C': 0}. Please ensure your function uses return, not print(). The order of the keys in the dictionary does not need to follow this order (2 marks).
Make sure that your function works when passed any lower and/or uppercase DNA characters in the sequence string. (2 marks)
DNA sequences sometimes contain letters other than A, C, G to T to indicate degenerate nucleotides. For example, R can represent A or G (the purine bases). If the program encounters any letter other than A, C, G or T, it should also count the frequency of that letter and return within the dictionary object. (2 marks).
Use following code:
def count_bases(input_str):
result = {}
for s in input_str:
try:
result[s]+=1
except:
result[s] = 1
return result
print(count_bases('ATGATAGG'))
Output:
{'A': 3, 'T': 2, 'G': 3}
Try it:
def f(input):
d = {}
for s in input:
d[s] = d.get(s,0)+1
return d
from collections import Counter
def count_bases(sequence):
# since you want to count both lower and upper case letters,
# it'd be better if you convert the input sequence to either upper or lower.
sequence = sequence.upper()
# Counter (from collections) does the counting for you. It takes list as input.
# So, list(sequence) will separate letters from your sequence into a list of letters ('abc' => ['a', 'b', 'c'])
# It returns you a Counter object. Since you want a dictionary, cast it to dict.
return dict(Counter(list(sequence)))
count_bases('ATGATAGGaatdga')
{'A': 6, 'T': 3, 'G': 4, 'D': 1}
In order to optimize a code in one single line, I am trying to write a determinate statement in my code without calling any function or method. While I was thinking about this I wondered if this is even possible in my case. I was searching some information about this but it seems to be very rarely, but in my current work I must be able to keep the code intact except that optimize section.
Hope you could give me a hand. Any help is welcome.
This is my current progress.
def count_chars(s):
'''(str) -> dict of {str: int}
Return a dictionary where the keys are the characters in s and the values
are how many times those characters appear in s.
>>> count_chars('abracadabra')
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}
'''
d = {}
for c in s:
if not (c in d):
# This is the line it is assumed to be modified without calling function or method
else:
d[c] = d[c] + 1
return d
How about this, as mentioned in the comments, it does implicitly use functions, but I think it may be the sort of thing you are looking for?
s='abcab'
chars={}
for char in s:
if char not in chars:
chars[char]=0
chars[char]+=1
Result
{'a': 2, 'b': 2, 'c': 1}
In my homework, this question is asking me to make a function where Python should create dictionary of how many words that start with a certain letter in the long string is symmetrical. Symmetrical means the word starts with one letter and ends in the same letter. I do not need help with the algorithm for this. I definitely know I have it right, but however I just need to fix this Key error that I cannot figure out. I wrote d[word[0]] += 1, which is to add 1 to the frequency of words that start with that particular letter.
The output should look like this (using the string I provided below):
{'d': 1, 'i': 3, 't': 1}
t = '''The sun did not shine
it was too wet to play
so we sat in the house
all that cold cold wet day
I sat there with Sally
we sat there we two
and I said how I wish
we had something to do'''
def symmetry(text):
from collections import defaultdict
d = {}
wordList = text.split()
for word in wordList:
if word[0] == word[-1]:
d[word[0]] += 1
print(d)
print(symmetry(t))
You're trying to increase the value of an entry which has yet to be made resulting in the KeyError. You could use get() for when there is no entry for a key yet; a default of 0 will be made (or any other value you choose). With this method, you would not need defaultdict (although very useful in certain cases).
def symmetry(text):
d = {}
wordList = text.split()
for word in wordList:
key = word[0]
if key == word[-1]:
d[key] = d.get(key, 0) + 1
print(d)
print(symmetry(t))
Sample Output
{'I': 3, 'd': 1, 't': 1}
You never actually use collections.defaultdict, although you import it. Initialize d as defaultdict(int), instead of as {}, and you're good to go.
def symmetry(text):
from collections import defaultdict
d = defaultdict(int)
wordList = text.split()
for word in wordList:
if word[0] == word[-1]:
d[word[0]] += 1
print(d)
print(symmetry(t))
Results in:
defaultdict(<class 'int'>, {'I': 3, 't': 1, 'd': 1})
Suppose I have this dictionary:
x = {'a':2, 'b':5, 'g':7, 'a':3, 'h':8}`
And this input string:
y = 'agb'
I want to delete the keys of x that appear in y, such as, if my input is as above, output should be:
{'h':8, 'a':3}
My current code is here:
def x_remove(x,word):
x1 = x.copy() # copy the input dict
for i in word: # iterate all the letters in str
if i in x1.keys():
del x1[i]
return x1
But when the code runs, it removes all existing key similar as letters in word. But i want though there is many keys similar as letter in word , it only delete one key not every
wheres my wrong, i got that maybe but Just explain me how can i do that without using del function
You're close, but try this instead:
def x_remove(input_dict, word):
output_dict = input_dict.copy()
for letter in word:
if letter in output_dict:
del output_dict[letter]
return output_dict
For example:
In [10]: x_remove({'a': 1, 'b': 2, 'c':3}, 'ac')
Out[10]: {'b': 2}
One problem was your indentation. Indentation matters in Python, and is used the way { and } and ; are in other languages. Another is the way you were checking to see if each letter was in the list; you want if letter in output_dict since in on a dict() searches keys.
It's also easier to see what's going on when you use descriptive variable names.
We can also skip the del entirely and make this more Pythonic, using a dict comprehension:
def x_remove(input_dict, word):
return {key: value for key, value in input_dict if key not in word}
This will still implicitly create a shallow copy of the list (without the removed elements) and return it. This will be more performant as well.
As stated in the comments, all keys in dictionaries are unique. There can only ever be one key named 'a' or b.
Dictionary must have unique keys. You may use list of tuples for your data instead.
x = [('a',2), ('b',5), ('g',7), ('a',3), ('h',8)]
Following code then deletes the desired entries:
for letter in y:
idx = 0
for item in x.copy():
if item[0] == letter:
del x[idx]
break
idx += 1
Result:
>>> x
[('a', 3), ('h', 8)]
You can also implement like
def remove_(x,y)
for i in y:
try:
del x[i]
except:
pass
return x
Inputs x = {'a': 1, 'b': 2, 'c':3} and y = 'ac'.
Output
{'b': 2}
This question already has answers here:
Count the number of occurrences of a character in a string
(26 answers)
Closed 8 years ago.
I want a string such as 'ddxxx' to be returned as ('d': 2, 'x': 3). So far I've attempted
result = {}
for i in s:
if i in s:
result[i] += 1
else:
result[i] = 1
return result
where s is the string, however I keep getting a KeyError. E.g. if I put s as 'hello' the error returned is:
result[i] += 1
KeyError: 'h'
The problem is with your second condition. if i in s is checking for the character in the string itself and not in the dictionary. It should instead be if i in result.keys() or as Neil mentioned It can just be if i in result
Example:
def fun(s):
result = {}
for i in s:
if i in result:
result[i] += 1
else:
result[i] = 1
return result
print (fun('hello'))
This would print
{'h': 1, 'e': 1, 'l': 2, 'o': 1}
You can solve this easily by using collections.Counter. Counter is a subtype of the standard dict that is made to count things. It will automatically make sure that indexes are created when you try to increment something that hasn’t been in the dictionary before, so you don’t need to check it yourself.
You can also pass any iterable to the constructor to make it automatically count the occurrences of the items in that iterable. Since a string is an iterable of characters, you can just pass your string to it, to count all characters:
>>> import collections
>>> s = 'ddxxx'
>>> result = collections.Counter(s)
>>> result
Counter({'x': 3, 'd': 2})
>>> result['x']
3
>>> result['d']
2
Of course, doing it the manual way is fine too, and your code almost works fine for that. Since you get a KeyError, you are trying to access a key in the dictionary that does not exist. This happens when you happen to come accross a new character that you haven’t counted before. You already tried to handle that with your if i in s check but you are checking the containment in the wrong thing. s is your string, and since you are iterating the character i of the string, i in s will always be true. What you want to check instead is whether i already exists as a key in the dictionary result. Because if it doesn’t you add it as a new key with a count of 1:
if i in result:
result[i] += 1
else:
result[i] = 1
Using collections.Counter is the sensible solution. But if you do want to reinvent the wheel, you can use the dict.get() method, which allows you to supply a default value for missing keys:
s = 'hello'
result = {}
for c in s:
result[c] = result.get(c, 0) + 1
print result
output
{'h': 1, 'e': 1, 'l': 2, 'o': 1}
Here is a simple way of doing this if you don't want to use collections module:
>>> st = 'ddxxx'
>>> {i:st.count(i) for i in set(st)}
{'x': 3, 'd': 2}