Most elegant way to count integers in a list - python

I am looking for the most elegant way to do the following:
Let's say that I want to count number of times each integer appears in a list; I could do it this way:
x = [1,2,3,2,4,1,2,5,7,2]
dicto = {}
for num in x:
try:
dicto[num] = dicto[num] + 1
except KeyError:
dicto[num] = 1
However, I think that
try:
dicto[num] = dicto[num] + 1
except KeyError:
dicto[num] = 1
is not the most elegant ways to do it; I think that I saw the above code replaced by a single line. What is the most elegant way to do this?
I realized that this might be a repeat, but I looked around and couldn't find what I was looking for.
Thank You in advance.

Use the Counter class
>>> from collections import Counter
>>> x = [1,2,3,2,4,1,2,5,7,2]
>>> c = Counter(x)
Now you can use the Counter object c as dictionary.
>>> c[1]
2
>>> c[10]
0
(This works for non-existant values too)

>>> from collections import defaultdict
>>> x = [1,2,3,2,4,1,2,5,7,2]
>>> d = defaultdict(int)
>>> for i in x:
d[i] += 1
>>> dict(d)
{1: 2, 2: 4, 3: 1, 4: 1, 5: 1, 7: 1}
Or just collections.Counter, if you are on Python 2.7+.

Bucket sort, as you're doing, is entirely algorithmically appropriate (discussion). This seems ideal when you don't need the additional overhead from Counter:
from collections import defaultdict
wdict = defaultdict(int)
for word in words:
wdict[word] += 1

Related

How to check if elements in one array exist in another array if so print the count using Python

I have two arrays
A=[1,2,3,4,6,5,5,5,8,9,7,7,7]
B=[1,5,7]
If elements of B in A then print the number of occurrences
output
1:1
5:3
7:3
pythonic way:
>>> import collections
>>> a= [1,2,3,4,6,5,5,5,8,9,7,7,7]
>>> b = [1,5,7]
>>> counter = collections.Counter(a)
>>> {x:counter[x] for x in b}
{1: 1, 5: 3, 7: 3}
>>>
For sorted array you can come up with better algorithms that work faster, but generally easy and more native way without using libraries would be
for i in B:
ans = 0
for j in A:
if i == j:
ans += 1
print(i,':',ans)
You tagged pandas so here is a simple solution:
Count everything, then return only the subset that overlap with the items in B
import pandas as pd
pd.Series(A).value_counts().reindex(B).to_dict()
#{1: 1, 5: 3, 7: 3}
The numpy_indexed package has a vectorized solution to this problem (disclaimer: i am its author):
import numpy_indexed as npi
keys, counts = npi.count(A)
counts = counts[npi.indices(keys, B)]
both of the above answers worked.
I used counter as B had all elements of A
import collections
c = collections.Counter(A)
print(c)

Python two dictionaries in dictionary, increase value in specific key

I want to increase value in dictionary of a dictionary, there is a major dictionary 'a' which has two separate dictionaries: 'be' and 'ce'. I want to increase value of specific key, determined by variables like 'dist' and 'bec' but I cannot reach the key of one of the minor dictionaries:
import collections
from collections import defaultdict
a={}
be = {}
ce = {}
for z in range(1,11):
be["b_{0}".format(z)] = 0
be = collections.OrderedDict(sorted(be.items()))
for c in range(1,11):
for b in range(1,11):
ce["c_{0}_{1}".format(c,b)]= 0
ce = collections.OrderedDict(sorted(ce.items()))
for x in range(1,10):
a["a_{0}".format(x)] = be,ce
a = collections.OrderedDict(sorted(a.items()))
dist = 3
bec = 10
a["a_"+str(dist)]["b_"+str(bec)] += 1
I tried to print "a["a_"+str(dist)]["b_"+str(bec)]" but it didnt work, it only works when I print only "a["a_"+str(dist)]"
Here's the simplest possible approach:
>>> from collections import Counter
>>> a = Counter()
>>> a[(3, 'b', 10)] += 1
>>> a[(3, 'b', 10)] += 1
>>> a[(3, 'b', 10)]
2
>>> a[(3, 'b', 8)]
0
Is there any way in which this doesn't work?

count characters frequency in a phrase frequency dict in python 3

In my experiences, this is a special work to do. I searched in many different ways but still can't find answer to it.
here the question is.
I have a dict of Chinese phrase frequency.It looks like:
{'中国':18950, '我们':16734, '我国':15400, ...}
What I need to do is count every single character's frequency, for example:
character '国' appears in two phrases ('中国'and '我国') , so this character's frequency should be:
{'国':(18950+15400)}
How can I achieve this?
Simple example,
d = {'abd':2, 'afd':3}
f = {}
for key in d:
strlen = len(key)
for i in range(strlen):
if key[i] in f:
f[key[i]] += d[key]
else:
f[key[i]] = d[key]
print f #gives {'a': 5, 'b': 2, 'd': 5, 'f': 3}
My way:
from collections import Counter
c={'中国':18950, '我们':16734, '我国':15400}
print(Counter([j for k,v in c.items() for i in k for j in [i]*v]))
Output:
Counter({'国': 34350, '我': 32134, '中': 18950, '们': 16734})
Something like this should work:
from collections import defaultdict
char_dict = defaultdict(int)
for phrase, count in phrase_dict.iteritems():
for char in phrase:
char_dict[char] += count
d = {'中国':18950, '我们':16734, '我国':15400, ...}
q = 0
for i in d:
if '国' in i:
a = (d[i])
q += a
print(q)

Count occurence in a list with time complexity of O(nlogn)

This is what I have so far:
alist=[1,1,1,2,2,3,4,2,2,3,2,2,1]
def icount(alist):
adic={}
for i in alist:
adic[i]=alist.count(i)
return adic
print(icount(alist))
I did some research to find out that the time complexity of list.count() is O(n), thus , this code will be O(n^2).
Is there a way to reduce this to O(nlogn)?
You can use Counter like this
from collections import Counter
alist=[1,1,1,2,2,3,4,2,2,3,2,2,1]
print Counter(alist)
If you want to use your solution, you can improve it like this
def icount(alist):
adic = {}
for i in alist:
adic[i] = adic.get(i, 0) + 1
return adic
Even better, you can use defaultdict like this
from collections import defaultdict
adic = defaultdict(int)
for i in alist:
adic[i] += 1
return adic
Also, You might want to look at the Time Complexity of various operations on different Python objects here
Counter is your helper:
>>> from collections import Counter
>>> a = [1,2,1,3,4]
>>> Counter(a)
Counter({1: 2, 2: 1, 3: 1, 4: 1})
>>> x = Counter(a)
>>> x[1]
2
>>> x[2]
1
Get the count of each element easily through this method

How to use dict in python?

10
5
-1
-1
-1
1
1
0
2
...
If I want to count the number of occurrences of each number in a file, how do I use python to do it?
This is almost the exact same algorithm described in Anurag Uniyal's answer, except using the file as an iterator instead of readline():
from collections import defaultdict
try:
from io import StringIO # 2.6+, 3.x
except ImportError:
from StringIO import StringIO # 2.5
data = defaultdict(int)
#with open("filename", "r") as f: # if a real file
with StringIO("10\n5\n-1\n-1\n-1\n1\n1\n0\n2") as f:
for line in f:
data[int(line)] += 1
for number, count in data.iteritems():
print number, "was found", count, "times"
Counter is your best friend:)
http://docs.python.org/dev/library/collections.html#counter-objects
for(Python2.5 and 2.6) http://code.activestate.com/recipes/576611/
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
... cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})
# or just cnt = Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
for this :
print Counter(int(line.strip()) for line in open("foo.txt", "rb"))
##output
Counter({-1: 3, 1: 2, 0: 1, 2: 1, 5: 1, 10: 1})
I think what you call map is, in python, a dictionary.
Here is some useful link on how to use it: http://docs.python.org/tutorial/datastructures.html#dictionaries
For a good solution, see the answer from Stephan or Matthew - but take also some time to understand what that code does :-)
Read the lines of the file into a list l, e.g.:
l = [int(line) for line in open('filename','r')]
Starting with a list of values l, you can create a dictionary d that gives you for each value in the list the number of occurrences like this:
>>> l = [10,5,-1,-1,-1,1,1,0,2]
>>> d = dict((x,l.count(x)) for x in l)
>>> d[1]
2
EDIT: as Matthew rightly points out, this is hardly optimal. Here is a version using defaultdict:
from collections import defaultdict
d = defaultdict(int)
for line in open('filename','r'):
d[int(line)] += 1
New in Python 3.1:
from collections import Counter
with open("filename","r") as lines:
print(Counter(lines))
Use collections.defaultdict so that
by deafult count for anything is
zero
After that loop thru lines in file
using file.readline and convert
each line to int
increment counter for each value in
your countDict
at last go thru dict using for intV,
count in countDict.iteritems() and
print values
Use dictionary where every line is a key, and count is value. Increment count for every line, and if there is no dictionary entry for line initialize it with 1 in except clause -- this should work with older versions of Python.
def count_same_lines(fname):
line_counts = {}
for l in file(fname):
l = l.rstrip()
if l:
try:
line_counts[l] += 1
except KeyError:
line_counts[l] = 1
print('cnt\ttxt')
for k in line_counts.keys():
print('%d\t%s' % (line_counts[k], k))
l = [10,5,-1,-1,-1,1,1,0,2]
d = {}
for x in l:
d[x] = (d[x] + 1) if (x in d) else 1
There will be a key in d for every distinct value in the original list, and the values of d will be the number of occurrences.
counter.py
#!/usr/bin/env python
import fileinput
from collections import defaultdict
frequencies = defaultdict(int)
for line in fileinput.input():
frequencies[line.strip()] += 1
print frequencies
Example:
$ perl -E'say 1*(rand() < 0.5) for (1..100)' | python counter.py
defaultdict(<type 'int'>, {'1': 52, '0': 48})

Categories