Retrieve the first element from Counter in O(n) time - python

I have such a nums list
In [72]: nums
Out[72]: [4, 1, 2, 1, 2]
try to get the unique number from the list
n [72]: nums
Out[72]: [4, 1, 2, 1, 2]
In [73]: c = Counter(nums)
In [74]: c
Out[74]: Counter({4: 1, 1: 2, 2: 2})
I can see the result from the counter, it is 4:1, but cannot retrieve it in O(1) time
In [79]: list(c)[0]
Out[79]: 4 #O(n) time
Is it possible to get 4 in O(1)time

According to the comments to the question, you want to get the elements that have a count of 1. But it is still not clear what you want to get exactly, as the term "the first element" is unclear in the context of a Counter, which is a dict and no defined order internally.
Here are a few options (I used str instead of int to make it clearer which are the values and which are their counts):
>>> import collections
>>> input_str = 'awlkjelqkdjlakd'
>>> c = collections.Counter(input_str)
>>> c
Counter({'l': 3, 'k': 3, 'a': 2, 'j': 2, 'd': 2, 'w': 1, 'e': 1, 'q': 1})
Get all elements that have count of 1 (takes O(k), where k is the number of different elements):
>>> [char for char, count in c.items() if count == 1]
['w', 'e', 'q']
Get one (random, not specified) element that has count of 1 (takes O(k), because the list has to be built):
>>> [char for char, count in c.items() if count == 1][0]
'w'
This can be improved by using a generator, so the full list will not be built; the generator will stop when the first element with count 1 is found, but there is no way to know if that will be first or last or in the middle ...
>>> g = (char for char, count in c.items() if count == 1)
>>> g
<generator object <genexpr> at 0x7fd520e82f68>
>>> next(g)
'w'
>>> next(char for char, count in c.items() if count == 1)
'w'
Now, if you want to find the count of the first element of your input data (in my example input_str), that is done in O(1) because it is a list item access and then a dict lookup:
>>> elem = input_str[0]
>>> elem
'a'
>>> c[elem]
2
But I cannot give a more concrete answer without more information on what exactly you need.

Related

Do only the first and the last true condition,

I have a live process the could be true or false in input. Based on the condition status, if it's true start to write the timecode in a file. I need to write the timecode just the first time the condition is true and add the number of times the condition was true.
So, if the statement is true 5 times, I need to write the timecode the first time the condition it's true, ignoring the next true condition, but counting them and write the times the condition was true in the file.
if process:
writeTC(fName,
TC_in, # write only the first time the condition is true
)
writeDuration(fName,
Duration # write duration only at the last true cond.
)
))
Output:
00:00:03:05
7 sec.
Old vague question.
Before answering, please read my question. I now how to use counter, what I'm asking is about the if statement. Thanks. I have a condition inside a loop. I want to print the result once, the first time the condition is true and add the number of times the condition was true.
arr = ['a', 1, 1, 1, 1, 1, 2, 2, 2, 2, 'a', 'a',
3, 3, 'a', 4]
for i in arr:
if type(i) == int:
print('Printed once {i}'.format(i=i))
I would like end up with this result:
There is 5 times the number 1
There is 4 times the number 2
There is 2 times the number 3
There is 1 times the number 4
This will work.I think Counter is the best function to count occurences in a list.
from collections import Counter
arr = ['a', 1, 1, 1, 1, 1, 2, 2, 2, 2, 'a', 'a',
3, 3, 'a', 4]
filteredArr = [x for x in arr if type(x)==int]
k = dict(Counter(filteredArr))
for i in k.keys():
print('There is {} the number {}'.format(k[i],i))
you can use Counter --
from collections import Counter
arr = ['a', 1, 1, 1, 1, 1, 2, 2, 2, 2, 'a', 'a',
3, 3, 'a', 4]
freq_dict = Counter([item for item in arr if type(x)==int])
for key,value in freq_dict.items():
print(f'There is {value} the number {key}')
Output -
There is 5 the number 1
There is 4 the number 2
There is 2 the number 3
There is 1 the number 4
I'm still not sure when you want to do the print statement, but may be a dictionary could be closer to what you want:
import time
arr = ['a', 1, 1, 1, 1, 1, 2, 2, 2, 2, 'a', 'a', 3, 3, 'a', 4]
numbers = {}
# extract all integers from the starting list
for element in arr:
if type(element) == int:
if element not in numbers.keys():
numbers.update({element: [time.time_ns(), 1]})
else:
numbers[element][1] += 1
for key in numbers.keys():
print(f"the number {key} occurs {numbers[key][1]} times. First occurrence at {numbers[key][0]} ")
First time an interger is found, it is written to the dictionary as key. The corresponding value takes the actual time and a counter.
Next time this integer is found, only the counter is increased.
So If you need the print statement while looping, just add it into the innerif else clause.

How to efficiently count each element in a list in Python? [duplicate]

This question already has answers here:
Using a dictionary to count the items in a list
(8 answers)
Closed 7 months ago.
Given an unordered list of values like
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
How can I get the frequency of each value that appears in the list, like so?
# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
In Python 2.7 (or newer), you can use collections.Counter:
>>> import collections
>>> a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
>>> counter = collections.Counter(a)
>>> counter
Counter({1: 4, 2: 4, 5: 2, 3: 2, 4: 1})
>>> counter.values()
dict_values([2, 4, 4, 1, 2])
>>> counter.keys()
dict_keys([5, 1, 2, 4, 3])
>>> counter.most_common(3)
[(1, 4), (2, 4), (5, 2)]
>>> dict(counter)
{5: 2, 1: 4, 2: 4, 4: 1, 3: 2}
>>> # Get the counts in order matching the original specification,
>>> # by iterating over keys in sorted order
>>> [counter[x] for x in sorted(counter.keys())]
[4, 4, 2, 1, 2]
If you are using Python 2.6 or older, you can download an implementation here.
If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):
from itertools import groupby
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]
Output:
[4, 4, 2, 1, 2]
Python 2.7+ introduces Dictionary Comprehension. Building the dictionary from the list will get you the count as well as get rid of duplicates.
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> d = {x:a.count(x) for x in a}
>>> d
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
>>> a, b = d.keys(), d.values()
>>> a
[1, 2, 3, 4, 5]
>>> b
[4, 4, 2, 1, 2]
Count the number of appearances manually by iterating through the list and counting them up, using a collections.defaultdict to track what has been seen so far:
from collections import defaultdict
appearances = defaultdict(int)
for curr in a:
appearances[curr] += 1
In Python 2.7+, you could use collections.Counter to count items
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>>
>>> from collections import Counter
>>> c=Counter(a)
>>>
>>> c.values()
[4, 4, 2, 1, 2]
>>>
>>> c.keys()
[1, 2, 3, 4, 5]
Counting the frequency of elements is probably best done with a dictionary:
b = {}
for item in a:
b[item] = b.get(item, 0) + 1
To remove the duplicates, use a set:
a = list(set(a))
You can do this:
import numpy as np
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
np.unique(a, return_counts=True)
Output:
(array([1, 2, 3, 4, 5]), array([4, 4, 2, 1, 2], dtype=int64))
The first array is values, and the second array is the number of elements with these values.
So If you want to get just array with the numbers you should use this:
np.unique(a, return_counts=True)[1]
Here's another succint alternative using itertools.groupby which also works for unordered input:
from itertools import groupby
items = [5, 1, 1, 2, 2, 1, 1, 2, 2, 3, 4, 3, 5]
results = {value: len(list(freq)) for value, freq in groupby(sorted(items))}
results
format: {value: num_of_occurencies}
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
I would simply use scipy.stats.itemfreq in the following manner:
from scipy.stats import itemfreq
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq = itemfreq(a)
a = freq[:,0]
b = freq[:,1]
you may check the documentation here: http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html
from collections import Counter
a=["E","D","C","G","B","A","B","F","D","D","C","A","G","A","C","B","F","C","B"]
counter=Counter(a)
kk=[list(counter.keys()),list(counter.values())]
pd.DataFrame(np.array(kk).T, columns=['Letter','Count'])
seta = set(a)
b = [a.count(el) for el in seta]
a = list(seta) #Only if you really want it.
Suppose we have a list:
fruits = ['banana', 'banana', 'apple', 'banana']
We can find out how many of each fruit we have in the list like so:
import numpy as np
(unique, counts) = np.unique(fruits, return_counts=True)
{x:y for x,y in zip(unique, counts)}
Result:
{'banana': 3, 'apple': 1}
This answer is more explicit
a = [1,1,1,1,2,2,2,2,3,3,3,4,4]
d = {}
for item in a:
if item in d:
d[item] = d.get(item)+1
else:
d[item] = 1
for k,v in d.items():
print(str(k)+':'+str(v))
# output
#1:4
#2:4
#3:3
#4:2
#remove dups
d = set(a)
print(d)
#{1, 2, 3, 4}
For your first question, iterate the list and use a dictionary to keep track of an elements existsence.
For your second question, just use the set operator.
def frequencyDistribution(data):
return {i: data.count(i) for i in data}
print frequencyDistribution([1,2,3,4])
...
{1: 1, 2: 1, 3: 1, 4: 1} # originalNumber: count
I am quite late, but this will also work, and will help others:
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq_list = []
a_l = list(set(a))
for x in a_l:
freq_list.append(a.count(x))
print 'Freq',freq_list
print 'number',a_l
will produce this..
Freq [4, 4, 2, 1, 2]
number[1, 2, 3, 4, 5]
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
# 1. Get counts and store in another list
output = []
for i in set(a):
output.append(a.count(i))
print(output)
# 2. Remove duplicates using set constructor
a = list(set(a))
print(a)
Set collection does not allow duplicates, passing a list to the set() constructor will give an iterable of totally unique objects. count() function returns an integer count when an object that is in a list is passed. With that the unique objects are counted and each count value is stored by appending to an empty list output
list() constructor is used to convert the set(a) into list and referred by the same variable a
Output
D:\MLrec\venv\Scripts\python.exe D:/MLrec/listgroup.py
[4, 4, 2, 1, 2]
[1, 2, 3, 4, 5]
Simple solution using a dictionary.
def frequency(l):
d = {}
for i in l:
if i in d.keys():
d[i] += 1
else:
d[i] = 1
for k, v in d.iteritems():
if v ==max (d.values()):
return k,d.keys()
print(frequency([10,10,10,10,20,20,20,20,40,40,50,50,30]))
#!usr/bin/python
def frq(words):
freq = {}
for w in words:
if w in freq:
freq[w] = freq.get(w)+1
else:
freq[w] =1
return freq
fp = open("poem","r")
list = fp.read()
fp.close()
input = list.split()
print input
d = frq(input)
print "frequency of input\n: "
print d
fp1 = open("output.txt","w+")
for k,v in d.items():
fp1.write(str(k)+':'+str(v)+"\n")
fp1.close()
from collections import OrderedDict
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
def get_count(lists):
dictionary = OrderedDict()
for val in lists:
dictionary.setdefault(val,[]).append(1)
return [sum(val) for val in dictionary.values()]
print(get_count(a))
>>>[4, 4, 2, 1, 2]
To remove duplicates and Maintain order:
list(dict.fromkeys(get_count(a)))
>>>[4, 2, 1]
i'm using Counter to generate a freq. dict from text file words in 1 line of code
def _fileIndex(fh):
''' create a dict using Counter of a
flat list of words (re.findall(re.compile(r"[a-zA-Z]+"), lines)) in (lines in file->for lines in fh)
'''
return Counter(
[wrd.lower() for wrdList in
[words for words in
[re.findall(re.compile(r'[a-zA-Z]+'), lines) for lines in fh]]
for wrd in wrdList])
For the record, a functional answer:
>>> L = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> import functools
>>> >>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc,1)] if e<=len(acc) else acc+[0 for _ in range(e-len(acc)-1)]+[1], L, [])
[4, 4, 2, 1, 2]
It's cleaner if you count zeroes too:
>>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc)] if e<len(acc) else acc+[0 for _ in range(e-len(acc))]+[1], L, [])
[0, 4, 4, 2, 1, 2]
An explanation:
we start with an empty acc list;
if the next element e of L is lower than the size of acc, we just update this element: v+(i==e) means v+1 if the index i of acc is the current element e, otherwise the previous value v;
if the next element e of L is greater or equals to the size of acc, we have to expand acc to host the new 1.
The elements do not have to be sorted (itertools.groupby). You'll get weird results if you have negative numbers.
Another approach of doing this, albeit by using a heavier but powerful library - NLTK.
import nltk
fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()
Found another way of doing this, using sets.
#ar is the list of elements
#convert ar to set to get unique elements
sock_set = set(ar)
#create dictionary of frequency of socks
sock_dict = {}
for sock in sock_set:
sock_dict[sock] = ar.count(sock)
For an unordered list you should use:
[a.count(el) for el in set(a)]
The output is
[4, 4, 2, 1, 2]
Yet another solution with another algorithm without using collections:
def countFreq(A):
n=len(A)
count=[0]*n # Create a new list initialized with '0'
for i in range(n):
count[A[i]]+= 1 # increase occurrence for value A[i]
return [x for x in count if x] # return non-zero count
num=[3,2,3,5,5,3,7,6,4,6,7,2]
print ('\nelements are:\t',num)
count_dict={}
for elements in num:
count_dict[elements]=num.count(elements)
print ('\nfrequency:\t',count_dict)
You can use the in-built function provided in python
l.count(l[i])
d=[]
for i in range(len(l)):
if l[i] not in d:
d.append(l[i])
print(l.count(l[i])
The above code automatically removes duplicates in a list and also prints the frequency of each element in original list and the list without duplicates.
Two birds for one shot ! X D
This approach can be tried if you don't want to use any library and keep it simple and short!
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)
o/p
[4, 4, 2, 1, 2]

Find the same values in array [duplicate]

Given a single item, how do I count occurrences of it in a list, in Python?
A related but different problem is counting occurrences of each different element in a collection, getting a dictionary or list as a histogram result instead of a single integer. For that problem, see Using a dictionary to count the items in a list.
If you only want a single item's count, use the count method:
>>> [1, 2, 3, 4, 1, 4, 1].count(1)
3
Important: this is very slow if you are counting multiple different items
Each count call goes over the entire list of n elements. Calling count in a loop n times means n * n total checks, which can be catastrophic for performance.
If you want to count multiple items, use Counter, which only does n total checks.
Use Counter if you are using Python 2.7 or 3.x and you want the number of occurrences for each element:
>>> from collections import Counter
>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
>>> Counter(z)
Counter({'blue': 3, 'red': 2, 'yellow': 1})
Counting the occurrences of one item in a list
For counting the occurrences of just one list item you can use count()
>>> l = ["a","b","b"]
>>> l.count("a")
1
>>> l.count("b")
2
Counting the occurrences of all items in a list is also known as "tallying" a list, or creating a tally counter.
Counting all items with count()
To count the occurrences of items in l one can simply use a list comprehension and the count() method
[[x,l.count(x)] for x in set(l)]
(or similarly with a dictionary dict((x,l.count(x)) for x in set(l)))
Example:
>>> l = ["a","b","b"]
>>> [[x,l.count(x)] for x in set(l)]
[['a', 1], ['b', 2]]
>>> dict((x,l.count(x)) for x in set(l))
{'a': 1, 'b': 2}
Counting all items with Counter()
Alternatively, there's the faster Counter class from the collections library
Counter(l)
Example:
>>> l = ["a","b","b"]
>>> from collections import Counter
>>> Counter(l)
Counter({'b': 2, 'a': 1})
How much faster is Counter?
I checked how much faster Counter is for tallying lists. I tried both methods out with a few values of n and it appears that Counter is faster by a constant factor of approximately 2.
Here is the script I used:
from __future__ import print_function
import timeit
t1=timeit.Timer('Counter(l)', \
'import random;import string;from collections import Counter;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
)
t2=timeit.Timer('[[x,l.count(x)] for x in set(l)]',
'import random;import string;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
)
print("Counter(): ", t1.repeat(repeat=3,number=10000))
print("count(): ", t2.repeat(repeat=3,number=10000)
And the output:
Counter(): [0.46062711701961234, 0.4022796869976446, 0.3974247490405105]
count(): [7.779430688009597, 7.962715800967999, 8.420845870045014]
Another way to get the number of occurrences of each item, in a dictionary:
dict((i, a.count(i)) for i in a)
Given an item, how can I count its occurrences in a list in Python?
Here's an example list:
>>> l = list('aaaaabbbbcccdde')
>>> l
['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'e']
list.count
There's the list.count method
>>> l.count('b')
4
This works fine for any list. Tuples have this method as well:
>>> t = tuple('aabbbffffff')
>>> t
('a', 'a', 'b', 'b', 'b', 'f', 'f', 'f', 'f', 'f', 'f')
>>> t.count('f')
6
collections.Counter
And then there's collections.Counter. You can dump any iterable into a Counter, not just a list, and the Counter will retain a data structure of the counts of the elements.
Usage:
>>> from collections import Counter
>>> c = Counter(l)
>>> c['b']
4
Counters are based on Python dictionaries, their keys are the elements, so the keys need to be hashable. They are basically like sets that allow redundant elements into them.
Further usage of collections.Counter
You can add or subtract with iterables from your counter:
>>> c.update(list('bbb'))
>>> c['b']
7
>>> c.subtract(list('bbb'))
>>> c['b']
4
And you can do multi-set operations with the counter as well:
>>> c2 = Counter(list('aabbxyz'))
>>> c - c2 # set difference
Counter({'a': 3, 'c': 3, 'b': 2, 'd': 2, 'e': 1})
>>> c + c2 # addition of all elements
Counter({'a': 7, 'b': 6, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c | c2 # set union
Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c & c2 # set intersection
Counter({'a': 2, 'b': 2})
Silly answer, sum
There are good builtin answers, but this example is slightly instructive. Here we sum all the occurences where the character, c, is equal to 'b':
>>> sum(c == 'b' for c in l)
4
Not great for this use-case, but if you need to have a count of iterables where the case is True it works perfectly fine to sum the boolean results, since True is equivalent to 1.
Why not pandas?
Another answer suggests:
Why not use pandas?
Pandas is a common library, but it's not in the standard library. Adding it as a requirement is non-trivial.
There are builtin solutions for this use-case in the list object itself as well as in the standard library.
If your project does not already require pandas, it would be foolish to make it a requirement just for this functionality.
I've compared all suggested solutions (and a few new ones) with perfplot (a small project of mine).
Counting one item
For large enough arrays, it turns out that
numpy.sum(numpy.array(a) == 1)
is slightly faster than the other solutions.
Counting all items
As established before,
numpy.bincount(a)
is what you want.
Code to reproduce the plots:
from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot
def counter(a):
return Counter(a)
def count(a):
return dict((i, a.count(i)) for i in set(a))
def bincount(a):
return numpy.bincount(a)
def pandas_value_counts(a):
return pandas.Series(a).value_counts()
def occur_dict(a):
d = {}
for i in a:
if i in d:
d[i] = d[i]+1
else:
d[i] = 1
return d
def count_unsorted_list_items(items):
counts = defaultdict(int)
for item in items:
counts[item] += 1
return dict(counts)
def operator_countof(a):
return dict((i, operator.countOf(a, i)) for i in set(a))
perfplot.show(
setup=lambda n: list(numpy.random.randint(0, 100, n)),
n_range=[2**k for k in range(20)],
kernels=[
counter, count, bincount, pandas_value_counts, occur_dict,
count_unsorted_list_items, operator_countof
],
equality_check=None,
logx=True,
logy=True,
)
from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot
def counter(a):
return Counter(a)
def count(a):
return dict((i, a.count(i)) for i in set(a))
def bincount(a):
return numpy.bincount(a)
def pandas_value_counts(a):
return pandas.Series(a).value_counts()
def occur_dict(a):
d = {}
for i in a:
if i in d:
d[i] = d[i] + 1
else:
d[i] = 1
return d
def count_unsorted_list_items(items):
counts = defaultdict(int)
for item in items:
counts[item] += 1
return dict(counts)
def operator_countof(a):
return dict((i, operator.countOf(a, i)) for i in set(a))
b = perfplot.bench(
setup=lambda n: list(numpy.random.randint(0, 100, n)),
n_range=[2 ** k for k in range(20)],
kernels=[
counter,
count,
bincount,
pandas_value_counts,
occur_dict,
count_unsorted_list_items,
operator_countof,
],
equality_check=None,
)
b.save("out.png")
b.show()
list.count(x) returns the number of times x appears in a list
see:
http://docs.python.org/tutorial/datastructures.html#more-on-lists
If you want to count all values at once you can do it very fast using numpy arrays and bincount as follows
import numpy as np
a = np.array([1, 2, 3, 4, 1, 4, 1])
np.bincount(a)
which gives
>>> array([0, 3, 1, 1, 2])
Why not using Pandas?
import pandas as pd
my_list = ['a', 'b', 'c', 'd', 'a', 'd', 'a']
# converting the list to a Series and counting the values
my_count = pd.Series(my_list).value_counts()
my_count
Output:
a 3
d 2
b 1
c 1
dtype: int64
If you are looking for a count of a particular element, say a, try:
my_count['a']
Output:
3
If you can use pandas, then value_counts is there for rescue.
>>> import pandas as pd
>>> a = [1, 2, 3, 4, 1, 4, 1]
>>> pd.Series(a).value_counts()
1 3
4 2
3 1
2 1
dtype: int64
It automatically sorts the result based on frequency as well.
If you want the result to be in a list of list, do as below
>>> pd.Series(a).value_counts().reset_index().values.tolist()
[[1, 3], [4, 2], [3, 1], [2, 1]]
I had this problem today and rolled my own solution before I thought to check SO. This:
dict((i,a.count(i)) for i in a)
is really, really slow for large lists. My solution
def occurDict(items):
d = {}
for i in items:
if i in d:
d[i] = d[i]+1
else:
d[i] = 1
return d
is actually a bit faster than the Counter solution, at least for Python 2.7.
Count of all elements with itertools.groupby()
Antoher possiblity for getting the count of all elements in the list could be by means of itertools.groupby().
With "duplicate" counts
from itertools import groupby
L = ['a', 'a', 'a', 't', 'q', 'a', 'd', 'a', 'd', 'c'] # Input list
counts = [(i, len(list(c))) for i,c in groupby(L)] # Create value-count pairs as list of tuples
print(counts)
Returns
[('a', 3), ('t', 1), ('q', 1), ('a', 1), ('d', 1), ('a', 1), ('d', 1), ('c', 1)]
Notice how it combined the first three a's as the first group, while other groups of a are present further down the list. This happens because the input list L was not sorted. This can be a benefit sometimes if the groups should in fact be separate.
With unique counts
If unique group counts are desired, just sort the input list:
counts = [(i, len(list(c))) for i,c in groupby(sorted(L))]
print(counts)
Returns
[('a', 5), ('c', 1), ('d', 2), ('q', 1), ('t', 1)]
Note: For creating unique counts, many of the other answers provide easier and more readable code compared to the groupby solution. But it is shown here to draw a parallel to the duplicate count example.
Although it is very old question, since i didn't find a one liner, i made one.
# original numbers in list
l = [1, 2, 2, 3, 3, 3, 4]
# empty dictionary to hold pair of number and its count
d = {}
# loop through all elements and store count
[ d.update( {i:d.get(i, 0)+1} ) for i in l ]
print(d)
# {1: 1, 2: 2, 3: 3, 4: 1}
# Python >= 2.6 (defaultdict) && < 2.7 (Counter, OrderedDict)
from collections import defaultdict
def count_unsorted_list_items(items):
"""
:param items: iterable of hashable items to count
:type items: iterable
:returns: dict of counts like Py2.7 Counter
:rtype: dict
"""
counts = defaultdict(int)
for item in items:
counts[item] += 1
return dict(counts)
# Python >= 2.2 (generators)
def count_sorted_list_items(items):
"""
:param items: sorted iterable of items to count
:type items: sorted iterable
:returns: generator of (item, count) tuples
:rtype: generator
"""
if not items:
return
elif len(items) == 1:
yield (items[0], 1)
return
prev_item = items[0]
count = 1
for item in items[1:]:
if prev_item == item:
count += 1
else:
yield (prev_item, count)
count = 1
prev_item = item
yield (item, count)
return
import unittest
class TestListCounters(unittest.TestCase):
def test_count_unsorted_list_items(self):
D = (
([], []),
([2], [(2,1)]),
([2,2], [(2,2)]),
([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
)
for inp, exp_outp in D:
counts = count_unsorted_list_items(inp)
print inp, exp_outp, counts
self.assertEqual(counts, dict( exp_outp ))
inp, exp_outp = UNSORTED_WIN = ([2,2,4,2], [(2,3), (4,1)])
self.assertEqual(dict( exp_outp ), count_unsorted_list_items(inp) )
def test_count_sorted_list_items(self):
D = (
([], []),
([2], [(2,1)]),
([2,2], [(2,2)]),
([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
)
for inp, exp_outp in D:
counts = list( count_sorted_list_items(inp) )
print inp, exp_outp, counts
self.assertEqual(counts, exp_outp)
inp, exp_outp = UNSORTED_FAIL = ([2,2,4,2], [(2,3), (4,1)])
self.assertEqual(exp_outp, list( count_sorted_list_items(inp) ))
# ... [(2,2), (4,1), (2,1)]
Below are the three solutions:
Fastest is using a for loop and storing it in a Dict.
import time
from collections import Counter
def countElement(a):
g = {}
for i in a:
if i in g:
g[i] +=1
else:
g[i] =1
return g
z = [1,1,1,1,2,2,2,2,3,3,4,5,5,234,23,3,12,3,123,12,31,23,13,2,4,23,42,42,34,234,23,42,34,23,423,42,34,23,423,4,234,23,42,34,23,4,23,423,4,23,4]
#Solution 1 - Faster
st = time.monotonic()
for i in range(1000000):
b = countElement(z)
et = time.monotonic()
print(b)
print('Simple for loop and storing it in dict - Duration: {}'.format(et - st))
#Solution 2 - Fast
st = time.monotonic()
for i in range(1000000):
a = Counter(z)
et = time.monotonic()
print (a)
print('Using collections.Counter - Duration: {}'.format(et - st))
#Solution 3 - Slow
st = time.monotonic()
for i in range(1000000):
g = dict([(i, z.count(i)) for i in set(z)])
et = time.monotonic()
print(g)
print('Using list comprehension - Duration: {}'.format(et - st))
Result
#Solution 1 - Faster
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 234: 3, 23: 10, 12: 2, 123: 1, 31: 1, 13: 1, 42: 5, 34: 4, 423: 3}
Simple for loop and storing it in dict - Duration: 12.032000000000153
#Solution 2 - Fast
Counter({23: 10, 4: 6, 2: 5, 42: 5, 1: 4, 3: 4, 34: 4, 234: 3, 423: 3, 5: 2, 12: 2, 123: 1, 31: 1, 13: 1})
Using collections.Counter - Duration: 15.889999999999418
#Solution 3 - Slow
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 34: 4, 423: 3, 234: 3, 42: 5, 12: 2, 13: 1, 23: 10, 123: 1, 31: 1}
Using list comprehension - Duration: 33.0
It was suggested to use numpy's bincount, however it works only for 1d arrays with non-negative integers. Also, the resulting array might be confusing (it contains the occurrences of the integers from min to max of the original list, and sets to 0 the missing integers).
A better way to do it with numpy is to use the unique function with the attribute return_counts set to True. It returns a tuple with an array of the unique values and an array of the occurrences of each unique value.
# a = [1, 1, 0, 2, 1, 0, 3, 3]
a_uniq, counts = np.unique(a, return_counts=True) # array([0, 1, 2, 3]), array([2, 3, 1, 2]
and then we can pair them as
dict(zip(a_uniq, counts)) # {0: 2, 1: 3, 2: 1, 3: 2}
It also works with other data types and "2d lists", e.g.
>>> a = [['a', 'b', 'b', 'b'], ['a', 'c', 'c', 'a']]
>>> dict(zip(*np.unique(a, return_counts=True)))
{'a': 3, 'b': 3, 'c': 2}
To count the number of diverse elements having a common type:
li = ['A0','c5','A8','A2','A5','c2','A3','A9']
print sum(1 for el in li if el[0]=='A' and el[1] in '01234')
gives
3 , not 6
You can also use countOf method of a built-in module operator.
>>> import operator
>>> operator.countOf([1, 2, 3, 4, 1, 4, 1], 1)
3
I would use filter(), take Lukasz's example:
>>> lst = [1, 2, 3, 4, 1, 4, 1]
>>> len(filter(lambda x: x==1, lst))
3
use %timeit to see which operation is more efficient. np.array counting operations should be faster.
from collections import Counter
mylist = [1,7,7,7,3,9,9,9,7,9,10,0]
types_counts=Counter(mylist)
print(types_counts)
May not be the most efficient, requires an extra pass to remove duplicates.
Functional implementation :
arr = np.array(['a','a','b','b','b','c'])
print(set(map(lambda x : (x , list(arr).count(x)) , arr)))
returns :
{('c', 1), ('b', 3), ('a', 2)}
or return as dict :
print(dict(map(lambda x : (x , list(arr).count(x)) , arr)))
returns :
{'b': 3, 'c': 1, 'a': 2}
Given a list X
import numpy as np
X = [1, -1, 1, -1, 1]
The dictionary which shows i: frequency(i) for elements of this list is:
{i:X.count(i) for i in np.unique(X)}
Output:
{-1: 2, 1: 3}
Alternatively, you can also implement the counter by yourself. This is the way I do:
item_list = ['me', 'me', 'you', 'you', 'you', 'they']
occ_dict = {}
for item in item_list:
if item not in occ_dict:
occ_dict[item] = 1
else:
occ_dict[item] +=1
print(occ_dict)
Output: {'me': 2, 'you': 3, 'they': 1}
mot = ["compte", "france", "zied"]
lst = ["compte", "france", "france", "france", "france"]
dict((x, lst.count(x)) for x in set(mot))
this gives
{'compte': 1, 'france': 4, 'zied': 0}
sum([1 for elem in <yourlist> if elem==<your_value>])
This will return the amount of occurences of your_value
test = [409.1, 479.0, 340.0, 282.4, 406.0, 300.0, 374.0, 253.3, 195.1, 269.0, 329.3, 250.7, 250.7, 345.3, 379.3, 275.0, 215.2, 300.0]
for i in test:
print('{} numbers {}'.format(i, test.count(i)))
import pandas as pd
test = [409.1, 479.0, 340.0, 282.4, 406.0, 300.0, 374.0, 253.3, 195.1, 269.0, 329.3, 250.7, 250.7, 345.3, 379.3, 275.0, 215.2, 300.0]
#turning the list into a temporary dataframe
test = pd.DataFrame(test)
#using the very convenient value_counts() function
df_counts = test.value_counts()
df_counts
then you can use df_counts.index and df_counts.values to get the data.
x = ['Jess', 'Jack', 'Mary', 'Sophia', 'Karen',
'Addison', 'Joseph','Jack', 'Jack', 'Eric', 'Ilona', 'Jason']
the_item = input('Enter the item that you wish to find : ')
how_many_times = 0
for occurrence in x:
if occurrence == the_item :
how_many_times += 1
print('The occurrence of', the_item, 'in', x,'is',how_many_times)
Created a list of names wherein the name 'Jack' is repeated.
In order to check its Occurrence, I ran a for loop in the list named x.
Upon each iteration, if the loop variable attains the value same that of received from the user and stored in the variable the_item, the variable how_many_times gets incremented by 1.
After attaining some value...We print how_many_times which stores the value of the occurance of the word 'jack'
def countfrequncyinarray(arr1):
r=len(arr1)
return {i:arr1.count(i) for i in range(1,r+1)}
arr1=[4,4,4,4]
a=countfrequncyinarray(arr1)
print(a)

In Python, I have a table with columns and lines; how do I count how many repetitions there are? [duplicate]

Given a single item, how do I count occurrences of it in a list, in Python?
A related but different problem is counting occurrences of each different element in a collection, getting a dictionary or list as a histogram result instead of a single integer. For that problem, see Using a dictionary to count the items in a list.
If you only want a single item's count, use the count method:
>>> [1, 2, 3, 4, 1, 4, 1].count(1)
3
Important: this is very slow if you are counting multiple different items
Each count call goes over the entire list of n elements. Calling count in a loop n times means n * n total checks, which can be catastrophic for performance.
If you want to count multiple items, use Counter, which only does n total checks.
Use Counter if you are using Python 2.7 or 3.x and you want the number of occurrences for each element:
>>> from collections import Counter
>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
>>> Counter(z)
Counter({'blue': 3, 'red': 2, 'yellow': 1})
Counting the occurrences of one item in a list
For counting the occurrences of just one list item you can use count()
>>> l = ["a","b","b"]
>>> l.count("a")
1
>>> l.count("b")
2
Counting the occurrences of all items in a list is also known as "tallying" a list, or creating a tally counter.
Counting all items with count()
To count the occurrences of items in l one can simply use a list comprehension and the count() method
[[x,l.count(x)] for x in set(l)]
(or similarly with a dictionary dict((x,l.count(x)) for x in set(l)))
Example:
>>> l = ["a","b","b"]
>>> [[x,l.count(x)] for x in set(l)]
[['a', 1], ['b', 2]]
>>> dict((x,l.count(x)) for x in set(l))
{'a': 1, 'b': 2}
Counting all items with Counter()
Alternatively, there's the faster Counter class from the collections library
Counter(l)
Example:
>>> l = ["a","b","b"]
>>> from collections import Counter
>>> Counter(l)
Counter({'b': 2, 'a': 1})
How much faster is Counter?
I checked how much faster Counter is for tallying lists. I tried both methods out with a few values of n and it appears that Counter is faster by a constant factor of approximately 2.
Here is the script I used:
from __future__ import print_function
import timeit
t1=timeit.Timer('Counter(l)', \
'import random;import string;from collections import Counter;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
)
t2=timeit.Timer('[[x,l.count(x)] for x in set(l)]',
'import random;import string;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
)
print("Counter(): ", t1.repeat(repeat=3,number=10000))
print("count(): ", t2.repeat(repeat=3,number=10000)
And the output:
Counter(): [0.46062711701961234, 0.4022796869976446, 0.3974247490405105]
count(): [7.779430688009597, 7.962715800967999, 8.420845870045014]
Another way to get the number of occurrences of each item, in a dictionary:
dict((i, a.count(i)) for i in a)
Given an item, how can I count its occurrences in a list in Python?
Here's an example list:
>>> l = list('aaaaabbbbcccdde')
>>> l
['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'e']
list.count
There's the list.count method
>>> l.count('b')
4
This works fine for any list. Tuples have this method as well:
>>> t = tuple('aabbbffffff')
>>> t
('a', 'a', 'b', 'b', 'b', 'f', 'f', 'f', 'f', 'f', 'f')
>>> t.count('f')
6
collections.Counter
And then there's collections.Counter. You can dump any iterable into a Counter, not just a list, and the Counter will retain a data structure of the counts of the elements.
Usage:
>>> from collections import Counter
>>> c = Counter(l)
>>> c['b']
4
Counters are based on Python dictionaries, their keys are the elements, so the keys need to be hashable. They are basically like sets that allow redundant elements into them.
Further usage of collections.Counter
You can add or subtract with iterables from your counter:
>>> c.update(list('bbb'))
>>> c['b']
7
>>> c.subtract(list('bbb'))
>>> c['b']
4
And you can do multi-set operations with the counter as well:
>>> c2 = Counter(list('aabbxyz'))
>>> c - c2 # set difference
Counter({'a': 3, 'c': 3, 'b': 2, 'd': 2, 'e': 1})
>>> c + c2 # addition of all elements
Counter({'a': 7, 'b': 6, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c | c2 # set union
Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c & c2 # set intersection
Counter({'a': 2, 'b': 2})
Silly answer, sum
There are good builtin answers, but this example is slightly instructive. Here we sum all the occurences where the character, c, is equal to 'b':
>>> sum(c == 'b' for c in l)
4
Not great for this use-case, but if you need to have a count of iterables where the case is True it works perfectly fine to sum the boolean results, since True is equivalent to 1.
Why not pandas?
Another answer suggests:
Why not use pandas?
Pandas is a common library, but it's not in the standard library. Adding it as a requirement is non-trivial.
There are builtin solutions for this use-case in the list object itself as well as in the standard library.
If your project does not already require pandas, it would be foolish to make it a requirement just for this functionality.
I've compared all suggested solutions (and a few new ones) with perfplot (a small project of mine).
Counting one item
For large enough arrays, it turns out that
numpy.sum(numpy.array(a) == 1)
is slightly faster than the other solutions.
Counting all items
As established before,
numpy.bincount(a)
is what you want.
Code to reproduce the plots:
from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot
def counter(a):
return Counter(a)
def count(a):
return dict((i, a.count(i)) for i in set(a))
def bincount(a):
return numpy.bincount(a)
def pandas_value_counts(a):
return pandas.Series(a).value_counts()
def occur_dict(a):
d = {}
for i in a:
if i in d:
d[i] = d[i]+1
else:
d[i] = 1
return d
def count_unsorted_list_items(items):
counts = defaultdict(int)
for item in items:
counts[item] += 1
return dict(counts)
def operator_countof(a):
return dict((i, operator.countOf(a, i)) for i in set(a))
perfplot.show(
setup=lambda n: list(numpy.random.randint(0, 100, n)),
n_range=[2**k for k in range(20)],
kernels=[
counter, count, bincount, pandas_value_counts, occur_dict,
count_unsorted_list_items, operator_countof
],
equality_check=None,
logx=True,
logy=True,
)
from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot
def counter(a):
return Counter(a)
def count(a):
return dict((i, a.count(i)) for i in set(a))
def bincount(a):
return numpy.bincount(a)
def pandas_value_counts(a):
return pandas.Series(a).value_counts()
def occur_dict(a):
d = {}
for i in a:
if i in d:
d[i] = d[i] + 1
else:
d[i] = 1
return d
def count_unsorted_list_items(items):
counts = defaultdict(int)
for item in items:
counts[item] += 1
return dict(counts)
def operator_countof(a):
return dict((i, operator.countOf(a, i)) for i in set(a))
b = perfplot.bench(
setup=lambda n: list(numpy.random.randint(0, 100, n)),
n_range=[2 ** k for k in range(20)],
kernels=[
counter,
count,
bincount,
pandas_value_counts,
occur_dict,
count_unsorted_list_items,
operator_countof,
],
equality_check=None,
)
b.save("out.png")
b.show()
list.count(x) returns the number of times x appears in a list
see:
http://docs.python.org/tutorial/datastructures.html#more-on-lists
If you want to count all values at once you can do it very fast using numpy arrays and bincount as follows
import numpy as np
a = np.array([1, 2, 3, 4, 1, 4, 1])
np.bincount(a)
which gives
>>> array([0, 3, 1, 1, 2])
Why not using Pandas?
import pandas as pd
my_list = ['a', 'b', 'c', 'd', 'a', 'd', 'a']
# converting the list to a Series and counting the values
my_count = pd.Series(my_list).value_counts()
my_count
Output:
a 3
d 2
b 1
c 1
dtype: int64
If you are looking for a count of a particular element, say a, try:
my_count['a']
Output:
3
If you can use pandas, then value_counts is there for rescue.
>>> import pandas as pd
>>> a = [1, 2, 3, 4, 1, 4, 1]
>>> pd.Series(a).value_counts()
1 3
4 2
3 1
2 1
dtype: int64
It automatically sorts the result based on frequency as well.
If you want the result to be in a list of list, do as below
>>> pd.Series(a).value_counts().reset_index().values.tolist()
[[1, 3], [4, 2], [3, 1], [2, 1]]
I had this problem today and rolled my own solution before I thought to check SO. This:
dict((i,a.count(i)) for i in a)
is really, really slow for large lists. My solution
def occurDict(items):
d = {}
for i in items:
if i in d:
d[i] = d[i]+1
else:
d[i] = 1
return d
is actually a bit faster than the Counter solution, at least for Python 2.7.
Count of all elements with itertools.groupby()
Antoher possiblity for getting the count of all elements in the list could be by means of itertools.groupby().
With "duplicate" counts
from itertools import groupby
L = ['a', 'a', 'a', 't', 'q', 'a', 'd', 'a', 'd', 'c'] # Input list
counts = [(i, len(list(c))) for i,c in groupby(L)] # Create value-count pairs as list of tuples
print(counts)
Returns
[('a', 3), ('t', 1), ('q', 1), ('a', 1), ('d', 1), ('a', 1), ('d', 1), ('c', 1)]
Notice how it combined the first three a's as the first group, while other groups of a are present further down the list. This happens because the input list L was not sorted. This can be a benefit sometimes if the groups should in fact be separate.
With unique counts
If unique group counts are desired, just sort the input list:
counts = [(i, len(list(c))) for i,c in groupby(sorted(L))]
print(counts)
Returns
[('a', 5), ('c', 1), ('d', 2), ('q', 1), ('t', 1)]
Note: For creating unique counts, many of the other answers provide easier and more readable code compared to the groupby solution. But it is shown here to draw a parallel to the duplicate count example.
Although it is very old question, since i didn't find a one liner, i made one.
# original numbers in list
l = [1, 2, 2, 3, 3, 3, 4]
# empty dictionary to hold pair of number and its count
d = {}
# loop through all elements and store count
[ d.update( {i:d.get(i, 0)+1} ) for i in l ]
print(d)
# {1: 1, 2: 2, 3: 3, 4: 1}
# Python >= 2.6 (defaultdict) && < 2.7 (Counter, OrderedDict)
from collections import defaultdict
def count_unsorted_list_items(items):
"""
:param items: iterable of hashable items to count
:type items: iterable
:returns: dict of counts like Py2.7 Counter
:rtype: dict
"""
counts = defaultdict(int)
for item in items:
counts[item] += 1
return dict(counts)
# Python >= 2.2 (generators)
def count_sorted_list_items(items):
"""
:param items: sorted iterable of items to count
:type items: sorted iterable
:returns: generator of (item, count) tuples
:rtype: generator
"""
if not items:
return
elif len(items) == 1:
yield (items[0], 1)
return
prev_item = items[0]
count = 1
for item in items[1:]:
if prev_item == item:
count += 1
else:
yield (prev_item, count)
count = 1
prev_item = item
yield (item, count)
return
import unittest
class TestListCounters(unittest.TestCase):
def test_count_unsorted_list_items(self):
D = (
([], []),
([2], [(2,1)]),
([2,2], [(2,2)]),
([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
)
for inp, exp_outp in D:
counts = count_unsorted_list_items(inp)
print inp, exp_outp, counts
self.assertEqual(counts, dict( exp_outp ))
inp, exp_outp = UNSORTED_WIN = ([2,2,4,2], [(2,3), (4,1)])
self.assertEqual(dict( exp_outp ), count_unsorted_list_items(inp) )
def test_count_sorted_list_items(self):
D = (
([], []),
([2], [(2,1)]),
([2,2], [(2,2)]),
([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
)
for inp, exp_outp in D:
counts = list( count_sorted_list_items(inp) )
print inp, exp_outp, counts
self.assertEqual(counts, exp_outp)
inp, exp_outp = UNSORTED_FAIL = ([2,2,4,2], [(2,3), (4,1)])
self.assertEqual(exp_outp, list( count_sorted_list_items(inp) ))
# ... [(2,2), (4,1), (2,1)]
Below are the three solutions:
Fastest is using a for loop and storing it in a Dict.
import time
from collections import Counter
def countElement(a):
g = {}
for i in a:
if i in g:
g[i] +=1
else:
g[i] =1
return g
z = [1,1,1,1,2,2,2,2,3,3,4,5,5,234,23,3,12,3,123,12,31,23,13,2,4,23,42,42,34,234,23,42,34,23,423,42,34,23,423,4,234,23,42,34,23,4,23,423,4,23,4]
#Solution 1 - Faster
st = time.monotonic()
for i in range(1000000):
b = countElement(z)
et = time.monotonic()
print(b)
print('Simple for loop and storing it in dict - Duration: {}'.format(et - st))
#Solution 2 - Fast
st = time.monotonic()
for i in range(1000000):
a = Counter(z)
et = time.monotonic()
print (a)
print('Using collections.Counter - Duration: {}'.format(et - st))
#Solution 3 - Slow
st = time.monotonic()
for i in range(1000000):
g = dict([(i, z.count(i)) for i in set(z)])
et = time.monotonic()
print(g)
print('Using list comprehension - Duration: {}'.format(et - st))
Result
#Solution 1 - Faster
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 234: 3, 23: 10, 12: 2, 123: 1, 31: 1, 13: 1, 42: 5, 34: 4, 423: 3}
Simple for loop and storing it in dict - Duration: 12.032000000000153
#Solution 2 - Fast
Counter({23: 10, 4: 6, 2: 5, 42: 5, 1: 4, 3: 4, 34: 4, 234: 3, 423: 3, 5: 2, 12: 2, 123: 1, 31: 1, 13: 1})
Using collections.Counter - Duration: 15.889999999999418
#Solution 3 - Slow
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 34: 4, 423: 3, 234: 3, 42: 5, 12: 2, 13: 1, 23: 10, 123: 1, 31: 1}
Using list comprehension - Duration: 33.0
It was suggested to use numpy's bincount, however it works only for 1d arrays with non-negative integers. Also, the resulting array might be confusing (it contains the occurrences of the integers from min to max of the original list, and sets to 0 the missing integers).
A better way to do it with numpy is to use the unique function with the attribute return_counts set to True. It returns a tuple with an array of the unique values and an array of the occurrences of each unique value.
# a = [1, 1, 0, 2, 1, 0, 3, 3]
a_uniq, counts = np.unique(a, return_counts=True) # array([0, 1, 2, 3]), array([2, 3, 1, 2]
and then we can pair them as
dict(zip(a_uniq, counts)) # {0: 2, 1: 3, 2: 1, 3: 2}
It also works with other data types and "2d lists", e.g.
>>> a = [['a', 'b', 'b', 'b'], ['a', 'c', 'c', 'a']]
>>> dict(zip(*np.unique(a, return_counts=True)))
{'a': 3, 'b': 3, 'c': 2}
To count the number of diverse elements having a common type:
li = ['A0','c5','A8','A2','A5','c2','A3','A9']
print sum(1 for el in li if el[0]=='A' and el[1] in '01234')
gives
3 , not 6
You can also use countOf method of a built-in module operator.
>>> import operator
>>> operator.countOf([1, 2, 3, 4, 1, 4, 1], 1)
3
I would use filter(), take Lukasz's example:
>>> lst = [1, 2, 3, 4, 1, 4, 1]
>>> len(filter(lambda x: x==1, lst))
3
use %timeit to see which operation is more efficient. np.array counting operations should be faster.
from collections import Counter
mylist = [1,7,7,7,3,9,9,9,7,9,10,0]
types_counts=Counter(mylist)
print(types_counts)
May not be the most efficient, requires an extra pass to remove duplicates.
Functional implementation :
arr = np.array(['a','a','b','b','b','c'])
print(set(map(lambda x : (x , list(arr).count(x)) , arr)))
returns :
{('c', 1), ('b', 3), ('a', 2)}
or return as dict :
print(dict(map(lambda x : (x , list(arr).count(x)) , arr)))
returns :
{'b': 3, 'c': 1, 'a': 2}
Given a list X
import numpy as np
X = [1, -1, 1, -1, 1]
The dictionary which shows i: frequency(i) for elements of this list is:
{i:X.count(i) for i in np.unique(X)}
Output:
{-1: 2, 1: 3}
Alternatively, you can also implement the counter by yourself. This is the way I do:
item_list = ['me', 'me', 'you', 'you', 'you', 'they']
occ_dict = {}
for item in item_list:
if item not in occ_dict:
occ_dict[item] = 1
else:
occ_dict[item] +=1
print(occ_dict)
Output: {'me': 2, 'you': 3, 'they': 1}
mot = ["compte", "france", "zied"]
lst = ["compte", "france", "france", "france", "france"]
dict((x, lst.count(x)) for x in set(mot))
this gives
{'compte': 1, 'france': 4, 'zied': 0}
sum([1 for elem in <yourlist> if elem==<your_value>])
This will return the amount of occurences of your_value
test = [409.1, 479.0, 340.0, 282.4, 406.0, 300.0, 374.0, 253.3, 195.1, 269.0, 329.3, 250.7, 250.7, 345.3, 379.3, 275.0, 215.2, 300.0]
for i in test:
print('{} numbers {}'.format(i, test.count(i)))
import pandas as pd
test = [409.1, 479.0, 340.0, 282.4, 406.0, 300.0, 374.0, 253.3, 195.1, 269.0, 329.3, 250.7, 250.7, 345.3, 379.3, 275.0, 215.2, 300.0]
#turning the list into a temporary dataframe
test = pd.DataFrame(test)
#using the very convenient value_counts() function
df_counts = test.value_counts()
df_counts
then you can use df_counts.index and df_counts.values to get the data.
x = ['Jess', 'Jack', 'Mary', 'Sophia', 'Karen',
'Addison', 'Joseph','Jack', 'Jack', 'Eric', 'Ilona', 'Jason']
the_item = input('Enter the item that you wish to find : ')
how_many_times = 0
for occurrence in x:
if occurrence == the_item :
how_many_times += 1
print('The occurrence of', the_item, 'in', x,'is',how_many_times)
Created a list of names wherein the name 'Jack' is repeated.
In order to check its Occurrence, I ran a for loop in the list named x.
Upon each iteration, if the loop variable attains the value same that of received from the user and stored in the variable the_item, the variable how_many_times gets incremented by 1.
After attaining some value...We print how_many_times which stores the value of the occurance of the word 'jack'
def countfrequncyinarray(arr1):
r=len(arr1)
return {i:arr1.count(i) for i in range(1,r+1)}
arr1=[4,4,4,4]
a=countfrequncyinarray(arr1)
print(a)

How do you calculate the greatest number of repetitions in a list?

If I have a list in Python like
[1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1]
How do I calculate the greatest number of repeats for any element? In this case 2 is repeated a maximum of 4 times and 1 is repeated a maximum of 3 times.
Is there a way to do this but also record the index at which the longest run began?
Use groupby, it group elements by value:
from itertools import groupby
group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1])
print max(group, key=lambda k: len(list(k[1])))
And here is the code in action:
>>> group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1])
>>> print max(group, key=lambda k: len(list(k[1])))
(2, <itertools._grouper object at 0xb779f1cc>)
>>> group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3])
>>> print max(group, key=lambda k: len(list(k[1])))
(3, <itertools._grouper object at 0xb7df95ec>)
From python documentation:
The operation of groupby() is similar
to the uniq filter in Unix. It
generates a break or new group every
time the value of the key function
changes
# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
If you also want the index of the longest run you can do the following:
group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3])
result = []
index = 0
for k, g in group:
length = len(list(g))
result.append((k, length, index))
index += length
print max(result, key=lambda a:a[1])
Loop through the list, keep track of the current number, how many times it has been repeated, and compare that to the most times youve seen that number repeated.
Counts={}
Current=0
Current_Count=0
LIST = [1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1]
for i in LIST:
if Current == i:
Current_Count++
else:
Current_Count=1
Current=i
if Current_Count>Counts[i]:
Counts[i]=Current_Count
print Counts
If you want it for just any element (i.e. the element with the most repetitions), you could use:
def f((v, l, m), x):
nl = l+1 if x==v else 1
return (x, nl, max(m,nl))
maxrep = reduce(f, l, (0,0,0))[2];
This only counts continuous repetitions (Result for [1,2,2,2,1,2] would be 3) and only records the element with the the maximum number.
Edit: Made definition of f a bit shorter ...
This is my solution:
def longest_repetition(l):
if l == []:
return None
element = l[0]
new = []
lar = []
for e in l:
if e == element:
new.append(e)
else:
if len(new) > len(lar):
lar = new
new = []
new.append(e)
element = e
if len(new) > len(lar):
lar = new
return lar[0]
-You can make new copy of the list but with unique values and a corresponding hits list.
-Then get the Max of hits list and get from it's index your most repeated item.
oldlist = ["A", "B", "E", "C","A", "C","D","A", "E"]
newlist=[]
hits=[]
for i in range(len(oldlist)):
if oldlist[i] in newlist:
hits[newlist.index(oldlist[i])]+= 1
else:
newlist.append(oldlist[i])
hits.append(1);
#find the most repeated item
temp_max_hits=max(hits)
temp_max_hits_index=hits.index(temp_max_hits)
print(newlist[temp_max_hits_index])
print(temp_max_hits)
But I don't know is this the fastest way to do that or there are faster solution.
If you think there are faster or more efficient solution, kindly inform us.
I'd use a hashmap of item to counter.
Every time you see a 'key' succession, increment its counter value. If you hit a new element, set the counter to 1 and keep going. At the end of this linear search, you should have the maximum succession count for each number.
This code seems to work:
l = [1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1]
previous = None
# value/repetition pair
greatest = (-1, -1)
reps = 1
for e in l:
if e == previous:
reps += 1
else:
if reps > greatest[1]:
greatest = (previous, reps)
previous = e
reps = 1
if reps > greatest[1]:
greatest = (previous, reps)
print greatest
i write this code and working easly:
lst = [4,7,2,7,7,7,3,12,57]
maximum=0
for i in lst:
count = lst.count(i)
if count>maximum:
maximum=count
indexx = lst.index(i)
print(lst[indexx])

Categories