Python count list items through a function [duplicate] - python

This question already has answers here:
How do I count the occurrences of a list item?
(29 answers)
Closed 5 years ago.
In short, I am making a function that takes 2 arguments 'sequence' and 'item'. The 'sequence' can be anything - an integer, a string, or a list containing integers or strings. I am trying to make a function that counts the amount of times 'item' occurs in the 'sequence'. Please take into account, I am still a newbie at Python. A simple answer would be very much appreciated.
This is what I have so far
def count(sequence, item):
found = 0
if sequence == item:
found += 1
return found
else:
for num in sequence:
if num == sequence:
found += 1
return found
else:
return False
print count([4,'foo',5,'hi'], 5)
The else part of the code is meant to be enabled if the sequence is something like a list. I was thinking I should loop through the list using for and do the same thing - but it's not working as it keeps returning False which follows the second 'else' statement. Any idea how I can do this? For clarification, the output in the example above should be 1 because '5' occurs once in the list.

len([i for i in sequence if item in i])

EDIT:
Changed return to return the number of occurrences instead of True/False
You are checking each item in sequence and evaluating, if it doesn't equal item it will return False right away, that is why you're getting False always.
You need to have your loop increment found and once the loop is done then use if/else to check whether found == 0 or not. A simple example:
def count(sequence, item):
found = 0
if sequence == item:
return 1
for num in sequence:
if num == item:
found += 1
return found
If you learned list-comprehension already, you can use it as the following:
def count(sequence, item):
if sequence == item:
return 1
return sum(1 for x in sequence if x == item)

If we use your code as a basis, we get the following if we want it to work.
You told it to loop over sequence, asking when num is equal to sequence, but one part of the list is not equal to the whole list. To fix this, we use an index. So we say, if item is equal to sequence[i], where is is an index, then we do found +=1. You also had the return statement in the for-loop, while you should get it outside of the for-loop, because it because breaks the loop. I hope this helps.
def count(sequence, item):
found = 0
if sequence == item:
found += 1
return found
else:
for num in range(len(sequence)):
if item == sequence[num]:
found += 1
if found > 0:
return found
else:
return False

You can use a recursive function to repeat calls on the count function when the first argument is a list, or use a simple == when it is not. This can equally handle nested lists by walking through the nesting recursively:
def count(sequence, item):
found = 0
if isinstance(sequence, list):
for x in sequence:
found += count(x, item)
else:
return int(item == sequence)
return found
print count([4,'foo',5,'hi', [5]], 5)
# 2
print count(5, 5)
# 1

I'd use collections.Sequence (or collections.abc.Sequence in Python 3) to determine if the items are sequences (lists, tuples, etc). If you just want to know if the item is in the list, which is what your code seems to be doing in spite of the name count:
from collections import Sequence
# In Python 3 it would be
# from collections.abc import Sequence
def count(sequence, item):
for i in sequence:
if item == i or (isinstance(i, Sequence) and item in i):
return True
return False
If you want to actually count the number of appearances of item in sequence:
from collections import Sequence
# In Python 3 it would be
# from collections.abc import Sequence
def count(sequence, item):
found = 0
for i in sequence:
if item == i or (isinstance(i, Sequence) and item in i):
found += 1
return found
There could be the possibility that a list within sequence contains in turn deeper sublists, and so on, in which case you may want a recursive algorithm. For finding if the element is contained you could go with:
from collections import Sequence
# In Python 3 it would be
# from collections.abc import Sequence
def count(sequence, item):
for i in sequence:
if item == i or (isinstance(i, Sequence) and count(i, item)):
return True
return False
For actually counting:
from collections import Sequence
# In Python 3 it would be
# from collections.abc import Sequence
def count(sequence, item):
found = 0
for i in sequence:
if item == i:
found += 1
elif isinstance(i, Sequence) and item in i:
found += count(i, item)
return found
PD: Note that strings are considered sequences, and therefore "b" in "abc" and "b" in ["a", "b", "c"] are both true. If you don't want "b" being considered an element of "abc" (i.e. you want to consider strings atomically, like numbers), you'd have to tune the code a bit. You could replace isinstance(i, Sequence) with isinstance(i, (list, tuple)), or with isinstance(i, Sequence) and not isinstance(i, basestring) (or, in Python 3, isinstance(i, Sequence) and not isinstance(i, (str, bytes))).

Related

python, printing longest length of string in a list

My question is to write a function which returns the longest string and ignores any non-strings, and if there are no strings in the input list, then it should return None.
my answer:
def longest_string(x):
for i in max(x, key=len):
if not type(i)==str:
continue
if
return max
longest_string(['cat', 'dog', 'horse'])
I'm a beginner so I have no idea where to start. Apologies if this is quite simple.
This is how i would do it:
def longest_string(x):
Strings = [i for i in x if isinstance(i, str)]
return(max(Strings, key=len)) if Strings else None
Based on your code:
def longest_string(x):
l = 0
r = None
for s in x:
if isinstance(s, str) and len(s) > l:
l = len(s)
r = s
return r
print(longest_string([None, 'cat', 1, 'dog', 'horse']))
# horse
def longest_string(items):
try:
return max([x for x in items if isinstance(x, str)], key=len)
except ValueError:
return None
def longest_string(items):
strings = (s for s in items if isinstance(s, str))
longest = max(strings, key=len) if strings else None
return longest
print(longest_string(['cat', 'dog', 'horse']))
Your syntax is wrong (second-to-last line: if with no condition) and you are returning max which you did not define manually. In actuality, max is a built-in Python function which you called a few lines above.
In addition, you are not looping through all strings, you are looping through the longest string. Your code should instead be
def longest_string(l):
strings = [item for item in l if type(item) == str]
if len(strings):
return max(strings, key=len)
return None
You're on a good way, you could iterate the list and check each item is the longest:
def longest_string(x)
# handle case of 0 strings
if len(x) == 0:
return None
current_longest = ""
# Iterate the strings
for i in x:
# Handle nonestring
if type(i) != str:
continue
# if the current string is longer than the longest, replace the string.
if len(i) > len(current_longest):
current_longest = i
# This condition handles multiple elements where none are strings and should return None.
if len(current_longest) > 0:
return current_longest
else:
return None
Since you are a beginner, I recommend you to start using python's built-in methods to sort and manage lists. Is the best when it comes to logic and leaves less room for bugs.
def longest_string(x):
x = filter(lambda obj: isinstance(obj, str), x)
longest = max(list(x), key=lambda obj: len(obj), default=None)
return longest
Nonetheless, you were in a good way. Just avoid using python´s keywords for variable names (such as max, type, list, etc.)
EDIT: I see a lot of answers using one-liner conditionals, list comprehension, etc. I think those are fantastic solutions, but for the level of programming the OP is at, my answer attempts to document each step of the process and be as readable as possible.
First of all, I would highly suggest defining the type of the x argument in your function.
For example; since I see you are passing a list, you can define the type like so:
def longest_string(x: list):
....
This not only makes it more readable for potential collaborators but helps enormously when creating docstrings and/or combined with using an IDE that shows type hints when writing functions.
Next, I highly suggest you break down your "specs" into some pseudocode, which is enormously helpful for taking things one step at a time:
returns the longest string
ignores any non-strings
if there are no strings in the input list, then it should return None.
So to elaborate on those "specifications" further, we can write:
Return the longest string from a list.
Ignore any element from the input arg x that is not of type str
if no string is present in the list, return None
From here we can proceed to writing the function.
def longest_string(x: list):
# Immediately verify the input is the expected type. if not, return None (or raise Exception)
if type(x) != list:
return None # input should always be a list
# create an empty list to add all strings to
str_list = []
# Loop through list
for element in x:
# check type. if not string, continue
if type(element) != str:
pass
# at this point in our loop the element has passed our type check, and is a string.
# add the element to our str_list
str_list.append(element)
# we should now have a list of strings
# however we should handle an edge case where a list is passed to the function that contains no strings at all, which would mean we now have an empty str_list. let's check that
if not str_list: # an empty list evaluates to False. if not str_list is basically saying "if str_list is empty"
return None
# if the program has not hit one of the return statements yet, we should now have a list of strings (or at least 1 string). you can check with a simple print statement (eg. print(str_list), print(len(str_list)) )
# now we can check for the longest string
# we can use the max() function for this operation
longest_string = max(str_list, key=len)
# return the longest string!
return longest_string

Loop through int array, return True if following int is equal to current

Given a list of ints, return True if the array contains a 3 next to a 3 somewhere.
has_33([1, 3, 3]) → True
has_33([1, 3, 1, 3]) → False
has_33([3, 1, 3]) → False
First Approch:
def has_33(nums):
for i in range(0,len(nums)):
return nums[i] == nums[i+1] ==3
Could someone explain me what's wrong with this approach, I see that this code is returning True only if all the elements in a list are true.
Second Approach:
def has_33(nums):
for i in range(0,len(nums)):
if(nums[i] == nums[i+1] ==3):
return True
The second approach satisfies my question.
What is the difference between these two approaches?
Well, the difference is rather obvious. In the first case, you inconditionnaly return the result of expression nums[i] == nums[i+1] ==3, whatever the value of this expression is. This actually means that you always return on the very first iteration, so your code could as well be written as
def has_33(nums):
if len(nums):
return nums[0] == nums[1] ==3
In the second case, you only return if the expression is true, so the iteration goes on until either you explicitely return (found a match) or the iteration naturally terminates and you've found nothing (in which case the function will implicitely return None).
Unrelated, but your code (second version) can be improved in quite a few ways. First point: Python "for" loop are of the "foreach" kind - you iterate on the sequence elements, not indices. If you don't need the indice, the proper way is
for item in iterable:
do_something_with(item)
no need for range(len(xxx)) and indexed access here.
If you do need both the item and the index, then enumerate() is your friend - it yields (index, item) tuples:
for index, item in enumerate(sequence):
print("item at {} is {}".format(index, item))
Now for your current need - geting (item, nextitem) pairs -, there's still another solution: zip(seq1, seq2) + slicing:
for item, nextitem in zip(sequence, sequence[1:]):
print("item: {} - nextitem : {}".format(item, nextitem))
and finally, if what you want is to check if at least one item in a sequence satisfies a condition, you can use any() with a predicate:
def has_33(nums):
return any((item == nextitem == 3) for item, nextitem in zip(nums, nums[1:]))
Another solution could be to turn nums into a string and look for the literal string "33" in it:
def has_33(nums):
return "33" in "".join(str(x) for x in nums)
but I'm not sure this will be more efficient (you can use timeit to find out by yourself).
In your first approach, you will return the value of
return nums[i] == nums[i+1] == 3 #Where i = 0 since it returns
first iteration.
return nums[0]==nums[1] == 3 #If nums = [0,3,3]
return false # would be your result. But it would never check the next pair of values.
In your second approach, you will return the value
return true #If the if-statement is satisfied
The return function, will end the function call when called. Therefore, if being called in a for-loop without an if-statement, it will be called for the first iteration. If there is an if-statement and the iteration passes through the if-statement, it will return and end the loop at that iteration. Basically, the return function ends the function call and returns the value given.

How to fix this function please?

I need to code a function that takes as input a list of tuples, the number of tuples, and two numbers. This function should return True if the two given numbers exist in one of the tuples in our list.
For example : ma_fonction(l,n,i,j)
l: list of tuples
i and j two numbers between 0 and n-1 and i != j
I tried this code:
def ma_fonction(l,n,i,j):
checker = False
for item in l :
if ( i in item and j in item ):
cheker = True
return True
else:
return False
ma_fonction([(0,1), (5,2), (4,3)], 6, 2, 5)
But it doesn't work. What should I add or change?
I tried this ( somehow i didnt copy all my work in my question )
This is my work:
def ma_fonction(l,n,i,j):
checker = False
if((i!=j )and (0<=i<=n-1 )and( 0<=j<=n-1) ):
for item in l :
if ( i in item and j in item ):
cheker=True
return True
else:
return False
change your function to this:
def foo(l,n,i,j):
for item in l:
if (i in item and j in item):
return True
return False
You go over all tuples, and if i and j are in the tuple, you return True.
If it went over all tuples and didn't find a match, we can be sure that we can return False.
And this implementation really doesn't need the parameter n
The logic is that for each tuple in this list, check whether it contains the number i and j. If yes, then return True; if no, continue check the next tuple. Once you finish checking every tuple, and it turns out that there is no satisfied tuple, then return False.
def ma_fonction(l,n,i,j):
for item in l :
if ( i in item and j in item ):
return True
return False
This should work:
def ma_fonction(l,n,i,j):
for item in l :
if ( i in item and j in item ):
return True
return False
The reason your solution is not working is because you are returning False at the first element of the list that doesn't match with i and j. What you need to do is return False if and only if you looked in all the elements of the list and you couldn't find a match.

Seeking elements in nested lists

I am trying to make a function that is able to find a element in a nested list.
That is what if got so far:
def in_list(ele, lst, place):
if ele == lst[place]:
return True
else:
for i in range(len(lst)):
in_list(ele, lst[i], place)
This is what i input:
a=[[1,2],[3,4]]
if in_list(2,a,1)==True:
print("True")
the variable "place" is the place in the list where the element should be found...
Now somehow it doesn't understand this line if ele == lst[place]
this is the error message: TypeError: 'int' object is not subscriptable
Thanks in advance
There are two issues in the last line
def in_list(ele, lst, place):
if ele == lst[place]:
return True
else:
for i in range(len(lst)):
in_list(ele, lst[i], place)
lst[i] is an integer (assuming lst is a list of integers), which is why you get your error.
The other issue is that you're not returning anything from the else branch.
Something like this might work better in case of arbitrary, but uniform, nesting:
def recursive_contains(item, lst):
if len(lst) == 0:
return False
elif isinstance(lst[0], collections.Iterable):
return any(recursive_contains(item, sublist) for sublist in lst)
else:
return item in lst
for arbitrary non-uniform nesting, perhaps something like this:
def recursive_contains(item, lst):
if not isinstance(lst, collections.Iterable):
return item == lst
for val in lst:
if item == val:
return True
elif isinstance(val, collections.Iterable):
if recursive_contains(item, val):
return True
return False
of course if you only have 2 levels (all elements of lst are lists of int), you could simply say:
if ele in sum(lst, []):
...
which uses sum to flatten the list first.
The other answers well define the mistake in your code.
Just to reiterate that you were assuming each element in the list as a nested list and subscripting it like - elem[place].
You can't subscript a primitive type such as integer and hence the error.
Refer the below code to handle nesting.
Note - You Don't require the 3rd parameter of place, more appropriately you wouldn't the place if you are searching.*
def fetch(data, l):
for element in l:
if type(element) == list:
if fetch(data, element):
return True
else:
if element == data:
return True
return False
On further thought you are looking for an element that should be only at "place" index of any of the nested lists.
Refer to the snippet below for that-
def fetch(data, l,place):
if data == l[place]:
return True
else:
for element in l:
if type(element) == list and fetch(data,element,place):
return True
return False
Note- Only call fetch again if the element is a list.
a = [[1, 2], [3, 4]]
def inlist(e, l, p):
for lists in range(len(l)):
print("array:", l[lists])
print("Looking for", e)
print("In position", p)
print(l[lists][p])
if l[lists][p] == e:
print(True, ": The element ", e, " is in the list n.", lists, " at the place ", p)
return True
inlist(2, a, 1)
Output
array: [1, 2]
Looking for 2
In position 1
2
True : The element 2 is in the list n. 0 at the place 1

Count the number of times an item occurs in a sequence using recursion Python

I'm trying to count the number of times an item occurs in a sequence whether it's a list of numbers or a string, it works fine for numbers but i get an error when trying to find a letter like "i" in a string:
def Count(f,s):
if s == []:
return 0
while len(s) != 0:
if f == s[0]:
return 1 + Count(f,s[1:])
else:
return 0 + Count(f,s[1:])
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
There's a far more idiomatic way to do it than using recursion: use the built-in count method to count occurrences.
def count(str, item):
return str.count(item)
>>> count("122333444455555", "4")
4
However, if you want to do it with iteration, you can apply a similar principle. Convert it to a list, then iterate over the list.
def count(str, item):
count = 0
for character in list(str):
if character == item:
count += 1
return count
The problem is your first if, which explicitly checks if the input is an empty list:
if s == []:
return 0
If you want it to work with strs and lists you should simply use:
if not s:
return s
In short any empty sequence is considered false according to the truth value testing in Python and any not-empty sequence is considered true. If you want to know more about it I added a link to the relevant documentation.
You can also omit the while loop here because it's unnecessary because it will always return in the first iteration and therefore leave the loop.
So the result would be something along these lines:
def count(f, s):
if not s:
return 0
elif f == s[0]:
return 1 + count(f, s[1:])
else:
return 0 + count(f, s[1:])
Example:
>>> count('i', 'what is it')
2
In case you're not only interested in making it work but also interested in making it better there are several possibilities.
Booleans subclass from integers
In Python booleans are just integers, so they behave like integers when you do arithmetic:
>>> True + 0
1
>>> True + 1
2
>>> False + 0
0
>>> False + 1
1
So you can easily inline the if else:
def count(f, s):
if not s:
return 0
return (f == s[0]) + count(f, s[1:])
Because f == s[0] returns True (which behaves like a 1) if they are equal or False (behaves like a 0) if they aren't. The parenthesis are not necessary but I added them for clarity. And because the base case always returns an integer this function itself will always return an integer.
Avoiding copies in the recursive approach
Your approach will create a lot of copies of the input because of the:
s[1:]
This creates a shallow copy of the whole list (or string, ...) except for the first element. That means you actually have an operation that uses O(n) (where n is the number of elements) time and memory in every function call and because you do this recursively the time and memory complexity will be O(n**2).
You can avoid these copies, for example, by passing the index in:
def _count_internal(needle, haystack, current_index):
length = len(haystack)
if current_index >= length:
return 0
found = haystack[current_index] == needle
return found + _count_internal(needle, haystack, current_index + 1)
def count(needle, haystack):
return _count_internal(needle, haystack, 0)
Because I needed to pass in the current index I added another function that takes the index (I assume you probably don't want the index to be passed in in your public function) but if you wanted you could make it an optional argument:
def count(needle, haystack, current_index=0):
length = len(haystack)
if current_index >= length:
return 0
return (haystack[current_index] == needle) + count(needle, haystack, current_index + 1)
However there is probably an even better way. You could convert the sequence to an iterator and use that internally, at the start of the function you pop the next element from the iterator and if there is no element you end the recursion, otherwise you compare the element and then recurse into the remaining iterator:
def count(needle, haystack):
# Convert it to an iterator, if it already
# is an (well-behaved) iterator this is a no-op.
haystack = iter(haystack)
# Try to get the next item from the iterator
try:
item = next(haystack)
except StopIteration:
# No element remained
return 0
return (item == needle) + count(needle, haystack)
Of course you could also use an internal method if you want to avoid the iter call overhead that is only necessary the first time the function is called. However that's a micro-optimization that may not result in noticeably faster execution:
def _count_internal(needle, haystack):
try:
item = next(haystack)
except StopIteration:
return 0
return (item == needle) + _count_internal(needle, haystack)
def count(needle, haystack):
return _count_internal(needle, iter(haystack))
Both of these approaches have the advantage that they don't use (much) additional memory and can avoid the copies. So it should be faster and take less memory.
However for long sequences you will run into problems because of the recursion. Python has a recursion-limit (which is adjustable but only to some extend):
>>> count('a', 'a'*10000)
---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
<ipython-input-9-098dac093433> in <module>()
----> 1 count('a', 'a'*10000)
<ipython-input-5-5eb7a3fe48e8> in count(needle, haystack)
11 else:
12 add = 0
---> 13 return add + count(needle, haystack)
... last 1 frames repeated, from the frame below ...
<ipython-input-5-5eb7a3fe48e8> in count(needle, haystack)
11 else:
12 add = 0
---> 13 return add + count(needle, haystack)
RecursionError: maximum recursion depth exceeded in comparison
Recursion using divide-and-conquer
There are ways to mitigate (you cannot solve the recursion depth problem as long as you use recursion) that problem. An approach used regularly is divide-and-conquer. It basically means you divide whatever sequence you have into 2 (sometimes more) parts and do call the function with each of these parts. The recursion sill ends when only one item remained:
def count(needle, haystack):
length = len(haystack)
# No item
if length == 0:
return 0
# Only one item remained
if length == 1:
# I used the long version here to avoid returning True/False for
# length-1 sequences
if needle == haystack[0]:
return 1
else:
return 0
# More than one item, split the sequence in
# two parts and recurse on each of them
mid = length // 2
return count(needle, haystack[:mid]) + count(needle, haystack[mid:])
The recursion depth now changed from n to log(n), which allows to make the call that previously failed:
>>> count('a', 'a'*10000)
10000
However because I used slicing it will again create lots of copies. Using iterators will be complicated (or impossible) because iterators don't have a size (generally) but it's easy to use indices:
def _count_internal(needle, haystack, start_index, end_index):
length = end_index - start_index
if length == 0:
return 0
if length == 1:
if needle == haystack[start_index]:
return 1
else:
return 0
mid = start_index + length // 2
res1 = _count_internal(needle, haystack, start_index, mid)
res2 = _count_internal(needle, haystack, mid, end_index)
return res1 + res2
def count(needle, haystack):
return _count_internal(needle, haystack, 0, len(haystack))
Using built-in methods with recursion
It may seem stupid to use built-in methods (or functions) in this case because there is already a built-in method to solve the problem without recursion but here it is and it uses the index method that both strings and lists have:
def count(needle, haystack):
try:
next_index = haystack.index(needle)
except ValueError: # the needle isn't present
return 0
return 1 + count(needle, haystack[next_index+1:])
Using iteration instead of recursion
Recursion is really powerful but in Python you have to fight against the recursion limit and because there is not tail call optimization in Python it is often rather slow. This can be solved by using iterations instead of recursion:
def count(needle, haystack):
found = 0
for item in haystack:
if needle == item:
found += 1
return found
Iterative approaches using built-ins
If you're more advantageous, one can also use a generator expression together with sum:
def count(needle, haystack):
return sum(needle == item for item in haystack)
Again this relies on the fact that booleans behave like integers and so sum adds all the occurrences (ones) with all non-occurrences (zeros) and thus gives the number of total counts.
But if one is already using built-ins it would be a shame not to mention the built-in method (that both strings and lists have): count:
def count(needle, haystack):
return haystack.count(needle)
At that point you probably don't need to wrap it inside a function anymore and could simply use just the method directly.
In case you even want to go further and count all elements you can use the Counter in the built-in collections module:
>>> from collections import Counter
>>> Counter('abcdab')
Counter({'a': 2, 'b': 2, 'c': 1, 'd': 1})
Performance
I often mentioned copies and their effect on memory and performance and I actually wanted to present some quantitative results to show that it actually makes a difference.
I used a fun-project of mine simple_benchmarks here (it's a third-party package so if you want to run it you have to install it):
def count_original(f, s):
if not s:
return 0
elif f == s[0]:
return 1 + count_original(f, s[1:])
else:
return 0 + count_original(f, s[1:])
def _count_index_internal(needle, haystack, current_index):
length = len(haystack)
if current_index >= length:
return 0
found = haystack[current_index] == needle
return found + _count_index_internal(needle, haystack, current_index + 1)
def count_index(needle, haystack):
return _count_index_internal(needle, haystack, 0)
def _count_iterator_internal(needle, haystack):
try:
item = next(haystack)
except StopIteration:
return 0
return (item == needle) + _count_iterator_internal(needle, haystack)
def count_iterator(needle, haystack):
return _count_iterator_internal(needle, iter(haystack))
def count_divide_conquer(needle, haystack):
length = len(haystack)
if length == 0:
return 0
if length == 1:
if needle == haystack[0]:
return 1
else:
return 0
mid = length // 2
return count_divide_conquer(needle, haystack[:mid]) + count_divide_conquer(needle, haystack[mid:])
def _count_divide_conquer_index_internal(needle, haystack, start_index, end_index):
length = end_index - start_index
if length == 0:
return 0
if length == 1:
if needle == haystack[start_index]:
return 1
else:
return 0
mid = start_index + length // 2
res1 = _count_divide_conquer_index_internal(needle, haystack, start_index, mid)
res2 = _count_divide_conquer_index_internal(needle, haystack, mid, end_index)
return res1 + res2
def count_divide_conquer_index(needle, haystack):
return _count_divide_conquer_index_internal(needle, haystack, 0, len(haystack))
def count_index_method(needle, haystack):
try:
next_index = haystack.index(needle)
except ValueError: # the needle isn't present
return 0
return 1 + count_index_method(needle, haystack[next_index+1:])
def count_loop(needle, haystack):
found = 0
for item in haystack:
if needle == item:
found += 1
return found
def count_sum(needle, haystack):
return sum(needle == item for item in haystack)
def count_method(needle, haystack):
return haystack.count(needle)
import random
import string
from functools import partial
from simple_benchmark import benchmark, MultiArgument
funcs = [count_divide_conquer, count_divide_conquer_index, count_index, count_index_method, count_iterator, count_loop,
count_method, count_original, count_sum]
# Only recursive approaches without builtins
# funcs = [count_divide_conquer, count_divide_conquer_index, count_index, count_iterator, count_original]
arguments = {
2**i: MultiArgument(('a', [random.choice(string.ascii_lowercase) for _ in range(2**i)]))
for i in range(1, 12)
}
b = benchmark(funcs, arguments, 'size')
b.plot()
It's log-log scaled to display the range of values in a meaningful way and lower means faster.
One can clearly see that the original approach gets very slow for long inputs (because it copies the list it performs in O(n**2)) while the other approaches behave linearly. What may seem weird is that the divide-and-conquer approaches perform slower, but that is because these need more function calls (and function calls are expensive in Python). However they can process much longer inputs than the iterator and index variants before they hit the recursion limit.
It would be easy to change the divide-and-conquer approach so that it runs faster, a few possibilities that come to mind:
Switch to non-divide-and-conquer when the sequence is short.
Always process one element per function call and only divide the rest of the sequence.
But given that this is probably just an exercise in recursion that goes a bit beyond the scope.
However they all perform much worse than using iterative approaches:
Especially using the count method of lists (but also the one of strings) and the manual iteration are much faster.
The error is because sometimes you just have no return Value. So return 0 at the end of your function fixes this error. There are a lot better ways to do this in python, but I think it is just for training recursive programming.
You are doing things the hard way in my opinion.
You can use Counter from collections to do the same thing.
from collections import Counter
def count(f, s):
if s == None:
return 0
return Counter(s).get(f)
Counter will return a dict object that holds the counts of everything in your s object. Doing .get(f) on the dict object will return the count for the specific item you are searching for. This works on lists of numbers or a string.
If you're bound and determined to do it with recursion, whenever possible I strongly recommend halving the problem rather than whittling it down one-by-one. Halving allows you to deal with much larger cases without running into stack overflow.
def count(f, s):
l = len(s)
if l > 1:
mid = l / 2
return count(f, s[:mid]) + count(f, s[mid:])
elif l == 1 and s[0] == f:
return 1
return 0

Categories