Related
I'm trying to time all the different sorting algorithms to see which is fastest but every time I do that I need to rewrite the bottom half of the code again (under #####) except I have to change all the variable names (and instead of selectionsort(mylist) I do bubblesort(mylist) etc). I guess it's not the end of the world but I can't help but imagine it can be written much better. I know there are other options for timing it that may be better but I've been told I have to use perf_count.
def selectionsort(mylist):
sortedlist=[]
while len(mylist) > 0:
lowest = mylist[0]
for i in mylist:
if i < lowest:
lowest=i
sortedlist.append(lowest)
mylist.remove(lowest)
return sortedlist
ivalues = [2,4,8,16,32,64,128,256,512,1024]
#####
sorttimelist = []
for i in range(1,11):
mylist=[]
for j in range(2**i):
mylist.append(random.random())
start_time=time.perf_counter()
selectionsort(mylist)
end_time=time.perf_counter()
sorttime=end_time-start_time
sorttimelist.append(sorttime)
You can use a for loop to go through different functions to use in your code. Since functions are essentially variables, you can assign one to func in the for loop, and call it like func(my_list)
#####
for func in [selectionsort, bubblesort]:
for i in range(1,11):
mylist=[]
for j in range(2**i):
mylist.append(random.random())
start_time = time.perf_counter()
func(mylist) # use func instead of selectionsort
end_time = time.perf_counter()
sorttime = end_time - start_time
sorttimelist.append(sorttime)
You can iterate over collection of functions you are going to benchmark. See example:
def selection_sort(my_list):
pass
def bubble_sort(my_list):
pass
functions = [selection_sort, bubble_sort]
for func in functions:
func(list_to_sort)
Use a decorator, put #timit above your function.
import time
def timit(func):
'''
A Decorator that times how long it takes to return the function. Added time.sleep because some functions run under a seconds and would return 0 seconds.
'''
def inner(*args, **kwargs):
start = float(time.time())
time.sleep(1)
test = func(*args, **kwargs)
end = float(time.time())
print(f'Funtion {func.__name__} took {end-start-1} seconds to complete')
return test
return inner
#timit
def bubble_sort(array):
for last_idx in range(1,len(array)):
is_sorted = True
for idx in range(len(array)-last_idx-1):
if array[idx] > array[idx+1]:
is_sorted = False
swap(array, idx, idx+1)
if is_sorted is True:
break
return array
#timit
def selection_sort(array):
for first_idx in range(len(array)-1):
smallest = array[first_idx]
for idx in range(first_idx+1, len(array)):
if array[idx] < smallest:
smallest = array[idx]
swap(array, idx, first_idx)
return array
This question already has answers here:
Sum of Even Fibonacci Numbers < X
(5 answers)
Closed 2 years ago.
I'm taking a programming course and there's a question about fibonacci sums and recursion.
the rules are as follows:
Write a function fibsum(N) that returns the sum of all even valued fibonacci terms that are less than N.
I've gotten close I think but my summation isn't working properly, also I'd like the function to work up pretty high (like N = 10**6 at least), here's my code so far
def fibsum(n, memo = {}):
added = 0
if n<0:
return 0
if n== 1 or n == 0:
return 1
else:
if (n-1) in memo.keys():
f1 = memo[n-1]
else:
memo[n-1] = fibsum(n-1)
f1 = memo[n-1]
if (n-2) in memo.keys():
f2 = memo[n-2]
else:
memo[n-2] = fibsum(n-2)
f2 = memo[n-2]
if f1+f2 < 44:
if (f1+f2) % 2 == 0:
added += f1+f2
print ("look here",added)
return added
print (f1+f2)
return f1 + f2
I've left some print statements because I was trying to debug the problem but I've had no luck.
edit: I've been linked another question but it is done iteratively in that case, I would like to do it recursively if possible
memoization wont help you with large values for fib
but as an aside seperate your logic
def fib(n):
"""
simple recursive fibonacci function
"""
if n == 0:
return 1
return n + fib(n-1)
then make a generic memoization decorator
def memoize(fn):
cache = {}
def _tokenize(*args,**kwargs):
return str(args)+str(kwargs)
def __inner(*args,**kwargs):
token = _tokenize(*args,**kwargs)
if token not in cache:
cache[token] = fn(*args,**kwargs)
return cache[token]
now just decorate your simple recursive function
#memoize
def fib(n):
"""
simple recursive fibonacci function
"""
if n == 0:
return 1
return n + fib(n-1)
now you can make your fibsum method (and also memoize it)
#memoize
def get_fib_nums(n):
if n == 0:
return [1]
return [n] + get_fib_nums(n)
#memoize
def fibevensum(n):
return sum(n for n in get_fib_nums(n) if n%2 == 0)
I'm trying to count the number of times an item occurs in a sequence whether it's a list of numbers or a string, it works fine for numbers but i get an error when trying to find a letter like "i" in a string:
def Count(f,s):
if s == []:
return 0
while len(s) != 0:
if f == s[0]:
return 1 + Count(f,s[1:])
else:
return 0 + Count(f,s[1:])
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
There's a far more idiomatic way to do it than using recursion: use the built-in count method to count occurrences.
def count(str, item):
return str.count(item)
>>> count("122333444455555", "4")
4
However, if you want to do it with iteration, you can apply a similar principle. Convert it to a list, then iterate over the list.
def count(str, item):
count = 0
for character in list(str):
if character == item:
count += 1
return count
The problem is your first if, which explicitly checks if the input is an empty list:
if s == []:
return 0
If you want it to work with strs and lists you should simply use:
if not s:
return s
In short any empty sequence is considered false according to the truth value testing in Python and any not-empty sequence is considered true. If you want to know more about it I added a link to the relevant documentation.
You can also omit the while loop here because it's unnecessary because it will always return in the first iteration and therefore leave the loop.
So the result would be something along these lines:
def count(f, s):
if not s:
return 0
elif f == s[0]:
return 1 + count(f, s[1:])
else:
return 0 + count(f, s[1:])
Example:
>>> count('i', 'what is it')
2
In case you're not only interested in making it work but also interested in making it better there are several possibilities.
Booleans subclass from integers
In Python booleans are just integers, so they behave like integers when you do arithmetic:
>>> True + 0
1
>>> True + 1
2
>>> False + 0
0
>>> False + 1
1
So you can easily inline the if else:
def count(f, s):
if not s:
return 0
return (f == s[0]) + count(f, s[1:])
Because f == s[0] returns True (which behaves like a 1) if they are equal or False (behaves like a 0) if they aren't. The parenthesis are not necessary but I added them for clarity. And because the base case always returns an integer this function itself will always return an integer.
Avoiding copies in the recursive approach
Your approach will create a lot of copies of the input because of the:
s[1:]
This creates a shallow copy of the whole list (or string, ...) except for the first element. That means you actually have an operation that uses O(n) (where n is the number of elements) time and memory in every function call and because you do this recursively the time and memory complexity will be O(n**2).
You can avoid these copies, for example, by passing the index in:
def _count_internal(needle, haystack, current_index):
length = len(haystack)
if current_index >= length:
return 0
found = haystack[current_index] == needle
return found + _count_internal(needle, haystack, current_index + 1)
def count(needle, haystack):
return _count_internal(needle, haystack, 0)
Because I needed to pass in the current index I added another function that takes the index (I assume you probably don't want the index to be passed in in your public function) but if you wanted you could make it an optional argument:
def count(needle, haystack, current_index=0):
length = len(haystack)
if current_index >= length:
return 0
return (haystack[current_index] == needle) + count(needle, haystack, current_index + 1)
However there is probably an even better way. You could convert the sequence to an iterator and use that internally, at the start of the function you pop the next element from the iterator and if there is no element you end the recursion, otherwise you compare the element and then recurse into the remaining iterator:
def count(needle, haystack):
# Convert it to an iterator, if it already
# is an (well-behaved) iterator this is a no-op.
haystack = iter(haystack)
# Try to get the next item from the iterator
try:
item = next(haystack)
except StopIteration:
# No element remained
return 0
return (item == needle) + count(needle, haystack)
Of course you could also use an internal method if you want to avoid the iter call overhead that is only necessary the first time the function is called. However that's a micro-optimization that may not result in noticeably faster execution:
def _count_internal(needle, haystack):
try:
item = next(haystack)
except StopIteration:
return 0
return (item == needle) + _count_internal(needle, haystack)
def count(needle, haystack):
return _count_internal(needle, iter(haystack))
Both of these approaches have the advantage that they don't use (much) additional memory and can avoid the copies. So it should be faster and take less memory.
However for long sequences you will run into problems because of the recursion. Python has a recursion-limit (which is adjustable but only to some extend):
>>> count('a', 'a'*10000)
---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
<ipython-input-9-098dac093433> in <module>()
----> 1 count('a', 'a'*10000)
<ipython-input-5-5eb7a3fe48e8> in count(needle, haystack)
11 else:
12 add = 0
---> 13 return add + count(needle, haystack)
... last 1 frames repeated, from the frame below ...
<ipython-input-5-5eb7a3fe48e8> in count(needle, haystack)
11 else:
12 add = 0
---> 13 return add + count(needle, haystack)
RecursionError: maximum recursion depth exceeded in comparison
Recursion using divide-and-conquer
There are ways to mitigate (you cannot solve the recursion depth problem as long as you use recursion) that problem. An approach used regularly is divide-and-conquer. It basically means you divide whatever sequence you have into 2 (sometimes more) parts and do call the function with each of these parts. The recursion sill ends when only one item remained:
def count(needle, haystack):
length = len(haystack)
# No item
if length == 0:
return 0
# Only one item remained
if length == 1:
# I used the long version here to avoid returning True/False for
# length-1 sequences
if needle == haystack[0]:
return 1
else:
return 0
# More than one item, split the sequence in
# two parts and recurse on each of them
mid = length // 2
return count(needle, haystack[:mid]) + count(needle, haystack[mid:])
The recursion depth now changed from n to log(n), which allows to make the call that previously failed:
>>> count('a', 'a'*10000)
10000
However because I used slicing it will again create lots of copies. Using iterators will be complicated (or impossible) because iterators don't have a size (generally) but it's easy to use indices:
def _count_internal(needle, haystack, start_index, end_index):
length = end_index - start_index
if length == 0:
return 0
if length == 1:
if needle == haystack[start_index]:
return 1
else:
return 0
mid = start_index + length // 2
res1 = _count_internal(needle, haystack, start_index, mid)
res2 = _count_internal(needle, haystack, mid, end_index)
return res1 + res2
def count(needle, haystack):
return _count_internal(needle, haystack, 0, len(haystack))
Using built-in methods with recursion
It may seem stupid to use built-in methods (or functions) in this case because there is already a built-in method to solve the problem without recursion but here it is and it uses the index method that both strings and lists have:
def count(needle, haystack):
try:
next_index = haystack.index(needle)
except ValueError: # the needle isn't present
return 0
return 1 + count(needle, haystack[next_index+1:])
Using iteration instead of recursion
Recursion is really powerful but in Python you have to fight against the recursion limit and because there is not tail call optimization in Python it is often rather slow. This can be solved by using iterations instead of recursion:
def count(needle, haystack):
found = 0
for item in haystack:
if needle == item:
found += 1
return found
Iterative approaches using built-ins
If you're more advantageous, one can also use a generator expression together with sum:
def count(needle, haystack):
return sum(needle == item for item in haystack)
Again this relies on the fact that booleans behave like integers and so sum adds all the occurrences (ones) with all non-occurrences (zeros) and thus gives the number of total counts.
But if one is already using built-ins it would be a shame not to mention the built-in method (that both strings and lists have): count:
def count(needle, haystack):
return haystack.count(needle)
At that point you probably don't need to wrap it inside a function anymore and could simply use just the method directly.
In case you even want to go further and count all elements you can use the Counter in the built-in collections module:
>>> from collections import Counter
>>> Counter('abcdab')
Counter({'a': 2, 'b': 2, 'c': 1, 'd': 1})
Performance
I often mentioned copies and their effect on memory and performance and I actually wanted to present some quantitative results to show that it actually makes a difference.
I used a fun-project of mine simple_benchmarks here (it's a third-party package so if you want to run it you have to install it):
def count_original(f, s):
if not s:
return 0
elif f == s[0]:
return 1 + count_original(f, s[1:])
else:
return 0 + count_original(f, s[1:])
def _count_index_internal(needle, haystack, current_index):
length = len(haystack)
if current_index >= length:
return 0
found = haystack[current_index] == needle
return found + _count_index_internal(needle, haystack, current_index + 1)
def count_index(needle, haystack):
return _count_index_internal(needle, haystack, 0)
def _count_iterator_internal(needle, haystack):
try:
item = next(haystack)
except StopIteration:
return 0
return (item == needle) + _count_iterator_internal(needle, haystack)
def count_iterator(needle, haystack):
return _count_iterator_internal(needle, iter(haystack))
def count_divide_conquer(needle, haystack):
length = len(haystack)
if length == 0:
return 0
if length == 1:
if needle == haystack[0]:
return 1
else:
return 0
mid = length // 2
return count_divide_conquer(needle, haystack[:mid]) + count_divide_conquer(needle, haystack[mid:])
def _count_divide_conquer_index_internal(needle, haystack, start_index, end_index):
length = end_index - start_index
if length == 0:
return 0
if length == 1:
if needle == haystack[start_index]:
return 1
else:
return 0
mid = start_index + length // 2
res1 = _count_divide_conquer_index_internal(needle, haystack, start_index, mid)
res2 = _count_divide_conquer_index_internal(needle, haystack, mid, end_index)
return res1 + res2
def count_divide_conquer_index(needle, haystack):
return _count_divide_conquer_index_internal(needle, haystack, 0, len(haystack))
def count_index_method(needle, haystack):
try:
next_index = haystack.index(needle)
except ValueError: # the needle isn't present
return 0
return 1 + count_index_method(needle, haystack[next_index+1:])
def count_loop(needle, haystack):
found = 0
for item in haystack:
if needle == item:
found += 1
return found
def count_sum(needle, haystack):
return sum(needle == item for item in haystack)
def count_method(needle, haystack):
return haystack.count(needle)
import random
import string
from functools import partial
from simple_benchmark import benchmark, MultiArgument
funcs = [count_divide_conquer, count_divide_conquer_index, count_index, count_index_method, count_iterator, count_loop,
count_method, count_original, count_sum]
# Only recursive approaches without builtins
# funcs = [count_divide_conquer, count_divide_conquer_index, count_index, count_iterator, count_original]
arguments = {
2**i: MultiArgument(('a', [random.choice(string.ascii_lowercase) for _ in range(2**i)]))
for i in range(1, 12)
}
b = benchmark(funcs, arguments, 'size')
b.plot()
It's log-log scaled to display the range of values in a meaningful way and lower means faster.
One can clearly see that the original approach gets very slow for long inputs (because it copies the list it performs in O(n**2)) while the other approaches behave linearly. What may seem weird is that the divide-and-conquer approaches perform slower, but that is because these need more function calls (and function calls are expensive in Python). However they can process much longer inputs than the iterator and index variants before they hit the recursion limit.
It would be easy to change the divide-and-conquer approach so that it runs faster, a few possibilities that come to mind:
Switch to non-divide-and-conquer when the sequence is short.
Always process one element per function call and only divide the rest of the sequence.
But given that this is probably just an exercise in recursion that goes a bit beyond the scope.
However they all perform much worse than using iterative approaches:
Especially using the count method of lists (but also the one of strings) and the manual iteration are much faster.
The error is because sometimes you just have no return Value. So return 0 at the end of your function fixes this error. There are a lot better ways to do this in python, but I think it is just for training recursive programming.
You are doing things the hard way in my opinion.
You can use Counter from collections to do the same thing.
from collections import Counter
def count(f, s):
if s == None:
return 0
return Counter(s).get(f)
Counter will return a dict object that holds the counts of everything in your s object. Doing .get(f) on the dict object will return the count for the specific item you are searching for. This works on lists of numbers or a string.
If you're bound and determined to do it with recursion, whenever possible I strongly recommend halving the problem rather than whittling it down one-by-one. Halving allows you to deal with much larger cases without running into stack overflow.
def count(f, s):
l = len(s)
if l > 1:
mid = l / 2
return count(f, s[:mid]) + count(f, s[mid:])
elif l == 1 and s[0] == f:
return 1
return 0
My question comes from a variant of Hanoi, which has four towers.
I know this article which says In c++ you can convert any recursive function to a loop, but I am only familiar with Python. I tried to read the ten rules but what do the keywords struct and stack means to python?
So, any article or discuss for python which is similar to the C++ one above is also very appreciated. Thanks.
The raw recursive function is fmove (holds another recursive function tmove), receives an integer, returns a tuple of pairs. It is elegant but useless (try tmove(100) and be careful of your memory).
I want to convert it to a pure yield loop version so even the n becomes big like 100 or 1000, I can still know what the first 10 or 100 pairs of the tuple is.
def memory(function):
"""
This is a decorator to help raw recursion
functions to avoid repetitive calculation.
"""
cache = {}
def memofunc(*nkw,**kw):
key=str(nkw)+str(kw)
if key not in cache:
cache[key] = function(*nkw,**kw)
return cache[key]
return memofunc
#memory
def tmove(n, a=0, b=1, c=2):
"int n -> a tuple of pairs"
if n==1:
return ((a,c),)
return tmove(n-1,a,c,b)+\
((a,c),)+\
tmove(n-1,b,a,c)
#memory
def fmove(n,a=0,b=1,c=2,d=3):
"int n -> a tuple of pairs"
if n==1:
return ((a,d),)
return min(
(
fmove(n-i,a,d,b,c) +
tmove(i,a,b,d) +
fmove(n-i,c,b,a,d)
for i in range(1,n)
),
key=len,)
With the help of user2357112 in this question, I know how to convert recursive functions like tmove -- return recur(...)+ CONS or another call +recur(...), but when situations getting more complicated like fmove, I don't know how to design the structure, -- the i is relevant to n which is different in a different stack, and you finally have to use min to get the minimum size tuple as the correct output for the current stack.
This is my try (the core algorithm best(n)is still recursive function):
#memory
def _best(n):
if n==1:
return 1,1
return min(
(
(i, 2*(_best(n-i)[1])+2**i-1)
for i in range(1,n)
),
key=lambda x:x[1],
)
def best(n):
return _best(n)[0]
def xtmove(n,a=0,b=1,c=2):
stack = [(True,n,a,b,c)]
while stack:
tag,n,a,b,c = stack.pop()
if n==1:
yield a,c
elif tag:
stack.append((False,n,a,b,c))
stack.append((True,n-1,a,c,b))
else:
yield a,c
stack.append((True,n-1,b,a,c))
def xfmove(n,a=0,b=1,c=2,d=3):
stack = [(True,n,a,b,c,d)]
while stack:
is_four,n,a,b,c,d = stack.pop()
if n==1 and is_four:
yield a,d
elif is_four:
# here I use a none-tail-recursion function 'best'
# to get the best i, so the core is still not explicit stack.
i = best(n)
stack.append((True,n-i,c,b,a,d))
stack.append((False,i,a,b,d,None))
stack.append((True,n-i,a,d,b,c))
else:
for t in xtmove(n,a,b,c):
yield t
This is the test code. Make sure you can pass it.
if __name__=='__main__':
MAX_TEST_NUM = 20
is_passed = all((
fmove(test_num) == tuple(xfmove(test_num))
for test_num in range(1,MAX_TEST_NUM)
))
assert is_passed, "Doesn't pass the test."
print("Pass the test!")
fmove performs a min over all the values of its recursive calls and the call to tmove so there can be no streaming of results in this case. You need 100% of the calls to finish to get the result of min.
Regarding the stack approach, it is creating a minimal interpreter with 2 opcodes, True and False. :)
Look how tmove can stream results without recurring to archaic techniques necessary in languages without generators.
from itertools import chain
def xtmove(n, a=0, b=1, c=2):
"int n -> a tuple of pairs"
if n==1:
yield (a,c)
else:
for i in chain(xtmove(n-1,a,c,b), [(a,c)], xtmove(n-1,b,a,c)):
yield i
After days of study, with the help of the c++ article, I finally get the pure loop version by myself. And I think #Javier is right -- it is impossible to yield.
def best(n):
"""
n -> best_cut_number
four-towers Hanoi best cut number for n disks.
"""
stacks = [(0,n,[],None,)] #(stg,n,possible,choice)
cache={1:(1,0)}
while stacks:
stg,n,possible,choice=stacks.pop()
if n in cache:
res = cache[n]
elif stg==0:
stacks.append((1,n,possible,n-1))
stacks.append((0,1,[],None))
else:
value = 2*res[0] + 2**choice-1
possible.append((value,choice))
if choice > 1:
stacks.append((1,n,possible,choice-1))
stacks.append((0,n-choice+1,[],None))
else:
res = min(possible,key=lambda x:x[0])
cache[n] = res
best_cut_number = res[1]
return best_cut_number
I want to do something analogous to the following:
def normal(start):
term = start
while True:
yield term
term = term + 1
def iterate(start, inc):
if inc == 1:
return normal(start)
else:
term = start
while True:
yield term
term = term + inc
Now this gives the following error.
SyntaxError: 'return' with argument inside generator
How I can return a generator to one function through another?
(Note: Example shown here doesn't require this kind of functionality but it shows what I need to do)
Thanks in advance.
Starting in Python 3.3, you can use yield from normal(start) as described in PEP 380. In earlier versions, you must manually iterate over the other generator and yield what it yields:
if inc == 1:
for item in normal(start):
yield item
Note that this isn't "returning the generator", it's one generator yielding the values yielded by another generator. You could use yield normal(start) to yield the "normal" generator object itself, but judging from your example that isn't what you're looking for.
You cant have returns inside a generator expression.
In Python 2.X you have to chain the generators manually:
def normal(start):
term = start
while True:
yield term
term = term + 1
def iterate(start, inc):
if inc == 1:
for item in normal(start):
yield item
else:
term = start
while True:
yield term
term = term + inc
I'm assuming you know both of these examples will run forever :)
FWIW in your original example you could clean it up by making two generators (say, 'mormal' and 'abnormal') and then returning one or the other from the iterate function. As long as you're not mixing generators and returns you can return generators... so to speak...
#using original 'normal'
def abnormal(start, inc):
term = start
while True:
yield term
term = term + inc
def iterate (start, inc):
if inc == 1:
return normal(start)
return abnormal(start, inc)