Python generator with external break condition - python

I need to iterate over ascending sequences x of n (= 5, f.i.) integers, finding all sequences for which a function f(*x) returns True.
Assume that if f_n(*y) is False for a particular y, then f_n(*z) id False for any z with z_i >= y_i. So f_n is monotonic in all its arguments.
This kind of generator function could be used in the following way to determine all ascending sequences of integers that have a sum of squares < 100
for sequence in generate_sequences(5):
if sum_squares_is_at_least(sequence, 100):
# some code to trigger the breaking of the generator loop
else:
print sequence
Clarification:
The problem here is that we need to iterate of n elements individually. Initially, we iterate [1,1,1,1,1] to [1,1,1,1,x], and then we have to continue with [1,1,1,2,2] to [1,1,1,2,y], eventually ending with [a,b,c,d,e]. It seems that the generator should look something like this, but needs some code to break out of the for and/or while loops if necessary (determined externally):
def generate_sequences(length, minimum = 1):
if length == []:
yield []
else:
element = minimum
while True:
for sequence in generate_sequences(length - 1, element):
yield element + [sequence]
element += 1
Example:
For n = 3, and sum of squares no larger than 20, the following sequences would be generated:
[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 1, 4], [1, 2, 2], [1, 2, 3], [1, 3, 3], [2, 2, 2], [2, 2, 3]
Note that in the general case, I cannot use the information that 4 is the upper bound for each element. This would also seriously impact the running time for larger examples.

Are you looking for itertools.takewhile?
>>> from itertools import takewhile
>>> def gen(): #infinite generator
... i=0
... while True:
... yield range(i,i+5)
... i = i+1
...
>>> [ x for x in takewhile( lambda x:sum(x)<20, gen() ) ]
[[0, 1, 2, 3, 4], [1, 2, 3, 4, 5]]
>>>

import itertools as it
it.takewhile(lambda x: sum_squares_is_at_least(x, 100), generate_sequences(5))
If you are now sure about the 5 in the generate_sequences, then just let it yield the numbers as long as it is called:
def generate_sequences():
i = 0 # or anything
while True:
yield [i, i] # or anything
i = i + 1 # or anything
Then use it this way:
it.takewhile(lambda x: sum_squares_is_at_least(x, 100), generate_sequences())

I would solve it with recursion by starting with a given list then appending another number (with logic to prevent going over sum of squares target)
def makegen(N): #make a generator with max sumSquares: N
def gen(l=[]): #empty list is valid with sum == 0
yield l
if l:
i = l[-1] #keep it sorted to only include combinations not permutations
else:
i = 1 #only first iteration
sumsquare = sum(x*x for x in l) #find out how much more we can add
while sumsquare + i*i < N: #increase the appended number until we exceed target
for x in gen(l+[i]): #recurse with appended list
yield x
i += 1
return gen
calling our generator generator (tee hee :D) in the following fashion allows us to have any maximum sum of squares we desire
for x in makegen(26)():
print x

Related

I created a function that groups numbers from a list to a list based on their frequency and I need help optimizing it

UPDATE - I have managed to 'fix' my problem but I'm a 100% sure that this is not the optimal solution.
Also I forgot to mention that the order of each frequency group must be ordered from greatest number to least
input:
[-1,1,-6,4,5,-6,1,4,1]
output:
[[5, -1], [4, 4, -6, -6], [1, 1, 1]]
This is the updated code that I wrote:
def freqSort(nums)
#This section makes an ordered hash map that is ordered from least to greatest frequency
hash_map = {x:0 for x in nums}
hash_lst = []
for num in nums:
hash_map[num] += 1
hash_lst = sorted(hash_map.items(), key=lambda x: x[1], reverse=True)
hash_lst.reverse()
#This section creates a list of frequencies for each type of number in the list
key_lst = []
for key in hash_lst:
key_lst.append(key[1])
interval_map = {x:0 for x in key_lst}
for num in key_lst:
interval_map[num] += 1
#This section initializes an array based on the number of frequencies
array_lst = []
for num in interval_map:
array_lst.append([])
#This section appends numbers into their corresponding frequencies
i = 0
j = 0
for tup in hash_lst:
array_lst[i].append(tup[0])
if j+1 != len(key_lst):
if key_lst[j] != key_lst[j+1]:
i += 1
j += 1
k = 0
#array_lst at this point looks like [[5, -1],[4, -6],[1]]
#This section multiplies each number in each array based on the frequency it is grouped in
for interval in interval_map:
array_lst[k] = np.repeat(array_lst[k], interval)
k+=1
result_lst = []
for array in array_lst:
result_lst.append(sorted(list(array), reverse=True))
return result_lst
I've got a function called frequency, which stores the occurrence of each element in a dictionary such that element:occurrence. I use get_output to format it properly by finding the length
of the individual elements of the list and grouping them together (by using + operator).
lst = [-1,1,-6,4,5,-6,1,4,1]
def f(r): #to check if elements in the list are same
if r.count(r[0])==len(r):
return True
d={}
megalist=[]
def frequency(x):
for i in x:
d[i]=x.count(i)
for j in d:
n=d[j]
if n in d.values():
megalist.append([j for k in range(d[j])])
frequency(lst)
def get_output(y):
for a in y:
for b in y:
if a!=b:
if len(a)==len(b)and f(a) ==True and f(b)==True:
y[y.index(a)]=a+b
y.remove(b)
print(y)
get_output(megalist)
output:
[[-1, 5], [1, 1, 1], [-6, -6, 4, 4]]
UPDATE:
For this,
the order of each frequency group must be ordered from greatest number to least input
You could run a for loop and sort the individual elements of the list (using "list".sort(reverse=True))

Permutations with repetition without two consecutive equal elements

I need a function that generates all the permutation with repetition of an iterable with the clause that two consecutive elements must be different; for example
f([0,1],3).sort()==[(0,1,0),(1,0,1)]
#or
f([0,1],3).sort()==[[0,1,0],[1,0,1]]
#I don't need the elements in the list to be sorted.
#the elements of the return can be tuples or lists, it doesn't change anything
Unfortunatly itertools.permutation doesn't work for what I need (each element in the iterable is present once or no times in the return)
I've tried a bunch of definitions; first, filterting elements from itertools.product(iterable,repeat=r) input, but is too slow for what I need.
from itertools import product
def crp0(iterable,r):
l=[]
for f in product(iterable,repeat=r):
#print(f)
b=True
last=None #supposing no element of the iterable is None, which is fine for me
for element in f:
if element==last:
b=False
break
last=element
if b: l.append(f)
return l
Second, I tried to build r for cycle, one inside the other (where r is the class of the permutation, represented as k in math).
def crp2(iterable,r):
a=list(range(0,r))
s="\n"
tab=" " #4 spaces
l=[]
for i in a:
s+=(2*i*tab+"for a["+str(i)+"] in iterable:\n"+
(2*i+1)*tab+"if "+str(i)+"==0 or a["+str(i)+"]!=a["+str(i-1)+"]:\n")
s+=(2*i+2)*tab+"l.append(a.copy())"
exec(s)
return l
I know, there's no need you remember me: exec is ugly, exec can be dangerous, exec isn't easy-readable... I know.
To understand better the function I suggest you to replace exec(s) with print(s).
I give you an example of what string is inside the exec for crp([0,1],2):
for a[0] in iterable:
if 0==0 or a[0]!=a[-1]:
for a[1] in iterable:
if 1==0 or a[1]!=a[0]:
l.append(a.copy())
But, apart from using exec, I need a better functions because crp2 is still too slow (even if faster than crp0); there's any way to recreate the code with r for without using exec? There's any other way to do what I need?
You could prepare the sequences in two halves, then preprocess the second halves to find the compatible choices.
def crp2(I,r):
r0=r//2
r1=r-r0
A=crp0(I,r0) # Prepare first half sequences
B=crp0(I,r1) # Prepare second half sequences
D = {} # Dictionary showing compatible second half sequences for each token
for i in I:
D[i] = [b for b in B if b[0]!=i]
return [a+b for a in A for b in D[a[-1]]]
In a test with iterable=[0,1,2] and r=15, I found this method to be over a hundred times faster than just using crp0.
You could try to return a generator instead of a list. With large values of r, your method will take a very long time to process product(iterable,repeat=r) and will return a huge list.
With this variant, you should get the first element very fast:
from itertools import product
def crp0(iterable, r):
for f in product(iterable, repeat=r):
last = f[0]
b = True
for element in f[1:]:
if element == last:
b = False
break
last = element
if b:
yield f
for no_repetition in crp0([0, 1, 2], 12):
print(no_repetition)
# (0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
# (1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0)
Instead of filtering the elements, you could generate a list directly with only the correct elements. This method uses recursion to create the cartesian product:
def product_no_repetition(iterable, r, last_element=None):
if r == 0:
return [[]]
else:
return [p + [x] for x in iterable
for p in product_no_repetition(iterable, r - 1, x)
if x != last_element]
for no_repetition in product_no_repetition([0, 1], 12):
print(no_repetition)
I agree with #EricDuminil's comment that you do not want "Permutations with repetition." You want a significant subset of the product of the iterable with itself multiple times. I don't know what name is best: I'll just call them products.
Here is an approach that builds each product line without building all the products then filtering out the ones you want. My approach is to work primarily with the indices of the iterable rather than the iterable itself--and not all the indices, but ignoring the last one. So instead of working directly with [2, 3, 5, 7] I work with [0, 1, 2]. Then I work with the products of those indices. I can transform a product such as [1, 2, 2] where r=3 by comparing each index with the previous one. If an index is greater than or equal to the previous one I increment the current index by one. This prevents two indices from being equal, and this also gets be back to using all the indices. So [1, 2, 2] is transformed to [1, 2, 3] where the final 2 was changed to a 3. I now use those indices to select the appropriate items from the iterable, so the iterable [2, 3, 5, 7] with r=3 gets the line [3, 5, 7]. The first index is treated differently, since it has no previous index. My code is:
from itertools import product
def crp3(iterable, r):
L = []
for k in range(len(iterable)):
for f in product(range(len(iterable)-1), repeat=r-1):
ndx = k
a = [iterable[ndx]]
for j in range(r-1):
ndx = f[j] if f[j] < ndx else f[j] + 1
a.append(iterable[ndx])
L.append(a)
return L
Using %timeit in my Spyder/IPython configuration on crp3([0,1], 3) shows 8.54 µs per loop while your crp2([0,1], 3) shows 133 µs per loop. That shows a sizeable speed improvement! My routine works best where iterable is short and r is large--your routine finds len ** r lines (where len is the length of the iterable) and filters them while mine finds len * (len-1) ** (r-1) lines without filtering.
By the way, your crp2() does do filtering, as shown by the if lines in your code that is execed. The sole if in my code does not filter a line, it modifies an item in the line. My code does return surprising results if the items in the iterable are not unique: if that is a problem, just change the iterable to a set to remove the duplicates. Note that I replaced your l name with L: I think l is too easy to confuse with 1 or I and should be avoided. My code could easily be changed to a generator: replace L.append(a) with yield a and remove the lines L = [] and return L.
How about:
from itertools import product
result = [ x for x in product(iterable,repeat=r) if all(x[i-1] != x[i] for i in range(1,len(x))) ]
Elaborating on #peter-de-rivaz's idea (divide and conquer). When you divide the sequence to create into two subsequences, those subsequences are the same or very close. If r = 2*k is even, store the result of crp(k) in a list and merge it with itself. If r=2*k+1, store the result of crp(k) in a list and merge it with itself and with L.
def large(L, r):
if r <= 4: # do not end the divide: too slow
return small(L, r)
n = r//2
M = large(L, r//2)
if r%2 == 0:
return [x + y for x in M for y in M if x[-1] != y[0]]
else:
return [x + y + (e,) for x in M for y in M for e in L if x[-1] != y[0] and y[-1] != e]
small is an adaptation from #eric-duminil's answer using the famous for...else loop of Python:
from itertools import product
def small(iterable, r):
for seq in product(iterable, repeat=r):
prev, *tail = seq
for e in tail:
if e == prev:
break
prev = e
else:
yield seq
A small benchmark:
print(timeit.timeit(lambda: crp2( [0, 1, 2], 10), number=1000))
#0.16290732200013736
print(timeit.timeit(lambda: crp2( [0, 1, 2, 3], 15), number=10))
#24.798989593000442
print(timeit.timeit(lambda: large( [0, 1, 2], 10), number=1000))
#0.0071403849997295765
print(timeit.timeit(lambda: large( [0, 1, 2, 3], 15), number=10))
#0.03471425700081454

longest decreasing sublist inside a given list

I wanted to find the longest decreasing sub sequence inside a given list for example L = [1, 2, 1, 2, 1, 2, 1, 2, 1], the result should be [2,1] however I cant seem to produce that result. Can someone tell me why it doesn't work ? The output is something [0,2,1,2,1,2,1,2,1] Nevermind the first zero but the result should produce [2,1].
Here's a code I tried
L = [1, 2, 1, 2, 1, 2, 1, 2, 1]
current = [0]
smallest = []
for i in range(len(L)):
if i < (len(L)-1):
if L[i] >= L[i+1]:
current.append(L[i])
else :
if L[i] < current[-1]:
current.append(L[i])
elif i>= (len(L)-1):
if L[-1]<L[i-1]:
current.append(L[i])
else:
current = [i]
if len(current) > len(smallest):
smallest = current
Result : [0,2,1,2,1,2,1,2,1]
Desired Result : [2,1]
There are so many ways to solve this. In Py3 - using itertools.accumulate for dynamic programming:
>>> import operator as op
>>> import itertools as it
>>> L = [1, 2, 1, 2, 1, 2, 1, 2, 1]
>>> dyn = it.accumulate(it.chain([0], zip(L, L[1:])), lambda x, y: (x+1)*(y[0]>y[1]))
>>> i, l = max(enumerate(dyn), key=op.itemgetter(1))
>>> L[i-l:i+1]
[2, 1]
when you say current = [0], it actually adds 0 to the list, maybe you want current = [L[0]].
See this:
def longest_decreasing_sublist(a):
lds, current = [], [a[0]]
for val in a[1:]:
if val < current[-1]: current.append(val)
else:
lds = current[:] if len(current) > len(lds) else lds
current = [val]
lds = current[:] if len(current) > len(lds) else lds
return lds
L = [1, 2, 1, 2, 1, 2, 1, 2, 1]
print (longest_decreasing_sublist(L))
# [2, 1]
You would be better off engineering your code for maintainability and readability up front, so that you can easily test components of your software individually rather than trying to solve the entire thing at once. This is known as functional decomposition.
Think about what you need at the topmost level.
First, you need a way to get the longest decreasing subset at a given point in the array, so that you can intelligently compare the various sequences.
You can do that with code such as:
def getNextSequence(myList, index):
# Beyond list end, return empty list.
if index > len(myList):
return []
# Construct initial list, keep adding until either
# an increase is found or no elements left.
sequence = [ myList[index] ]
addIdx = index + 1
while addIdx < len(myList) and myList[addIdx] <= myList[addIdx - 1]:
sequence.append(myList[addIdx])
addIdx += 1
# And there you have it, the sequence at a given point.
return sequence
Second, you need to be able to store the current longest one and check the lengths to see whether the current one is greater than the longest to date.
That breaks down to something like:
def getLongestSequence(myList):
# Initially no sequence, even a sequence of one
# will beat this.
longestSequence = []
# This index is where we are checking. Keep checking
# until all possibilities exhausted.
checkIndex = 0
while checkIndex < len(myList):
# Get current sequence, save it if it's longer
# than the currently longest one.
currentSequence = getNextSequence(myList, index)
if len(currentSequence) > len(longestSequence):
longestSequence = currentSequence
# Adjust index to next checking point.
checkIndex += len(currentSequence)
# All done, just return the longest sequence.
return longestSequence
The important thing there (other than readability, of course) is the changing of the index. Once you've established a decreasing sequence, you never need to look anywhere inside it since any partial sequence within it will naturally be shorter than the whole.
In other words, if you have 8, 7, 6, 5, 9, a sequence starting at 7 cannot be (by definition) longer than the sequence starting at 8. Hence you can skip straight to the 9.

Creating a recursive function that gets the sum of all possible subsets in a list of integers

If i were to get the sum of all possible subset-combinations in the list [1,2,3] i would use the code below:
def f():
for i in range(2):
for j in range(2):
for k in range(2):
x = i*1 + j*2 + k*3
print x
f()
How can i make a recursive function that does this for any list?
I can solve this using itertools.combinations but i would like to learn the recursive way.
Thanks
Let's write a recursive function to output all combinations of all subsets of a list.
For a given list, the combinations are the the list itself, plus all combinations of the list minus each member. That's easy to translate straight to Python:
def combinations(seq):
yield seq
for i in range(len(seq)):
for combination in combinations(seq[:i] + seq[i+1:]):
yield combination
However, this will obviously yield duplicates. For example, the list [1, 2, 3] contains both [1, 2] and [1, 3], and they both contain [1]. So, how do you eliminate those duplicates? Simple, just tell each sub-list how many elements to skip:
def combinations(seq, toskip=0):
yield seq
for i in range(toskip, len(seq)):
for combination in combinations(seq[:i] + seq[i+1:], i):
yield combination
Now, you want to sum all combinations? That's easy:
>>> a = [1, 2, 3]
>>> map(sum, combinations(a))
[6, 5, 3, 0, 2, 4, 1, 3]
def allsums(a):
x = a[0]
if len(a) > 1:
yy = allsums(a[1:])
return set(x + y for y in yy).union(yy)
else:
return set([0, x])

Using functions on lists

I have a function that determines if the number is less than 0 or if there isn't a number at all
def numberfunction(s) :
if s == "":
return 0
if s < 0 :
return -1
if s > 0:
return s
i also have a list of lists
numbers = [[]]
now, lets say i filled the list of lists with numbers like:
[[1,2,3,4],[1,1,1,1],[2,2,2,2] ..etc ]
how would i go about calling up the function i had above into the numbers i have in the lists?
Would I require a loop where I use the function on every number of every list, or is it simpler than that?
You can use map and a list comprehension to apply your function to all of your elements. Please note that I have modified your example list to show all of the return cases.
def numberfunction(s) :
if s == "":
return 0
if s < 0 :
return -1
if s > 0:
return s
# Define some example input data.
a = [[1,2,3,""],[-1,1,-1,1],[0,-2,-2,2]]
# Apply your function to each element.
b = [map(numberfunction, i) for i in a]
print(b)
# [[1, 2, 3, 0], [-1, 1, -1, 1], [None, -1, -1, 2]]
Note that, with the way your numberfunction works at the moment, it will return None for an element equal to zero (thanks to #thefourtheye for pointing this out).
You can also call nested map():
>>> a = [[1,2,3,""],[-1,1,-1,1],[2,-2,-2,2]]
>>> map(lambda i: map(numberfunction, i), a)
[[1, 2, 3, 0], [-1, 1, -1, 1], [2, -1, -1, 2]]
>>>
I have Python < 3 in which map returns list.
You could do it as:
result = [[numberfunction(item) for item in row] for row in numbers]

Categories