I want to work with a given group of elements and reorder them as the function set_partitions do but adding a condition to this lists, I will try to explain my problem in a understandable and simple way. A example of how this function works,
from more_itertools import set_partitions
a=[1,2,3]
print(list(set_partitions(a)))
And return the possible agrupations,
[[[1, 2, 3]],
[[1], [2, 3]],
[[1, 2], [3]],
[[2], [1, 3]],
[[1], [2], [3]]]
I want to put a constrain in this function (for example could be, the groups created need to sum a even number). I can simply extract the arrays and apply the constrains, but my problem is that I need to do it for a huge number of list elements, and my computer dies before finishing. If I apply the constrain directly to the function set_partitions the problem will be reduced because the constrain applied is very strong, i mean, for example it reduce a total amount of 4213597 million possible combinations to 12k valid combinations (but i can not go further because of memory problems), so applying the condition will solve the problem.
For this I need to understand the function set_partitions, this function can be used only copying the following code without needing other auxiliar functions, or to import anything,
def set_partitions(iterable, k=None):
L = list(iterable)
n = len(L)
def set_partitions_helper(L, k):
n = len(L)
if k == 1:
yield [L]
elif n == k:
yield [[s] for s in L]
else:
e, *M = L
for p in set_partitions_helper(M, k - 1):
yield [[e], *p]
for p in set_partitions_helper(M, k):
for i in range(len(p)):
yield p[:i] + [[e] + p[i]] + p[i + 1 :]
for k in range(1, n + 1):
yield from set_partitions_helper(L, k)
I' have experience programming and is my work but I do not arrive to this level, so there is some things in the function that I do not understand, as for example what is doing,
e, *M = L or the following elements in the helper function.
I need some help understanding how work this function for finding where the condition can be placed.
Thanks a lot!
Addition explaining my specific case:
The elements I'm trying to group in partitions are matrixes in tensorial product, for example chains of fixed lenght for example,
A element will be a chain of this style, that we can represent as a string,
'xz11y'
I want the elements of the subgroups to commute, 1 commute with everything, x/y/z only commutes with itself (and the identity), so we need to compare the chains only, so for example if we have the 3 matrixes,
['1z'],['zz'],['x1']
A valid agrupation will be
'1z','zz' | 'x1'
Because '1z' commutes with zz.
A invalid agrupation will be
'1z' | 'zz', 'x1'
Because 'zz' and 'x1' have a x and a z in same position, and this operators do not commute.
This is simplified, so I have a Hamiltonian of about 19 or more separate operators of this type and I need to find the agrupations that conserve this commutation.
I'm trying to wrap my head around this whole thing and I can't seem to figure it out. Basically, I have a list of ints. Adding up those int values equals 15. I want to split up a list into 2 parts, but at the same time, making each list as close as possible to each other in total sum. Sorry if I'm not explaining this good.
Example:
list = [4,1,8,6]
I want to achieve something like this:
list = [[8, 1][6,4]]
adding the first list up equals 9, and the other equals 10. That's perfect for what I want as they are as close as possible.
What I have now:
my_list = [4,1,8,6]
total_list_sum = 15
def divide_chunks(l, n):
# looping till length l
for i in range(0, len(l), n):
yield l[i:i + n]
n = 2
x = list(divide_chunks(my_list, n))
print (x)
But, that just splits it up into 2 parts.
Any help would be appreciated!
You could use a recursive algorithm and "brute force" partitioning of the list. Starting with a target difference of zero and progressively increasing your tolerance to the difference between the two lists:
def sumSplit(left,right=[],difference=0):
sumLeft,sumRight = sum(left),sum(right)
# stop recursion if left is smaller than right
if sumLeft<sumRight or len(left)<len(right): return
# return a solution if sums match the tolerance target
if sumLeft-sumRight == difference:
return left, right, difference
# recurse, brutally attempting to move each item to the right
for i,value in enumerate(left):
solution = sumSplit(left[:i]+left[i+1:],right+[value], difference)
if solution: return solution
if right or difference > 0: return
# allow for imperfect split (i.e. larger difference) ...
for targetDiff in range(1, sumLeft-min(left)+1):
solution = sumSplit(left, right, targetDiff)
if solution: return solution
# sumSplit returns the two lists and the difference between their sums
print(sumSplit([4,1,8,6])) # ([1, 8], [4, 6], 1)
print(sumSplit([5,3,2,2,2,1])) # ([2, 2, 2, 1], [5, 3], 1)
print(sumSplit([1,2,3,4,6])) # ([1, 3, 4], [2, 6], 0)
Use itertools.combinations (details here). First let's define some functions:
def difference(sublist1, sublist2):
return abs(sum(sublist1) - sum(sublist2))
def complement(sublist, my_list):
complement = my_list[:]
for x in sublist:
complement.remove(x)
return complement
The function difference calculates the "distance" between lists, i.e, how similar the sums of the two lists are. complement returns the elements of my_list that are not in sublist.
Finally, what you are looking for:
def divide(my_list):
lower_difference = sum(my_list) + 1
for i in range(1, int(len(my_list)/2)+1):
for partition in combinations(my_list, i):
partition = list(partition)
remainder = complement(partition, my_list)
diff = difference(partition, remainder)
if diff < lower_difference:
lower_difference = diff
solution = [partition, remainder]
return solution
test1 = [4,1,8,6]
print(divide(test1)) #[[4, 6], [1, 8]]
test2 = [5,3,2,2,2,1]
print(divide(test2)) #[[5, 3], [2, 2, 2, 1]]
Basically, it tries with every possible division of sublists and returns the one with the minimum "distance".
If you want to make it a a little bit faster you could return the first combination whose difference is 0.
I think what you're looking for is a hill climbing algorithm. I'm not sure this will cover all cases but at least works for your example. I'll update this if I think of a counter example or something.
Let's call your list of numbers vals.
vals.sort(reverse=true)
a,b=[],[]
for v in vals:
if sum(a)<sum(b):
a.append(v)
else:
b.append(v)
A friend of mine passed me over an interview question he recently got and I wasn't very happy with my approach to the solution. The question is as follows:
You have two lists.
Each list will contain lists of length 2, which represent a range (ie. [3,5] means a range from 3 to 5, inclusive).
You need to return the intersection of all ranges between the sets. If I give you [1,5] and [0,2], the result would be [1,2].
Within each list, the ranges will always increase and never overlap (i.e. it will be [[0, 2], [5, 10] ... ] never [[0,2], [2,5] ... ])
In general there are no "gotchas" in terms of the ordering or overlapping of the lists.
Example:
a = [[0, 2], [5, 10], [13, 23], [24, 25]]
b = [[1, 5], [8, 12], [15, 18], [20, 24]]
Expected output:
[[1, 2], [5, 5], [8, 10], [15, 18], [20, 24]]
My lazy solution involved spreading the list of ranges into a list of integers then doing a set intersection, like this:
def get_intersection(x, y):
x_spread = [item for sublist in [list(range(l[0],l[1]+1)) for l in x] for item in sublist]
y_spread = [item for sublist in [list(range(l[0],l[1]+1)) for l in y] for item in sublist]
flat_intersect_list = list(set(x_spread).intersection(y_spread))
...
But I imagine there's a solution that's both readable and more efficient.
Please explain how you would mentally tackle this problem, if you don't mind. A time/space complexity analysis would also be helpful.
Thanks
[[max(first[0], second[0]), min(first[1], second[1])]
for first in a for second in b
if max(first[0], second[0]) <= min(first[1], second[1])]
A list comprehension which gives the answer:
[[1, 2], [5, 5], [8, 10], [15, 18], [20, 23], [24, 24]]
Breaking it down:
[[max(first[0], second[0]), min(first[1], second[1])]
Maximum of the first term, Min of the 2nd term
for first in a for second in b
For all combinations of first and second term:
if max(first[0], second[0]) <= min(first[1], second[1])]
Only if the max of the first does not exceed the minimum of the second.
If you need the output compacted, then the following function does that (In O(n^2) time because deletion from a list is O(n), a step we perform O(n) times):
def reverse_compact(lst):
for index in range(len(lst) - 2,-1,-1):
if lst[index][1] + 1 >= lst[index + 1][0]:
lst[index][1] = lst[index + 1][1]
del lst[index + 1] # remove compacted entry O(n)*
return lst
It joins ranges which touch, given they are in-order. It does it in reverse because then we can do this operation in place and delete the compacted entries as we go. If we didn't do it in reverse, deleting other entries would muck with our index.
>>> reverse_compact(comp)
[[1, 2], [5, 5], [8, 10], [15, 18], [20, 24]]
The compacting function can be reduced further to O(n) by doing a forward in place compaction and copying back the elements, as then each inner step is O(1) (get/set instead of del), but this is less readable:
This runs in O(n) time and space complexity:
def compact(lst):
next_index = 0 # Keeps track of the last used index in our result
for index in range(len(lst) - 1):
if lst[next_index][1] + 1 >= lst[index + 1][0]:
lst[next_index][1] = lst[index + 1][1]
else:
next_index += 1
lst[next_index] = lst[index + 1]
return lst[:next_index + 1]
Using either compactor, the list comprehension is the dominating term here, with time =O(n*m), space = O(m+n), as it compares all possible combinations of the two lists with no early outs. This does not take advantage of the ordered structure of the lists given in the prompt: you could exploit that structure to reduce the time complexity to O(n + m) as they always increase and never overlap, meaning you can do all comparisons in a single pass.
Note there is more than one solution and hopefully you can solve the problem and then iteratively improve upon it.
A 100% correct answer which satisfies all possible inputs is not the goal of an interview question. It is to see how a person thinks and handles challenges, and whether they can reason about a solution.
In fact, if you give me a 100% correct, textbook answer, it's probably because you've seen the question before and you already know the solution... and therefore that question isn't helpful to me as an interviewer. 'Check, can regurgitate solutions found on StackOverflow.' The idea is to watch you solve a problem, not regurgitate a solution.
Too many candidates miss the forest for the trees: Acknowledging shortcomings and suggesting solutions is the right way to go about an answer to an interview questions. You don't have to have a solution, you have to show how you would approach the problem.
Your solution is fine if you can explain it and detail potential issues with using it.
I got my current job by failing to answer an interview question: After spending the majority of my time trying, I explained why my approach didn't work and the second approach I would try given more time, along with potential pitfalls I saw in that approach (and why I opted for my first strategy initially).
OP, I believe this solution works, and it runs in O(m+n) time where m and n are the lengths of the lists. (To be sure, make ranges a linked list so that changing its length runs in constant time.)
def intersections(a,b):
ranges = []
i = j = 0
while i < len(a) and j < len(b):
a_left, a_right = a[i]
b_left, b_right = b[j]
if a_right < b_right:
i += 1
else:
j += 1
if a_right >= b_left and b_right >= a_left:
end_pts = sorted([a_left, a_right, b_left, b_right])
middle = [end_pts[1], end_pts[2]]
ranges.append(middle)
ri = 0
while ri < len(ranges)-1:
if ranges[ri][1] == ranges[ri+1][0]:
ranges[ri:ri+2] = [[ranges[ri][0], ranges[ri+1][1]]]
ri += 1
return ranges
a = [[0,2], [5,10], [13,23], [24,25]]
b = [[1,5], [8,12], [15,18], [20,24]]
print(intersects(a,b))
# [[1, 2], [5, 5], [8, 10], [15, 18], [20, 24]]
Algorithm
Given two intervals, if they overlap, then the intersection's starting point is the maximum of the starting points of the two intervals, and its stopping point is the minimum of the stopping points:
To find all the pairs of intervals that might intersect, start with the first pair and keep incrementing the interval with the lower stopping point:
At most m + n pairs of intervals are considered, where m is length of the first list, and n is the length of the second list. Calculating the intersection of a pair of intervals is done in constant time, so this algorithm's time-complexity is O(m+n).
Implementation
To keep the code simple, I'm using Python's built-in range object for the intervals. This is a slight deviation from the problem description in that ranges are half-open intervals rather than closed. That is,
(x in range(a, b)) == (a <= x < b)
Given two range objects x and y, their intersection is range(start, stop), where start = max(x.start, y.start) and stop = min(x.stop, y.stop). If the two ranges don't overlap, then start >= stop and you just get an empty range:
>>> len(range(1, 0))
0
So given two lists of ranges, xs and ys, each increasing in start value, the intersection can be computed as follows:
def intersect_ranges(xs, ys):
# Merge any abutting ranges (implementation below):
xs, ys = merge_ranges(xs), merge_ranges(ys)
# Try to get the first range in each iterator:
try:
x, y = next(xs), next(ys)
except StopIteration:
return
while True:
# Yield the intersection of the two ranges, if it's not empty:
intersection = range(
max(x.start, y.start),
min(x.stop, y.stop)
)
if intersection:
yield intersection
# Try to increment the range with the earlier stopping value:
try:
if x.stop <= y.stop:
x = next(xs)
else:
y = next(ys)
except StopIteration:
return
It seems from your example that the ranges can abut. So any abutting ranges have to be merged first:
def merge_ranges(xs):
start, stop = None, None
for x in xs:
if stop is None:
start, stop = x.start, x.stop
elif stop < x.start:
yield range(start, stop)
start, stop = x.start, x.stop
else:
stop = x.stop
yield range(start, stop)
Applying this to your example:
>>> a = [[0, 2], [5, 10], [13, 23], [24, 25]]
>>> b = [[1, 5], [8, 12], [15, 18], [20, 24]]
>>> list(intersect_ranges(
... (range(i, j+1) for (i, j) in a),
... (range(i, j+1) for (i, j) in b)
... ))
[range(1, 3), range(5, 6), range(8, 11), range(15, 19), range(20, 25)]
I know this question already got a correct answer. For completeness, I would like to mention I developed some time ago a Python library, namely portion (https://github.com/AlexandreDecan/portion) that supports this kind of operations (intersections between list of atomic intervals).
You can have a look at the implementation, it's quite close to some of the answers that were provided here: https://github.com/AlexandreDecan/portion/blob/master/portion/interval.py#L406
To illustrate its usage, let's consider your example:
a = [[0, 2], [5, 10], [13, 23], [24, 25]]
b = [[1, 5], [8, 12], [15, 18], [20, 24]]
We need to convert these "items" to closed (atomic) intervals first:
import portion as P
a = [P.closed(x, y) for x, y in a]
b = [P.closed(x, y) for x, y in b]
print(a)
... displays [[0,2], [5,10], [13,23], [24,25]] (each [x,y] is an Interval object).
Then we can create an interval that represents the union of these atomic intervals:
a = P.Interval(*a)
b = P.Interval(*b)
print(b)
... displays [0,2] | [5,10] | [13,23] | [24,25] (a single Interval object, representing the union of all the atomic ones).
And now we can easily compute the intersection:
c = a & b
print(c)
... displays [1,2] | [5] | [8,10] | [15,18] | [20,23] | [24].
Notice that our answer differs from yours ([20,23] | [24] instead of [20,24]) since the library expects continuous domains for values. We can quite easily convert the results to discrete intervals following the approach proposed in https://github.com/AlexandreDecan/portion/issues/24#issuecomment-604456362 as follows:
def discretize(i, incr=1):
first_step = lambda s: (P.OPEN, (s.lower - incr if s.left is P.CLOSED else s.lower), (s.upper + incr if s.right is P.CLOSED else s.upper), P.OPEN)
second_step = lambda s: (P.CLOSED, (s.lower + incr if s.left is P.OPEN and s.lower != -P.inf else s.lower), (s.upper - incr if s.right is P.OPEN and s.upper != P.inf else s.upper), P.CLOSED)
return i.apply(first_step).apply(second_step)
print(discretize(c))
... displays [1,2] | [5] | [8,10] | [15,18] | [20,24].
I'm no kind of python programmer, but don't think this problem is amenable to slick Python-esque short solutions that are also efficient.
Mine treats the interval boundaries as "events" labeled 1 and 2, processing them in order. Each event toggles the respective bit in a parity word. When we toggle to or from 3, it's time to emit the beginning or end of an intersection interval.
The tricky part is that e.g. [13, 23], [24, 25] is being treated as [13, 25]; adjacent intervals must be concatenated. The nested if below takes care of this case by continuing the current interval rather than starting a new one. Also, for equal event values, interval starts must be processed before ends so that e.g. [1, 5] and [5, 10] will be emitted as [5, 5] rather than nothing. That's handled with the middle field of the event tuples.
This implementation is O(n log n) due to the sorting, where n is the total length of both inputs. By merging the two event lists pairwise, it could be O(n), but this article suggests that the lists must be huge before the library merge will beat the library sort.
def get_isect(a, b):
events = (map(lambda x: (x[0], 0, 1), a) + map(lambda x: (x[1], 1, 1), a)
+ map(lambda x: (x[0], 0, 2), b) + map(lambda x: (x[1], 1, 2), b))
events.sort()
prevParity = 0
isect = []
for event in events:
parity = prevParity ^ event[2]
if parity == 3:
# Maybe start a new intersection interval.
if len(isect) == 0 or isect[-1][1] < event[0] - 1:
isect.append([event[0], 0])
elif prevParity == 3:
# End the current intersection interval.
isect[-1][1] = event[0]
prevParity = parity
return isect
Here is an O(n) version that's a bit more complex because it finds the next event on the fly by merging the input lists. It also requires only constant storage beyond inputs and output:
def get_isect2(a, b):
ia = ib = prevParity = 0
isect = []
while True:
aVal = a[ia / 2][ia % 2] if ia < 2 * len(a) else None
bVal = b[ib / 2][ib % 2] if ib < 2 * len(b) else None
if not aVal and not bVal: break
if not bVal or aVal < bVal or (aVal == bVal and ia % 2 == 0):
parity = prevParity ^ 1
val = aVal
ia += 1
else:
parity = prevParity ^ 2
val = bVal
ib += 1
if parity == 3:
if len(isect) == 0 or isect[-1][1] < val - 1:
isect.append([val, 0])
elif prevParity == 3:
isect[-1][1] = val
prevParity = parity
return isect
Answering your question as I personally would probably answer an interview question and probably also most appreciate an answer; the interviewee's goal is probably to demonstrate a range of skills, not limited strictly to python. So this answer is admittedly going to be more abstract than others here.
It might be helpful to ask for information about any constraints I'm operating under. Operation time and space complexity are common constraints, as is development time, all of which are mentioned in previous answers here; but other constraints might also arise. As common as any of those is maintenance and integration with existing code.
Within each list, the ranges will always increase and never overlap
When I see this, it probably means there is some pre-existing code to normalize the list of ranges, that sorts ranges and merges overlap. That's a pretty common union operation. When joining an existing team or ongoing project, one of the most important factors for success is integrating with existing patterns.
Intersection operation can also be performed via a union operation. Invert the sorted ranges, union them, and invert the result.
To me, that answer demonstrates experience with algorithms generally and "range" problems specifically, an appreciation that the most readable and maintainable code approach is typically reusing existing code, and a desire to help a team succeed over simply puzzling on my own.
Another approach is to sort both lists together into one iterable list. Iterate the list, reference counting each start/end as increment/decrement steps. Ranges are emitted on transitions between reference counts of 1 and 2. This approach is inherently extensible to support more than two lists, if the sort operation meets our needs (and they usually do).
Unless instructed otherwise, I would offer the general approaches and discuss reasons I might use each before writing code.
So, there's no code here. But you did ask for general approaches and thinking :D
I have a little game I've been making for a school project, and it has worked up until now.
I used a very messy nested list system for multiple screens, each with a 2D array for objects on screen. These 2D "level" arrays are also arranged in their own 2D array, which makes up the "world". The strings correspond to an object tile, which is drawn using pygame.
My problem is that every level array is the same in the world array, and I can't understand why that is.
def generate_world(load):
# This bit not important
if load is True:
in_array()
# This is
else:
for world_y in Game_world.game_array:
for world_x in world_y:
generate_clutter(world_x)
print Game_world.game_array
out_array()
# Current_level.array = Level.new_level_array
def generate_clutter(world_x):
for level_y in world_x:
for level_x, _ in enumerate(level_y):
### GENERATE CLUTTER ###
i = randrange(1, 24)
if i == 19 or i == 20:
level_y[level_x] = "g1"
elif i == 21 or i == 22:
level_y[level_x] = "g2"
elif i == 23:
level_y[level_x] = "c1"
else:
level_y[level_x] = "-"
I'm sure it's something simple I'm overlooking, but to me it seems the random generation should be carried out for every single list item individually, so I can't understand the duplication.
I know quadruple nested lists aren't pretty, but I think I'm in too deep to make any serious changes now.
EDIT:
This is the gist of how the lists/arrays are initially created. Their size doesn't ever change, existing strings are just replaced.
class World:
def __init__(self, name, load):
if load is False:
n = [["-" for x in range(20)]for x in range(15)]
self.game_array = [[n, n, n, n, n, n, n],
[n, n, n, n, n, n, n],
[n, n, n, n, n, n, n]]
In Python, everything is an object - even integer values. How you initialize an 'empty' array can have some surprising results.
Consider this initialization:
>>> l=[[1]*2]*2
>>> l
[[1, 1], [1, 1]]
You appear to have created a 2x2 matrix with each cell containing the value 1. In fact, you have created a list of two lists (each containing [1,1]). Deeper still, you have created a list of two references to a single list [1,1].
The results of this can be seen if you now modify one of the cells
>>> l[0][0]=2
>>> l
[[2, 1], [2, 1]]
>>>
Notice that both l[0][0] and l[1][0] were modified.
To avoid this effect, you need to jump through some hoops
>>> l2 = [[1 for _ in range(2)] for _ in range(2)]
>>> l2
[[1, 1], [1, 1]]
>>> l2[0][0]=2
>>> l2
[[2, 1], [1, 1]]
>>>
If you used the former approach to initialize Game_world.game_array every assignment to level_y[level_x] will be modifying multiple cells in your array.
Just as an additional comment, your generate_clutter function can be simplified slightly using a dict
def generate_clutter(world_x):
clutter_map = {19:"g1", 20:"g1", 21:"g2", 22:"g2", 23:"c1"}
for level_y in world_x:
for level_x, _ in enumerate(level_y):
level_y[level_x] = clutter_map.get(randrange(1,24),'-')
This separates the logic of selecting the clutter representation from the actual mapping of values and will be much easier to expand and maintain.
Looking at your edit, the initialization needs to be something like:
self.game_array = [
[
[
["-" for x in range(20)]
for x in range(15)
]
for x in range(7)
]
for x in range(3)
]