How to calculate all interleavings of two lists? - python

I want to create a function that takes in two lists, the lists are not guaranteed to be of equal length, and returns all the interleavings between the two lists.
Input: Two lists that do not have to be equal in size.
Output: All possible interleavings between the two lists that preserve the original list's order.
Example: AllInter([1,2],[3,4]) -> [[1,2,3,4], [1,3,2,4], [1,3,4,2], [3,1,2,4], [3,1,4,2], [3,4,1,2]]
I do not want a solution. I want a hint.

Itertools would not be capable enough to handle this problem and would require some bit of understanding of the pegs and holes problem
Consider your example list
A = [1, 2 ]
B = [3, 4 ]
There are four holes (len(A) + len(B)) where you can place the elements (pegs)
| || || || |
|___||___||___||___|
In Python you can represent empty slots as a list of None
slots = [None]*(len(A) + len(B))
The number of ways you can place the elements (pegs) from the second list into the pegs is simply, the number of ways you can select slots from the holes which is
len(A) + len(B)Clen(B)
= 4C2
= itertools.combinations(range(0, len(len(A) + len(B)))
which can be depicted as
| || || || | Slots
|___||___||___||___| Selected
3 4 (0,1)
3 4 (0,2)
3 4 (0,3)
3 4 (1,2)
3 4 (1,3)
3 4 (2,3)
Now for each of these slot position fill it with elements (pegs) from the second list
for splice in combinations(range(0,len(slots)),len(B)):
it_B = iter(B)
for s in splice:
slots[s] = next(it_B)
Once you are done, you just have to fill the remaining empty slots with the elements (pegs) from the first list
it_A = iter(A)
slots = [e if e else next(it_A) for e in slots]
Before you start the next iteration with another slot position, flush your holes
slots = [None]*(len(slots))
A Python implementation for the above approach
def slot_combinations(A,B):
slots = [None]*(len(A) + len(B))
for splice in combinations(range(0,len(slots)),len(B)):
it_B = iter(B)
for s in splice:
slots[s] = next(it_B)
it_A = iter(A)
slots = [e if e else next(it_A) for e in slots]
yield slots
slots = [None]*(len(slots))
And the O/P from the above implementation
list(slot_combinations(A,B))
[[3, 4, 1, 2], [3, 1, 4, 2], [3, 1, 2, 4], [1, 3, 4, 2], [1, 3, 2, 4], [1, 2, 3, 4]]

Hint: Suppose each list had the same elements (but different between the list), i.e. one list was completely Red (say r of them), and the other was Blue (say b of them).
Each element of the output contains r+b or them, r of which are red.
Seems like others have spoilt it for you, even though you didn't ask for a solution (but they have a very good explanation)
So here it the code I wrote up quickly.
import itertools
def interleave(lst1, lst2):
r,b = len(lst1), len(lst2)
for s in itertools.combinations(xrange(0,r+b), r):
lst = [0]*(r+b)
tuple_idx,idx1,idx2 = 0,0,0
for i in xrange(0, r+b):
if tuple_idx < r and i == s[tuple_idx]:
lst[i] = lst1[idx1]
idx1 += 1
tuple_idx += 1
else:
lst[i] = lst2[idx2]
idx2 += 1
yield lst
def main():
for s in interleave([1,2,3], ['a','b']):
print s
if __name__ == "__main__":
main()
The basic idea is to map the output to (r+b) choose r combinations.

As suggested by #airza, the itertools module is your friend.
If you want to avoid using encapsulated magical goodness, my hint is to use recursion.
Start playing the process of generating the lists in your mind, and when you notice you're doing the same thing again, try to find the pattern. For example:
Take the first element from the first list
Either take the 2nd, or the first from the other list
Either take the 3rd, or the 2nd if you didn't, or another one from the other list
...
Okay, that is starting to look like there's some greater logic we're not using. I'm just incrementing the numbers. Surely I can find a base case that works while changing the "first element, instead of naming higher elements?
Play with it. :)

You can try something a little closer to the metal and more elegant(in my opinion) iterating through different possible slices. Basically step through and iterate through all three arguments to the standard slice operation, removing anything added to the final list. Can post code snippet if you're interested.

Related

Splitting list into 2 parts, as equal to sum as possible

I'm trying to wrap my head around this whole thing and I can't seem to figure it out. Basically, I have a list of ints. Adding up those int values equals 15. I want to split up a list into 2 parts, but at the same time, making each list as close as possible to each other in total sum. Sorry if I'm not explaining this good.
Example:
list = [4,1,8,6]
I want to achieve something like this:
list = [[8, 1][6,4]]
adding the first list up equals 9, and the other equals 10. That's perfect for what I want as they are as close as possible.
What I have now:
my_list = [4,1,8,6]
total_list_sum = 15
def divide_chunks(l, n):
# looping till length l
for i in range(0, len(l), n):
yield l[i:i + n]
n = 2
x = list(divide_chunks(my_list, n))
print (x)
But, that just splits it up into 2 parts.
Any help would be appreciated!
You could use a recursive algorithm and "brute force" partitioning of the list. Starting with a target difference of zero and progressively increasing your tolerance to the difference between the two lists:
def sumSplit(left,right=[],difference=0):
sumLeft,sumRight = sum(left),sum(right)
# stop recursion if left is smaller than right
if sumLeft<sumRight or len(left)<len(right): return
# return a solution if sums match the tolerance target
if sumLeft-sumRight == difference:
return left, right, difference
# recurse, brutally attempting to move each item to the right
for i,value in enumerate(left):
solution = sumSplit(left[:i]+left[i+1:],right+[value], difference)
if solution: return solution
if right or difference > 0: return
# allow for imperfect split (i.e. larger difference) ...
for targetDiff in range(1, sumLeft-min(left)+1):
solution = sumSplit(left, right, targetDiff)
if solution: return solution
# sumSplit returns the two lists and the difference between their sums
print(sumSplit([4,1,8,6])) # ([1, 8], [4, 6], 1)
print(sumSplit([5,3,2,2,2,1])) # ([2, 2, 2, 1], [5, 3], 1)
print(sumSplit([1,2,3,4,6])) # ([1, 3, 4], [2, 6], 0)
Use itertools.combinations (details here). First let's define some functions:
def difference(sublist1, sublist2):
return abs(sum(sublist1) - sum(sublist2))
def complement(sublist, my_list):
complement = my_list[:]
for x in sublist:
complement.remove(x)
return complement
The function difference calculates the "distance" between lists, i.e, how similar the sums of the two lists are. complement returns the elements of my_list that are not in sublist.
Finally, what you are looking for:
def divide(my_list):
lower_difference = sum(my_list) + 1
for i in range(1, int(len(my_list)/2)+1):
for partition in combinations(my_list, i):
partition = list(partition)
remainder = complement(partition, my_list)
diff = difference(partition, remainder)
if diff < lower_difference:
lower_difference = diff
solution = [partition, remainder]
return solution
test1 = [4,1,8,6]
print(divide(test1)) #[[4, 6], [1, 8]]
test2 = [5,3,2,2,2,1]
print(divide(test2)) #[[5, 3], [2, 2, 2, 1]]
Basically, it tries with every possible division of sublists and returns the one with the minimum "distance".
If you want to make it a a little bit faster you could return the first combination whose difference is 0.
I think what you're looking for is a hill climbing algorithm. I'm not sure this will cover all cases but at least works for your example. I'll update this if I think of a counter example or something.
Let's call your list of numbers vals.
vals.sort(reverse=true)
a,b=[],[]
for v in vals:
if sum(a)<sum(b):
a.append(v)
else:
b.append(v)

multiple list generation from a list based on condition in python

I have a Pandas series (which could be a list, this is not very important) of lists which contains (to simplify, but that could also be letters of words) positive and negative number,
such as
0 [12,-13,0,6]
1 [2,-3,8,233]
2 [0,6,8,3]
for each of these, i want to fill a row in a three columns data frame, with a list of all positive values, a list of all negative values, and a list of all values comprised in some interval. Such as:
[[12,6],[-13],[0,6]]
[[2,8,233],[-3],[2,8]]
[[6,8,3],[],[6,8,3]]
What I first thought was using a list comprehension to generate a list of triadic lists of lists, which would be converted using pd.DataFrame to the right form.
This was because i don't want to loop over the list of lists 3 times to apply each time a new choice heuristics, feels slow and dull.
But the problem is that I can't actually generate well the lists of the triad [[positive],[negative], [interval]].
I was using a syntax like
[[[positivelist.extend(number)],[negativelist], [intervalist.extend(number)]]\
for listofnumbers in listoflists for number in listofnumbers\
if number>0 else [positivelist],[negativelist.extend(number)], [intervalist.extend(number)]]
but let be honest, this is unreadable, and anyway it doesn't do what I want since extend yields none.
So how could I go about that without looping three times (I could have many millions elements in the list of lists, and in the sublists, and I might want to apply more complexe formulae to these numbers, too, it is a first approach)?
I thought about using functional programming, map/lambda; but it is unpythonic. The catch is: what in python may help to do it right?
My guess would be something as:
newlistoflist=[]
for list in lists:
positive=[]
negative=[]
interval=[]
for element in list:
positive.extend(element) if element>0
negative.extend(element) if element<0
interval.extend(element) if n<element<m
triad=[positive, negative,interval]
newlistoflist.append(triad)
what do you think?
You can do:
import numpy
l = [[12,-13,0,6], [2,-3,8,233], [0,6,8,3]]
l = numpy.array([x for e in l for x in e])
positive = l[l>0]
negative = l[l<0]
n,m = 1,5
interval = l[((l>n) & (l<m))]
print positive, negative, interval
Output: [ 12 6 2 8 233 6 8 3] [-13 -3] [2 3]
Edit: Triad version:
import numpy
l = numpy.array([[12,-13,0,6], [2,-3,8,233], [0,6,8,3]])
n,m = 1,5
triad = numpy.array([[e[e>0], e[e<0], e[((e>n) & (e<m))]] for e in l])
print triad
Output:
[[array([12, 6]) array([-13]) array([], dtype=int64)]
[array([ 2, 8, 233]) array([-3]) array([2])]
[array([6, 8, 3]) array([], dtype=int64) array([3])]]

having trouble understanding this code

I just started learning recursion and I have an assignment to write a program that tells the nesting depth of a list. Well, I browsed around and found working code to do this, but I'm still having trouble understanding how it works. Here's the code:
def depth(L) :
nesting = []
for c in L:
if type(c) == type(nesting) :
nesting.append(depth(c))
if len(nesting) > 0:
return 1 + max(nesting)
return 1
So naturally, I start to get confused at the line with the append that calls recursion. Does anyone have a simple way of explaining what's going on here? I'm not sure what is actually being appended, and going through it with test cases in my head isn't helping. Thanks!
edit: sorry if the formatting is poor, I typed this from my phone
Let me show it to you the easy way, change the code like this:
(### are the new lines I added to your code so you can watch what is happening there)
def depth(L) :
nesting = []
for c in L:
if type(c) == type(nesting) :
print 'nesting before append', nesting ###
nesting.append(depth(c))
print 'nesting after append', nesting ###
if len(nesting) > 0:
return 1 + max(nesting)
return 1
Now lets make a list with the depth of three:
l=[[1,2,3],[1,2,[4]],'asdfg']
You can see our list has 3 element. one of them is a list, the other is a list which has another list in itself and the last one is a string. You can clearly see the depth of this list is 3 (i.e there are 2 lists nested together in the second element of the main list)
Lets run this code:
>>> depth(l)
nesting before append []
nesting after append [1]
nesting before append [1]
nesting before append []
nesting after append [1]
nesting after append [1, 2]
3
Piece of cake! this function appends 1 to the nesting. then if the element has also another list it appends 1 + maximum number in nesting which is the number of time function has been called itself. and if the element is a string, it skips it.
At the end, it returns the maximum number in the nesting which is the maximum number of times recursion happened, which is the number of time there is a list inside list in the main list, aka depth. In our case recursion happened twice for the second element + 1=3 as we expected.
If you still have problem getting it, try to add more print statements or other variables to the function and watch them carefully and eventually you'll get it.
So what this seems to be is a function that takes a list and calculates, as you put it, the nesting depth of it. nesting is a list, so what if type(c) == type(nesting) is saying is: if the item in list L is a list, run the function again and append it and when it runs the function again, it will do the same test until there are no more nested lists in list L and then return 1 + the max amount of nested lists because every list has a depth of 1.
Please tell me if any of this is unclear
Let's start with a couple of examples.
First, let's consider a list with only one level of depth. For Example, [1, 2, 3].
In the above list, the code starts with a call to depth() with L = [1, 2, 3]. It makes an empty list nesting. Iterates over all the elements of L i.e 1, 2, 3 and does not find a single element which passes the test type(c) == type(nesting). The check that len(nesting) > 0 fails and the code returns a 1, which is the depth of the list.
Next, let's take an example with a depth of 2, i.e [[1, 2], 3]. The function depth() is called with L = [[1, 2], 3] and an empty list nesting is created. The loop iterates over the 2 elements of L i.e [1, 2] , 3 and since type([1, 2]) == type(nesting), nesting.append(depth(c)) is called. Similar to the previous example, depth(c) i.e depth([1, 2]) returns a 1 and nesting now becomes [1]. After the execution of the loop, the code evaluates the test len(nesting) > 0 which results in True and 1 + max(nesting) which is 1 + 1 = 2 is returned.
Similarly, the code follows for the depth 3 and so on.
Hope this was helpful.
This algorithm visits the nested lists and adds one for each level of recursion. The call chain is like this:
depth([1, 2, [3, [4, 5], 6], 7]) =
1 + depth([3, [4, 5], 6]) = 3
1 + depth([4, 5]) = 2
1
Since depth([4,5]) never enters the if type(c) == type(nesting) condition because no element is a list, it returns 1 from the outer return, which is the base case.
In the case where, for a given depth, you have more than one nested list, e.g. [1, [2, 3], [4, [5, 6]], both the max depth of [2,3]and [4, [5, 6]] are appended on a depth call, of which the max is returned by the inside return.

Python: Change the parameter of the loop while the loop is running

I want to change a in the for-loop to [4,5,6].
This code just print: 1, 2, 3
a = [1,2,3]
for i in a:
global a
a = [4,5,6]
print i
I want the ouput 1, 4, 5, 6.
You'll need to clarify the question because there is no explanation of how you should derive the desired output 1, 4, 5, 6 when your input is [1, 2, 3]. The following produces the desired output, but it's completely ad-hoc and makes no sense:
i = 0
a = [1, 2, 3]
while i < len(a):
print(a[i])
if a[i] == 1:
a = [4, 5, 6]
i = 0 # edit - good catch larsmans
else:
i += 1
The main point is that you can't modify the parameters of a for loop while the loop is executing. From the python documentation:
It is not safe to modify the sequence being iterated over in the loop
(this can only happen for mutable sequence types, such as lists). If
you need to modify the list you are iterating over (for example, to
duplicate selected items) you must iterate over a copy.
Edit: if based on the comments you are trying to walk URLs, you need more complicated logic to do a depth-first or breadth-first walk than just replacing one list (the top-level links) with another list (links in the first page). In your example you completely lose track of pages 2 and 3 after diving into page 1.
The issue is that the assignment
a = [4,5,6]
just changes the variable a, not the underlying object. There are various ways you could deal with this; one would be to use a while loop like
a = [1,2,3]
i = 0
while i<len(a):
print a[i]
a = [4,5,6]
i += 1
prints
1
5
6
If you print id(a) at useful points in your code you'll realise why this doesn't work.
Even something like this does not work:
a = [1,2,3]
def change_a(new_val):
a = new_val
for i in a:
change_a([4,5,6])
print i
I don't think it is possible to do what you want. Break out of the current loop and start a new one with your new value of a.

how to match two numpy array of unequal length?

i have two 1D numpy arrays. The lengths are unequal. I want to make pairs (array1_elemnt,array2_element) of the elements which are close to each other. Lets consider following example
a = [1,2,3,8,20,23]
b = [1,2,3,5,7,21,35]
The expected result is
[(1,1),
(2,2),
(3,3),
(8,7),
(20,21),
(23,25)]
It is important to note that 5 is left alone. It could easily be done by loops but I have very large arrays. I considered using nearest neighbor. But felt like killing a sparrow with a canon.
Can anybody please suggest any elegant solution.
Thanks a lot.
How about using the Needleman-Wunsch algorithm? :)
The scoring matrix would be trivial, as the "distance" between two numbers is just their difference.
But that will probably feel like killing a sparrow with a tank ...
You could use the built in map function to vectorize a function that does this. For example:
ar1 = np.array([1,2,3,8,20,23])
ar2 = np.array([1,2,3,5,7,21,35])
def closest(ar1, ar2, iter):
x = np.abs(ar1[iter] - ar2)
index = np.where(x==x.min())
value = ar2[index]
return value
def find(x):
return closest(ar1, ar2, x)
c = np.array(map(find, range(ar1.shape[0])))
In the example above, it looked like you wanted to exclude values once they had been paired. In that case, you could include a removal process in the first function like this, but be very careful about how array 1 is sorted:
def closest(ar1, ar2, iter):
x = np.abs(ar1[iter] - ar2)
index = np.where(x==x.min())
value = ar2[index]
ar2[ar2==value] = -10000000
return value
The best method I can think of is use a loop. If loop in python is slow, you can use Cython to speedup you code.
I think one can do it like this:
create two new structured arrays, such that there is a second index which is 0 or 1 indicating to which array the value belongs, i.e. the key
concatenate both arrays
sort the united array along the first field (the values)
use 2 stacks: go through the array putting elements with key 1 on the left stack, and when you cross an element with key 0, put them in the right stack. When you reach the second element with key 0, for the first with key 0 check the top and bottom of the left and right stacks and take the closest value (maybe with a maximum distance), switch stacks and continue.
sort should be slowest step and max total space for the stacks is n or m.
You can do the following:
a = np.array([1,2,3,8,20,23])
b = np.array([1,2,3,5,7,21,25])
def find_closest(a, sorted_b):
j = np.searchsorted(.5*(sorted_b[1:] + sorted_b[:-1]), a, side='right')
return b[j]
b.sort() # or, b = np.sort(b), if you don't want to modify b in-place
print np.c_[a, find_closest(a, b)]
# ->
# array([[ 1, 1],
# [ 2, 2],
# [ 3, 3],
# [ 8, 7],
# [20, 21],
# [23, 25]])
This should be pretty fast. How it works is that searchsorted will find for each number a the index into the b past the midpoint between two numbers, i.e., the closest number.

Categories