Splitting arrays in Python - python

I have the following problem: I would like to find different "cuts" of the array into two different arrays by adding one element each time, for example:
If I have an array
a = [0,1,2,3]
The following splits are desired:
[0] [1,2,3]
[0,1] [2,3]
[0,1,2] [3]
In the past I had easier tasks so np.split() function was quite enough for me. How should I act in this particular case?
Many thanks in advance and apologies if this question was asked before.

Use slicing, more details : Understanding slicing.
a = [0,1,2,3]
for i in range(len(a)-1):
print(a[:i+1], a[i+1:])
Output:
[0] [1, 2, 3]
[0, 1] [2, 3]
[0, 1, 2] [3]

Check this out:
a = [0,1,2,3]
result = [(a[:x], a[x:]) for x in range(1, len(a))]
print(result)
# [([0], [1, 2, 3]), ([0, 1], [2, 3]), ([0, 1, 2], [3])]
# you can access result like normal list
print(result[0])
# ([0], [1, 2, 3])

Related

Removing `n` elements from each row in `2D` numpy where `n` varies row by row

How would one remove n elements from 2D numpy array where n varies in each row?
For example:
# input array
[[1,2,3],
[3,1,2],
[1,2,3],
[4,5,2],
[5,6,7]]
with
# n elements to remove from each row
[0, 2, 1, 2, 1]
would result in:
[[1,2,3],
[3],
[1, 2],
[4],
[5,6]]
Do note that the result does not need to be a numpy array(and won't be as Michael noticed in the comments), just an arbitrary Python list of lists.
As has been pointed out in the comments, the result wouldn't be a numpy array.
Given the example data from your question:
>>> a = [[1,2,3],
... [3,1,2],
... [1,2,3],
... [4,5,2],
... [5,6,7]]
>>> remove_n_tail_list = [0, 2, 1, 2, 1]
You could for example use a list comprehension to get the desired result:
>>> [row[:len(row) - remove_n_tail] for row, remove_n_tail in zip(a, remove_n_tail_list)]
[[1, 2, 3], [3], [1, 2], [4], [5, 6]]
In that solution, row[:len(row) - remove_n_tail] is selecting the values up to the length of the row (len(row)), minus the number of values you want to remove from the end (remove_n_tail).
There are various methods to achieve similar results. You might find the Most efficient way to map function over numpy array question interesting.
Herewith I have come up with a solution. Give it a try:
arr = [[1,2,3],
[3,1,2],
[1,2,3],
[4,5,2],
[5,6,7]]
new_arr =[]
lst = [0, 2, 1, 2, 1]
count=0
for i in arr:
remove = lst[count]
count = count+1
temp = i[:len(i)-remove]
new_arr.append(temp)
print(new_arr)
If arr is your numpy array, rem is a list of removal-counts, a simple solution could be:
res = [arr[i][:3-rem[i]].tolist() for i in range(len(rem))]
Output:
[[1, 2, 3], [3], [1, 2], [4], [5, 6]]

Permuting characters in a string

Warning: this question is not what you think!
Suppose I have a string like this (Python):
'[[1, 2], [2, 3], [0, 3]]'
Now suppose further that I have the permutation of the characters 0, 1, 2, 3 which swaps 0 and 1, as well as (separately) 2 and 3. Then I would wish to obtain
'[[0, 3], [3, 2], [1, 2]]'
from this. As another example, suppose I want to use the more complicated permutation where 1 goes to 2, 2 goes to 3, and 3 goes to 1? Then I would desire the output
'[[2, 3], [3, 1], [0, 1]]'
Question: Given a permutation (encoded however one likes) of characters/integers 0 to n-1 and a string containing (some of) them, I would like a function which takes such a string and gives the appropriate resulting string where these characters/integers have been permuted - and nothing else.
I have been having a lot of difficult seeing whether there is some obvious use of re or even just indexing that will help me, because usually these replacements are sequential, which would obviously be bad in this case. Any help will be much appreciated, even if it makes me look dumb.
(If someone has an idea for the original list [[1, 2], [2, 3], [0, 3]], that is fine too, but that is a list of lists and presumably more annoying than a string, and the string would suffice for my purposes.)
Here's a simple solution using a regular expression with callback:
import re
s = '[[1, 2], [2, 3], [0, 3]]'
map = [3, 2, 1, 0]
print(re.sub('\d+', # substitute all numbers
lambda m : str(map[int(m.group(0))]), # ... with the mapping
s # ... for string s
)
)
# output: [[2, 1], [1, 0], [3, 0]]
Well I think in general you'll need to use a working memory copy of the resultant to avoid the sequential issue you mention. Also converting to some structured data format like an array to work in makes things much easier (you don't say so but your target string is clearly a stringified array so I'm taking that for granted). Here is one idea using eval and numpy:
import numpy as np
s = '[[2, 3], [3, 1], [0, 1]]'
a = np.array(eval(s))
print('before\n', a)
mymap = [1,2,3,0]
a = np.array([mymap[i] for i in a.flatten()]).reshape(a.shape)
print('after\n', a)
Gives:
before
[[2 3]
[3 1]
[0 1]]
after
[[3 0]
[0 2]
[1 2]]
permutation = {'0':'1', '1':'0', '2':'3', '3':'2'}
s = '[[1, 2], [2, 3], [0, 3]]'
rv = ''
for c in s:
rv += permutation.get(c, c)
print(rv)
?
You can build a mapping of your desired transformations:
import ast
d = ast.literal_eval('[[1, 2], [2, 3], [0, 3]]')
m = {1: 2, 2: 3, 3: 1}
new_d = [[m.get(i) if i in m else
(lambda x:i if not x else x[0])([a for a, b in m.items() if b == i]) for i in b] for b in d]
Output:
[[2, 3], [3, 1], [0, 1]]
For the first desired swap:
m = {0:1, 2:3}
d = ast.literal_eval('[[1, 2], [2, 3], [0, 3]]')
new_d = [[m.get(i) if i in m else
(lambda x:i if not x else x[0])([a for a, b in m.items() if b == i]) for i in b] for b in d]
Output:
[[0, 3], [3, 2], [1, 2]]
This is absolutely inelegant regarding the quality of this forum I confess but here is my suggestion just to help:
string = '[[1, 2], [2, 3], [0, 3]]'
numbers = dict(zero = 0, one = 1, two = 2, three=3, four = 4, five = 5, six=6, seven=7, height=8, nine = 9)
string = string.replace('0', 'one').replace('1', 'zero').replace('2','three').replace('3', 'two')
for x in numbers.keys():
string = string.replace(x, str(numbers[x]))
[[0, 3], [3, 2], [1, 2]]

Why is this list changing value?

I have a list called ones that changes value after a block of code that shouldn't affect it. Why?
s = 3
ones = []
terms = []
for i in range (0, s):
ones.append(1)
terms.append(ones)
print(terms)
twos = []
if len(ones) > 1:
twos.append(ones)
twos[-1].pop()
twos[-1][-1] = 2
print(twos)
print(terms)
Output:
[[1, 1, 1]] # terms
[[1, 1, 2]] # twos
[1, 1, 2] # terms
For context, I'm trying to use this to begin to solve the problem on page 5 of this British Informatics Olympiad past exam: http://www.olympiad.org.uk/papers/2009/bio/bio09-exam.pdf.
Here:
twos.append(ones)
You are appending a reference to ones, not its values. See the difference:
In [1]: l1 = [1, 2, 3]
In [2]: l2 = []
In [3]: l2.append(l1)
In [4]: l2, l1
Out[4]: ([[1, 2, 3]], [1, 2, 3])
In [5]: l2[0][1] = 'test'
In [6]: l2, l1
Out[6]: ([[1, 'test', 3]], [1, 'test', 3])
In order to avoid this you can give a copy by using [:] operator:
In [7]: l1 = [1, 2, 3]
In [8]: l2 = []
In [9]: l2.append(l1[:])
In [10]: l2, l1
Out[10]: ([[1, 2, 3]], [1, 2, 3])
In [11]: l2[0][1] = 'test'
In [12]: l2, l1
Out[12]: ([[1, 'test', 3]], [1, 2, 3])
twos.append(ones) does not copy ones.
There is only ever one list ones in memory, which also goes by the following references:
terms[0]
twos[0]
and also terms[-1] and twos[-1] because terms and twos only have one element each, so the first is the last.
Now, when you mutate ones/terms[0]/terms[-1]/twos[0]/twos[-1] you are mutating the same list in memory.
I highly recommend watching Facts and Myths about Python names and values.
When you do twos.append(ones), you're passing the reference to the ones list, not the value itself. Therefore, when you do twos[-1][-1] = 2, it'll modify the value in the ones list itself, not a copy in the twos list.
To pass the value instead of the reference to the ones list, you can do:
twos.append(ones[:])

Python permutations recursive

I'm trying to solve a problem using backtracking and I need the permutations of numbers for it. I have this basic algorithm that does it but the problem is... the results don't come in the normal order.
def perm(a,k=0):
if(k==len(a)):
print(a)
else:
for i in range(k,len(a)):
a[k],a[i] = a[i],a[k]
perm(a, k+1)
a[k],a[i] = a[i],a[k]
Example: for [1,2,3] the normal results would be: [1,2,3] [1,3,2] [2,1,3] [2,3,1] [3,1,2] [3,2,1]
Whereas this algorithm will interchange the last 2 elements. I understand why. I just don't know how to correct this.
I don't want to use the permutations from itertools. Can the code above be easily fixed to work properly? What would be the complexity for this algorithm from above?
A recursive generator function that yields permutations in the expected order with regard to the original list:
def perm(a):
if len(a) <= 1:
yield a
else:
for i in xrange(len(a)):
for p in perm(a[:i]+a[i+1:]):
yield [a[i]]+p
a = [1, 2, 3]
for p in perm(a):
print(p)
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]
Here's one (suboptimal, because copying lists all the time) solution:
def perm(a, prev=[]):
if not a:
print(prev)
for index, element in enumerate(a):
perm(a[:index] + a[index+1:], prev + [element])
‌The order it is printed out:
>>> perm([1,2,3])
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]

Confounding recursive list append in Python

I'm trying to create a pair of functions that, given a list of "starting" numbers, will recursively add to each index position up to a defined maximum value (much in the same way that a odometer works in a car--each counter wheel increasing to 9 before resetting to 1 and carrying over onto the next wheel).
The code looks like this:
number_list = []
def counter(start, i, max_count):
if start[len(start)-1-i] < max_count:
start[len(start)-1-i] += 1
return(start, i, max_count)
else:
for j in range (len(start)):
if start[len(start)-1-i-j] == max_count:
start[len(start)-1-i-j] = 1
else:
start[len(start)-1-i-j] += 1
return(start, i, max_count)
def all_values(fresh_start, i, max_count):
number_list.append(fresh_start)
new_values = counter(fresh_start,i,max_count)
if new_values != None:
all_values(*new_values)
When I run all_values([1,1,1],0,3) and print number_list, though, I get:
[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1],
[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1],
[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1],
[1, 1, 1], [1, 1, 1], [1, 1, 1]]
Which is unfortunate. Doubly so knowing that if I replace the first line of all_values with
print(fresh_start)
I get exactly what I'm after:
[1, 1, 1]
[1, 1, 2]
[1, 1, 3]
[1, 2, 1]
[1, 2, 2]
[1, 2, 3]
[1, 3, 1]
[1, 3, 2]
[1, 3, 3]
[2, 1, 1]
[2, 1, 2]
[2, 1, 3]
[2, 2, 1]
[2, 2, 2]
[2, 2, 3]
[2, 3, 1]
[2, 3, 2]
[2, 3, 3]
[3, 1, 1]
[3, 1, 2]
[3, 1, 3]
[3, 2, 1]
[3, 2, 2]
[3, 2, 3]
[3, 3, 1]
[3, 3, 2]
[3, 3, 3]
I have already tried making a copy of fresh_start (by way of temp = fresh_start) and appending that instead, but with no change in the output.
Can anyone offer any insight as to what I might do to fix my code? Feedback on how the problem could be simplified would be welcome as well.
Thanks a lot!
temp = fresh_start
does not make a copy. Appending doesn't make copies, assignment doesn't make copies, and pretty much anything that doesn't say it makes a copy doesn't make a copy. If you want a copy, slice it:
fresh_start[:]
is a copy.
Try the following in the Python interpreter:
>>> a = [1,1,1]
>>> b = []
>>> b.append(a)
>>> b.append(a)
>>> b.append(a)
>>> b
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
>>> b[2][2] = 2
>>> b
[[1, 1, 2], [1, 1, 2], [1, 1, 2]]
This is a simplified version of what's happening in your code. But why is it happening?
b.append(a) isn't actually making a copy of a and stuffing it into the array at b. It's making a reference to a. It's like a bookmark in a web browser: when you open a webpage using a bookmark, you expect to see the webpage as it is now, not as it was when you bookmarked it. But that also means that if you have multiple bookmarks to the same page, and that page changes, you'll see the changed version no matter which bookmark you follow.
It's the same story with temp = a, and for that matter, a = [1,1,1]. temp and a are "bookmarks" to a particular array which happens to contain three ones. And b in the example above, is a bookmark to an array... which contains three bookmarks to that same array that contains three ones.
So what you do is create a new array and copy in the elements of the old array. The quickest way to do that is to take an array slice containing the whole array, as user2357112 demonstrated:
>>> a = [1,1,1]
>>> b = []
>>> b.append(a[:])
>>> b.append(a[:])
>>> b.append(a[:])
>>> b
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
>>> b[2][2] = 2
>>> b
[[1, 1, 1], [1, 1, 1], [1, 1, 2]]
Much better.
When I look at the desired output I can't help but think about using one of the numpy grid data production functions.
import numpy
first_column, second_column, third_column = numpy.mgrid[1:4,1:4,1:4]
numpy.dstack((first_column.flatten(),second_column.flatten(),third_column.flatten()))
Out[23]:
array([[[1, 1, 1],
[1, 1, 2],
[1, 1, 3],
[1, 2, 1],
[1, 2, 2],
[1, 2, 3],
[1, 3, 1],
[1, 3, 2],
[1, 3, 3],
[2, 1, 1],
[2, 1, 2],
[2, 1, 3],
[2, 2, 1],
[2, 2, 2],
[2, 2, 3],
[2, 3, 1],
[2, 3, 2],
[2, 3, 3],
[3, 1, 1],
[3, 1, 2],
[3, 1, 3],
[3, 2, 1],
[3, 2, 2],
[3, 2, 3],
[3, 3, 1],
[3, 3, 2],
[3, 3, 3]]])
Of course, the utility of this particular approach might depend on the variety of input you need to deal with, but I suspect this could be an interesting way to build the data and numpy is pretty fast for this kind of thing. Presumably if your input list has more elements you could have more min:max arguments fed into mgrid[] and then unpack / stack in a similar fashion.
Here is a simplified version of your program, which works. Comments will follow.
number_list = []
def _adjust_counter_value(counter, n, max_count):
"""
We want the counter to go from 1 to max_count, then start over at 1.
This function adds n to the counter and then returns a tuple:
(new_counter_value, carry_to_next_counter)
"""
assert max_count >= 1
assert 1 <= counter <= max_count
# Counter is in closed range: [1, max_count]
# Subtract 1 so expected value is in closed range [0, max_count - 1]
x = counter - 1 + n
carry, x = divmod(x, max_count)
# Add 1 so expected value is in closed range [1, max_count]
counter = x + 1
return (counter, carry)
def increment_counter(start, i, max_count):
last = len(start) - 1 - i
copy = start[:] # make a copy of the start
add = 1 # start by adding 1 to index
for i_cur in range(last, -1, -1):
copy[i_cur], add = _adjust_counter_value(copy[i_cur], add, max_count)
if 0 == add:
return (copy, i, max_count)
else:
# if we have a carry out of the 0th position, we are done with the sequence
return None
def all_values(fresh_start, i, max_count):
number_list.append(fresh_start)
new_values = increment_counter(fresh_start,i,max_count)
if new_values != None:
all_values(*new_values)
all_values([1,1,1],0,3)
import itertools as it
correct = [list(tup) for tup in it.product(range(1,4), range(1,4), range(1,4))]
assert number_list == correct
Since you want the counters to go from 1 through max_count inclusive, it's a little bit tricky to update each counter. Your original solution was to use several if statements, but here I have made a helper function that uses divmod() to compute each new digit. This lets us add any increment to any digit and will find the correct carry out of the digit.
Your original program never changed the value of i so my revised one doesn't either. You could simplify the program further by getting rid of i and just having increment_counter() always go to the last position.
If you run a for loop to the end without calling break or return, the else: case will then run if there is one present. Here I added an else: case to handle a carry out of the 0th place in the list. If there is a carry out of the 0th place, that means we have reached the end of the counter sequence. In this case we return None.
Your original program is kind of tricky. It has two explicit return statements in counter() and an implicit return at the end of the sequence. It does return None to signal that the recursion can stop, but the way it does it is too tricky for my taste. I recommend using an explicit return None as I showed.
Note that Python has a module itertools that includes a way to generate a counter series like this. I used it to check that the result is correct.
I'm sure you are writing this to learn about recursion, but be advised that Python isn't the best language for recursive solutions like this one. Python has a relatively shallow recursion stack, and does not automatically turn tail recursion into an iterative loop, so this could cause a stack overflow inside Python if your recursive calls nest enough times. The best solution in Python would be to use itertools.product() as I did to just directly generate the desired counter sequence.
Since your generated sequence is a list of lists, and itertools.product() produces tuples, I used a list comprehension to convert each tuple into a list, so the end result is a list of lists, and we can simply use the Python == operator to compare them.

Categories