Python matrix with increasing numbers - python

My problem here is that I need a 2 dimensional array, that holds the value 1 in the center of the matrix and then other numbers continue from that point. For example, when the input is 3, the matrix should look like this:
[[3, 3, 3, 3, 3],
[3, 2, 2, 2, 3],
[3, 2, 1, 2, 3],
[3, 2, 2, 2, 3],
[3, 3, 3, 3, 3]]
This is my code so far:
import pprint
def func(number):
d = [[0] for i in range(number + (number - 1))]
#? Create the matrix with 0 value inside each index and determine the amount of columns
for i in d:
d[d.index(i)] = [number] * (number + (number - 1))
#? Add values to the indexes
#? [value inside] * [length of a row]
centre = len(d) // 2
#? centre is the index value of the list which contains the number 1
pprint.pprint(d)
func(3)
The result is this:
[[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3]]
My approach was to fill the whole table with the number given, then work with the rest of the table without the first and last array, because they won't change and their value will always be the number times amount necessary.
I even know at what index of the 2D array the center is. However, I am stuck here. I'm not sure how to continue from here.

The value at each index is given by the maximum of (distance from the row index to the center, distance from the column index to the center). So, you can do:
n = 3
dim = 2 * n - 1
result = [
[
max(abs(row - dim // 2), abs(col - dim // 2)) + 1
for col in range(dim)
]
for row in range(dim)
]
print(result)
This outputs:
[[3, 3, 3, 3, 3],
[3, 2, 2, 2, 3],
[3, 2, 1, 2, 3],
[3, 2, 2, 2, 3],
[3, 3, 3, 3, 3]]

Fun question!
Here's what I came up with. It's quite shorter than your implementation:
import numpy as np
mdim = (5, 5) # specify your matrix dimensions
value = 3 # specify the outer value
matrix = np.zeros(mdim) # initiate the matrix
for i, v in enumerate(range(value, 0, -1)): # let's fill it!
matrix[i:(mdim[0]-i), i:(mdim[1]-i)] = v
print(matrix)
The output is as desired! It's probably possible to tweak this a little bit more.
Explanations:
It first creates a matrix of zeros with the dimensions of mdim.
Then, it fills the matrix with the values of v=3 from rows 0:5 and columns 0:5.
It successively fills the matrix with the values of v-1=2 from rows 1:4 and columns 1:4, until the centre gets the final value of 1.

Related

python replace value in array based on previous and following value in column

given the following array, I want to replace the zero with their previous value columnwise as long as it is surrounded by two values greater than zero.
I am aware of np.where but it would consider the whole array instead of its columns.
I am not sure how to do it and help would be appreciated.
This is the array:
a=np.array([[4, 3, 3, 2],
[0, 0, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
and since the only zero that meets this condition is the second row/second column one,
the new array should be the following
new_a=np.array([[4, 3, 3, 2],
[0, 3, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
How do I accomplish this?
And what if I would like to extend the gap surrounded by nonzero ? For instance, the first column contains two 0 and the second column contains one 0, so the new array would be
new_a=np.array([[4, 3, 3, 2],
[4, 3, 1, 2],
[4, 4, 2, 4],
[2, 4, 3, 0]])
In short, how do I solve this if the columnwise condition would be the one of having N consecutive zeros or less?
As a generic method, I would approach this using a convolution:
from scipy.signal import convolve2d
# kernel for top/down neighbors
kernel = np.array([[1],
[0],
[1]])
# is the value a zero?
m1 = a==0
# count non-zeros neighbors
m2 = convolve2d(~m1, kernel, mode='same') > 1
mask = m1&m2
# replace matching values with previous row value
a[mask] = np.roll(a, 1, axis=0)[mask]
output:
array([[4, 3, 3, 2],
[0, 3, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
filling from surrounding values
Using pandas to benefit from ffill/bfill (you can forward-fill in pure numpy but its more complex):
import pandas as pd
df = pd.DataFrame(a)
# limit for neighbors
N = 2
# identify non-zeros
m = df.ne(0)
# mask zeros
m2 = m.where(m)
# mask for values with 2 neighbors within limits
mask = m2.ffill(limit=N) & m2.bfill(limit=N)
df.mask(mask&~m).ffill()
array([[4, 3, 3, 2],
[4, 3, 1, 2],
[4, 4, 2, 4],
[2, 4, 3, 0]])
That's one solution I found. I know it's basic but I think it works.
a=np.array([[4, 3, 3, 2],
[0, 0, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
a_t = a.T
for i in range(len(a_t)):
ar = a_t[i]
for j in range(len(ar)-1):
if (j>0) and (ar[j] == 0) and (ar[j+1] > 0):
a_t[i][j] = a_t[i][j-1]
a = a_t.T

2D Vectorization of unique values per row with condition

Consider the array and function definition shown:
import numpy as np
a = np.array([[2, 2, 5, 6, 2, 5],
[1, 5, 8, 9, 9, 1],
[0, 4, 2, 3, 7, 9],
[1, 4, 1, 1, 5, 1],
[6, 5, 4, 3, 2, 1],
[3, 6, 3, 6, 3, 6],
[0, 2, 7, 6, 3, 4],
[3, 3, 7, 7, 3, 3]])
def grpCountSize(arr, grpCount, grpSize):
count = [np.unique(row, return_counts=True) for row in arr]
valid = [np.any(np.count_nonzero(row[1] == grpSize) == grpCount) for row in count]
return valid
The point of the function is to return the rows of array a that have exactly grpCount groups of elements that each hold exactly grpSize identical elements.
For example:
# which rows have exactly 1 group that holds exactly 2 identical elements?
out = a[grpCountSize(a, 1, 2)]
As expected, the code outputs out = [[2, 2, 5, 6, 2, 5], [3, 3, 7, 7, 3, 3]].
The 1st output row has exactly 1 group of 2 (ie: 5,5), while the 2nd output row also has exactly 1 group of 2 (ie: 7,7).
Similarly:
# which rows have exactly 2 groups that each hold exactly 3 identical elements?
out = a[grpCountSize(a, 2, 3)]
This produces out = [[3, 6, 3, 6, 3, 6]], because only this row has exactly 2 groups each holding exactly 3 elements (ie: 3,3,3 and 6,6,6)
PROBLEM: My actual arrays have just 6 columns, but they can have many millions of rows. The code works perfectly as intended, but it is VERY SLOW for long arrays. Is there a way to speed this up?
np.unique sorts the array which makes it less efficient for your purpose. Use np.bincount and that way you most likely will save some time(depending on your array shape and values in the array). You also will not need np.any anymore:
def grpCountSize(arr, grpCount, grpSize):
count = [np.bincount(row) for row in arr]
valid = [np.count_nonzero(row == grpSize) == grpCount for row in count]
return valid
Another way that might even save more time is using same number of bins for all rows and create one array:
def grpCountSize(arr, grpCount, grpSize):
m = arr.max()
count = np.stack([np.bincount(row, minlength=m+1) for row in arr])
return (count == grpSize).sum(1)==grpCount
Another yet upgrade is to use vectorized 2D bin count from this post. For example (note that Numba solutions tested in the post above is faster. I just provided the numpy solution for example. You can replace the function with any of the suggested ones in the post linked above):
def grpCountSize(arr, grpCount, grpSize):
count = bincount2D_vectorized(arr)
return (count == grpSize).sum(1)==grpCount
#from the post above
def bincount2D_vectorized(a):
N = a.max()+1
a_offs = a + np.arange(a.shape[0])[:,None]*N
return np.bincount(a_offs.ravel(), minlength=a.shape[0]*N).reshape(-1,N)
output of all solutions above:
a[grpCountSize2(a, 1, 2)]
#array([[2, 2, 5, 6, 2, 5],
# [3, 3, 7, 7, 3, 3]])

How would you reshuffle this array efficiently?

I have an array arr_val, which stores values of a certain function at large size of locations (for illustration let's just take a small one 4 locations). Now, let's say that I also have another array loc_array which stores the location of the function, and assume that location is again the same number 4. However, location array is multidimensional array such that each location index has the same 4 sub-location index, and each sub-location index is a pair coordinates. To clearly illustrate:
arr_val = np.array([1, 2, 3, 4])
loc_array = np.array([[[1,1],[2,3],[3,1],[3,2]],[[1,2],[2,4],[3,4],[4,1]],
[[2,1],[1,4],[1,3],[3,3]],[[4,2],[4,3],[2,2],[4,4]]])
The meaning of the above two arrays would be value of some parameter of interest at, for example locations [1,1],[2,3],[3,1],[3,2] is 1, and so on. However, I am interested in re-expressing the same thing above in a different form, which is instead of having random points, I would like to have coordinates in the following tractable form
coord = [[[1,1],[1,2],[1,3],[1,4]],[[2,1],[2,2],[2,3],[2,4]],[[3,1],[3,2],
[3,3],[3,4]],[[4,1],[4,2],[4,3],[4,4]]]
and the values at respective coordinates given as
val = [[1, 2, 3, 3],[3, 4, 1, 2],[1, 1, 3, 2], [2, 4, 4, 4]]
What would be a very efficient way to achieve the above for large numpy arrays?
You can use lexsort like so:
>>> order = np.lexsort(loc_array.reshape(-1, 2).T[::-1])
>>> arr_val.repeat(4)[order].reshape(4, 4)
array([[1, 2, 3, 3],
[3, 4, 1, 2],
[1, 1, 3, 2],
[2, 4, 4, 4]])
If you know for sure that loc_array is a permutation of all possible locations then you can avoid the sort:
>>> out = np.empty((4, 4), arr_val.dtype)
>>> out.ravel()[np.ravel_multi_index((loc_array-1).reshape(-1, 2).T, (4, 4))] = arr_val.repeat(4)
>>> out
array([[1, 2, 3, 3],
[3, 4, 1, 2],
[1, 1, 3, 2],
[2, 4, 4, 4]])
It could not be the answer what you want, but it works anyway.
val = [[1, 2, 3, 3],[3, 4, 1, 2],[1, 1, 3, 2], [2, 4, 4, 4]]
temp= ""
int_list = []
for element in val:
temp_int = temp.join(map(str, element ))
int_list.append(int(temp_int))
int_list.sort()
print(int_list)
## result ##
[1132, 1233, 2444, 3412]
Change each element array into int and construct int_list
Sort int_list
Construct 2D np.array from int_list
I skipped last parts. You may find the way on web.

Efficiently create numpy array with decreasing values moving away from center

Let N be an odd positive integer. I want to create a square 2D grid of shape N x N where the center element is N, the surrounding 8-neighborhood has value N-1, the elements around that neighborhood have value N-2 etc., so that finally the "outer shell" of the array has value N//2. I can accomplish this using np.pad in a for loop that iteratively adds "shells" to the array:
def pad_decreasing(N):
assert N % 2 == 1
a = np.array([[N]])
for n in range(N-1, N//2, -1):
a = np.pad(a, 1, mode='constant', constant_values=n)
return a
Example output:
In: pad_decreasing(5)
Out:
array([[3, 3, 3, 3, 3],
[3, 4, 4, 4, 3],
[3, 4, 5, 4, 3],
[3, 4, 4, 4, 3],
[3, 3, 3, 3, 3]])
Question. Can I accomplish this in a "vectorized" form that doesn't resort to for loops. This code is fairly slow for N large.
The distance between the center and any point in the grid can be written as np.maximum(np.abs(row - center), np.abs(col - center)); Then N minus this distance gives the onion you need:
N = 5
# the x and y coordinate of the center point
center = (N - 1) / 2
# the coordinates of the square array
row, col = np.ogrid[:N, :N]
# N - distance gives the result
N - np.maximum(np.abs(row - center), np.abs(col - center))
# array([[3, 3, 3, 3, 3],
# [3, 4, 4, 4, 3],
# [3, 4, 5, 4, 3],
# [3, 4, 4, 4, 3],
# [3, 3, 3, 3, 3]])

Split a list into increasing sequences using itertools

I have a list with mixed sequences like
[1,2,3,4,5,2,3,4,1,2]
I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.
Another example is with the sequence
[3,2,1]
the output should be
[[3], [2], [1]]
In the event that the given sequence is increasing we return the same sequence. For example
[1,2,3]
returns the same result. i.e
[[1, 2, 3]]
For a repeating list like
[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
the output should be
[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
What I did to achieve this is define the following function
def splitter (L):
result = []
tmp = 0
initialPoint=0
for i in range(len(L)):
if (L[i] < tmp):
tmpp = L[initialPoint:i]
result.append(tmpp)
initialPoint=i
tmp = L[i]
result.append(L[initialPoint:])
return result
The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?
With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:
import numpy as np
lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)
# [array([1, 2, 2, 2]),
# array([1, 2, 3, 3]),
# array([1, 1, 1, 2, 3, 4]),
# array([1, 2, 3, 4, 5, 6])]
Psidom already has you covered with a good answer, but another NumPy solution would be to use scipy.signal.argrelmax to acquire the local maxima, then np.split.
from scipy.signal import argrelmax
arr = np.random.randint(1000, size=10**6)
splits = np.split(arr, argrelmax(arr)[0]+1)
Assume your original input array:
a = [1, 2, 3, 4, 5, 2, 3, 4, 1, 2]
First find the places where the splits shall occur:
p = [ i+1 for i, (x, y) in enumerate(zip(a, a[1:])) if x > y ]
Then create slices for each such split:
print [ a[m:n] for m, n in zip([ 0 ] + p, p + [ None ]) ]
This will print this:
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
I propose to use more speaking names than p, n, m, etc. ;-)

Categories