Related
given the following array, I want to replace the zero with their previous value columnwise as long as it is surrounded by two values greater than zero.
I am aware of np.where but it would consider the whole array instead of its columns.
I am not sure how to do it and help would be appreciated.
This is the array:
a=np.array([[4, 3, 3, 2],
[0, 0, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
and since the only zero that meets this condition is the second row/second column one,
the new array should be the following
new_a=np.array([[4, 3, 3, 2],
[0, 3, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
How do I accomplish this?
And what if I would like to extend the gap surrounded by nonzero ? For instance, the first column contains two 0 and the second column contains one 0, so the new array would be
new_a=np.array([[4, 3, 3, 2],
[4, 3, 1, 2],
[4, 4, 2, 4],
[2, 4, 3, 0]])
In short, how do I solve this if the columnwise condition would be the one of having N consecutive zeros or less?
As a generic method, I would approach this using a convolution:
from scipy.signal import convolve2d
# kernel for top/down neighbors
kernel = np.array([[1],
[0],
[1]])
# is the value a zero?
m1 = a==0
# count non-zeros neighbors
m2 = convolve2d(~m1, kernel, mode='same') > 1
mask = m1&m2
# replace matching values with previous row value
a[mask] = np.roll(a, 1, axis=0)[mask]
output:
array([[4, 3, 3, 2],
[0, 3, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
filling from surrounding values
Using pandas to benefit from ffill/bfill (you can forward-fill in pure numpy but its more complex):
import pandas as pd
df = pd.DataFrame(a)
# limit for neighbors
N = 2
# identify non-zeros
m = df.ne(0)
# mask zeros
m2 = m.where(m)
# mask for values with 2 neighbors within limits
mask = m2.ffill(limit=N) & m2.bfill(limit=N)
df.mask(mask&~m).ffill()
array([[4, 3, 3, 2],
[4, 3, 1, 2],
[4, 4, 2, 4],
[2, 4, 3, 0]])
That's one solution I found. I know it's basic but I think it works.
a=np.array([[4, 3, 3, 2],
[0, 0, 1, 2],
[0, 4, 2, 4],
[2, 4, 3, 0]])
a_t = a.T
for i in range(len(a_t)):
ar = a_t[i]
for j in range(len(ar)-1):
if (j>0) and (ar[j] == 0) and (ar[j+1] > 0):
a_t[i][j] = a_t[i][j-1]
a = a_t.T
I am saving the edge weights of an undirected graph in a row vector. For instance, if I have a graph as pictured below
The vector that I create is [5, 3, 4, 1, 2, 7] as ordered based on node number in ascending order. Now, if I swap the node labels of nodes 1 and 4, I can obtain the following graph;
In this scenerio, the vector that I should have is [2, 7, 4, 1, 5, 3]. My question is if I have an n by m NumPy array, where n is the number of graphs and m is the number of edges, how can I shuffle the node labels for each row and get the updated array efficiently?
Suppose I have a set of graphs consisting of four nodes as shown below. My intention is to randomly shuffle node labels in each network and then get an updated weights accordingly in a same size array.
np.random.seed(2)
arr = np.random.randint(10, size=(5, 6))
arr
array([[8, 8, 6, 2, 8, 7],
[2, 1, 5, 4, 4, 5],
[7, 3, 6, 4, 3, 7],
[6, 1, 3, 5, 8, 4],
[6, 3, 9, 2, 0, 4]])
You can do it like this:
import numpy as np
def get_arr_from_edges(a):
n = int(np.sqrt(len(a) * 2)) + 1
mask = np.tri(n, dtype=bool, k=-1).T
out = np.zeros((n, n))
out[mask] = a
out += out.T
return out
def get_edges_from_arr(a):
mask = np.tri(a.shape[0], dtype=bool, k=-1).T
out = a[mask]
return out
def swap_nodes(a, nodes):
a[:, [nodes[0] - 1, nodes[1] - 1], :] = a[:, [nodes[1] - 1, nodes[0] - 1], :]
a[:, :, [nodes[0] - 1, nodes[1] - 1]] = a[:, :, [nodes[1] - 1, nodes[0] - 1]]
return a
arr = np.array([
[8, 8, 6, 2, 8, 7],
[2, 1, 5, 4, 4, 5],
[7, 3, 6, 4, 3, 7],
[6, 1, 3, 5, 8, 4],
[6, 3, 9, 2, 0, 4],
])
nodes_to_swap = (1, 4)
# initialize node-arr
node_arrs = np.apply_along_axis(get_arr_from_edges, axis=1, arr=arr)
# swap nodes
node_arrs = swap_nodes(node_arrs, nodes_to_swap)
# return rempapped edges
edges = np.array([get_edges_from_arr(node_arr) for node_arr in node_arrs])
print(edges)
Gives the following result:
[[8 7 6 2 8 8]
[4 5 5 4 2 1]
[3 7 6 4 7 3]
[8 4 3 5 6 1]
[0 4 9 2 6 3]]
The idea is to build a connection-matrix from the edges, where the edge-number is saved at the indices of the two nodes.
Then you just swap the columns and rows according to the nodes you want to swap. If you want this process to be random you could create random node pairs and call the function multiple times with these node pairs. This process is non-commutative, so if you want to swap multiple node-pairs then order matters!
After that you read out the remapped edges of the array with the swapped columns and rows (this is basically the inverse of the first step).
I am sure that there are some more optimizations left using numpys vast functionality.
Suppose to have, a numpy 3D tensor D of dimension r x c x d, such as:
r = 2
c = 3
d = 3
D = np.array([[[1, 5, 3], [1, 2, 5], [1, 4, 3]], [[1, 1, 6], [3, 1, 7], [5, 1, 3]]])
array([[[1, 5, 3],
[1, 2, 5],
[1, 4, 3]],
[[1, 1, 6],
[3, 1, 7],
[5, 1, 3]]])
and a 2D integer matrix Q of dimensions r x c, such as:
Q = np.array([[1, 1, 2], [2, 1, 2]])
array([[1, 1, 2],
[2, 1, 2]])
where every element in Q is less than d.
I need to sum the first Q[r_i][c_i] element of the third dimension of matrix D for every 0 < r_i < r and 0 < c_i < c.
The expected results (Res) using the example above is a 2D matrix of r x c (2x3):
Res = np.array([[6, 3, 8], [8, 4, 5]])
array([[6, 3, 8],
[8, 4, 5]])
My actual solution is using a list comprehension looping over r_i and c_i:
r = 2
c = 3
res = np.array([[np.sum(D[r_i, c_i, :Q[r_i, c_i]+1]) for c_i in range(c)] for r_i in range(r)])
There is a more efficient or elegant solution to solve this problem?
Let us try:
# this is equivalent to double loop on r_i, c_i
x,y = np.ogrid[:r, :c]
# we take the cumsum on the last axis,
# then extract the Q[r_i, c_i]'th sum at r_i, c_i
out = D.cumsum(axis=-1)[x,y, Q]
Output:
array([[6, 3, 8],
[8, 4, 9]])
Cross check
np.allclose(out, res)
# True
My problem here is that I need a 2 dimensional array, that holds the value 1 in the center of the matrix and then other numbers continue from that point. For example, when the input is 3, the matrix should look like this:
[[3, 3, 3, 3, 3],
[3, 2, 2, 2, 3],
[3, 2, 1, 2, 3],
[3, 2, 2, 2, 3],
[3, 3, 3, 3, 3]]
This is my code so far:
import pprint
def func(number):
d = [[0] for i in range(number + (number - 1))]
#? Create the matrix with 0 value inside each index and determine the amount of columns
for i in d:
d[d.index(i)] = [number] * (number + (number - 1))
#? Add values to the indexes
#? [value inside] * [length of a row]
centre = len(d) // 2
#? centre is the index value of the list which contains the number 1
pprint.pprint(d)
func(3)
The result is this:
[[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3]]
My approach was to fill the whole table with the number given, then work with the rest of the table without the first and last array, because they won't change and their value will always be the number times amount necessary.
I even know at what index of the 2D array the center is. However, I am stuck here. I'm not sure how to continue from here.
The value at each index is given by the maximum of (distance from the row index to the center, distance from the column index to the center). So, you can do:
n = 3
dim = 2 * n - 1
result = [
[
max(abs(row - dim // 2), abs(col - dim // 2)) + 1
for col in range(dim)
]
for row in range(dim)
]
print(result)
This outputs:
[[3, 3, 3, 3, 3],
[3, 2, 2, 2, 3],
[3, 2, 1, 2, 3],
[3, 2, 2, 2, 3],
[3, 3, 3, 3, 3]]
Fun question!
Here's what I came up with. It's quite shorter than your implementation:
import numpy as np
mdim = (5, 5) # specify your matrix dimensions
value = 3 # specify the outer value
matrix = np.zeros(mdim) # initiate the matrix
for i, v in enumerate(range(value, 0, -1)): # let's fill it!
matrix[i:(mdim[0]-i), i:(mdim[1]-i)] = v
print(matrix)
The output is as desired! It's probably possible to tweak this a little bit more.
Explanations:
It first creates a matrix of zeros with the dimensions of mdim.
Then, it fills the matrix with the values of v=3 from rows 0:5 and columns 0:5.
It successively fills the matrix with the values of v-1=2 from rows 1:4 and columns 1:4, until the centre gets the final value of 1.
I have a list with mixed sequences like
[1,2,3,4,5,2,3,4,1,2]
I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.
Another example is with the sequence
[3,2,1]
the output should be
[[3], [2], [1]]
In the event that the given sequence is increasing we return the same sequence. For example
[1,2,3]
returns the same result. i.e
[[1, 2, 3]]
For a repeating list like
[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
the output should be
[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
What I did to achieve this is define the following function
def splitter (L):
result = []
tmp = 0
initialPoint=0
for i in range(len(L)):
if (L[i] < tmp):
tmpp = L[initialPoint:i]
result.append(tmpp)
initialPoint=i
tmp = L[i]
result.append(L[initialPoint:])
return result
The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?
With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:
import numpy as np
lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)
# [array([1, 2, 2, 2]),
# array([1, 2, 3, 3]),
# array([1, 1, 1, 2, 3, 4]),
# array([1, 2, 3, 4, 5, 6])]
Psidom already has you covered with a good answer, but another NumPy solution would be to use scipy.signal.argrelmax to acquire the local maxima, then np.split.
from scipy.signal import argrelmax
arr = np.random.randint(1000, size=10**6)
splits = np.split(arr, argrelmax(arr)[0]+1)
Assume your original input array:
a = [1, 2, 3, 4, 5, 2, 3, 4, 1, 2]
First find the places where the splits shall occur:
p = [ i+1 for i, (x, y) in enumerate(zip(a, a[1:])) if x > y ]
Then create slices for each such split:
print [ a[m:n] for m, n in zip([ 0 ] + p, p + [ None ]) ]
This will print this:
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
I propose to use more speaking names than p, n, m, etc. ;-)