How to pre-pad tensorflow ragged tensor with beginning values - python

This question is similar to this question:
pad last dimension of tensor
with the exception that I would like to pre-pad this tensor with the beginning values. Given the ragged tensor:
[[1],
[4, 2],
[1, 2, 3]]
I would expect the output to be:
[[1 1 1],
[4 4 2],
[1 2 3]]
I would like to be able to apply the solution to a larger ragged tensor.

Just use the properties of a ragged tensor:
import tensorflow as tf
x = tf.ragged.constant([[1],
[4, 2],
[1, 2, 3]])
rows_to_pad = tf.abs(x.row_lengths() - tf.reduce_max(x.row_lengths()))
padded_x = tf.concat([tf.RaggedTensor.from_row_lengths(
values=tf.repeat(tf.gather(x.merge_dims(0, -1), x.row_starts()), rows_to_pad, axis=0),
row_lengths=rows_to_pad), x], axis=-1).to_tensor()
[[1 1 1]
[4 4 2]
[1 2 3]]
A different ragged tensor:
x = tf.ragged.constant([[1, 4, 5, 6, 7, 8, 3],
[4, 2],
[1, 2, 3],
[1, 2, 3, 4, 5],
[1]])
Pre-padded:
[[1 4 5 6 7 8 3]
[4 4 4 4 4 4 2]
[1 1 1 1 1 2 3]
[1 1 1 2 3 4 5]
[1 1 1 1 1 1 1]]

Related

Sorting/Cluster a 2D numpy array in ordered sequence based on multiple columns

I have a 2D numpy array like this:
[[4 5 2]
[5 5 1]
[5 4 5]
[5 3 4]
[5 4 4]
[4 3 2]]
I would like to sort/cluster this array by finding the sequence in array like this row[0]>=row[1]>=row[2], row[0]>=row[2]>row[1]... so the row of the array is in ordered sequence.
I tried to use the code: lexdf = df[np.lexsort((df[:,2], df[:,1],df[:,0]))][::-1], however it is not I want.
The output of lexsort:
[[5 5 1]
[5 4 5]
[5 4 4]
[5 3 4]
[4 5 2]
[4 3 2]]
The output I would like to have:
[[5 5 1]
[5 4 4]
[4 3 2]
[5 4 5]
[5 3 4]
[4 5 2]]
or cluster it into three parts:
[[5 5 1]
[5 4 4]
[4 3 2]]
[[5 4 5]
[5 3 4]]
[[4 5 2]]
And I would like to apply this to an array with more columns, so it would be better to do it without iteration. Any ideas to generate this kind of output?
I don't know how to do it in numpy, except maybe with some weird hacks of function numpy.split.
Here is a way to get your groups with python lists:
from itertools import groupby, pairwise
def f(sublist):
return [x <= y for x,y in pairwise(sublist)]
# NOTE: itertools.pairwise requires python>=3.10
# For python<=3.9, use one of those alternatives:
# * more_itertools.pairwise(sublist)
# * zip(sublist, sublist[1:])
a = [[4, 5, 2],
[5, 5, 1],
[5, 4, 5],
[5, 3, 4],
[5, 4, 4],
[4, 3, 2]]
b = [list(g) for _,g in groupby(sorted(a, key=f), key=f)]
print(b)
# [[[4, 3, 2]],
# [[5, 4, 5], [5, 3, 4], [5, 4, 4]],
# [[4, 5, 2], [5, 5, 1]]]
Note: The combination groupby+sorted is actually slightly subefficient, because sorted takes n log(n) time. A linear alternative is to group using a dictionary of lists. See for instance function itertoolz.groupby from module toolz.

How to change content of numpy array when indexing with a list?

could anyone explain me the reson why indexing the array using a list and using [x:x] lead to a very different result when manipulating numpy arrays?
Example:
a = np.array([[1,2,3,4],[3,4,5,5],[4,5,6,3], [1,2,5,5], [1, 2, 3, 4]])
print(a, '\n')
print(a[[3, 4]][:1][:, 1])
a[[3, 4]][:1][:, 1] = 99
print(a, '\n')
print(a[3:4][:1][:, 1])
a[3:4][:1][:, 1] = 99
print(a, '\n')
Output:
[[1 2 3 4]
[3 4 5 5]
[4 5 6 3]
[1 2 5 5]
[1 2 3 4]]
[2]
[[1 2 3 4]
[3 4 5 5]
[4 5 6 3]
[1 2 5 5]
[1 2 3 4]]
[2]
[[ 1 2 3 4]
[ 3 4 5 5]
[ 4 5 6 3]
[ 1 99 5 5]
[ 1 2 3 4]]
Is there a way to modify the array when indexing with a list?
Create an index that selects the desired elements without chaining:
In [114]: a[[3,4],1]=90
In [115]: a
Out[115]:
array([[ 1, 2, 3, 4],
[ 3, 4, 5, 5],
[ 4, 5, 6, 3],
[ 1, 90, 5, 5],
[ 1, 90, 3, 4]])

reshaping 2-d array using specific block shape [duplicate]

This question already has answers here:
Flatten or group array in blocks of columns - NumPy / Python
(6 answers)
Closed 3 years ago.
I've got problem with reshaping simple 2-d array into another.
Let`s assume matrix :
[[4 1 2 1 2 4 1 2 4]
[2 3 0 3 0 2 3 0 2]
[5 5 1 5 1 5 5 1 5]
[6 6 6 6 6 6 6 6 6]]
What I want to do is to reshape it to (12, 3) matrix, but using (4, 3) block. What I meant to do is to get matrix like:
[[4 1 2
2 3 0
5 5 1
6 6 6
1 2 4
3 0 2
5 1 5
6 6 6
1 2 4
3 0 2
5 1 5
6 6 6]]
I have highlighted the "egde" of cutting this matrix by additional newline.
I`ve tried numpy reshape (with all available order parameter value), but still I get array with "mixed" values.
You can always do this manually for custom reshapes:
import numpy as np
data = [[4, 1, 2, 1, 2, 4, 1, 2, 4],
[2, 3, 0, 3, 0, 2, 3, 0, 2],
[5, 5, 1, 5, 1, 5, 5, 1, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6]]
X = np.array(data)
Z = np.r_[X[:, 0:3], X[:, 3:6], X[:, 6:9]]
print(Z)
yields
array([[4, 1, 2],
[2, 3, 0],
[5, 5, 1],
[6, 6, 6],
[1, 2, 4],
[3, 0, 2],
[5, 1, 5],
[6, 6, 6],
[1, 2, 4],
[3, 0, 2],
[5, 1, 5],
[6, 6, 6]])
note the special np.r_ operator that concatenates arrays on rows (first axis). It is just a handy alias for np.concatenate.

Swap two rows in a numpy array in python [duplicate]

This question already has an answer here:
Row exchange in Numpy [duplicate]
(1 answer)
Closed 4 years ago.
How to swap xth and yth rows of the 2-D NumPy array? x & y are inputs provided by the user.
Lets say x = 0 & y =2 , and the input array is as below:
a = [[4 3 1]
[5 7 0]
[9 9 3]
[8 2 4]]
Expected Output :
[[9 9 3]
[5 7 0]
[4 3 1]
[8 2 4]]
I tried multiple things, but did not get the expected result. this is what i tried:
a[x],a[y]= a[y],a[x]
output i got is:
[[9 9 3]
[5 7 0]
[9 9 3]
[8 2 4]]
Please suggest what is wrong in my solution.
Put the index as a whole:
a[[x, y]] = a[[y, x]]
With your example:
a = np.array([[4,3,1], [5,7,0], [9,9,3], [8,2,4]])
a
# array([[4, 3, 1],
# [5, 7, 0],
# [9, 9, 3],
# [8, 2, 4]])
a[[0, 2]] = a[[2, 0]]
a
# array([[9, 9, 3],
# [5, 7, 0],
# [4, 3, 1],
# [8, 2, 4]])

Tensorflow: stack all row pairs from a tensor

Given a tensor t=[[1,2], [3,4]], I need to produce ts=[[1,2,1,2], [1,2,3,4], [3,4,1,2], [3,4,3,4]]. That is, I need to stack together all row pairs.
Important: the tensor has dimension [None, 2], ie. the first dimension is variable.
I have tried:
Using a tf.while_loop to generate a list of indices idx=[[0, 0], [0, 1], [1, 0], [1, 1]], then tf.gather(ts, idx). This works but is messy and I don't know what to do about gradients.
2 for loops iterating over tf.unstack(t), adding stacked rows to a buffer, then tf.stack(buffer). This does not work if the first dimension is variable.
To look for inspiration in broadcasting. For instance, given x=t.expand_dims(t, 0), y=t.expand_dims(t, 1), s=tf.reshape(tf.add(x, y), [-1, 2]) s will be [[2, 4], [4, 6], [4, 6], [6, 8]], ie. the sum of every row combination. But how can I do stacking instead of sum? I've been failing for 2 days :)
Solution with tf.meshgrid() and some reshaping:
import tensorflow as tf
import numpy as np
t = tf.placeholder(tf.int32, [None, 2])
num_rows, size_row = tf.shape(t)[0], tf.shape(t)[1] # actual dynamic dimensions
# Getting pair indices using tf.meshgrid:
idx_range = tf.range(num_rows)
pair_indices = tf.stack(tf.meshgrid(*[idx_range, idx_range]))
pair_indices = tf.transpose(pair_indices, perm=[1, 2, 0])
# Finally gathering the rows accordingly:
res = tf.reshape(tf.gather(t, pair_indices), (-1, size_row * 2))
with tf.Session() as sess:
print(sess.run(res, feed_dict={t: np.array([[1,2], [3,4], [5,6]])}))
# [[1 2 1 2]
# [3 4 1 2]
# [5 6 1 2]
# [1 2 3 4]
# [3 4 3 4]
# [5 6 3 4]
# [1 2 5 6]
# [3 4 5 6]
# [5 6 5 6]]
Solution using cartesian product:
import tensorflow as tf
import numpy as np
t = tf.placeholder(tf.int32, [None, 2])
num_rows, size_row = tf.shape(t)[0], tf.shape(t)[1] # actual dynamic dimensions
# Getting pair indices by computing the indices cartesian product:
row_idx = tf.range(num_rows)
row_idx_a = tf.expand_dims(tf.tile(tf.expand_dims(row_idx, 1), [1, num_rows]), 2)
row_idx_b = tf.expand_dims(tf.tile(tf.expand_dims(row_idx, 0), [num_rows, 1]), 2)
pair_indices = tf.concat([row_idx_a, row_idx_b], axis=2)
# Finally gathering the rows accordingly:
res = tf.reshape(tf.gather(t, pair_indices), (-1, size_row * 2))
with tf.Session() as sess:
print(sess.run(res, feed_dict={t: np.array([[1,2], [3,4], [5,6]])}))
# [[1 2 1 2]
# [1 2 3 4]
# [1 2 5 6]
# [3 4 1 2]
# [3 4 3 4]
# [3 4 5 6]
# [5 6 1 2]
# [5 6 3 4]
# [5 6 5 6]]
Can be achieved by:
tf.concat([tf.tile(tf.expand_dims(t,1), [1, tf.shape(t)[0], 1]), tf.tile(tf.expand_dims(t,0), [tf.shape(t)[0], 1, 1])], axis=2)
Detailed steps:
t = tf.placeholder(tf.int32, shape=[None, 2])
#repeat each row of t
d = tf.tile(tf.expand_dims(t,1), [1, tf.shape(t)[0], 1])
#Output:
#[[[1 2] [1 2]]
# [[3 4] [3 4]]]
#repeat the entire input t
e = tf.tile(tf.expand_dims(t,0), [tf.shape(t)[0], 1, 1])
#Output:
#[[[1 2] [3 4]]
# [[1 2] [3 4]]]
#concat
f = tf.concat([d, e], axis=2)
with tf.Session() as sess:
print(sess.run(f, {t:np.asarray([[1,2],[3,4]])}))
#Output
#[[[1 2 1 2]
#[1 2 3 4]]
#[[3 4 1 2]
#[3 4 3 4]]]

Categories