I currently have a dataset of nxm tensors that I need to pad to 13xm tensor (n <= 13); however, I cannot figure out how to do this without Tensorflow losing the shape of my tensor.
I am trying to apply the map function to these, but tf.constant cannot accept a tensor as part of the padding specification and because of map's requirement I cannot just use the numpy method.
def pad_tensor(x):
current_length = tf.shape(x)[0]
additional_length = 13 - current_length
padding = tf.constant([[0, additional_length], [0, 0]])
return tf.pad(x, padding, "CONSTANT")
I know I can use py_func but when I do that, tensorflow loses the shape of the data in the dataset.
Any help would be appreciated
PS: I'm not sure exactly what you mean by apply the map function to these and because of map's requirement I cannot just use the numpy method, if you still have problem after following example then please make the definition of the problem more clear
FWIW, add .numpy() and code running without any error:
import tensorflow as tf
import numpy as np
def pad_tensor(x):
current_length = tf.shape(x)[0]
additional_length = 13 - current_length.numpy()
padding = tf.constant([[0, additional_length], [0, 0]])
return tf.pad(x, padding, "CONSTANT")
n = 3
m = 5
x = tf.constant(np.random.rand(n, m))
pad_tensor(x)
Outputs:
<tf.Tensor: shape=(13, 5), dtype=float64, numpy=
array([[0.35710346, 0.49611589, 0.18744049, 0.91046784, 0.19934265],
[0.51464596, 0.96416921, 0.87008494, 0.52756893, 0.23010099],
[0.05335277, 0.88451633, 0.25949178, 0.91156944, 0.03638372],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ]])>
Related
I have a matrix like this:
profile=np.array([[0,0,0.5,0.1],
[0.3,0,0,0],
[0,0,0.1,0.9],
[0,0,0,0.1],
[0,0.5,0,0]])
And I want to add a row before and after filled with zeros. How can I do that?
I thought of using np.pad but not sure how.
Output should be:
np.array([[0,0,0,0],
[0,0,0.5,0.1],
[0.3,0,0,0],
[0,0,0.1,0.9],
[0,0,0,0.1],
[0,0.5,0,0]
[0,0,0,0]])
The np.pad function allows you to specify the axes you want to pad:
In [3]: np.pad(profile, ((1, 1), (0, 0)))
Out[3]:
array([[0. , 0. , 0. , 0. ],
[0. , 0. , 0.5, 0.1],
[0.3, 0. , 0. , 0. ],
[0. , 0. , 0.1, 0.9],
[0. , 0. , 0. , 0.1],
[0. , 0.5, 0. , 0. ],
[0. , 0. , 0. , 0. ]])
The nested tuple can be read as: pad 1 array "above", and 1 array "below" axis 0, and pad 0 arrays "above" and 0 arrays "below" axis 1.
Another example, which pads five columns "after" on axis 1:
In [4]: np.pad(profile, ((0, 0), (0, 5)))
Out[4]:
array([[0. , 0. , 0.5, 0.1, 0. , 0. , 0. , 0. , 0. ],
[0.3, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0.1, 0.9, 0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0.1, 0. , 0. , 0. , 0. , 0. ],
[0. , 0.5, 0. , 0. , 0. , 0. , 0. , 0. , 0. ]])
You can use np.pad:
out = np.pad(profile, 1)[:, 1:-1]
Output:
>>> out
array([[0. , 0. , 0. , 0. ],
[0. , 0. , 0.5, 0.1],
[0.3, 0. , 0. , 0. ],
[0. , 0. , 0.1, 0.9],
[0. , 0. , 0. , 0.1],
[0. , 0.5, 0. , 0. ],
[0. , 0. , 0. , 0. ]])
Because np.pad pads it on all sides (left and right, in addition to top and bottom), [:, 1:-1] slices off the first and last columns.
How can i find log likilihood layer if i have:
logP = [[-5.8971105e+00 -1.3536860e-01 -2.3225722e+00 -3.6559267e+00]
[-7.1035299e+00 -7.1037712e+00 -8.0828800e+00 -1.9549085e-03]]
oneHotTruth = [[0. 0. 0. 1.]
[0. 0. 0. 1.]]
gradInput should be equal:
[[ 0. 0. 0. -0.5]
[ 0. 0. 0. -0.5]]
Need to implement without using the library pytorch / tf
I am trying to create anti-aliased (weighted and not boolean) circular masks for making circular kernels for use in convolution.
radius = 3 # no. of pixels to be 1 on either side of the center pixel
# shall be decimal as well; not the real radius
kernel_size = 9
kernel_radius = (kernel_size - 1) // 2
x, y = np.ogrid[-kernel_radius:kernel_radius+1, -kernel_radius:kernel_radius+1]
dist = ((x**2+y**2)**0.5)
mask = (dist-radius).clip(0,1)
print(mask)
and the output is
array([[1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ],
[1. , 1. , 0.61, 0.16, 0. , 0.16, 0.61, 1. , 1. ],
[1. , 0.61, 0. , 0. , 0. , 0. , 0. , 0.61, 1. ],
[1. , 0.16, 0. , 0. , 0. , 0. , 0. , 0.16, 1. ],
[1. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. ],
[1. , 0.16, 0. , 0. , 0. , 0. , 0. , 0.16, 1. ],
[1. , 0.61, 0. , 0. , 0. , 0. , 0. , 0.61, 1. ],
[1. , 1. , 0.61, 0.16, 0. , 0.16, 0.61, 1. , 1. ],
[1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ]])
Then we can do
mask = 1 - mask
print(mask)
to get
array([[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0.39, 0.84, 1. , 0.84, 0.39, 0. , 0. ],
[0. , 0.39, 1. , 1. , 1. , 1. , 1. , 0.39, 0. ],
[0. , 0.84, 1. , 1. , 1. , 1. , 1. , 0.84, 0. ],
[0. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 0. ],
[0. , 0.84, 1. , 1. , 1. , 1. , 1. , 0.84, 0. ],
[0. , 0.39, 1. , 1. , 1. , 1. , 1. , 0.39, 0. ],
[0. , 0. , 0.39, 0.84, 1. , 0.84, 0.39, 0. , 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]])
I can now normalize and use this as my circular filter (kernel) in convolution operations.
Note: Radius can be decimal. Eg: get_circular_kernel(0.5,(5,5)) should give
array([[0. , 0. , 0. , 0. , 0. ],
[0. , 0.08578644, 0.5 , 0.08578644, 0. ],
[0. , 0.5 , 1. , 0.5 , 0. ],
[0. , 0.08578644, 0.5 , 0.08578644, 0. ],
[0. , 0. , 0. , 0. , 0. ]])
I want to generate a million of these at the very least, with the kernel_size fixed and radius changing, so is there a better or more efficient way to do this? (maybe without costly operations like sqrt and still stay accurate enough to arc integrals i.e., area covered by the curve in the particular pixel?)
Since you want to generate a large number of kernels with the same size, you can greatly improve performance by constructing every kernel in one step rather than one after the other in a loop. You can create a single array of shape (num_radii, kernel_size, kernel_size) given num_radii values for each kernel. The price of this vectorization is memory: you'll have to fit all these values in RAM, otherwise you should chunk up your millions of radii into a handful of smaller batches and generate each batch again separately.
The only thing you need to change is to take an array of radii (rather than a scalar radius), and inject two trailing singleton dimensions so that your mask creation triggers broadcasting:
import numpy as np
kernel_size = 9
kernel_radius = (kernel_size - 1) // 2
x, y = np.ogrid[-kernel_radius:kernel_radius+1, -kernel_radius:kernel_radius+1]
dist = (x**2 + y**2)**0.5 # shape (kernel_size, kernel_size)
# let's create three kernels for the sake of example
radii = np.array([3, 3.5, 4])[...,None,None] # shape (num_radii, 1, 1)
# using ... allows compatibility with arbitrarily-shaped radius arrays
masks = 1 - (dist - radii).clip(0,1) # shape (num_radii, kernel_size, kernel_size)
Now masks[0,...] (or masks[0] for short, but I prefer the explicit version) contains the example mask in your question, and masks[1,...] and masks[2,...] contain the kernels for radii 3.5 and 4, respectively.
If you want to build millions of masks, you should precompute once what never changes, and compute only the strict necessary for each radius.
You can try something like this:
class Circle:
def __init__(self, kernel_size):
self._kernel_size = kernel_size
self._kernel_radius = (self._kernel_size - 1) // 2
x, y = np.ogrid[
-self._kernel_radius:self._kernel_radius+1,
-self._kernel_radius:self._kernel_radius+1]
self._dist = np.sqrt(x**2 + y**2)
def __call__(self, radius):
mask = self._dist - radius
mask = np.clip(mask, 0, 1, out=mask)
mask *= -1
mask += 1
return mask
circle = Circle(kernel_size=9)
for radius in range(1, 4, 0.2):
mask = circle(radius)
print(mask)
I did the operations inplace as much as possible to optimize for speed and memory, but for small arrays it won't matter much.
I have a numpy array:
arr=np.array([0,1,0,0.5])
I need to form a new array from it as follows, such that every zero elements is repeated thrice and every non-zero element has 2 preceding zeroes, followed by the non-zero number. In short, every element is repeated thrice, zero as it is and non-zero has 2 preceding 0 and then the number itself. It is as follows:
([0,1,0,0.5])=0,0,0, [for index 0]
0,0,1 [for index 1]
0,0,0 [for index 2, which again has a zero] and
0,0,0.5
final output should be:
new_arr=[0,0,0,0,0,1,0,0,0,0,0,0.5]
np.repeat() repeats all the array elements n number of times, but i dont want that exactly. How should this be done? Thanks for the help.
A quick reshape followed by a call to np.pad will do it:
np.pad(arr.reshape(-1, 1), ((0, 0), (2, 0)), 'constant')
Output:
array([[ 0. , 0. , 0. ],
[ 0. , 0. , 1. ],
[ 0. , 0. , 0. ],
[ 0. , 0. , 0.5]])
You'll want to flatten it back again. That's simply done by calling .reshape(-1, ).
>>> np.pad(arr.reshape(-1, 1), ((0, 0), (2, 0)), 'constant').reshape(-1, )
array([ 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. , 0. , 0. ,
0.5])
A variant on the pad idea is to concatenate a 2d array of zeros
In [477]: arr=np.array([0,1,0,0.5])
In [478]: np.column_stack([np.zeros((len(arr),2)),arr])
Out[478]:
array([[ 0. , 0. , 0. ],
[ 0. , 0. , 1. ],
[ 0. , 0. , 0. ],
[ 0. , 0. , 0.5]])
In [479]: _.ravel()
Out[479]:
array([ 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. , 0. , 0. ,
0.5])
or padding in the other direction:
In [481]: np.vstack([np.zeros((2,len(arr))),arr])
Out[481]:
array([[ 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. ],
[ 0. , 1. , 0. , 0.5]])
In [482]: _.T.ravel()
Out[482]:
array([ 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. , 0. , 0. ,
0.5])
Let W be some matrix of dimension (x, nP) [see end of question]
Right now, I'm doing the following code:
uUpperDraw = np.zeros(W.shape)
for p in np.arange(0, nP):
uUpperDraw[s, p] = (W[s+1,:(p+1)]).sum()
I want to vectorize this for efficiency gains. Given a pGrid = [0, 1, ...], how can I reproduce the following?
uUpperDraw = np.array([sum(W[x, 0]), sum(W[x,0] + W[x, 1]), sum(W[x,0] + W[x, 1] + W[x, 2]) ...
Here is some reproducible example.
>>> s, nP
(3, 10)
>>> W
array([[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ],
[ 2. , 1.63636364, 1.38461538, 1.2 , 1.05882353,
0.94736842, 0.85714286, 0.7826087 , 0.72 , 0.66666667]])
>>> uUpperDraw
array([[ 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. ],
[ 2. , 3.63636364, 5.02097902, 6.22097902,
7.27980255, 8.22717097, 9.08431383, 9.86692252,
10.58692252, 11.25358919],
[ 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. ]])
This looks like the cumulative sum. When you want to have the cumulative sum for each row seperately this here works
uUpperDraw = np.cumsum(W,axis=1)