How to efficiently draw a plot of a torch.nn model? - python

I'm exploring neural networks, and I want to model some pictures with neural network. Picture is a function that maps pixel coordinates to color, so I make my network also with 2 input variables (x, y) and 1 (shade) to 3 (R, G, B) output coordinates. For example, like this:
import torch.nn as nn
net = nn.Sequential(
nn.Linear(2, 2),
nn.Sigmoid(),
nn.Linear(2, 1),
)
Now, I plot it like this:
import matplotlib.pyplot as plt
import numpy as np
def draw_image1(f):
image = []
y = 1
delta = 0.005
while y > 0:
x = 0
row = []
while x < 1:
row.append(f(x, y))
x += delta
image.append(row)
y -= delta
plt.imshow(image, extent=[0, 1, 0, 1], cmap='winter')
plt.draw()
draw_image1(lambda x, y: net(torch.Tensor([x, y])).item())
But it looks ugly and is slow because it uses Python lists instead of numpy arrays or tensors.
I have another version of code that draws images from functions, which looks better and is 100x faster:
def draw_image2(f):
x = np.linspace(0, 1, num = 200)
y = np.linspace(0, 1, num = 200)
X, Y = np.meshgrid(x, y)
image = f(X, Y)
plt.imshow(image, extent=[0, 1, 0, 1], cmap='winter')
plt.draw()
It works for functions that use numpy operations (like lambda x: x + y), but when I plug in my net in the same way as for previous function (draw_image2(lambda x, y: net(torch.Tensor([x, y])).item())), I get RuntimeError: mat1 and mat2 shapes cannot be multiplied (400x200 and 2x2), which I understand as my neural net complaining that it wants to be fed data in smaller pieces.
Is there any proper way to plot pytorch neural network output?

To feed a whole batch into nn.Linear(i, o), the input typically has the shape (b, i) where b is the size of the batch. If we take a look at the documentation you can actually use additional "batch"-dimensions in between. Actually since pytorch was primarily made for deep learning that is based on stochastic gradietn descent, pretty much all modules of pytorch require you to have at least one batch dimension.
So you could easily modify your second plotting function to something like:
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
net = nn.Sequential(
nn.Linear(2, 2),
nn.Sigmoid(),
nn.Linear(2, 1),
)
def draw_image2(f):
device = torch.device('cpu') # or use your gpu alternatively
with torch.no_grad(): # disable building evaluation graph if you don't need it
x = torch.linspace(0, 1, 200)
y = torch.linspace(0, 1, 200)
X, Y = torch.meshgrid(x, y)
# the data dimension should be the last (2), as per documentation
inp = torch.stack([X, Y], dim=2).to(device) # shape = (200, 200, 2)
image = f(inp) # shape = (200, 200, 1)
image = image[..., 0].detach().cpu() # shape (200, 200)
plt.imshow(image, extent=[0, 1, 0, 1], cmap='winter')
plt.show()
return image
draw_image2(net)
Note that the with torch.no_grad() is not necessary for it to work, but it will save you some time. Depending on your network architecture it might also be worth to set your network to eval mode (net.eval()) first. Finally the .to(device)/.cpu() is also not necessary if you're not using your GPU.

Related

Tensorflow custom filter layer definition like glcm or gabor

I want to apply various filters like GLCM or Gabor filter bank as a custom layer in Tensorflow, but I could not find enough custom layer samples. How can I apply these type of filters as a layer?
The process of generating GLCM is defined in the scikit-image library as follows:
from skimage.feature import greycomatrix, greycoprops
from skimage import data
#load image
img = data.brick()
#result glcm
glcm = greycomatrix(img, distances=[5], angles=[0], levels=256, symmetric=True, normed=True)
The use of Gabor filter bank is as follows:
import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage as ndi
from skimage import data
from skimage.util import img_as_float
from skimage.filters import gabor_kernel
shrink = (slice(0, None, 3), slice(0, None, 3))
brick = img_as_float(data.brick())[shrink]
grass = img_as_float(data.grass())[shrink]
gravel = img_as_float(data.gravel())[shrink]
image_names = ('brick', 'grass', 'gravel')
images = (brick, grass, gravel)
def power(image, kernel):
# Normalize images for better comparison.
image = (image - image.mean()) / image.std()
return np.sqrt(ndi.convolve(image, np.real(kernel), mode='wrap')**2 +
ndi.convolve(image, np.imag(kernel), mode='wrap')**2)
# Plot a selection of the filter bank kernels and their responses.
results = []
kernel_params = []
for theta in (0, 1):
theta = theta / 4. * np.pi
for sigmax in (1, 3):
for sigmay in (1, 3):
for frequency in (0.1, 0.4):
kernel = gabor_kernel(frequency, theta=theta,sigma_x=sigmax, sigma_y=sigmay)
params = 'theta=%d,f=%.2f\nsx=%.2f sy=%.2f' % (theta * 180 / np.pi, frequency,sigmax, sigmay)
kernel_params.append(params)
# Save kernel and the power image for each image
results.append((kernel, [power(img, kernel) for img in images]))
fig, axes = plt.subplots(nrows=6, ncols=4, figsize=(5, 6))
plt.gray()
fig.suptitle('Image responses for Gabor filter kernels', fontsize=12)
axes[0][0].axis('off')
# Plot original images
for label, img, ax in zip(image_names, images, axes[0][1:]):
ax.imshow(img)
ax.set_title(label, fontsize=9)
ax.axis('off')
for label, (kernel, powers), ax_row in zip(kernel_params, results, axes[1:]):
# Plot Gabor kernel
ax = ax_row[0]
ax.imshow(np.real(kernel))
ax.set_ylabel(label, fontsize=7)
ax.set_xticks([])
ax.set_yticks([])
# Plot Gabor responses with the contrast normalized for each filter
vmin = np.min(powers)
vmax = np.max(powers)
for patch, ax in zip(powers, ax_row[1:]):
ax.imshow(patch, vmin=vmin, vmax=vmax)
ax.axis('off')
plt.show()
How do I define these and similar filters in tensorflow.
I tried above code but it didnt gave the same results like : https://scikit-image.org/docs/dev/auto_examples/features_detection/plot_gabor.html
I got this:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow.keras.backend as K
from tensorflow.keras import Input, layers
from tensorflow.keras.models import Model
from scipy import ndimage as ndi
from skimage import data
from skimage.util import img_as_float
from skimage.filters import gabor_kernel
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
def gfb_filter(shape,size=3, tlist=[1,2,3], slist=[2,5],flist=[0.01,0.25],dtype=None):
print(shape)
fsize=np.ones([size,size])
kernels = []
for theta in tlist:
theta = theta / 4. * np.pi
for sigma in slist:
for frequency in flist:
kernel = np.real(gabor_kernel(frequency, theta=theta,sigma_x=sigma, sigma_y=sigma))
kernels.append(kernel)
gfblist = []
for k, kernel in enumerate(kernels):
ck=ndi.convolve(fsize, kernel, mode='wrap')
gfblist.append(ck)
gfblist=np.asarray(gfblist).reshape(size,size,1,len(gfblist))
print(gfblist.shape)
return K.variable(gfblist, dtype='float32')
dimg=img_as_float(data.brick())
input_mat = dimg.reshape((1, 512, 512, 1))
def build_model():
input_tensor = Input(shape=(512,512,1))
x = layers.Conv2D(filters=12,
kernel_size = 3,
kernel_initializer=gfb_filter,
strides=1,
padding='valid') (input_tensor)
model = Model(inputs=input_tensor, outputs=x)
return model
model = build_model()
out = model.predict(input_mat)
print(out)
o1=out.reshape(12,510,510)
plt.subplot(2,2,1)
plt.imshow(dimg)
plt.subplot(2,2,2)
plt.imshow(o1[0,:,:])
plt.subplot(2,2,3)
plt.imshow(o1[6,:,:])
plt.subplot(2,2,4)
plt.imshow(o1[10,:,:])
You can read the documentation about writing a custom layer, and about Making new Layers and Models via subclassing
Here is a simple implementation of the Gabor filter bank based on your code:
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
from skimage.filters import gabor_kernel
class GaborFilterBank(layers.Layer):
def __init__(self):
super().__init__()
def build(self, input_shape):
# assumption: shape is NHWC
self.n_channel = input_shape[-1]
self.kernels = []
for theta in range(4):
theta = theta / 4.0 * np.pi
for sigma in (1, 3):
for frequency in (0.05, 0.25):
kernel = np.real(
gabor_kernel(
frequency, theta=theta, sigma_x=sigma, sigma_y=sigma
)
).astype(np.float32)
# tf.nn.conv2d does crosscorrelation, not convolution, so flipping
# the kernel is needed
kernel = np.flip(kernel)
# we stack the kernel on itself to match the number of channel of
# the input
kernel = np.stack((kernel,)*self.n_channel, axis=-1)
# print(kernel.shape)
# adding the number of out channel, here 1.
kernel = kernel[:, :, : , np.newaxis]
# because the kernel shapes are different, we can't do the conv op
# in one go, so we stack the kernels in a list
self.kernels.append(tf.Variable(kernel, trainable=False))
def call(self, x):
out_list = []
for kernel in self.kernels:
out_list.append(tf.nn.conv2d(x, kernel, strides=1, padding="SAME"))
# output is [batch_size, H, W, 16] where 16 is the number of filters
# 16 = n_theta * n_sigma * n_freq = 4 * 2 * 2
return tf.concat(out_list,axis=-1)
There is some differences though:
tensorflow does not have a "wrap" mode for convolution. I used "SAME" which is akin to "constant" with a padding value of 0 inscipy. Its possible to provide your own padding, so it is definitely possible to mimic the "wrap" mode, I let that as an exercise to the reader.
tf.nn.conv2d expect a 4D input, so I add a batch dimension and a channel dimension to the img as an input.
the filters for tf.nn.conv2d must follow the shape [filter_height, filter_width, in_channels, out_channels]. In that case, I use the number of channel of the input as in_channels. out_channels could be equal to the number of filters in the filter bank, but because their shape is not constant, it is easier to concatenate them afterwards, so I set it to 1. It means that the output of the layer is [N,H,W,C] where C is the number of filters in the bank (in your example, 16).
tf.nn.conv2d is not a real convolution, but a cross-correlation (see the doc), so flipping the filters before hand is needed to get an actual convolution.
I'm adding a quick example on how to use it:
# defining the model
inp = tf.keras.Input(shape=(512,512,1))
conv = tf.keras.layers.Conv2D(4, (3,3), padding="SAME")(inp)
g = GaborFilterBank()(conv)
model = tf.keras.Model(inputs=inp, outputs=g)
# calling the model with an example Image
img = img_as_float(data.brick())
img_nhwc = img[np.newaxis, :, :, np.newaxis]
out = model(img_nhwc)
# out shape is [1,512,512,16]

How do I rotate a PyTorch image tensor around it's center in a way that supports autograd?

I'd like to randomly rotate an image tensor (B, C, H, W) around it's center (2d rotation I think?). I would like to avoid using NumPy and Kornia, so that I basically only need to import from the torch module. I'm also not using torchvision.transforms, because I need it to be autograd compatible. Essentially I'm trying to create an autograd compatible version of torchvision.transforms.RandomRotation() for visualization techniques like DeepDream (so I need to avoid artifacts as much as possible).
import torch
import math
import random
import torchvision.transforms as transforms
from PIL import Image
# Load image
def preprocess_simple(image_name, image_size):
Loader = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor()])
image = Image.open(image_name).convert('RGB')
return Loader(image).unsqueeze(0)
# Save image
def deprocess_simple(output_tensor, output_name):
output_tensor.clamp_(0, 1)
Image2PIL = transforms.ToPILImage()
image = Image2PIL(output_tensor.squeeze(0))
image.save(output_name)
# Somehow rotate tensor around it's center
def rotate_tensor(tensor, radians):
...
return rotated_tensor
# Get a random angle within a specified range
r_degrees = 5
angle_range = list(range(-r_degrees, r_degrees))
n = random.randint(angle_range[0], angle_range[len(angle_range)-1])
# Convert angle from degrees to radians
ang_rad = angle * math.pi / 180
# test_tensor = preprocess_simple('path/to/file', (512,512))
test_tensor = torch.randn(1,3,512,512)
# Rotate input tensor somehow
output_tensor = rotate_tensor(test_tensor, ang_rad)
# Optionally use this to check rotated image
# deprocess_simple(output_tensor, 'rotated_image.jpg')
Some example outputs of what I'm trying to accomplish:
So the grid generator and the sampler are sub-modules of the Spatial Transformer (JADERBERG, Max, et al.). These sub-modules are not trainable, they let you apply a learnable, as well as non-learnable, spatial transformation.
Here I take these two submodules and use them to rotate an image by theta using PyTorch's functions torch.nn.functional.affine_grid and torch.nn.functional.affine_sample (these functions are implementations of the generator and the sampler, respectively):
import torch
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
def get_rot_mat(theta):
theta = torch.tensor(theta)
return torch.tensor([[torch.cos(theta), -torch.sin(theta), 0],
[torch.sin(theta), torch.cos(theta), 0]])
def rot_img(x, theta, dtype):
rot_mat = get_rot_mat(theta)[None, ...].type(dtype).repeat(x.shape[0],1,1)
grid = F.affine_grid(rot_mat, x.size()).type(dtype)
x = F.grid_sample(x, grid)
return x
#Test:
dtype = torch.cuda.FloatTensor if torch.cuda.is_available() else torch.FloatTensor
#im should be a 4D tensor of shape B x C x H x W with type dtype, range [0,255]:
plt.imshow(im.squeeze(0).permute(1,2,0)/255) #To plot it im should be 1 x C x H x W
plt.figure()
#Rotation by np.pi/2 with autograd support:
rotated_im = rot_img(im, np.pi/2, dtype) # Rotate image by 90 degrees.
plt.imshow(rotated_im.squeeze(0).permute(1,2,0)/255)
In the example above, assume we take our image, im, to be a dancing cat in a skirt:
rotated_im will be a 90-degrees CCW rotated dancing cat in a skirt:
And this is what we get if we call rot_img with theta eqauls to np.pi/4:
And the best part that it's differentiable w.r.t the input and has autograd support! Hooray!
With torchvision it should be simple:
import torchvision.transforms.functional as TF
angle = 30
x = torch.randn(1,3,512,512)
out = TF.rotate(x, angle)
For example if x is:
out with a 30 degree rotation is (NOTE: counterclockwise):
There is a pytorch function for that:
x = torch.tensor([[0, 1],
[2, 3]])
x = torch.rot90(x, 1, [0, 1])
>> tensor([[1, 3],
[0, 2]])
Here are the docs: https://pytorch.org/docs/stable/generated/torch.rot90.html

torch.rfft - fft-based convolution creating different output than spatial convolution

I implemented FFT-based convolution in Pytorch and compared the result with spatial convolution via conv2d() function. The convolution filter used is an average filter. The conv2d() function produced smoothened output due to average filtering as expected but the fft-based convolution returned a more blurry output.
I have attached the code and outputs here -
spatial convolution -
from PIL import Image, ImageOps
import torch
from matplotlib import pyplot as plt
from torchvision.transforms import ToTensor
import torch.nn.functional as F
import numpy as np
im = Image.open("/kaggle/input/tiger.jpg")
im = im.resize((256,256))
gray_im = im.convert('L')
gray_im = ToTensor()(gray_im)
gray_im = gray_im.squeeze()
fil = torch.tensor([[1/9,1/9,1/9],[1/9,1/9,1/9],[1/9,1/9,1/9]])
conv_gray_im = gray_im.unsqueeze(0).unsqueeze(0)
conv_fil = fil.unsqueeze(0).unsqueeze(0)
conv_op = F.conv2d(conv_gray_im,conv_fil)
conv_op = conv_op.squeeze()
plt.figure()
plt.imshow(conv_op, cmap='gray')
FFT-based convolution -
def fftshift(image):
sh = image.shape
x = np.arange(0, sh[2], 1)
y = np.arange(0, sh[3], 1)
xm, ym = np.meshgrid(x,y)
shifter = (-1)**(xm + ym)
shifter = torch.from_numpy(shifter)
return image*shifter
shift_im = fftshift(conv_gray_im)
padded_fil = F.pad(conv_fil, (0, gray_im.shape[0]-fil.shape[0], 0, gray_im.shape[1]-fil.shape[1]))
shift_fil = fftshift(padded_fil)
fft_shift_im = torch.rfft(shift_im, 2, onesided=False)
fft_shift_fil = torch.rfft(shift_fil, 2, onesided=False)
shift_prod = fft_shift_im*fft_shift_fil
shift_fft_conv = fftshift(torch.irfft(shift_prod, 2, onesided=False))
fft_op = shift_fft_conv.squeeze()
plt.figure('shifted fft')
plt.imshow(fft_op, cmap='gray')
original image -
spatial convolution output -
fft-based convolution output -
Could someone kindly explain the issue?
The main problem with your code is that Torch doesn't do complex numbers, the output of its FFT is a 3D array, with the 3rd dimension having two values, one for the real component and one for the imaginary. Consequently, the multiplication does not do a complex multiplication.
There currently is no complex multiplication defined in Torch (see this issue), we'll have to define our own.
A minor issue, but also important if you want to compare the two convolution operations, is the following:
The FFT takes the origin of its input in the first element (top-left pixel for an image). To avoid a shifted output, you need to generate a padded kernel where the origin of the kernel is the top-left pixel. This is quite tricky, actually...
Your current code:
fil = torch.tensor([[1/9,1/9,1/9],[1/9,1/9,1/9],[1/9,1/9,1/9]])
conv_fil = fil.unsqueeze(0).unsqueeze(0)
padded_fil = F.pad(conv_fil, (0, gray_im.shape[0]-fil.shape[0], 0, gray_im.shape[1]-fil.shape[1]))
generates a padded kernel where the origin is in pixel (1,1), rather than (0,0). It needs to be shifted by one pixel in each direction. NumPy has a function roll that is useful for this, I don't know the Torch equivalent (I'm not at all familiar with Torch). This should work:
fil = torch.tensor([[1/9,1/9,1/9],[1/9,1/9,1/9],[1/9,1/9,1/9]])
padded_fil = fil.unsqueeze(0).unsqueeze(0).numpy()
padded_fil = np.pad(padded_fil, ((0, gray_im.shape[0]-fil.shape[0]), (0, gray_im.shape[1]-fil.shape[1])))
padded_fil = np.roll(padded_fil, -1, axis=(0, 1))
padded_fil = torch.from_numpy(padded_fil)
Finally, your fftshift function, applied to the spatial-domain image, causes the frequency-domain image (the result of the FFT applied to the image) to be shifted such that the origin is in the middle of the image, rather than the top-left. This shift is useful when looking at the output of the FFT, but is pointless when computing the convolution.
Putting these things together, the convolution is now:
def complex_multiplication(t1, t2):
real1, imag1 = t1[:,:,0], t1[:,:,1]
real2, imag2 = t2[:,:,0], t2[:,:,1]
return torch.stack([real1 * real2 - imag1 * imag2, real1 * imag2 + imag1 * real2], dim = -1)
fft_im = torch.rfft(gray_im, 2, onesided=False)
fft_fil = torch.rfft(padded_fil, 2, onesided=False)
fft_conv = torch.irfft(complex_multiplication(fft_im, fft_fil), 2, onesided=False)
Note that you can do one-sided FFTs to save a bit of computation time:
fft_im = torch.rfft(gray_im, 2, onesided=True)
fft_fil = torch.rfft(padded_fil, 2, onesided=True)
fft_conv = torch.irfft(complex_multiplication(fft_im, fft_fil), 2, onesided=True, signal_sizes=gray_im.shape)
Here the frequency domain is about half the size as in the full FFT, but it is only redundant parts that are left out. The result of the convolution is unchanged.

Pixel-wise loss weight for image segmentation in Keras

I am currently using a modified version of the U-Net (https://arxiv.org/pdf/1505.04597.pdf) to segment cell organelles in microscopy images. Since I am using Keras, I took the code from https://github.com/zhixuhao/unet. However, in this version no weight map is implemented to force the network to learn the border pixels.
The results that I have obtained so far are quite good, but the network fails to separate objects that are close to each other. So I want to try and make use of the weight map mentioned in the paper. I have been able to generate the weight map (based on the given formula) for each label image, but I was unable to find out how to use this weight map to train my network and thus solve the above mentioned problem.
Do weight maps and label images have to be combined somehow or is there a Keras function that will allow me to make use of the weight maps? I am Biologist, who only recently started to work with neural networks, so my understanding is still limited. Any help or advice would be greatly appreciated.
In case it is still relevant: I needed to solve this recently. You can paste the code below into a Jupyter notebook to see how it works.
%matplotlib inline
import numpy as np
from skimage.io import imshow
from skimage.measure import label
from scipy.ndimage.morphology import distance_transform_edt
import numpy as np
def generate_random_circles(n = 100, d = 256):
circles = np.random.randint(0, d, (n, 3))
x = np.zeros((d, d), dtype=int)
f = lambda x, y: ((x - x0)**2 + (y - y0)**2) <= (r/d*10)**2
for x0, y0, r in circles:
x += np.fromfunction(f, x.shape)
x = np.clip(x, 0, 1)
return x
def unet_weight_map(y, wc=None, w0 = 10, sigma = 5):
"""
Generate weight maps as specified in the U-Net paper
for boolean mask.
"U-Net: Convolutional Networks for Biomedical Image Segmentation"
https://arxiv.org/pdf/1505.04597.pdf
Parameters
----------
mask: Numpy array
2D array of shape (image_height, image_width) representing binary mask
of objects.
wc: dict
Dictionary of weight classes.
w0: int
Border weight parameter.
sigma: int
Border width parameter.
Returns
-------
Numpy array
Training weights. A 2D array of shape (image_height, image_width).
"""
labels = label(y)
no_labels = labels == 0
label_ids = sorted(np.unique(labels))[1:]
if len(label_ids) > 1:
distances = np.zeros((y.shape[0], y.shape[1], len(label_ids)))
for i, label_id in enumerate(label_ids):
distances[:,:,i] = distance_transform_edt(labels != label_id)
distances = np.sort(distances, axis=2)
d1 = distances[:,:,0]
d2 = distances[:,:,1]
w = w0 * np.exp(-1/2*((d1 + d2) / sigma)**2) * no_labels
else:
w = np.zeros_like(y)
if wc:
class_weights = np.zeros_like(y)
for k, v in wc.items():
class_weights[y == k] = v
w = w + class_weights
return w
y = generate_random_circles()
wc = {
0: 1, # background
1: 5 # objects
}
w = unet_weight_map(y, wc)
imshow(w)
I think you want to use class_weight in Keras. This is actually simple to introduce in your model if you have already calculated the class weights.
Create a dictionary with your class labels and their associated weights. For example
class_weight = {0: 10.9,
1: 20.8,
2: 1.0,
3: 50.5}
Or create a 1D Numpy array of the same length as your number of classes. For example
class_weight = [10.9, 20.8, 1.0, 50.5]
Pass this parameter during training in your model.fit or model.fit_generator
model.fit(x, y, batch_size=batch_size, epochs=num_epochs, verbose=1, class_weight=class_weight)
You can look up the Keras documentation for more details here.

Need Tensorflow/Keras equivalent for scipy signal.fftconvolve

I want to use scipy.signal.fftconvolve in Tensorflow/Keras, is there any way to do that?
Right now I am using the following code :
window = np.tile(window, (1, 1, 1, 3))
tf.nn.conv2d(img1, window, strides=[1,1,1,1], padding='VALID')
Are these lines equivalent to :
signal.fftconvolve(img1, window, mode='valid')
Implementation
FFT convolution can be relatively easily implemented in tensorflow. The following follows scipy.signal.fftconvolve quite strictly
import tensorflow as tf
def _centered(arr, newshape):
# Return the center newshape portion of the array.
currshape = tf.shape(arr)[-2:]
startind = (currshape - newshape) // 2
endind = startind + newshape
return arr[..., startind[0]:endind[0], startind[1]:endind[1]]
def fftconv(in1, in2, mode="full"):
# Reorder channels to come second (needed for fft)
in1 = tf.transpose(in1, perm=[0, 3, 1, 2])
in2 = tf.transpose(in2, perm=[0, 3, 1, 2])
# Extract shapes
s1 = tf.convert_to_tensor(tf.shape(in1)[-2:])
s2 = tf.convert_to_tensor(tf.shape(in2)[-2:])
shape = s1 + s2 - 1
# Compute convolution in fourier space
sp1 = tf.spectral.rfft2d(in1, shape)
sp2 = tf.spectral.rfft2d(in2, shape)
ret = tf.spectral.irfft2d(sp1 * sp2, shape)
# Crop according to mode
if mode == "full":
cropped = ret
elif mode == "same":
cropped = _centered(ret, s1)
elif mode == "valid":
cropped = _centered(ret, s1 - s2 + 1)
else:
raise ValueError("Acceptable mode flags are 'valid',"
" 'same', or 'full'.")
# Reorder channels to last
result = tf.transpose(cropped, perm=[0, 2, 3, 1])
return result
Example
A quick example of applying a gaussian smoothing with width 20 pixels to the standard "face" image is as follows:
if __name__ == '__main__':
from scipy import misc
import matplotlib.pyplot as plt
from tensorflow.python.ops import array_ops, math_ops
session = tf.InteractiveSession()
# Create gaussian
std = 20
grid_x, grid_y = array_ops.meshgrid(math_ops.range(3 * std),
math_ops.range(3 * std))
grid_x = tf.cast(grid_x[None, ..., None], 'float32')
grid_y = tf.cast(grid_y[None, ..., None], 'float32')
gaussian = tf.exp(-((grid_x - 1.5 * std) ** 2 + (grid_y - 1.5 * std) ** 2) / std ** 2)
gaussian = gaussian / tf.reduce_sum(gaussian)
face = misc.face(gray=False)[None, ...].astype('float32')
# Apply convolution
result = fftconv(face, gaussian, 'same')
result_r = session.run(result)
# Show results
plt.figure('face')
plt.imshow(face[0, ...] / 256.0)
plt.figure('convolved')
plt.imshow(result_r[0, ...] / 256.0)
You want just a regular conv2d then...
If you want it somewhere in the model, add a Conv2D(...,name='myLayer') layer, and in the model use model.get_layer('myLayer').set_weights([filters,biases])
If you want it in a loss function, just create a loss function:
import keras.backend as K
def myLoss(y_true, y_pred):
#where y_true is the true training data and y_pred is the model's output
convResult = K.conv2d(y_pred, kernel = window, padding = 'same')
anotherResult = K.depthwise_conv2d(y_pred, kernel = window, padding='same')
The regular conv2D will assume each output channel in the filter will process and sum all input channels.
The depthwise convolution will keep input channels separate.
Beware of the window, though. I don't know the format in tensorflow or scipy, but the kernel in keras should have this shape: (height, width, numberOfInputChannels, numberOfOutputChannels)
I believe, if I understand it right, it should be window = np.reshape(_FSpecialGauss(size, sigma), (size, size, 1, 1)), considering that "size" is the size of the kernel and you have only 1 input and output channels.
I used padding='same' to get the result image the same size of the input. If you use padding='valid', you will lose the borders (although in your case, your filter seems to have size (1,1), which won't remove borders).
You can use any tensorflow function inside the loss function as well:
def customLoss(yTrue,yPred):
tf.anyFunction(yTrue)
tf.anyFunction(yPred)
Using keras backend will let your code be portable to other backends later.
When compiling the model, give it your loss function:
model.compile(loss=myLoss, optimizer =....)

Categories