How to feed trained Python TensorFlow model with batches? - python

I have created an own model and I trained it with ImageTrainGenerator- from Keras using flow_from_directory.
Like this: how to train model with batches. Everything works fine, I checked the generated batches, and the pictures are as it has to be.
My problem is, that I want to use this trained model in online face detection. I crop the faces on desired width and height, I convert it into array, but the prediction is horrible.
I think that the live streamed image has to be the same as what the Imagetraingenerator creates (batches). Any idea how can I convert cv2.imread(path) image into batch to predict the class?

You just have to add the batch dimensions to convert it to a batch with 1 sample: np.expand_dims(img, axis=0).

Related

How to build pre-processing and post-processing inside TFlite model using Python?

I have trained a tensorflow model and converted it to tflite model.
I want to build a Tensorflow lite (.tflite) model which does pre-processing, model execution, and post-processing. Pre-processing mainly consists of reading a single image, resizing it with padding, and converting it to an array. This array is input to tflite model and output of the model are several arrays. These arrays need to be processed to get meaningful information out of them.
Is it possible to create a tflite model which can do pre-processing and post-processing? I only need to give image as input and get the desired output.
For e.g.
pre-processing.py --> import image, resize image, normalize image, convert to numpy array (float32)
post-processing.py --> read model output arrays, extract segmentation masks, plot on image
WHAT I WANT
input_image.jpg-->model-->output_image.jpg with segmentation mask plotted

Convert apparent resistivity image into "true" model output image

For my project, I have to feed an image, which is an apparent resistivity model, into a convolutional neural network and output a corresponding "true" model image. The idea is like this: apparent resistivity model -> true model
Both sets of images are saved as tiff files in different folders. From what I understand I need to convert these to floating-point tensors to feed them into the CNN. However I'm confused as to the overall big picture - how do I feed both the apparent resistivity model and the true model into the CNN for training? Can I simply create a function that will extract the images from their directories (it is very important this is done in the correct corresponding order), convert them to arrays, normalise them, and then store them in lists? And then pass both sets of lists into the CNN as X_train and y_train?
After training, I will feed more apparent resistivity pictures into the CNN to see the image output and compare it to the "real" true model, in order to check the performance of the model. How will I get the CNN to output an image?
Sorry if these are questions are too vague, large or basic. The tutorials I've seen online all have to do with training a neural net for classification, which is not helpful to me. Thanks in advance.

Pytorch classification gives wrong prediction

After training my classification model I get an accuracy of 94% on my Test data. I am working with TIFF images. To load the data and feed it into the classification model I am using the Dataloader from Pytorch.
My Dataloader function looks like this:
def dataload(self,train_path,batch_train,test_path,batch_train_val):
#Transforms
transformer=transforms.Compose([
#transforms.Resize((450,450)),
transforms.Resize((150,150)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(), #0-255 to 0-1, numpy to tensors
transforms.Normalize([0.5,0.5,0.5], # 0-1 to [-1,1]
[0.5,0.5,0.5])])
train_loader=DataLoader(
torchvision.datasets.ImageFolder(train_path,transform=transformer),
#torchvision.datasets.ImageFolder(train_path,transform=get_transformer()),
batch_size=batch_train, shuffle=True
)
test_loader=DataLoader(
torchvision.datasets.ImageFolder(test_path,transform=transformer),
#torchvision.datasets.ImageFolder(train_path,transform=get_transformer()),
batch_size=batch_train_val, shuffle=True
)
return [train_loader,test_loader]
Dataloader manages Tiff images and converts them somehow automatically into a three layer image, because a TIFF image has 4 layers but my model need a three layer image as an input.
When I finally tried to to use my saved model I got several problems. Since I am loading each image separately to predict the label I don't use Dataloader from Pytorch anymore.
My code looks like this:
all_images = [f for f in listdir(pred_path) if isfile(join(pred_path, f))]
for i in all_images:
transformer=transforms.Compose([
transforms.Resize((150,150)),
transforms.ToTensor(),
transforms.Normalize([0.5,0.5,0.5], # 0-1 to [-1,1]
[0.5,0.5,0.5])])
image=Image.open(pred_path+"/"+i).convert('RGB')
image_tensor=transformer(image).float()
image_tensor=image_tensor.unsqueeze_(0)
input=Variable(image_tensor)
output=model(input)
index=output.data.numpy().argmax()
Since Tiff images have 4 layers and my model expects a three layer image I get an error. However when I manually convert the Tiff to a JPG image or convert them directly to RGB in my code, the model always predicts the same label for every image.
The strange part is that I only get all these problems when using the Efficientnet B7 model. When I use a small custom model, everything works fine I get neither of the above problems.

Pytorch - Use a UNet to perform Image Deblurring/Image Reconstruction

Currently, I'm working with a dataset where I have two kinds of images: "sharp version" of the image and "blurry version" of the same images, where a blur was added synthetically. My goal is to train a model that takes the blurry version of the images in and tries to deblur the image as much as it can so that the "deblurred image" is closer to the sharp version. In the literature, the UNet architecture seemed to be a model with good results. Additionally, I can use a pre-trained U-Net via Pytorch (https://pytorch.org/hub/mateuszbuda_brain-segmentation-pytorch_unet/).
My problem is now: When I train this pre-trained U-Net with my images and then try it on my test set, I get the following output:
The original image:
I know that this pre-trained model is usually used for biomedical image segmentation but I'm rather confused about how I have to modify the model to use it for an Image Deblurring/Reconstruction task. Does anyone have any advice on how to do this?
I would appreciate any feedback :)
The U-net you're using is for segmentation (classification of each pixels of the image) whereas you're trying to denoise the image (getting your image "sharper"/remove noise). It explains the results you got.
To get what you want you need and as DerekG said, you first need to modify the number of channels of the output. By modify it, you can't load all the pretrained model. You will have to copy parameters by parameters until the last one.
As the last layer is initialized randomly, you can retrained the model with your training set. You can freeze or not the pretrained parts.
Also, I'm not sure what your new dataset is but if it's really not related to biomedical images you should retrain your network from scratch (transfer learning shouldn't be done in these cases), maybe even change the encoder-decoder network.
From the included link:
import torch
model = torch.hub.load('mateuszbuda/brain-segmentation-pytorch', 'unet',
in_channels=3, out_channels=1, init_features=32, pretrained=True)
The model you define has a single output channel, resulting in a grayscale image output. You need 3 output channels for an RGB image.

Correct way to take advantage of Resizing layer in Tensorflow 2.3?

Tensorflow 2.3 introduced new preprocessing layers, such as tf.keras.layers.experimental.preprocessing.Resizing.
However, the typical flow to train on images with Keras uses tf.keras.preprocessing.image.ImageDataGenerator, which can only take a fixed target_size parameter. As far as I understand, the root cause is that keras is handling images as a numpy array in the background, where all images have to be the same size (is that true?).
While I could then use a model with a resizing layer that was trained on a fixed size to then predict images of arbitrary size, this seems to be risky since the training data and inference data would have systematic differences. One workaround could be to use ImageDataGenerator with a target_size and interpolation method that match the ones of the resizing layer, so that during training the resizing layer basically does nothing, but then it seems that the resizing layer is not really of any benefit.
So the question is, is there a way to train directly on mixed size images to fully take advantage of the resizing layer?
models needs to operate on images of a FIXED size. If you train a model with a fixed size for example (224 X 224) then if you want to use the trained model to make predictions on images you need to resize those images to 224 X 224. Specifically whatever pre-processing you did on the training images you should also do on the images that you wish to predict. For example if your model was trained on RGB images but the images you want to predict are say BGR images (like reading in images with CV2) the results will be incorrect. You would need to convert them to RGB . Similarly if you rescaled you training images by dividing by 255 you should also rescale the images you want to predict.

Categories