Reversing an ML model - python

I am currently teaching myself the basics of machine learning by creating a simple image classifier using Keras (with a Tensorflow backend). The model classifies a (greyscaled) image as either a cat or not a cat.
My model is relatively good at this task, so I now want to see if it can generate images that it would classify as a cat.
I have attempted to start this in a simple way, by creating a random array of the same shape as the images, with random numbers in each index:
from random import randint
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
model.load_weights("model_weights.h5")
confidence = 0.0
thresholdConfidence = 0.6
while confidence < thresholdConfidence:
img_array = np.array([[[randint(0, 255) for z in range(1)] for y in range(64)] for x in range(64)])
img_array = img_array.reshape((1,) + img_array.shape)
confidence = model.predict(img_array)
This method is obviously not good at all, since it just creates random things and could potentially run eternally. Could the model somehow run in reverse by telling it that an array is 100% cat, and having it predict what the array representation of the image is?
Thank you for reading.
[This is my first post on StackOverflow, so please let me know if I've done something wrong!]

If you wish to generate a special type of image, you can use Generative Adversary Networks. This are made into two parts which need to be trained separately. The two parts are
Generator : Creates noise that is random images.
Discriminator : Gives feedback to the generator regarding the images
You can refer here.

Related

how to save ocr model from keras author-A_K_Nain

Im studying tensorflow ocr model from keras example authored by A_K_Nain. This model use custom object (CTC Layer). It is in the site:https://keras.io/examples/vision/captcha_ocr/
I trained model using my dataset and then the result of prediction model is perfect.
I want to save and load this model and i tried it. But i got some errors so i appended this code in CTC Layer class.
def get_config(self):
config = super(CTCLayer, self).get_config()
config.update({"name":self.name})
return config
After that
I tried to save whole model and weight but nothing worked.
So i applied 2 save point.
First way.
history = model.fit(
train_dataset,
validation_data=validation_dataset,
epochs=70,
callbacks=[early_stopping],
)
model.save('./model/my_model')
---------------------------------------
new_model = load_model('./model/my_model', custom_objects={'CTCLayer':CTCLayer})
prediction_model = keras.models.Model(
new_model .get_layer(name='image').input, new_model .get_layer(name='dense2').output
)
and second way.
prediction_model = keras.models.Model(
model.get_layer(name='image').input, model.get_layer(name='dense2').output
)
prediction_model.save('./model/my_model')
These still never worked. it didn't make error but result of prediction is terrible.
Accurate results are obtained when training and saving and loading are performed together.
If I load same model without training together, the result is so bad.
How can i use this model without training everytime? please help me.
The problem does not come from tensorflow. In the captcha_ocr tutorial, characters is a set, sets are unordered. So the mapping from characters to integers using StringLookup is dependent of the current run of the notebook. That is why you get rubbish when using it in another notebook without retraining, the mapping is not the same!
A solution is to use an ordered list instead of the set for characters :
characters = sorted(list(set([char for label in labels for char in label])))
Note that the set operator here permits to get a unique version of each character and then it is converted back to a list and sorted. It will work then on any script/notebook without retraining (using the same formula).
The problem is not in the saved model but in the character list that you are using to map number back to string. Everytime you restart the notebook, it resets the character list and when you load your model, it can't accurately map the numbers back to string. To resolve this issue you need to save character list. Please follow the below code.
train_labels_cleaned = []
characters = set()
max_len = 0
for label in train_labels:
label = label.split(" ")[-1].strip()
for char in label:
characters.add(char)
max_len = max(max_len, len(label))
train_labels_cleaned.append(label)
print("Maximum length: ", max_len)
print("Vocab size: ", len(characters))
# Check some label samples
train_labels_cleaned[:10]
ff = list(characters)
# save list as pickle file
import pickle
with open("/content/drive/MyDrive/Colab Notebooks/OCR_course/characters", "wb") as fp: #Pickling
pickle.dump(ff, fp)
# Load character list again
import pickle
with open("/content/drive/MyDrive/Colab Notebooks/OCR_course/characters", "rb") as fp: # Unpickling
b = pickle.load(fp)
print(b)
AUTOTUNE = tf.data.AUTOTUNE
# Maping characaters to integers
char_to_num = StringLookup(vocabulary=b, mask_token=None)
#Maping integers back to original characters
num_to_chars = StringLookup(vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True)
Now when you map back the numbers to string after prediction, it will retain the original order and will predict accurately.
If you still didn't understand the logic, you can watch my video in which I explained this project from scratch and resolved all the issues that you are facing.
https://youtu.be/ZiUEdS_5Byc

How do I use the given XML annotation files in my CNN to classify images

I have been learning about Convolutional Neural Networks over the last month and am finally trying to understand how to use annotated images when doing some sort of categorical classification. I am currently using the images/annotations found here:
http://web.mit.edu/torralba/www/indoor.html
After downloading the tar file linked for the annotations, I dont understand how I'm supposed to use the extracted XML files to help my CNN classify images. I don't understand if they need to be formatted another way or just combined somehow with the normal images I have. I have been looking for references on how it is supposed to be done, but I haven't found anything as far as I can tell.
This is my current code that I am using to build my original image set without the annotations.
I would appreciate any guidance on what I need to do.
import matplotlib.pyplot as plt
from sklearn.preprocessing import OneHotEncoder
import os
import cv2
import pickle
import random
DATADIR = "C:/Users/cadan/OneDrive/Desktop/IndoorImages/Images"
CATEGORIES = os.listdir(DATADIR)
#CATEGORIES = ["airport_inside","artstudio","auditorium","bakery","bar","bathroom","bedroom","bookstore","bowling","buffet"]
new_shape = len(CATEGORIES)
IMG_SIZE = 100
enc = OneHotEncoder(handle_unknown='ignore', categories = 'auto')
NEW_CATEGORIES = np.array(CATEGORIES).reshape(new_shape,1)
transformed = enc.fit_transform(NEW_CATEGORIES[:]).toarray()
training_data = []
def create_training_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img))
new_array = cv2.resize(img_array, (IMG_SIZE,IMG_SIZE))
training_data.append([new_array,transformed[class_num]])
except Exception as e:
pass
create_training_data()
random.shuffle(training_data)
X = []
y = []
for features, label in training_data:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
y = np.array(y)
pickle_out = open("images","wb")
pickle.dump(X, pickle_out)
pickle_out.close()
pickle_out = open("categories","wb")
pickle.dump(y, pickle_out)
pickle_out.close()
It really depends on the task that you want to solve and your description is not completely clear.
Since you are starting to get into DL, I would suggest you start with a simple classification task where you have the set of images as an input, and a set of single labels as an output (in this case, you can use the categories provided by the given dataset). To solve this, you can start with a CNN architecture, for example ResNet. In Keras, you can just import the model architecture and change the top layers to match your desired output shape (that is two lines of code!). I really like the examples given by the Keras community, here you can find a very good entry point for a simple classification task from scratch.
For your specific dataset, I would go in the following way (oversimplified):
Build an XML parser for the image class and pass it to a Pandas Dataframe. One column for the filename and another for the label.
Build the CNN as in the previous links.
Use a Keras ImageDataGenerator from the created Pandas Dataframe.
Train the model using .fit()

How to add testing category in Keras model?

I have a CNN trained on classes = [dog, cat, frog] and in the testing-phase-only, I want to include several pictures of horses to see which known classes those images get classified as. Any idea how to implement this in a Keras model?
One thing I've tried, but I don't like is to distribute the horse pictures equally and randomly across the training images for the known classes (dog, cat, and frog) and then see what happens with the testing images. I'm worried the number of horse images (though relatively small) would negatively impact the model's knowledge of a Here is the corresponding code:
<x_train, x_test, y_train, and y_test has already been done prior to this step>
clsLst = [dog, cat, frog]
clsRemove = horse
seed(1)
newClsLst = [0,0,0]
for I in range(0,len(y_train)):
if y_train[i][clsRemove] = 1.0:
y_train[i][clsRemove] = 0.0
randIndex = random.randint(0,8)
newCls = clsLst[randIndex]
newClsLst[newCls] = newClsLst[newCls] + 1
y_train[i][newCls] = 1.0
This is only my second time using Keras and I don't have a programming background so all tips and overexplaining is appreciated.
As you have correctly noted yourself, adding the horse images to your training data is a bad idea - unless, of course, you want to expand your model's classification capabilities so that it learns to identify horses.
That said, you could simply add the horse images to x_test or set up a separate testing dataset (say, x_test_horses) for this specific testing purpose, i.e. what horses are (mis)classified as.
As has been pointed out in Saankhya Mondal's comment below your original post, with both options you can simply use model.predict() to make predictions (y_pred = model.predict(x_test) respectively y_pred = model.predict(x_test_horses)).

How to get Data Generator more efficient?

To train a neural network, I modified a code I found on YouTube. It looks as follows:
def data_generator(samples, batch_size, shuffle_data = True, resize=224):
num_samples = len(samples)
while True:
random.shuffle(samples)
for offset in range(0, num_samples, batch_size):
batch_samples = samples[offset: offset + batch_size]
X_train = []
y_train = []
for batch_sample in batch_samples:
img_name = batch_sample[0]
label = batch_sample[1]
img = cv2.imread(os.path.join(root_dir, img_name))
#img, label = preprocessing(img, label, new_height=224, new_width=224, num_classes=37)
img = preprocessing(img, new_height=224, new_width=224)
label = my_onehot_encoded(label)
X_train.append(img)
y_train.append(label)
X_train = np.array(X_train)
y_train = np.array(y_train)
yield X_train, y_train
Now, I tried to train a neural network using this code, train sample size is 105.000 (image files which contain 8 characters out of 37 possibilities, A-Z, 0-9 and blank space).
I used a relatively small batch size (32, I think that is already too small) to get it more efficient but nevertheless it took like forever to train one quarter of the first epoch (I had 826 steps per epoch, and it took 90 minutes for 199 steps... steps_per_epoch = num_train_samples // batch_size).
The following functions are included in the data generator:
def shuffle_data(data):
data=random.shuffle(data)
return data
I don't think we can make this function anyhow more efficient or exclude it from the generator.
def preprocessing(img, new_height, new_width):
img = cv2.resize(img,(new_height, new_width))
img = img/255
return img
For preprocessing/resizing the data I use this code to get the images to a unique size of e.g. (224, 224, 3). I think, this part of the generator takes the most time, but I don't see a possibility to exclude it from the generator (since my memory would be full, if we resize the images outside the batches).
#One Hot Encoding of the Labels
from numpy import argmax
# define input string
def my_onehot_encoded(label):
# define universe of possible input values
characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ '
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(characters))
int_to_char = dict((i, c) for i, c in enumerate(characters))
# integer encode input data
integer_encoded = [char_to_int[char] for char in label]
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
character = [0 for _ in range(len(characters))]
character[value] = 1
onehot_encoded.append(character)
return onehot_encoded
I think, in this part there could be one approach to make it more efficient. I am thinking about to exclude this code from the generator and produce the array y_train outside of the generator, so that the generator does not have to one hot encode the labels every time.
What do you think? Or should I maybe go for a completely different approach?
I have found your question very intriguing because you give only clues. So here is my investigation.
Using your snippets, I have found GitHub repository and 3 part video tutorial on YouTube that mainly focuses on the benefits of using generator functions in Python.
The data is based on this kaggle (I would recommend to check out different kernels on that problem to compare the approach that you already tried with another CNN networks and review API in use).
You do not need to write a data generator from scratch, though it is not hard, but inventing the wheel is not productive.
Keras has the ImageDataGenerator class.
Plus here is a more generic example for DataGenerator.
Tensorflow offers very neat pipelines with their tf.data.Dataset.
Nevertheless, to solve the kaggle's task, the model needs to perceive single images only, hence the model is a simple deep CNN. But as I understand, you are combining 8 random characters (classes) into one image to recognize multiple classes at once. For that task, you need R-CNN or YOLO as your model. I just recently opened for myself YOLO v4, and it is possible to make it work for specific task really quick.
General advice about your design and code.
Make sure the library uses GPU. It saves a lot of time. (Even though I repeated flowers experiment from the repository very fast on CPU - about 10 minutes, but resulting predictions are no better than a random guess. So full training requires a lot of time on CPU.)
Compare different versions to find a bottleneck. Try a dataset with 48 images (1 per class), increase the number of images per class, and compare. Reduce image size, change the model structure, etc.
Test brand new models on small, artificial data to prove the idea or use iterative process, start from projects that can be converted to your task (handwriting recognition?).

Trying to print class names for dog breed but it keeps saying list index out of range

I am using a resnet model to classify dog breeds but when I try to print out an image with the label of dog breed it says list index out of range.
Here is my code:
import torchvision.models as models
import torch.nn as nn
model_transfer = models.resnet18(pretrained=True)
if use_cuda:
model_transfer = model_transfer.cuda()
model_transfer.fc.out_features = 133
Then I train the model and get over 70% accuracy on the dog breeds.
Then here is my code to classify dog and print the dog breed:
data_transfer = {'train':
datasets.ImageFolder('/data/dog_images/train',transform=transforms.Compose([transforms.RandomResizedCrop(224),transforms.ToTensor()]))}
class_names[0]
class_names = [item[4:].replace("_", " ") for item in data_transfer['train'].classes]
def predict_breed_transfer(img_path):
image = Image.open(img_path)
# large images will slow down processing
in_transform = transforms.Compose([
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])])
# discard the transparent, alpha channel (that's the :3) and add the batch dimension
image = in_transform(image)[:3,:,:].unsqueeze(0)
image = image
output = model_transfer(image)
pred = torch.argmax(output)
return class_names[pred]
predict_breed_transfer('images/Labrador_retriever_06455.jpg')
The code always predicts the dog wrong for some reason
Then when I try to print out the image and the label:
import matplotlib.pyplot as plt
def run_app(img_path):
img = Image.open(img_path)
dog = dog_detector(img_path)
if not dog:
print('hello, human!')
plt.imshow(img)
print('You look like a ... ')
print(predict_breed_transfer(img_path))
if dog:
print('hello, dog!')
print('Your predicted breed is ....')
print(predict_breed_transfer(img_path))
plt.imshow(img)
else:
print('Niether human nor dog')
And run a for loop that calls it on some dog images it will print some of the breeds out then it will say list index out of range and doesn't show any of the images.
The length of class_names is 133
And when I print out the resnet model the output is only 133 nodes does anyone know why it is saying list index out of range or why it is so inaccurate.
`IndexError Traceback (most recent
call last)
<ipython-input-26-473a9ba884b5> in <module>()
5 ## suggested code, below
6 for file in np.hstack((human_files[:3], dog_files[:3])):
----> 7 run_app(file)
8
<ipython-input-25-1d44200e44cc> in run_app(img_path)
10 plt.show(img)
11 print('You look like a ... ')
---> 12 print(predict_breed_transfer(img_path))
13 if dog:
14 print('hello, dog!')
<ipython-input-20-a51fb205659e> in predict_breed_transfer(img_path)
26 pred = torch.argmax(output)
27
---> 28 return class_names[pred]
29
predict_breed_transfer('images/Labrador_retriever_06455.jpg')
30
IndexError: list index out of range`
Here is the full error
I suppose you have several issues that can be fixed using 13 chars.
First, I suggest what #Alekhya Vemavarapu suggested - run your code with a debugger to isolate each line and inspect the output. This is one of the greatest benefits of dynamic graphs with pytorch.
Secondly, the most probable cause for your issue is the argmax statement that you use incorrectly. You do not specify the dimension you perform the argmax on, and therefore PyTorch automatically flattens the image and perform the operation on the full length vector. Thus, you get a number between 0 and MB_Size x num_classes -1. See Official doc on this method.
So, due to your fully connected layer I assume your output is of shape (MB_Size, num_classes). If so, you need to change your code to the following line:
pred = torch.argmax(output,dim=1)
and thats it. Otherwise, just choose the dimension of the logits.
Third thing you want to consider is the dropout and other influences that a training configuration may cause to the inference. For instance, the dropout in some frameworks may require multiplying the ouptut by 1/(1-p) in the inference (or not since it can be done while training), batch normalization may be cancelled since the batch size is different, and so on. Additionally, to reduce memory consumption, no gradients should be computed. Luckily, PyTorch developers are very thoughtful and provided us with torch.no_grad() and model.eval() for that.
I strongly suggest to have a practice for that, possibly changing your code with a few letters:
output = model_transfer.eval()(image)
and your done!
Edit:
This is a simple use case of wrong usage of the PyTorch framework, not reading the docs and not debugging your code. The following code is purely incorrect:
model_transfer.fc.out_features = 133
This line does not actually creates a new fully connected layer. It just changes the property of that tensor. Try in your console:
import torch
a = torch.nn.Linear(1,2)
a.out_features = 3
print(a.bias.data.shape, a.weight.data.shape)
Output:
torch.Size([2]) torch.Size([2, 1])
which indicates that the actual matrix of the weights and the biases vector remain in their original dimension.
A correct way to perform transfer learning is to keep the backbone (usually the convolutional layers until the fully connected ones in these type of models) and overwriting the head (FC layer in this case) with yours. If it is just one fully connected layer that exists in the original model, you do not have to change the forward pass of your model and you're good to go.
Since this answer is already long enough, just visit the Transfer learning tutorial in PyTorch docs to see how it can be done.
Good luck to ya'll.

Categories