how to interpret this MNIST tensor? - python

I have found a code that converts the data from the MNIST dataset into tensors. The code is the following:
import torch
import torchvision
import matplotlib.pyplot as plt
batch_size_test=1000
test_loader=torch.utils.data.DataLoader(
torchvision.datasets.MNIST('/files/',train=False,download=True,
transform=torchvision.transforms.Compose(
[torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(
(0.1307,),(0.3081,))
])),
batch_size=batch_size_test,shuffle=True
)
examples=enumerate(test_loader)
print (example_data.shape)
when I print the shape of the example_data I get the following:
torch.Size([1000, 1, 28, 28])
so, for what I know is a tensor of 1000 samples, 1 channel and images of 28 pixels of height and 28 pixels of width. Graphically, I imagine like a sort of a 4d array in which I have cubes, 1000 stacked one over another, each of then formed by 28 x 28 x 1 data.
I have also tried the following instruction:
print (example_data[2][0])
but the output is something like:
tensor([[-0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242,
-0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242,
-0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242,
-0.4242, -0.4242, -0.4242, -0.4242],
[-0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242,
-0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242,
-0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242, -0.4242,
-0.4242, -0.4242, -0.4242, -0.4242],
I see that each part between brackets, like a sort of an unidimensional array, contains 28 numbers in an horizontal way, but why I have also 28 of this [] vertically?
Also, in this part: print (example_data[2][0]), the 2 refers to the second sample, but why I have to put the [0]?
Sorry, if it seems like two questions in one post, but I believe they are closely related to each other.

As you said, MNIST is a 1000, 1, 28, 28 tensor. So each image is a 28x28 matrix. Obviously, it comprises 28 vectors of length 28 (first question)
For your second question, although MNIST has a single channel, generally, images could have three or even more channels in the torch. So you have to put the [0] as it is a dummy dimension for MNIST and makes the tensor a general form for all image types.

Related

How would I increase my accuracy in the cifar-100 dataset? I have a 10% accuracy at the moment

I am doing a small project for fun with the cifar-100 dataset. I'm not sure why I have a low 10% accuracy. Here is my code, could anyone help? thanks all (fyi, cifar-100 dataset is a keras dataset with 100 types of images.)
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
#print(train_images[0])
#print("Network Accuracy: " + str(test_acc))
# plt.imshow(train_images[0], cmap=plt.cm.binary) #greyscale
# plt.imshow(train_images[0]) #neon
# plt.show()
cifar100_mnist = keras.datasets.cifar100
(train_images, train_labels), (test_images, test_labels) = cifar100_mnist.load_data()
print(train_images)
print("-")
print(train_labels)
train_images = train_images/255
test_images = test_images/255
classes = [
'apple', 'aquarium_fish', 'baby', 'bear', 'beaver', 'bed', 'bee', 'beetle',
'bicycle', 'bottle', 'bowl', 'boy', 'bridge', 'bus', 'butterfly', 'camel',
'can', 'castle', 'caterpillar', 'cattle', 'chair', 'chimpanzee', 'clock',
'cloud', 'cockroach', 'couch', 'crab', 'crocodile', 'cup', 'dinosaur',
'dolphin', 'elephant', 'flatfish', 'forest', 'fox', 'girl', 'hamster',
'house', 'kangaroo', 'keyboard', 'lamp', 'lawn_mower', 'leopard', 'lion',
'lizard', 'lobster', 'man', 'maple_tree', 'motorcycle', 'mountain', 'mouse',
'mushroom', 'oak_tree', 'orange', 'orchid', 'otter', 'palm_tree', 'pear',
'pickup_truck', 'pine_tree', 'plain', 'plate', 'poppy', 'porcupine',
'possum', 'rabbit', 'raccoon', 'ray', 'road', 'rocket', 'rose',
'sea', 'seal', 'shark', 'shrew', 'skunk', 'skyscraper', 'snail', 'snake',
'spider', 'squirrel', 'streetcar', 'sunflower', 'sweet_pepper', 'table',
'tank', 'telephone', 'television', 'tiger', 'tractor', 'train', 'trout',
'tulip', 'turtle', 'wardrobe', 'whale', 'willow_tree', 'wolf', 'woman',
'worm'
]
model = keras.Sequential([
keras.layers.Flatten(input_shape=(32, 32, 3)),
keras.layers.Dense(500, activation="relu"),
keras.layers.Dense(100, activation="softmax")
])
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(train_images, train_labels, epochs=5)
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_acc)
# print(test_images)
prediction = model.predict(test_images)
answer = np.argmax(prediction[0])
print(classes[answer])
# print(train_images[0])
plt.imshow(train_images[0])
plt.show()
Hope that I can get an answer, thanks guys, I really appreciate the help.
An example of Implementing Convolutions regarding your task:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(32, 32, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(100, activation='softmax')
])
model.summary()
Try more epochs (e.g. 25):
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=25)
You gain high Accuracy on the Trainingset:
Epoch 25/25
50000/50000 [==============================] - 12s 248us/sample - loss: 0.2362 - acc: 0.9243
Let's have a look on the Test Accuracy:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_acc)
prediction = model.predict(test_images)
10000/10000 [==============================] - 1s 139us/sample - loss: 7.7701 - acc: 0.3272
0.3272
Which means your model overfits the trainingset.
Try it yourself:
image_number = 5
answer = np.argmax(prediction[image_number])
print(classes[answer])
plt.imshow(train_images[image_number])
lizard
obviously no lizard :)
Here begins the Task of Machine Learning. To improve your model to reduce bias and variance and gain high accuracy. I am still learning too, so i hope this points you in the right direction.
But be aware of this dataset. You can read on the CIFAR-100 Page:
There are 100 classes containing 600 images each. There are 500
training images and 100 testing images per class.
500 images is way too less to train a CNN and this will lead to overfitting. I think the CIFAR 10 is more likely for beginners? They mention on the same Page:
[...] 18% test error without data augmentation [...]
Which would be much more fun to experiment with :)
You mentioned in the comments that you want to see an example, here is one in Github.
What needs to be declared is that I did not personally test this code repository.
I think it is your model that is wrong. Try doing conv2D, then maxpooling several times. As for the second last dense layer, its presence in your code only depends on the number of epochs you are doing. Analyze your model with tensorboard. Also, when you are dividing the train_images by 255, best use 255.0. I do not think there are any other issues with your code. However, I would recommend using the same order of classes as given on the cifar100 website

vgg16: TypeError: __call__() missing 1 required positional argument: 'inputs'

work with medicala images, I use VGG16 for the classification of two classes, I remove the last layer (predictions (Dense)) and I add two layers, I had an accuracy of 71% during 200 epochs, I want to use my pre-training model to locate the areas in image with Grad CAM +++, so when I call my model, I got this error !!
How can I solve this problem? Is my method correct or not? please help
vgg16_model=keras.applications.vgg16.VGG16()
vgg16_model.layers.pop()`
model=Sequential()
for layer in vgg16_model.layers:
model.add(layer)
for layer in model.layers:
layer.trainable=False
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax', name='predic'))
from keras.optimizers import SGD
#model.compile(Adam(lr=0.0001),loss='categorical_crossentropy',metrics=['accuracy'])
import time `
start = time.time()
history = model.fit_generator(generator=train_batches,
epochs=epochs,
steps_per_epoch=steps_train,
#callbacks=callbacks_list,
validation_data=valid_batches,
validation_steps=steps_valid,
shuffle=True)
end = time.time()
model = model(include_top=True, weights='imagenet',input_shape=(224,224,3))
TypeError Traceback (most recent call
last) in
----> 1 model = model(include_top=True, weights='imagenet',input_shape=(224,224,3))
TypeError: call() missing 1 required positional argument: 'inputs'

'LSTM' object is not subscriptable

Here is my model code:
encoder = Embedding(input_dim=dataset.shape[0],output_dim=300, mask_zero=True, input_length=12,embeddings_initializer='uniform')
encoder = LSTM(epochs, input_shape=(train_X.shape[1], train_X.shape[2]), return_sequences=True, unroll=True)
encoder_last = encoder[:,-1,:]
and I got the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-88-3967dfedaa44> in <module>
1 encoder = Embedding(input_dim=dataset.shape[0],output_dim=300, mask_zero=True, input_length=12,embeddings_initializer='uniform')
2 encoder = LSTM(epochs, input_shape=(train_X.shape[1], train_X.shape[2]), return_sequences=True, unroll=True)
----> 3 encoder_last = encoder[:,-1,:]
TypeError: 'LSTM' object is not subscriptable
How should I fix it?
I guess you want to apply the LSTM layer on the output of Embedding layer and then take the last output of LSTM. Therefore, first you need to call (i.e. apply) the layers you have defined on some tensors (i.e. output of a layer) like this:
inp = Input(shape=...)
encoder = Embedding(...)(inp) # call embedding layer on inputs
encoder = LSTM(...)(encoder) # call lstm layer on the output of embedding layer
This way the layers are connected to each other. Then you need to use a Lambda layer to slice the LSTM layer output:
encoder_last = Lambda(lambda x: x[:,-1,:])(encoder)

Keras model giving constantly same output class?

I've been trying to build a keras model following the "cats vs dogs" tutorials but unfortunately I'm always getting the same output class "cat". I know there's been a few posts where people have the same struggle. I've tried every approach but I still couldn't figure out what I'm doing wrong. A friend of mine told me I'm not labeling the classes correctly since my accuracy ratio changes based on how many images I have for each class, but I read on the tutorials that if I have sub-directories using the "flow_from_directory" method it already labels my classes based on the name of my folders, if someone could enlighten me on what I'm doing wrong here that'd be quite helpful. Here's a small code sample of my prototype:
# MODEL CONSTRUCTION -----------------------------------------
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(128,128,3))) #(3, 150, 150)
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # this converts all our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid')) #sigmoid for binary outcome, softmax for more than two outcomes
model.compile(loss='binary_crossentropy', #since its a binary classification
optimizer='rmsprop',
metrics=['accuracy'])
#-------------------------------------------------------------
#augmentation configuration for training
train_datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
#rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
#augmentation configuration for validating
valid_datagen = ImageDataGenerator(
rotation_range=60,
width_shift_range=0.4,
height_shift_range=0.1,
zoom_range=0.1,
vertical_flip=True,)
#augmentation configuration for testing
test_datagen = ImageDataGenerator(
rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
directory='data/train', # this is the target directory
target_size=(img_width, img_height), # all images will be resized to the given dimensions
color_mode="rgb",
#classes = ['dog', 'cat'],
batch_size=batch_size,
class_mode='binary') # since we use binary_crossentropy loss, we need binary labels
# this is a similar generator, for validation data
validation_generator = valid_datagen.flow_from_directory(
directory='data/validation',
target_size=(img_width, img_height),
color_mode="rgb",
#classes = ['dog', 'cat'],
batch_size=batch_size,
class_mode='binary',
seed=42)
test_generator = test_datagen.flow_from_directory(
directory='data/test',
target_size=(img_width, img_height),
color_mode="rgb",
batch_size=1,
class_mode=None,
#shuffle=False,
)
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples / batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples / batch_size)
model.evaluate_generator(
generator=validation_generator
)
test_generator.reset()
pred=model.predict_generator(test_generator,verbose=1)
predicted_class_indices=np.argmax(pred,axis=1)
labels = (train_generator.class_indices)
labels = dict((v,k) for k,v in labels.items())
predictions = [labels[k] for k in predicted_class_indices]
Here's an image of the result when I test with some random images:
Output layer in your model is having 1 node that is activated with sigmoid activation function.
The output the model makes will be of 1 dimensional. Where each value will be less than 1 since the activation is sigmoid.
You are making predictions like this.
pred=model.predict_generator(test_generator,verbose=1)
predicted_class_indices=np.argmax(pred,axis=1)
You are making predictions with the model and then you are taking argmax on that. Since your output corresponding to each image will be a single value like 0.99 or 0.001.
Taking argmax on it will give always output 0. Hence the output you always get is 0. Which corresponds to cat.
If you want your model to make predictions properly, you must take the prediction made by the model and then map it to the class based on the threshold you need like this if you are keeping threshold as 0.5
pred=model.predict_generator(test_generator,verbose=1)
predicted_class_indices=[1 if x >= 0.5 else 0 for x in preds]
Why don't you use exact code and exact training data from the tutorial, that is recommended way to solve this kind of problems, when you mess things up and have no idea what to fix.

model.fit_generator() shape error

import os
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
img_width, img_height = 64, 64
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = sum([len(files) for files in os.walk(train_data_dir)])
nb_validation_samples = sum([len(files) for files in os.walk(validation_data_dir)])
nb_epoch = 10
model = Sequential()
model.add(Dense(4096, input_dim = 4096, init='normal', activation='relu'))
model.add(Dense(4,init='normal',activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1./255,
)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
color_mode="grayscale",
target_size=(img_width, img_height),
batch_size=1,
class_mode=None)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
color_mode="grayscale",
target_size=(img_width, img_height),
batch_size=1,
class_mode=None)
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
Everything runs fine until the model.fit_generator() in the coding above. Then it pop out errors like the the followed.
Traceback (most recent call last):
File "C:/Users/Sam/PycharmProjects/MLP/Testing Code without CNN.py", line 55, in <module>
nb_val_samples=nb_validation_samples)
File "C:\Python27\lib\site-packages\keras\models.py", line 874, in fit_generator
pickle_safe=pickle_safe)
File "C:\Python27\lib\site-packages\keras\engine\training.py", line 1427, in fit_generator
'or (x, y). Found: ' + str(generator_output))
Exception: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: [[[[ 0.19215688]
The problem should be caused by data dimension mismatch. ImageDataGenerator actually loads image files and put into numpy array in shape of (num_image_channel, image_height, image_width). But your first layer is a densely connected layer, which is looking for input data in the shape of 1D array, or 2D array with a number of samples. So essentially you are missing your input layer, which takes the input in the right shape.
Change the following line of code
model.add(Dense(4096, input_dim = 4096, init='normal', activation='relu'))
to
model.add(Reshape((img_width*img_height*img_channel), input_shape=(img_channel, img_height, img_width)))
model.add(Dense(4096, init='normal', activation='relu'))
You have to define img_channel, which is the number of channels in your images. The above code also assumes that your are using dim_ordering of th. If you are using tf input dimension ordering, you would have to change the input reshape layer to
model.add(Reshape((img_width*img_height*img_channel), input_shape=(img_height, img_width, img_channel)))
--- Old answer --
You probably have put training data and validation data into subfolders under train and validation, which isn't supported by Keras. All training data should be in one single folder, same for the validation data.
Please refer to this Keras tutorial for more details.
I am not 100% sure what you are trying to achieve but if you are trying a binary classification of pictures, try setting class_mode to binary. From the documentation:
class_mode: one of "categorical", "binary", "sparse" or None. Default:
"categorical". Determines the type of label arrays that are returned:
"categorical" will be 2D one-hot encoded labels, "binary" will be 1D
binary labels, "sparse" will be 1D integer labels.
The error message is a bit confusing but if you look at the source code, it becomes clearer:
if not hasattr(generator_output, '__len__'):
_stop.set()
raise Exception('output of generator should be a tuple '
'(x, y, sample_weight) '
'or (x, y). Found: ' + str(generator_output))

Categories