I am trying to implement a transfer learning approach in PyTorch. This is the dataset that I am using: Dog-Breed
Here's the step that I am following.
1. Load the data and read csv using pandas.
2. Resize (60, 60) the train images and store them as numpy array.
3. Apply stratification and split the train data into 7:1:2 (train:validation:test)
4. use the resnet18 model and train.
Location of dataset
LABELS_LOCATION = './dataset/labels.csv'
TRAIN_LOCATION = './dataset/train/'
TEST_LOCATION = './dataset/test/'
ROOT_PATH = './dataset/'
Reading CSV (labels.csv)
def read_csv(csvf):
# print(pandas.read_csv(csvf).values)
data=pandas.read_csv(csvf).values
labels_dict = dict(data)
idz=list(labels_dict.keys())
clazz=list(labels_dict.values())
return labels_dict,idz,clazz
I did this because of a constraint which I will mention next when I am loading the data using DataLoader.
def class_hashmap(class_arr):
uniq_clazz = Counter(class_arr)
class_dict = {}
for i, j in enumerate(uniq_clazz):
class_dict[j] = i
return class_dict
labels, ids, class_names = read_csv(LABELS_LOCATION)
train_images = os.listdir(TRAIN_LOCATION)
class_numbers = class_hashmap(class_names)
Next, I resize the image to 60,60 using opencv, and store the result as numpy array.
resize = []
indexed_labels = []
for t_i in train_images:
# resize.append(transform.resize(io.imread(TRAIN_LOCATION+t_i), (60, 60, 3))) # (60,60) is the height and widht; 3 is the number of channels
resize.append(cv2.resize(cv2.imread(TRAIN_LOCATION+t_i), (60, 60)).reshape(3, 60, 60))
indexed_labels.append(class_numbers[labels[t_i.split('.')[0]]])
resize = np.asarray(resize)
print(resize.shape)
Here in indexed_labels, I give each label a number.
Next, I split the data into 7:1:2 part
X = resize # numpy array of images [training data]
y = np.array(indexed_labels) # indexed labels for images [training labels]
sss = StratifiedShuffleSplit(n_splits=3, test_size=0.2, random_state=0)
sss.get_n_splits(X, y)
for train_index, test_index in sss.split(X, y):
X_temp, X_test = X[train_index], X[test_index] # split train into train and test [data]
y_temp, y_test = y[train_index], y[test_index] # labels
sss = StratifiedShuffleSplit(n_splits=3, test_size=0.123, random_state=0)
sss.get_n_splits(X_temp, y_temp)
for train_index, test_index in sss.split(X_temp, y_temp):
print("TRAIN:", train_index, "VAL:", test_index)
X_train, X_val = X[train_index], X[test_index] # training and validation data
y_train, y_val = y[train_index], y[test_index] # training and validation labels
Next, I loaded the data from the previous step into torch DataLoaders
batch_size = 500
learning_rate = 0.001
train = torch.utils.data.TensorDataset(torch.from_numpy(X_train), torch.from_numpy(y_train))
train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, shuffle=False)
val = torch.utils.data.TensorDataset(torch.from_numpy(X_val), torch.from_numpy(y_val))
val_loader = torch.utils.data.DataLoader(val, batch_size=batch_size, shuffle=False)
test = torch.utils.data.TensorDataset(torch.from_numpy(X_test), torch.from_numpy(y_test))
test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, shuffle=False)
# print(train_loader.size)
dataloaders = {
'train': train_loader,
'val': val_loader
}
Next, I load the pretrained rensnet model.
model_ft = models.resnet18(pretrained=True)
# freeze all model parameters
# for param in model_ft.parameters():
# param.requires_grad = False
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, len(class_numbers))
if use_gpu:
model_ft = model_ft.cuda()
model_ft.fc = model_ft.fc.cuda()
criterion = nn.CrossEntropyLoss()
# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.fc.parameters(), lr=0.001, momentum=0.9)
# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
num_epochs=25)
And then I use the train_model, a method described here in PyTorch's docs.
However, when I run this I get an error.
Traceback (most recent call last):
File "/Users/nirvair/Sites/pyTorch/TL.py",
line 244, in <module>
num_epochs=25)
File "/Users/nirvair/Sites/pyTorch/TL.py", line 176, in train_model
outputs = model(inputs)
File "/Library/Python/2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/Library/Python/2.7/site-packages/torchvision/models/resnet.py", line 149, in forward
x = self.avgpool(x)
File "/Library/Python/2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/Library/Python/2.7/site-packages/torch/nn/modules/pooling.py", line 505, in forward
self.padding, self.ceil_mode, self.count_include_pad)
File "/Library/Python/2.7/site-packages/torch/nn/functional.py", line 264, in avg_pool2d
ceil_mode, count_include_pad)
File "/Library/Python/2.7/site-packages/torch/nn/_functions/thnn/pooling.py", line 360, in forward
ctx.ceil_mode, ctx.count_include_pad)
RuntimeError: Given input size: (512x2x2). Calculated output size: (512x0x0). Output size is too small at /Users/soumith/code/builder/wheel/pytorch-src/torch/lib/THNN/generic/SpatialAveragePooling.c:64
I can't seem to figure out what's going wrong here.
Your network is too deep for the size of images you are using (60x60). As you know, the CNN layers do produce smaller and smaller feature maps as the input image propagate through the layers. This is because you are not using padding.
The error you have simply says that the next layer is expecting 512 feature maps with a size of 2 pixels by 2 pixels. The actual feature map produced from the forward pass was 512 maps of size 0x0. This mismatch is what triggered the error.
Generally, all stock networks, such as RESNET-18, Inception, etc, require the input images to be of the size 224x224 (at least). You can do this easier using the torchvision transforms[1]. You can also use larger image sizes with one exception for the AlexNet which has the size of feature vector hardcoded as explained in my answer in [2].
Bonus Tip: If you are using the network in pre-tained mode, you will need to whiten the data using the parameters in the pytorch documentation at [3].
Links
http://pytorch.org/docs/master/torchvision/transforms.html
https://stackoverflow.com/a/46865203/7387369
http://pytorch.org/docs/master/torchvision/models.html
Related
I am very new to machine learning and python in general. I'm working on a project requiring to make an image classification model. I've read the data from my local disk using tf.keras.preprocessing.image_dataset_from_directory and now I'm trying to extract x_val and y_val to generate a skilearn.metrics.classification_report with.
The issues is that whenever I call:
y_val = np.concatenate([y_val, np.argmax(y.numpy(), axis=-1)])`
I get the following error and I have no idea why or how to fix it
y_val = np.concatenate([y_val, np.argmax(y.numpy(), axis=-1)])
File "<array_function internals>", line 5, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 0 dimension(s)`
Here's my code
#data is split into train and validation folders with 6 folders in each representing a class, like this:
#data/train/hamburger/<haburger train images in here>
#data/train/pizza/<pizza train images in here>
#data/validation/hamburger/<haburger test images in here>
#data/validation/pizza/<pizza test images in here>
#training_dir = ......
validation_dir = pathlib.Path('path to data dir on local disk')
#hyperparams
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
training_dir,
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
validation_dir,
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
class_names = train_ds.class_names
print(class_names)
print(val_ds.class_names)
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
resize_and_rescale = tf.keras.Sequential([
layers.experimental.preprocessing.Resizing(img_height, img_width),
layers.experimental.preprocessing.Rescaling(1./255)
])
#normalization, augmentation, model layers
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
model.summary()
start_time = time.monotonic()
epochs = 1
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
#plot
#testing random image
test_path = pathlib.Path('C:/Users/adi/Desktop/New folder/downloads/hamburger/images(91).jpg')
img = keras.preprocessing.image.load_img(
test_path, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch
predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])
print(
"This image most likely belongs to {} with a {:.2f} percent confidence."
.format(class_names[np.argmax(score)], 100 * np.max(score))
)
x_val = np.array([])
y_val = np.array([])
for x, y in val_ds:
x_val = np.concatenate([x_val, np.argmax(model.predict(x), axis=-1)])
y_val = np.concatenate([y_val, np.argmax(y.numpy(), axis=-1)]) #<----- crashes here
print(classification_report(y_val, x_val, target_names = ['doughnuts (Class 0)','french_fries (Class 1)', 'hamburger (Class 2)','hot_dog (Class 3)', 'ice_cream (Class 4)','pizza (Class 5)']))
Any ideas why I'm getting this error and how I can fix it. Or alternatively how can I get what I need to make classification_report work. Thank you.
You don't need argmax operation while getting the true classes.
Since you did not specify class_mode in tf.keras.preprocessing.image_dataset_from_directory, labels are sparse which means they are not one-hot-encoded.
If you had one-hot-encoded vector labels, above code of yours would be correct.
Another thing is that renaming your arrays should be better like this and when predicting one image at a time, you can use model(x) which is more efficient. Correct code should be:
predicted_classes = np.array([])
labels = np.array([])
for x, y in val_ds:
predicted_classes = np.concatenate([predicted_classes, np.argmax(model(x), axis=-1)])
labels = np.concatenate([labels, y.numpy()])
I am slightly losging my mind about a simple task. I want to implement a simple RandomForestClassifier on Images using the tf.estimator.BoostedTreesClassifier (Gradient Boosted Tree is good enough although the difference is clear to me). I'm following the https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding guide. I swapped the
# Use entire batch since this is such a small dataset.
NUM_EXAMPLES = len(y_train)
def make_input_fn(X, y, n_epochs=None, shuffle=True):
def input_fn():
dataset = tf.data.Dataset.from_tensor_slices((X.to_dict(orient='list'), y))
if shuffle:
dataset = dataset.shuffle(NUM_EXAMPLES)
# For training, cycle thru dataset as many times as need (n_epochs=None).
dataset = (dataset
.repeat(n_epochs)
.batch(NUM_EXAMPLES))
return dataset
return input_fn
with my own function looking like this
# LOADING IMAGES USING TENSORFLOW
labels = ['some','fancy','labels']
batch_size = 32
datagen = ImageDataGenerator(
rescale=1. / 255,
data_format='channels_last',
validation_split=0.1,
dtype=tf.float32
)
train_generator = datagen.flow_from_directory(
'./images',
classes=labels,
target_size=(128, 128),
batch_size=batch_size,
class_mode='categorical',
shuffle=True,
subset='training',
seed=42
)
valid_generator = datagen.flow_from_directory(
'./images',
classes=labels,
target_size=(128, 128),
batch_size=batch_size,
class_mode='categorical',
shuffle=False,
subset='validation',
seed=42
)
# THE SWAPPED FUNCTION:
NUM_FEATURES = 128 * 128
NUM_EXAMPLES = len(train_generator)
def make_input_fn(gen, n_epochs=None, shuffle=True):
def input_fn():
dataset = tf.data.Dataset.from_generator(gen, (tf.float32, tf.int32))
if shuffle:
dataset = dataset.shuffle(NUM_EXAMPLES)
# For training, cycle thru dataset as many times as need (n_epochs=None).
dataset = (dataset
.repeat(n_epochs)
.batch(NUM_EXAMPLES))
return dataset
return input_fn
def _generator_(tf_gen):
print(len(tf_gen))
def arg_free():
for _ in range(len(tf_gen)):
X, y = next(iter(tf_gen))
X = X.reshape((len(X), -1))
print(X.shape)
yield X, y
return arg_free()
_gen = _generator_(train_generator)
print(callable(g_gen)) # returns Fals WHY?!
I dont understand why this is not working and why on earth nobody ever thaught about making a simple enough tutorial (or why I am not able to find it :D). If you are asking yourself, why I want to use the RandomForest and not regular Deep Learning aproaches. The RF is set by the supervising Authorithy as well as it has to be TF (and not e.g. sklearn).
Anyway, any help will be appreciated.
I'm trying to run a simple MNIST neural net on multiple cluster nodes (3 nodes with 1 GPU each), but it keeps stopping before the first epoch prints. I'm able to get all the nodes to sync, but right before it starts (maybe in the model.fit function) it just stops and doesn't do anything.
Any help is appreciated!
My TF_CONFIG looks like this:
TF_CONFIG='{"cluster": {"worker": ["ip1:88888", "ip2:88888", "ip3:88888"]}, "task": {"index": 0, 1, or 2, "type": "worker"}}' python fileName.py
And my code looks like this:
import tensorflow as tf
from tensorflow import keras
import time
def get_compiled_model():
# Make a simple 2-layer densely-connected neural network.
inputs = keras.Input(shape=(784,))
x = keras.layers.Dense(256, activation="relu")(inputs)
x = keras.layers.Dense(256, activation="relu")(x)
outputs = keras.layers.Dense(10)(x)
model = keras.Model(inputs, outputs)
model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
return model
def get_dataset():
batch_size = 32
num_val_samples = 10000
# Return the MNIST dataset in the form of a `tf.data.Dataset`.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Preprocess the data (these are Numpy arrays)
x_train = x_train.reshape(-1, 784).astype("float32") / 255
x_test = x_test.reshape(-1, 784).astype("float32") / 255
y_train = y_train.astype("float32")
y_test = y_test.astype("float32")
# Reserve num_val_samples samples for validation
x_val = x_train[-num_val_samples:]
y_val = y_train[-num_val_samples:]
x_train = x_train[:-num_val_samples]
y_train = y_train[:-num_val_samples]
return (
tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(batch_size),
tf.data.Dataset.from_tensor_slices((x_val, y_val)).batch(batch_size),
tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(batch_size),
)
# Create a MirroredStrategy.
strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
print("Number of devices: {}".format(strategy.num_replicas_in_sync))
# Open a strategy scope.
with strategy.scope():
# Everything that creates variables should be under the strategy scope.
# In general this is only model construction & `compile()`.
model = get_compiled_model()
# Train the model on all available devices.
train_dataset, val_dataset, test_dataset = get_dataset()
start = time.time()
print("Fit")
model.fit(train_dataset, epochs=1, verbose=1, validation_data=val_dataset, steps_per_epoch=25)
end = time.time()
print("Time:", end - start)
# Test the model on all available devices.
model.evaluate(test_dataset)
I want to use TFRecords in order to train a model on ImageNet (code below). All is well, and training goes just fine, except that validation seems to be performed on the training set and not the validation set.
I tested this by dumbing the training set down and doing nothing to the validation set, which led to both training and validation accuracy being 100%.
When I use fit_generator and flow_from_directory there is no such issue.
This is how I create the TFRecordDatasets:
class ImageNetTFRecordDataset:
def __init__(self, dir, batch_size):
self._dir = dir
self._batch_size = batch_size
#staticmethod
def _parse_function(proto):
# read image
keys_to_features = {'image/encoded': tf.FixedLenFeature([], tf.string),
'image/class/label': tf.FixedLenFeature([], tf.int64)}
# Load one example
parsed_features = tf.parse_single_example(proto, keys_to_features)
# Turn your saved image string into an array
parsed_features['image/encoded'] = tf.image.decode_jpeg(parsed_features['image/encoded'])
parsed_features['image/encoded'] = tf.image.convert_image_dtype(parsed_features['image/encoded'], tf.float32)
parsed_features['image/encoded'] = tf.image.per_image_standardization(parsed_features['image/encoded'])
return parsed_features['image/encoded'], parsed_features["image/class/label"]
def create_dataset(self):
dir_files = os.listdir(self._dir)
dataset = tf.data.TFRecordDataset([os.path.join(self._dir, f) for f in dir_files])
# This dataset will go on forever
dataset = dataset.repeat()
# Set the number of datapoints you want to load and shuffle
# dataset = dataset.shuffle(self._batch_size * 10)
# Maps the parser on every filepath in the array. You can set the number of parallel loaders here
dataset = dataset.map(self._parse_function, num_parallel_calls=8)
# Set the batchsize
dataset = dataset.batch(self._batch_size)
dataset = dataset.prefetch(4)
# Create an iterator
iterator = dataset.make_one_shot_iterator()
# Create your tf representation of the iterator
image, label = iterator.get_next()
# Bring your picture back in shape
image = tf.reshape(image, [-1, 224, 224, 3])
# Create a one hot array for your labels
label = tf.one_hot(label, 1000)
return image, label
And this is how I train the model:
train_dataset = ImageNetTFRecordDataset(train_dir, BATCH_SIZE)
val_dataset = ImageNetTFRecordDataset(val_dir, BATCH_SIZE)
x, y = train_dataset.create_dataset()
x_val, y_val = val_dataset.create_dataset()
model = get_model(x) # this model's first layer is Input(tensor=x)
model.compile(optimizer=OPTIMIZER, loss=categorical_crossentropy,
metrics=get_metrics(), target_tensors=[y])
model.fit(epochs=EPOCHS, verbose=1, validation_data=(x_val, y_val),
steps_per_epoch=150, callbacks=get_callbacks(),
validation_steps=VAL_STEPS)
Any ideas where things go wrong? Why does the validation part feed off the training set?
Right now I'm using keras with tensorflow backend.
The dataset was stored in the tfrecords format. Training without any validation set is working, but how to integrate my validation-tfrecords also?
Lets assume this code as coarse skeleton:
def _ds_parser(proto):
features = {
'X': tf.FixedLenFeature([], tf.string),
'Y': tf.FixedLenFeature([], tf.string)
}
parsed_features = tf.parse_single_example(proto, features)
# get the data back as float32
parsed_features['X'] = tf.decode_raw(parsed_features['I'], tf.float32)
parsed_features['Y'] = tf.decode_raw(parsed_features['Y'], tf.float32)
return parsed_features['X'], parsed_features['Y']
def datasetLoader(dataSetPath, batchSize):
dataset = tf.data.TFRecordDataset(dataSetPath)
# Maps the parser on every filepath in the array. You can set the number of parallel loaders here
dataset = dataset.map(_ds_parser, num_parallel_calls=8)
# This dataset will go on forever
dataset = dataset.repeat()
# Set the batchsize
dataset = dataset.batch(batchSize)
# Create an iterator
iterator = dataset.make_one_shot_iterator()
# Create your tf representation of the iterator
X, Y = iterator.get_next()
# Bring the date back in shape
X = tf.reshape(I, [-1, 66, 198, 3])
Y = tf.reshape(Y,[-1,1])
return X, Y
X, Y = datasetLoader('PATH-TO-DATASET', 264)
model_X = keras.layers.Input(tensor=X)
model_output = keras.layers.Conv2D(filters=16, kernel_size=3, strides=1, padding='valid', activation='relu',
input_shape=(-1, 66, 198, 3))(model_X)
model_output = keras.layers.Dense(units=1, activation='linear')(model_output)
model = keras.models.Model(inputs=model_X, outputs=model_output)
model.compile(
optimizer=optimizer,
loss='mean_squared_error',
target_tensors=[Y]
)
parallel_model.fit(
epochs=epochs,
steps_per_epoch=stepPerEpoch,
shuffle=False,
validation_data=????
)
The question is, how to pass the validation set?
I have found something related here: gcloud-ml-engine-with-keras, but I'm not sure how to fit this into my problem.
First, You don't need to use iterator. Keras model will accept dataset object instead separate data/labels parameters, and will handle iteration. You only need to specify steps_per_epoch, hence you need to know dataset size. If you have separate tfrecords file for train/validation, then you can just create dataset object and pass it to validation_data. If you have one file and you'd like to split it, you can do
dataset = tf.data.TFRecordDataset('file.tfrecords')
dataset_train = dataset.take(size)
dataset_val = dataset.skip(size)
...
Ok I found the answer myself: basically it's done by simply change import keras toimport tensorflow.keras as keras. Tf.keras allows you to pass the validation set also as tensor:
X, Y = datasetLoader('PATH-TO-DATASET', 264)
X_val, Y_val = datasetLoader('PATH-TO-VALIDATION-DATASET', 264)
# ... define and compile the model like above
parallel_model.fit(
epochs= epochs,
steps_per_epoch= STEPS_PER_EPOCH,
shuffle= False,
validation_data= (X_val, Y_val),
validation_steps= STEPS_PER_VALIDATION_EPOCH
)