Error : Could not read image in Google Colab

Error : Could not read image in Google Colab - python

I am following this tutorial to build a custom-made object detection model on Detect.
https://www.analyticsvidhya.com/blog/2021/06/simplest-way-to-do-object-detection-on-custom-datasets/
I have collected and labelled my images, put them on my Drive and I am running the following code snippet to train the model which is part of a Python Notebook on Google Colab:
Train_dataset = core.Dataset('/content/drive/My Drive/training model/Training',transform=custom_transforms)#L1
Test_dataset = core.Dataset('/content/drive/My Drive/training model/Test')#L2
loader=core.DataLoader(Train_dataset, batch_size=2, shuffle=True)#L3
model = core.Model(['black car', 'grey car','white truck'])#L4
losses = model.fit(loader, Test_dataset, epochs=25, lr_step_size=5, learning_rate=0.001, verbose=True)#L5
plt.plot(losses)
plt.show()
However, I keep getting the following error shortly after the first model epoch starts :
ValueError: Could not read image /content/drive/My Drive/training model/Training/frame22.jpg
It gives this error randomly, not only with frame22 but with other frames also that are not present in this directory. I tried to remount my Drive with enabling force_remount at the beginning of the script, but the error persists.

I checked the code of the core.Dataset implementation from Detecto and I confirm what I said in my comments.
The index is created by getting all the .xml annotation files and creating an index that maps them to their image. It does not check that the image is actually there.
For the image filename, it uses the one that is inside the xml file, not the name of the xml file. See below a view of an annotation XML file, where you see the filename attribute. If you change the name of your image, you need to change it inside the xml file.

Related

Is there a good way to write 2d arrays or tensors to TFRecords in Tensorflow?

I am currently working on a project using audio data. The first step of the project is to use another model to produce features for the audio example that are about [400 x 10_000] for each wav file and each wav file will have a label that I'm trying to predict. I will then build another model on top of this to produce my final result.
I don't want to run preprocessing every time I run the model, so my plan was to have a preprocessing pipeline that runs the feature extraction model and saves it into a new folder and then I can just have the second model use the saved features directly. I was looking at using TFRecords, but the documentation is quite unhelpful.
tf.io.serialize_tensor
tfrecord
This is what I've come up with to test it so far:
serialized_features = tf.io.serialize_tensor(features)
feature_of_bytes = tf.train.Feature(
bytes_list=tf.train.BytesList(value=[serialized_features.numpy()]))
features_for_example = {
'feature0': feature_of_bytes
}
example_proto = tf.train.Example(
features=tf.train.Features(feature=features_for_example))
filename = 'test.tfrecord'
writer = tf.io.TFRecordWriter(filename)
writer.write(example_proto.SerializeToString())
filenames = [filename]
raw_dataset = tf.data.TFRecordDataset(filenames)
for raw_record in raw_dataset.take(1):
example = tf.train.Example()
example.ParseFromString(raw_record.numpy())
print(example)
But I'm getting this error:
tensorflow.python.framework.errors_impl.DataLossError: truncated record at 0' failed with Read less bytes than requested
tl;dr:
Getting the above error with TFRecords. Any recommendations to get this example working or another solution not using TFRecords?

Not able to save/write sequence of images into specific folder using OpenCV2

I am trying to save sequence of predicted images into specific folder but its not working. Used below code and its running but not writing/saving the images into "results" folder.
save_image_path = f"results/{image_name}"
cv2.imwrite(save_image_path, cat_image)
Tried with full folder path also but its not working and error is occurring.
save_image_path = r'D:\Medical Imaging\Code\segmentation\results'
cv2.imwrite(save_image_path/{image_name}, cat_image)
Tried without extension and error is occurring.
save_image_path = r'D:\Medical Imaging\Code\segmentation\results'
cv2.imwrite(save_image_path, cat_image)
I am using pycharm IDE, Please suggest/guide if possible.

You need to spesify the file type ( ".jpg",".png",..) as;
save_image_path = f"results/{image_name}.jpg"
cv2.imwrite(save_image_path, cat_image)

vocab_model.docvecs.doctag_syn0.npy not generated after saving doc2vec model

Following is the code :-
modelDoc = Doc2Vec(size=300, window=5, dm=0, dbow_words=1, hs=0, negative=10, alpha=0.05, min_count=20,
workers=cores, sample=1e-5, seed=0, iter=10)
modelDoc.build_vocab(finalSent)
modelDoc.save(save_model)
my version :
gensim==3.8.1
numpy==1.16.2
after saving the model
only vocab_model file is generated
vocab_model.docvecs.doctag_syn0.npy is not generated.
what is the use of this file and does it is necessary to generate this file.

Were there any errors shown during the .save()?
Does the saved file load & work as expected? (In this case, since the original model wasn't trained, does it train alright as if the save-then-load hadn't happened?)
If there's no error, & it works, it's fine.
(Wha's the reason that a file of this name was expected, and its absence was a concern?)

Keras flow_from_dataframe gives 0 images

I am trying to use the flow_from_dataframe method of Keras to read training and testing images.
Both my training and testing images are in same directory, and I read the paths from two different csv files.
My code for reading test images looks like,
# Read test file
testdf = pd.read_csv("test.csv")
# load images
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_dataframe(
dataframe=testdf, directory=IMAGE_PATH,
x_col='image_name', y_col=None,
has_ext=True, target_size=(10,10)
,batch_size=32,color_mode='rgb',shuffle=False, class_mode=None)
I get output like this
Found 0 images.
While the similar code for reading training data works properly. I checked if the images exist at the given path, which they do. What are some possible reasons for this error? How can I try to debug the issue?
EDIT: This is a regression task, so all images are in a single directory, and not in subdirectories, as would be expected for a classification task.
EDIT 2: I added usecols=[0] to read_csv, and now test_datagen finds all the images in the directory, and not just the one's that are mentioned in the test.csv file

The issue happens due to NaN's in the dataframe. Ignoring those columns doesn't work. The solution is to replace the NaN's with something else. For example,
testdf = pd.read_csv("test.csv")
testdf.fillna(0, inplace=True)
This replaces the NaN's with 0. Then using ImageDataGenerator as usual works.

I was also facing the same error and found a solution for this.
I was using the absolute path, was using correct DataFrame and everything was fine still the code was throwing an error - "image not found".
I inspected and found that my dataframe was containing image names without extension and the images in the folder was having extension also.
E.g. The image name in DataFrame was 'abc' but the image in the folder was having a name 'abc.png'.
Just add .png in the image names in DataFrame and it will solve your problem.
I just tried below code and it worked out..!!!!
def append_ext(fn):
return fn+".png"
train_valid_data["id_code"]=train_valid_data["id_code"].apply(append_ext)
test_data["id_code"]=test_data["id_code"].apply(append_ext)
Let me know if it solves your problem or if you need any further explanation.

I have the same problem. First, make sure you got the absolute path correctly for the parameter directory.
The filename in my df has value image.pgm.png and the actual image file in the folder has the format image.pgm.
I tried to change the filename in df to image.pgm => Still not working
I renamed the image file from image.pgm to image.pgm.png which matches exactly the format in the df => Worked!

I had the same error,
What I found is that I missed the directory path, and the image extension that was not in the data frame,
So make sure that your directory path is correct and an extension to your image, as you can do the following:
def extention_train_data(x):
return x+".jpg"
change the jpg extension if you have an other one.
then you apply this to you data frame:
train_data['image'] = train_data['image_id'].apply(extention_train_data)
once you have the image column containing your image with its extension then
train_generator = datagen.flow_from_dataframe(
train_data,
directory="/kaggle/input/plant-pathology-2020-fgvc7/images/",
x_col = "image",
y_col = "label",
target_size = size,
class_mode = "binary",
batch_size = batch_size,
subset="training",
shuffle = True,
seed = 42,
)

Okay, so I have been having the same issues. Where my data labels were in a csv file , and the image data in a separate folder.I thought, the issue was being caused by the labels and the images in the folder not aligning properly.Did a whole bunch of stuff to rectify and process the data. It was not the problem.
So, anyone who's having issues.
I tried #Oussama Ouardini's answer and it worked. Thank you!
I am also going to add - that if you are doing a train and validation split to make sure the initial ImageDataGenerator object you create has the validation split specified.
def extension_train_data(x):
return "xc"+str(x)+".png"
train_df['file_id'] = train_df['file_id'].apply(extension_train_data)
Here is my code -
datagen=ImageDataGenerator(rescale=1./255,validation_split=0.2)
#rescale all pixel values from 0-255, so after this step all our
#pixel values are in range (0,1)
train_generator=datagen.flow_from_dataframe(dataframe=train_df,directory='./img_data/', x_col="file_id", y_col="english_cname",
class_mode="categorical",save_to_dir='./new folder/',
target_size=(64,64),subset="training",
seed=42,batch_size=32,shuffle=False)
val_generator=datagen.flow_from_dataframe(dataframe=train_df,directory='./img_d
ata/', x_col="file_id", y_col="english_cname",
class_mode="categorical",
target_size=(64,64),subset="validation",
seed=42,batch_size=32,shuffle=False)
print("\n Sanity check Line.--------")
My output was a succesfully validated image files. :)
Found 212 validated image filenames belonging to 88 classes.
Found 52 validated image filenames belonging to 88 classes.
Sanity check Line.----------
I hope someone will find this useful. Cheers!

Export Tensorflow Estimator

I'm trying to build a CNN with Tensorflow (r1.4) based on the API tf.estimator. It's a canned model. The idea is to train and evaluate the network with estimator in python and use the prediction in C++ without estimator by loading a pb file generated after the training.
My first question is, is it possible?
If yes, the training part works and the prediction part works too (with pb file generated without estimator) but it doesn't work when I load a pb file from estimator.
I got this error : "Data loss: Can't parse saved_model.pb as binary proto"
My pyhon code to export my model :
feature_spec = {'input_image': parsing_ops.FixedLenFeature(dtype=dtypes.float32, shape=[1, 48 * 48])}
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
input_fn = tf.estimator.inputs.numpy_input_fn(self.eval_features,
self.eval_label,
shuffle=False,
num_epochs=1)
eval_result = self.model.evaluate(input_fn=input_fn, name='eval')
exporter = tf.estimator.FinalExporter('save_model', export_input_fn)
exporter.export(estimator=self.model, export_path=MODEL_DIR,
checkpoint_path=self.model.latest_checkpoint(),
eval_result=eval_result,
is_the_final_export=True)
It doesn't work neither with tf.estimator.Estimator.export_savedmodel()
If one of you knows an explicit tutorial on estimator with canned model and how to export it, I'm interested

Please look at this issue on github, it looks like you have the same problem. Apparently (at least when using estimator.export_savedmodel) you should load the graph with LoadSavedModel instead of ReadBinaryProto, because it's not saved as a graphdef file.
You'll find here a bit more instructions about how to use it:
const string export_dir = ...
SavedModelBundle bundle;
...
LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain},
&bundle);
I can't seem to find the SavedModelBundle documentation for c++ to use it afterwards, but it's likely close to the same class in Java, in which case it basically contains the session and the graph you'll be using.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Error : Could not read image in Google Colab - python

Related

Is there a good way to write 2d arrays or tensors to TFRecords in Tensorflow?

Not able to save/write sequence of images into specific folder using OpenCV2

vocab_model.docvecs.doctag_syn0.npy not generated after saving doc2vec model

Keras flow_from_dataframe gives 0 images

Export Tensorflow Estimator

Categories

Resources