Memory error while reading lots of images - python

Reading 10000 images using OpenCv python
for filename in os.listdir(directory)
img = cv2.imread(os.path.join(directory,filename))
#Different Image preprocessing function applied after reading it
#After 700 image preprocess it gives memory error
How to resolve this error

We overcome this problem by using the Global image read variable and empty the variable after getting certain information.

Related

tf.image.decode_jpeg often taking forever to load file

The following code is part of my code for a tf graph to read images. When I use this code to iterate through the data, the program gets stuck in tf.io.read_file(path) after a few hundred images forever and doesn't do anything. More specifically, the code even can't be paused and I had to restart the session every time.
#tf.function()
def read_image(path):
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image)
return image
...
div8k_list=[os.path.join(div8k_save_path, x) for x in os.listdir(div8k_save_path)]
train_path = tf.data.Dataset.from_tensor_slices(div8k_list)
train_images = train_path.map(read_image, num_parallel_calls=tf.data.AUTOTUNE)
I first suspected that there were a few corrupted images or wrong paths in the data that were causing this problem and tested the following code.
for path in train_path:
print(path)
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image)
Surprisingly, there was no common characteristic of the image path the loop was stuck. And it was not a problem of the image because the loop was once stuck at 1056.png but when I explicitly loaded 1056.png, there was no problem.
What could be the cause of this problem?
edit: to summarize, the program is stuck at read_image forever, while I couldn't find a problem in the dataset.
My dataset is the DIV8K dataset and I am running in COLAB.
EDIT The function that is slowing my code is decode_jpeg, because the following definition of read_image worked multiple times.
#tf.function()
def read_image(path):
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image)
return image
As mentioned in the comment, try the following function to decode the image file as it can handle mixed extension file format (jpg, png etc), ref.
tf.io.decode_image(image, expand_animations = False)
However, decode_jpeg should also able to handle the .png file format now. Without the image file, it's hard to break down what's causing to prevent this. Most probably the file is somehow corrupted or not the valid extension for the decode_jpeg, though it's named that way, check this solution.

update data set after extracting features from face in python using pickle

I am new in python and following this article https://www.mygreatlearning.com/blog/face-recognition/#whatsopencv to extract features from face. After trying it, I realise that I do looping inside directory Images for each image and then save it inside face_enc:
datas = {"encodings": knownEncodings, "names": knownNames}
f = open("face_enc", "wb")
f.write(pickle.dumps(datas))
f.close()
So, the thing that makes me confused is, let's say that I have 50 images inside Images directory then I add another 100 images (just for example), so I will do looping from the start (1-150 images) and then save it in face_enc. Is there a way to update data inside face_enc without saving it from the start to saving time?

Augmenting images in a dataset - encountering ValueError: Could not find a format to read the specified file in mode 'i'

I'm in a beginner neural networks class and am really struggling.
I have a dataset of images that isn't big enough to train my network with, so I'm trying to augment them (rotate/noise addition etc.) and add the augmented images onto the original set. I'm following the code found on Medium: https://medium.com/#thimblot/data-augmentation-boost-your-image-dataset-with-few-lines-of-python-155c2dc1baec
However, I'm encountering ValueError: Could not find a format to read the specified file in mode 'i'
Not sure what this error means or how to go about solving it. Any help would be greatly appreciated.
import random
from scipy import ndarray
import skimage as sk
from skimage import transform
from skimage import util
path1 = "/Users/.../"
path2 = "/Users/.../"
listing = os.listdir(path1)
num_files_desired = 1000
image = [os.path.join(path2, f) for f in os.listdir(path2) if os.path.isfile(os.path.join(path2, f))]
num_generated_files = 0
while num_generated_files <= num_files_desired:
image_path = random.choice(image)
image_to_transform = sk.io.imread(image_path)
137 if format is None:
138 raise ValueError(
--> 139 "Could not find a format to read the specified file " "in mode %r" % mode
140 )
141
ValueError: Could not find a format to read the specified file in mode 'i'
I can see few possiblities. Before passing to them. I'd like to express what is your error. It's basically an indicator that your images cannot be read by sk.io.imread(). Let me pass to the possible things to do:
Your [os.path.join(path2, f) for f in os.listdir(path2) if os.path.isfile(os.path.join(path2, f))] part may not give the image path correctly. You have to correct it manually. If so, you can manually give the exact folder without doing such kind of a loop. Just simply use os.listdir() and read the files manually.
You can also use glob to read the files that having same extension like .jpg or stuff.
Your files may be corrupted. You can simply eliminate them by using PIL and read the images with PIL like image = Image.open() first and use image.verify() method.
Try to read about sk.io.imread(filename, plugin='' the plugin part may resolve your issue.
Hope it helps.

What is the best way to load a CCITT T.3 compressed tiff using python?

I am trying to load a CCITT T.3 compressed tiff into python, and get the pixel matrix from it. It should just be a logical matrix.
I have tried using pylibtiff and PIL, but when I load it with them, the matrix it returns is empty. I have read in a lot of places that these two tools support loading CCITT but not accessing the pixels.
I am open to converting the image, as long as I can get the logical matrix from it and do it in python code. The crazy thing is is that if I open one of my images in paint, save it without altering it, then try to load it with pylibtiff, it works. Paint re-compresses it to the LZW compression.
So I guess my real question is: Is there a way to either natively load CCITT images to matricies or convert the images to LZW using python??
Thanks,
tylerthemiler
It seems the best way is to not use Python entirely but lean on netpbm:
import Image
import ImageFile
import subprocess
tiff = 'test.tiff'
im = Image.open(tiff)
print 'size', im.size
try:
print 'extrema', im.getextrema()
except IOError as e:
print 'help!', e, '\n'
print 'I Get by with a Little Help from my Friends'
pbm_proc = subprocess.Popen(['tifftopnm', tiff],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
(pbm_data, pbm_error) = pbm_proc.communicate()
ifp = ImageFile.Parser()
ifp.feed(pbm_data)
im = ifp.close()
print 'conversion message', pbm_error,
print 'extrema', im.getextrema()
print 'size', im.size
# houston: we have an image
im.show()
Seems to do the trick:
$ python g3fax.py
size (1728, 2156)
extrema help! decoder group3 not available
I Get by with a Little Help from my Friends
conversion message tifftopnm: writing PBM file
extrema (0, 255)
size (1728, 2156)
How about running tiffcp with subprocess to convert to LZW (-c lzw switch), then process normally with pylibtiff? There are Windows builds of tiffcp lying around on the web. Not exactly Python-native solution, but still...

Load image from string

Given a string containing jpeg image data, is it possible to load this directly in pygame?
I've tried using StringIO but failed and I don't completely understand the 'file-like' object concept.
Currently, as a workaround, I'm saving to disk and then loading an image the standard way:
# imagestring contains a jpeg
f=open('test.jpg','wb')
f.write(imagestring)
f.close()
image=pygame.image.load('test.jpg')
Any suggestions on improving this so that we avoid creating a temp file?
fstr = cStringIO.StringIO(simage)
pygame.image.load(fstr, namehint="somethinguseful")

Categories