Processing Multiple Images - python

I am writing an image processing program, which works well, but I need to process multiple images.
First, I made an array of images:
images = ((image1.tif),
(image2.tif),
(image3.tif))
Then, I created a for loop:
for image in images:
dna = cv2.imread(image)
{code}
The problem is, whenever I run the code, the console returns an error of
TypeError: expected string or Unicode object, tuple found
At this line:
dna = cv2.imread(image)
It seems that the program is trying to process the whole array at once. I thought that the loop worked by processing one image in the array at a time? Can anybody help me with this?

You should wrap the filenames using single or double quotes:
images = (('image1.tif'),
('image2.tif'),
('image3.tif'))
You can also use list instead of tuples:
images = ['image1.tif', 'image2.tif', image3.tif']

Use:
images = (("image1.tif"),
("image2.tif"),
("image3.tif"))

Related

pdf2image conversion of multi page PDFs to images returns the last page on all images

So when I use the pdf2image python import, and pass a multi page PDF into the convert_from_bytes()- or convert_from_path() method, the output array does contain multiple images - but all images are of the last PDF page (whereas I would've expected that each image represented one of the PDF pages).
The output looks something like this:
Any idea on why this would occur? I can't find any solution to this online. I've found some vague suggestion that the use_cropbox argument might be used, but modifying it has no effect.
def convert(opened_file)
# Read PDF and convert pages to PPM image objects
try:
_ppm_pages = self.pdf2image.convert_from_bytes(
opened_file.read(),
grayscale = True
)
except Exception as e:
print(f"[CreateJPEG] Could not convert PDF pages to JPEG image due to error: \n '{e}'")
return
# Do stuff with _ppm_pages
for img in _ppm_pages:
img.show() # ...all images in that list are of the last page
Sometimes the output is an empty 1x1 image, instead, which I also haven't found a reason for. So if you have any idea what that is about, please do let me know!
Thanks in advance,
Simon
EDIT: Added code.
EDIT: So, when I try this in a random notebook, it actually works fine.
I've removed a few detours I used in my original code, and now it works. Still not sure what the underlying reason was though...
All the same, thanks for your help, everyone!
I'm using this right now....
from pdf2image import convert_from_path
imgSet = convert_from_path(pathToPDF, 500)
That gives me a list of images within imgSet
I guess you have to do something like this as described in the unit tests of the package.
with open("./tests/test.pdf", "rb") as pdf_file:
images_from_bytes = convert_from_bytes(pdf_file.read(), fmt="jpg")
self.assertTrue(images_from_bytes[0].format == "JPEG")

Memory issue with cv.imread

I trying to read a large number (54K) of 512x512x3 .png images into an array to create a dataset afterwards and feed to a Keras model. I am using the code below, however I am getting the cv2.OutofMemory error (at around image 50K...) pointing to the fourth line of my code. I have been reading a bit about it, and: I am using the 64bit version, and the images can not be resized as it is a fixed input representation. Is there anything that can be done from a memory management side of things to make it work?
'''
#Images (512x512x3)
X_data = []
files = glob.glob ('C:\Users\77901677\Projects\images1\*.png')
for myFile in files:
image = cv2.imread (myFile)
X_data.append (image)
dataset_image = np.array(X_data)
# Annontations (multilabel) 512x512x2
Y_data = []
files = glob.glob ('C:\\Users\\77901677\\Projects\\annotations1\\*.png')
for myFile in files:
mask = cv2.imread (myFile)
# Gets rid of first channel which is empty
mask = mask[:,:,1:]
Y_data.append (mask)
dataset_mask = np.array(Y_data)
'''
Any ideas or advices are welcome
You can reduce the memory by cutting one of your variables, because you have 2x the array at the moment.
You could use yield for this, thus creating a generator, which will only load your file one at a time, instead of storing it all in an auxiliary variable.
def myGenerator():
files = glob.glob ('C:\\Users\\77901677\\Projects\\annotations1\\*.png')
for myFile in files:
mask = cv2.imread (myFile)
# Gets rid of first channel which is empty
yield mask[:,:,1:]
# initialise your numpy array here
yData = np.zeros(NxHxWxC)
# initialise the generator
mygenerator = myGenerator() # create a generator
for I, data in enumerate(myGenerator):
yData[I,::] = data # load the data
But, this is not optimal for you. If you plan to train a model in the next step, you will have memory issues for sure. In keras, you can additionally implement a Keras Sequence Generator, which will load your files in batches (similarly to this yield generator) to your model in the training stage. I recommend this article here, which demonstrates an easy implementation of it, it's what I use for my keras/tf model pipelines.
It's good practice to use generators when feeding our models large amounts of data.

OpenCV Not Writing File

I have been trying to create a video from still images all in a folder. The file that is written is 0KB in size. I have checked and all files are retrieved by the glob.glob part correctly(That's what the commented print(filename) line is about). I have tried multiple fourcc options and none of them work. Does anyone see an issue that would be causing this? Also this is running on python 3 in Jupyter Notebook.
fold_file = fold +'/*jpg' #fold is just the path to folder containing the images
img_array=[]
for filename in glob.glob(fold_file):
#print(filename)
img=cv2.imread(filename)
height, width, layer = img.shape
size = (width,height)
img_array.append(img)
out = cv2.VideoWriter('pleasework.avi',cv2.VideoWriter.fourcc('X','V','I','D') ,15,size)
for image in range(len(img_array)):
out.write(image)
cv2.destroyAllWindows()
out.release()
This line of code is probably your issue:
for image in range(len(img_array)):
out.write(image)
The len() function counts the number of items in a sequence. Let's presume, for sake of argument that you have five images in img_array. Then len() would return 5. We are then feeding that length value into the range() function to produce a sequence of numbers from 0 to 4 (i.e. up to but not including 5).
We are then parsing through that range using the for loop, and we are dropping the numbers 0 through 4 into the out.write() method instead of dropping in images.
What you probably want is this:
for image in img_array:
out.write(image)
img_array is a Python list and as such, can be parsed by for loops natively without having to use any sort of length calculation, etc.

What is the best way to load a CCITT T.3 compressed tiff using python?

I am trying to load a CCITT T.3 compressed tiff into python, and get the pixel matrix from it. It should just be a logical matrix.
I have tried using pylibtiff and PIL, but when I load it with them, the matrix it returns is empty. I have read in a lot of places that these two tools support loading CCITT but not accessing the pixels.
I am open to converting the image, as long as I can get the logical matrix from it and do it in python code. The crazy thing is is that if I open one of my images in paint, save it without altering it, then try to load it with pylibtiff, it works. Paint re-compresses it to the LZW compression.
So I guess my real question is: Is there a way to either natively load CCITT images to matricies or convert the images to LZW using python??
Thanks,
tylerthemiler
It seems the best way is to not use Python entirely but lean on netpbm:
import Image
import ImageFile
import subprocess
tiff = 'test.tiff'
im = Image.open(tiff)
print 'size', im.size
try:
print 'extrema', im.getextrema()
except IOError as e:
print 'help!', e, '\n'
print 'I Get by with a Little Help from my Friends'
pbm_proc = subprocess.Popen(['tifftopnm', tiff],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
(pbm_data, pbm_error) = pbm_proc.communicate()
ifp = ImageFile.Parser()
ifp.feed(pbm_data)
im = ifp.close()
print 'conversion message', pbm_error,
print 'extrema', im.getextrema()
print 'size', im.size
# houston: we have an image
im.show()
Seems to do the trick:
$ python g3fax.py
size (1728, 2156)
extrema help! decoder group3 not available
I Get by with a Little Help from my Friends
conversion message tifftopnm: writing PBM file
extrema (0, 255)
size (1728, 2156)
How about running tiffcp with subprocess to convert to LZW (-c lzw switch), then process normally with pylibtiff? There are Windows builds of tiffcp lying around on the web. Not exactly Python-native solution, but still...

Python/OpenCV: Converting images taken from capture

I'm trying to convert images taken from a capture (webcam) and do some processing on them with OpenCV, but I'm having a difficult time..
When trying to convert the image to grayscale, the program crashes. (Python.exe has stopped working)
Here is the main snippet of my code:
newFrameImageGS = cv.CreateImage ((320, 240), cv.IPL_DEPTH_8U, 1)
for i in range(0,5):
newFrameImage = cv.QueryFrame(ps3eye)
cv.CvtColor(newFrameImage,newFrameImageGS,cv.CV_BGR2GRAY)
golfSwing.append(newFrameImageGS)
When I try using cvConvertScale I get the assertion error:
src.size() == dst.size() && src.channels() == dst.channels()
which makes sense, but I'm pretty confused on how to go about converting the input images of my web cam into images that can be used by functions like cvUpdateMotionHistory() and cvCalcOpticalFlowLK()
Any ideas? Thanks.
UPDATE:
I converted the image to grayscale manually with this:
for row in range(0,newFrameImage.height):
for col in range(0,newFrameImage.width):
newFrameImageGS[row,col] = (newFrameImage8U[row,col][0] * 0.114 + # B
newFrameImage8U[row,col][1] * 0.587 + # G
newFrameImage8U[row,col][2] * 0.299) # R
But this takes quite a while.. and i still can't figure out why cvCvtColor is causing the program to crash.
For some reason, CvtColor caused the program to crash when the image depths where 8 bit. When I converted them to 32 bit, the program no longer crashed and everything seemed to work OK. I have no idea why this is, but at least it works now.
newFrameImage = cv.QueryFrame(ps3eye)
newFrameImage32F = cv.CreateImage((320, 240), cv.IPL_DEPTH_32F, 3)
cv.ConvertScale(newFrameImage,newFrameImage32F)
newFrameImageGS_32F = cv.CreateImage ((320,240), cv.IPL_DEPTH_32F, 1)
cv.CvtColor(newFrameImage32F,newFrameImageGS_32F,cv.CV_RGB2GRAY)
newFrameImageGS = cv.CreateImage ((320,240), cv.IPL_DEPTH_8U, 1)
cv.ConvertScale(newFrameImageGS_32F,newFrameImageGS)
There is a common mistake here:
You're creating a single image in the newFrameImageGS variable before the loop, then overwrite its contents in the loop, which is then appended to a list. The result will not be what you would expect. The list will contain five references to the same image instance at the end, since only the object reference is appended to the list, no copy of the object made this way. This image will contain the very last frame, so you get five of that frame as a result, which is not what you want, I guess. Please review the Python tutorial if it is not clear for you. You can solve this by moving the first line of the above code into the body of the for loop.
Another possibilities if fixing the above would not help you:
The CvtColor function seems to be the correct one for conversion to grayscale, since it can convert to a different number of channels.
According to this manual the CvtColor function requires a destination image of the same data type as the source. Please double check that newFrameImage is a IPL_DEPTH_8U image.

Categories