Errno 20: Not a directory when saving into zip file - python

When I try to save a pyplot figure as a jpg, I keep getting a directory error saying that the given file name is not a directory. I am working in Colab. I have a numpy array called z_img and have opened a zip file.
import matplotlib.pyplot as plt
from zipfile import ZipFile
zipObj = ZipFile('slices.zip', 'w') # opening zip file
plt.imshow(z_img, cmap='binary')
The plotting works fine. I did a test of saving the image into Colab's regular memory like so:
plt.savefig(str(ii)+'um_slice.jpg')
And this works perfectly, except I am intending to use this code in a for loop. ii is an index to differentiate between each image, and several hundred images would be created so I want them going in the zipfile. Now when I try adding the path to the zipfile:
plt.savefig('/content/slices.zip/'+str(ii)+'um_slice.jpg')
I get: NotADirectoryError: [Errno 20] Not a directory: '/content/slices.zip/150500um_slice.jpg'
I assume it's because the {}.jpg string is a filename, and not a directory per se. But I am quite new to Python, and don't know how to get the plot into the zip file. That's all I want. Would love any advice!

First off, for anything that's not photographic content (ie. nice and soft), JPEG is the wrong format. You'll have a better time using a different file format. PNG is nice for pixels, SVG for vector graphics (in case you embed this in a website later!), PDF for vector, too.
The error message is quite on point: you cannot just save to a zip file as if it was a directory.
Multiple ways around:
use the tempfile module's mkdtemp to make a temporary directory, save into that, and zip the result
save not into a filename, but into a buffer (BytesIO I guess) and append that to the compressed stream (I'm not too familiar with ZipFile)
use PDF as output and simply generate a multipage PDF; it's not hard, and probably much nicer in the long term. You can still convert that vector graphic result to PNG (or any other pixel format9 as desired, but for the time being, it's space efficient, arbitrarily scaleable and keeps all your pages in one place. It's easy to import selected pages into LaTeX (matter of fact, \includegraphics does it directly) or into websites (pdf.js).

From the docs, matplotlib.pyplot.savefig accepts a binary file-like object. ZipFile.open creates binary file like objects. These two have to get todgether!
with zipobj.open(str(ii)+'um_slice.jpg', 'w') as fp:
plt.savefig(fp)

Related

How to return base64Encoded Matplotlib Chart?

In my matplotlib python code, I have a plt.savefig('test_file.png') and that, of course, works as expected.
But, since this code will be used in a REST service, I need to return the file contents as a Base64Encoded string to the caller (so they can Decode it and view it as png graphics file).
I tried this approach below, but it doesn't generate a large enough file (the actual png created by the plt.saveFig() is about 100KB, this approach below is only 4 KB):
my_stringIObytes = io.BytesIO()
plt.savefig(my_stringIObytes, format='png')
my_stringIObytes.seek(0)
my_base64_jpgData = base64.b64encode(my_stringIObytes.read())
And using the above "my_stringIObytes", I also tried saving this directly to disk (with no encoding), like this:
with open('base64_decode_test.png', 'wb') as fl:
fl.write(my_stringIObytes.getbuffer())
fl.close()
But that did not work either...
It looks like my initial attempt in the first code snippet above is probably the culprit, but I can't find any good examples of how do to what I need.
Would appreciate suggestions.
Thanks very much,
M

How to convert numpy array to bytes object without save audio file on disk?

I am now learning to build a TTS project based on Tacotron-2.
Here, the original code in save_wav(wav, path, sr) function has a step to save a numpy array to .wav file by using
wav *= 32767 / max(0.01, np.max(np.abs(wav)))
scipy.io.wavfile.write(path, hparams.sample_rate, wav.astype(np.int16))
However, after obtained a numpy array using wav *= 32767 / max(0.01, np.max(np.abs(wav))), I want to convert it to a .mp3 file so that it will be easier to send it back as streaming response.
Right now, I can convert .wav bytes object to a .mp3 file, but the problem is that I don't know how to convert the numpy array to a .wav bytes object.
I searched about it and found that it seems like I need to set a header for the numpy array, but in almost all posts that I looked into indicated using modules like scipy.io.wave and audioop, which will first save the numpy array to a .wav file and then with open('filename.wav', 'rb').
(This is the link for scipy.io.wavfile.write module, where the filename param should be string or open file handle which, from my understanding, the generated .wav file will be saved on disk.)
Could anyone give any suggestion on how to achieve this?
Use io.BytesIO
There is a much simpler and more convenient solution using a little hack creating i/o interface of bytes. We can use it like file for write and read:
import io
from scipy.io.wavfile import write
bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, <audio_sr>, <audio_numpy_array>)
result_bytes = byte_io.read()
Use your data sample rate and values array instead of <audio_sr> and <audio_numpy_array>.
You can operate with result_bytes as bytes of .wav file (as required).
P.S. Also check this simple gist of how to perform values array -> bytes -> values array for wav file.
I finally solved this problem by modifying and creating new modules based on scipy.io.wavfile.write and audio_segment.py of pydub.
Beside, when you want to do operation on wave/mp3 bytes without saving them as a .wav/.mp3 file (normally by using some handful APIs or python package module), you should manually add header for it. It will not be a too-tough task if you look into those excellent package source codes.

How to get Pillow to make identical copies (edit EXIF in-line)

You would think it's quite simple, but the following code doesn't work as I would expect:
from hashlib import md5
from PIL import Image
im = Image.open("/tmp/original.jpg")
im.save("/tmp/new.jpg", quality="keep")
original = Image.open("/tmp/original.jpg")
new = Image.open("/tmp/new.jpg")
assert md5(original.tobytes()).hexdigest() == md5(new.tobytes()).hexdigest()
Why is it that when I'm simply saving an image as a new file, and keeping the quality settings the same, that the image data isn't identical? What am I missing?
Update (Explanation):
My problem is that I have a Pillow JpegImage object being handed to my code as part of a pipeline and I don't have control over the step at which the file is saved to disk:
<magic> → <my code> → <magic that saves to disk>
All I want my code to do is add/update/replace (any of these) the EXIF data for the to-be-saved jpeg image. As this info doesn't appear to be editable on an image object, the only way that I can figure to do this is to save the image to a temporary place (like BytesIO), save it and the re-open it with Image.open() before passing it to the next function in the chain.
Please tell me that there's a smarter, more efficient way to do this?

imread_collection There is no item

I am trying to read several images from archive with skimage.io.imread_collection, but for some reason it throws an error:
"There is no item named '00071198d059ba7f5914a526d124d28e6d010c92466da21d4a04cd5413362552/masks/*.png' in the archive".
I checked several times, such directory exists in archive and with *.png I just specify that I want to have all images in my collection, and imread_collection works well, when I am trying to download images not from archive, but from extracted folder.
#specify folder name
each_img_idx = '00071198d059ba7f5914a526d124d28e6d010c92466da21d4a04cd5413362552'
with zipfile.ZipFile('stage1_train.zip') as archive:
mask_ = skimage.io.imread_collection(archive.open(str(each_img_idx) + '/masks/*.png')).concatenate()
May some one explain me, what's going on?
Not all scikit-image plugins support reading from bytes, so I recommend using imageio. You'll also have to tell ImageCollection how to access the images inside the archive, which is done using a customized load_func:
from skimage import io
import imageio
archive = zipfile.ZipFile('foo.zip')
images = [f.filename for f in zf.filelist]
def zip_imread(fn):
return imageio.imread(archive.read(fn))
ic = io.ImageCollection(images, load_func=zip_imread)
ImageCollection has some benefits like not loading all images into memory at the same time. But if you simply want a long list of NumPy arrays, you can do:
collection = [imageio.imread(zf.read(f)) for f in zf.filelist]

Reading an image into pygame with incorrect file extension

I want to load a JPS file into pygame. A JPS file is simply a jpeg file with the image consisting of two side by side stereo pictures. While Pigame will load it in if I change the extension to jpg and use pygame.image.load(file_name), what I want to do is to load the file into memory and then tell Pigame to load the file in from a buffer and that the buffer contains a jpeg file.
I would like to do it this way because later I want to extend things so that I can load in an MPO file which is a file that contains two jpeg files and I suspect that the same techniques will be involved.
I have tried the pygame.image.frombuffer and pygame.image.fromstring but get the error message that the "String length does not equal format and resolution size". I think this is because I am not telling it that the buffer contains a jpeg.
Any one have any idea how this can be done?
Perhaps something along these lines (untested):
with open('image.jps', 'rb') as imgfile:
imgbuf = StringIO(imgfile.read())
image1 = pygame.image.load(imgbuf)
Since you say it works, you could probably shorten things as shown below since there's no reason to give the image buffer a name and keep it around:
with open('image.jps', 'rb') as imgfile:
image1 = pygame.image.load(StringIO(imgfile.read()))

Categories