I have a file which contains a single image of a specific format at a
specific offset. I can already get a file-like for the embedded image
which supports read(), seek(), and tell(). I want to take advantage
of an existing PIL decoder to handle the embedded image, but be able to
treat the entire file as an "image file" in its own right.
I have not been able to figure out how to do this given the
documentation
available and was wondering if anyone had any insights as to how I could
do this.
The relevant chapter of the docs is this one and I think it's fairly clear: if for example you want to decode image files in the new .zap-format, you write a ZapImagePlugin.py module which must perform a couple things:
have a class ZapImageFile(ImageFile.ImageFile): with string attributes format and format_description, and a hook-method def _open(self) (of which more later);
at module level, Image.register_open('zap', ZapImageFile) and Image.register_extension('ZAP', '.zap')
The specs for the _open method are very clearly laid out in the chapter -- it must read image data and metadata from open binary file-like object self.fp, raise SyntaxError (or another exception) ASAP if it detects that the file's not actually in the right format, set at least self.size and self.mode attributes, and in order to allow reading the image, also self.tile, a list of tile descriptors again in the format specified in that chapter (including the file-offset, which you say you know, and a decoder -- if the raw or bit decoders, documented in the chapter, don't meet your needs, the chapter recommends studying the sources of some of the many supplied decoders, such as JPEG, PNG, etc).
What I did to solve this was to derive from the ImageFile.ImageFile child belonging to the embedded format instead of ImageFile.ImageFile directly. Then in _open() I replaced self.fp with the file-like to the embedded image, and called the parent's _open(). I can't say that I'm particularly happy doing it this way, but it seems to have worked.
Related
I have a python function (using the Pythonista app) to show an image in the console. I have the image saved in a BytesIO object but the function requires a file path.
Is there any way to give it a path to the bytesIO or somehow give it the image without needing to save it as a file?
The specific function is console.show_image(image_path)
The general answer is that if the function you call expects a filesystem path and cannot handle a file-like object instead then your only solution is to write your data to a file (and ask the function's author to add support for file-like object, or if it's OSS implement it by yourself and send a merge request).
can the tensorflow read a file contain a normal images for example in JPG, .... or the tensorflow just read the .bin file contains images
what is the difference between .mat file and .bin file
Also when I rename the .bin file name to .mat, does the data of the file changed??
sorry maybe my language not clear because I cannot speak English very well
A file-name suffix is just a suffix (which sometimes help to get info about that file; e.g. Windows decides which tool is called when double-clicked). A suffix does not need to be correct. And of course, changing the suffix will not change the content.
Every format will need their own decoder. JPG, PNG, MAT and co.
To some extent, these are automatically used by reading out metadata (giving some assumptions!). Many image-tools have some imread-function which works for jpg and png, even if there is no suffix (because there is checking for common and supported image-formats).
I'm not sure what tensorflow does automatically, but:
jpg, png, bmp should be no problem
worst-case: use scipy to read and convert
mat is usually a matrix (with infinite different encodings) and often matlab-based
scipy can read many matlab-based formats
bin can be anything (usually stands for binary; no clear mapping like the above)
Don't get me wrong, but i expect someone trying to use tensorflow (not a small, not a simple tool) to know that changing a suffix should never magically transform the content to the new format (especially in the lossless/lossy case like png, jpg). I hope you evaluated this decision and you are not running blindly into using a popular tool.
A '.mat' file contains Matlab formatted Data (not matlab code like you would expect from a '.m' file). I'm not sure if you're even using Matlab since you didn't include the the tag in your question. '.mat' files are associated with matlab workspace; if you wanted to save your current workspace in Matlab, you would save it as a '.mat' file.
A '.bin' file is a binary file read by the computer. In general, executable (ready-to-run) programs are often identified as binary files. I think this is what you would want to use. I am unsure what you really want though because the wording of the question is difficult to understand and it seems like you have two questions here.
Changing the suffix of a file just changes what will run the file. For example, if I were to change test.txt to test.py, the data inside the text file remains the same, but the way the file is opened has changed. In this case, the file was a text file usually opened using Notepad (or some variation) then it was opened by python once changed. If you were to change a .jpg file to a txt file, you wouldn't be able to view it as a picture again, but instead, you would open a text file with a bunch of seemingly random characters which describe the picture. The picture data never changed, but the way you see it and are able to use it does.
Take a look at this website which describes the .bin extension pretty well. Also, a quick Google search goes a long way especially with questions like this.
Question
What is a clean way to create a file object from raw binary information in Python?
More Info
The reason I need to do this is because I have the raw binary information comprising a jpeg image stored in ram. I need to put it inside some kind of file object so that I can resize the image using Python's Pillow library.
According to the pillow documentation, the file object needs to implement the read(), seek(), and tell() methods.
The file object must implement read(), seek(), and tell() methods, and be opened in binary mode.
I was able to find a mention of how to handle this situation under the documentation for PIL.Image.frombytes:
...If you have an entire image in a string, wrap it in a BytesIO object,
and use open() to load it.
This is what I ended up with that worked using BytesIO:
import io
import PIL
from PIL.Image import Image
file_body = <binary image data>
t_file = io.BytesIO(file_body)
img = PIL.Image.open(t_file)
Note: The comments mention tempfile.SpooledTemporaryFile. This seems like it should have worked, but it did not for some reason.
I'm trying to read a field from an Active Directory entry which contains raw jpeg binary data. I'd like to read that data and convert it to an image file for use in my django-based application. I cannot for the life of me figure out how to handle this data in a nice way. Any ideas?
Edit:
To anyone who might come across this in the future: there's a method in python's OS library:
os.tmpfile()
it creates a file and destroys it once the file descriptor is closed. Very useful for this situation.
Here is somebody who was having the same problem -- check out the latest post at the bottom.
http://groups.google.com/group/django-users/browse_thread/thread/4214db6699863ded/5d816b02daca3186
Looks like passing raw data to SimpleUploadedFile is what you are looking for.
request._raw_post_data
The raw HTTP POST data as a byte
string. This is useful for processing
data in different formats than of
conventional HTML forms: binary
images, XML payload etc.
http://docs.djangoproject.com/en/dev/ref/request-response/#httprequest-objects
I know this isn't part of the question, but this looks pretty awesome! "HttpRequest.read() file-like interface"
http://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpRequest.read
If you Save as > jpg in Adobe Photoshop a path (selection) is stored in the file.
Is it possible to read that path in python, for example to create a composition with PIL?
EDIT
Imagemagick seems to help, example
This code (by /F AKA the effbot, author of PIL and generally wondrous Python contributor) shows how to walk through the 8BIM resource blocks (but it's looking for 0x0404, the IPTC/NAA data, so of course you'll need to edit it).
Per Tom Ruark's post to this thread, paths will have IDs of 2000 to 2999 (the latter gives the name of the clipping path, so it's different from the others) and the data's a series of 26-bytes "point records" (so the resource length is always a multiple of 26).
Read the rest in Tom's post in all the gory details -- it's a pesky and very detailed binary format that will take substantial experimentation (and skill with struct, bitwise manipulation, etc) to read and interpret just right (not helped by the fact that the fields can be big-endian or little-endian -- little-endian in Windows, if I read the post correctly).
Are you sure the path is stored in the jpg? That seems unlikely. Paths would be stored in native photoshop format, but not the jpg.
Do you know of any other tools that can read the path? Can you try saving the item as a jpg, close photoshop, reopen only the jpg and see if you still have the path? I doubt it'd be there.