How to decode bitmap images using python's email class - python

Hello I have python script that takes apart an email from a string. I am using the get_payload(decode=True) function from the email class and it works great for pdf's and jpg's but it does not decode bmp files. The file is still encoded base64 when I write it to disk.
Has anyone come across this issue themselves?

OK so I finally found the problem and it was not related to the python mail class at all. I was reading from a named pipe using the .read() function and it was not reading the entire email from the pipe. I had to pass the read function a size argument and then it was able to read the entire email. So ultimately the reason why my bmp file was not decoded is because I had invalid base64 data causing the get_payload() function to not be able to decode the attatchment.

Related

Base64 Decoding with Python

I am receiving json data that is base64 encoded and I am supposed to decode it using Python. For most base64 requests, everything works fine, but some requests contains some strange chars after decoding.
The thing is, in Python I am getting an error (Invalid padding), even though the padding is right, but if I try to decode the base64 with the unix command or on https://www.base64decode.org/ website, I get all the data from the base64.
Do you know any work around for this problem, beside calling the unix command from python code?
Thank you!

.sendmail function trying to convert to ASCII, do I have to convert?

So I'm trying to send an email using python but I can't as long as it converts it to ASCII, is there a way around this or do I need to find another function?
File "/usr/lib/python3.6/smtplib.py", line 855, in sendmail
msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 1562: ordinal not in range(128)
Can I get around this or do I have convert? and how would i convert?
Traditionally, SMTP requires the message you submit to be in ASCII. The problem is earlier in the process: The message you are trying to pass in should already have been converted into a proper MIME message when you created it.
Generally, any binary data should be converted to have a Content-Transfer-Encoding: base64 and any non-ASCII text should have Content-Transfer-Encoding: quoted-printable. Then you can safely use non-ASCII bytes and the transfer encoding takes care of transparently converting the payload to ASCII for transport, and the recipient's email software takes care of displaying it as you intended.
Python's email library already knows how to take care of these things. Perhaps you are trying to construct a message manually without actually checking what the specs say? But using the standard library is obviously easier and saves you from a fair bit of learning curve.
For concrete details, see e.g. how to send a email body part through MIMEMultipart
There are now provisions for extending SMTP to handle UTF-8 everywhere, but the error message suggests that your Sendmail is not yet up to the task. (Or perhaps there is an option you can add to its configuration, but that's far outside the scope of this question, and of Stack Overflow.)

How do I access both binary and text data for email processing with Python 3?

I am converting a Python 2 program to Python 3 and I'm not sure about the approach to take.
The program reads in either a single email from STDIN, or file(s) are specified containing emails. The program then parses the emails and does some processing on them.
SO we need to work with the raw data of the email input, to store it on disk and do an MD5 hash on it. We also need to work with the text of the email input in order to run it through the Python email parser and extract fields etc.
With Python 3 it is unclear to me how we should be reading in the data. I believe we need the raw binary data in order to do an md5 on it, and also to be able to write it to disk. I understand we also need it in text form to be able to parse it with the email library. Python 3 has made significant changes to the IO handling and text handling and I can't see the "correct" approach to read the email raw data and also use the same data in text form.
Can anyone offer general guidance on this?
The general guidance is convert everything to unicode ASAP and keep it that way until the last possible minute.
Remember that str is the old unicode and bytes is the old str.
See http://docs.python.org/dev/howto/unicode.html for a start.
With Python 3 it is unclear to me how we should be reading in the data.
Specify the encoding when you open the file it and it will automatically give you unicode. If you're reading from stdin, you'll get unicode. You can read from stdin.buffer to get binary data.
I believe we need the raw binary data in order to do an md5 on it
Yes, you do. encode it when you need to hash it.
and also to be able to write it to disk.
You specify the encoding when you open the file you're writing it to, and the file object encodes it for you.
I understand we also need it in text form to be able to parse it with the email library.
Yep, but since it'll get decoded when you open the file, that's what you'll have.
That said, this question is really too open ended for Stack Overflow. When you have a specific problem / question, come back and we'll help.

Converting raw binary data into an image file?

I'm trying to read a field from an Active Directory entry which contains raw jpeg binary data. I'd like to read that data and convert it to an image file for use in my django-based application. I cannot for the life of me figure out how to handle this data in a nice way. Any ideas?
Edit:
To anyone who might come across this in the future: there's a method in python's OS library:
os.tmpfile()
it creates a file and destroys it once the file descriptor is closed. Very useful for this situation.
Here is somebody who was having the same problem -- check out the latest post at the bottom.
http://groups.google.com/group/django-users/browse_thread/thread/4214db6699863ded/5d816b02daca3186
Looks like passing raw data to SimpleUploadedFile is what you are looking for.
request._raw_post_data
The raw HTTP POST data as a byte
string. This is useful for processing
data in different formats than of
conventional HTML forms: binary
images, XML payload etc.
http://docs.djangoproject.com/en/dev/ref/request-response/#httprequest-objects
I know this isn't part of the question, but this looks pretty awesome! "HttpRequest.read() file-like interface"
http://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpRequest.read

Is it possible to post binaries to usenet with Python?

I'm trying to use the nntplib that comes with python to make some posts to usenet. However I can't figure out how to post binary files using the .post method.
I can post plain text files just fine, but not binary files. any ideas?
-- EDIT--
So thanks to Adrian's comment below I've managed to make one step towards my goal.
I now use the email library to make a multipart message and attach the binary files to the message. However I can't seem to figure out how to pass that message directly to the nttplib post method.
I have to first write a temporary file, then read it back in to the nttplib method. There has to be a way to do this all in memory....any suggestions?
you have to MIME-encode your post: a binary post in an NNTP newsgroup is like a mail with an attachment.
the file has to be encoded in ASCII, generally using the base64 encoding, then the encoded file is packaged iton a multipart MIME message and posted...
have a look at the email module: it implements all that you want.
i encourage you to read RFC3977 which is the official standard defining the NNTP protocol.
for the second part of your question:
use StringIO to build a fake file object from a string (the post() method of nntplib accepts open file objects).
email.Message objects have a as_string() method to retrieve the content of the message as a plain string.

Categories