So I'm trying to send an email using python but I can't as long as it converts it to ASCII, is there a way around this or do I need to find another function?
File "/usr/lib/python3.6/smtplib.py", line 855, in sendmail
msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 1562: ordinal not in range(128)
Can I get around this or do I have convert? and how would i convert?
Traditionally, SMTP requires the message you submit to be in ASCII. The problem is earlier in the process: The message you are trying to pass in should already have been converted into a proper MIME message when you created it.
Generally, any binary data should be converted to have a Content-Transfer-Encoding: base64 and any non-ASCII text should have Content-Transfer-Encoding: quoted-printable. Then you can safely use non-ASCII bytes and the transfer encoding takes care of transparently converting the payload to ASCII for transport, and the recipient's email software takes care of displaying it as you intended.
Python's email library already knows how to take care of these things. Perhaps you are trying to construct a message manually without actually checking what the specs say? But using the standard library is obviously easier and saves you from a fair bit of learning curve.
For concrete details, see e.g. how to send a email body part through MIMEMultipart
There are now provisions for extending SMTP to handle UTF-8 everywhere, but the error message suggests that your Sendmail is not yet up to the task. (Or perhaps there is an option you can add to its configuration, but that's far outside the scope of this question, and of Stack Overflow.)
Related
I tried to decode this highlighted segment however i ran into some issues.
I used this code in order to decipher the content
hexed ="01000c0000000040000040400000803f0000003f2af0ce4004040000404000008040cdcc4c3ecdcccc3d305b1a3e2903fa42240000484400006144000048430000c8424ddc4143200000484400006144000048430000c84218380b440000000000000000000000000000000000000000000000000b010001deddf7420b0100016666e6400201000102000000000000000000000000305b1a3e4ddc414318380b4400010000000101000100010002000300121204000200010000050006000600ffffffff00000000deddf742"
ether_pkt = Ether(binascii.unhexlify(hexed))
ether_pkt.show()
And the result i got is:
How do i further decipher this content?
'\x80?\x00\x00\x00?*\xf0\xce#\x04\x04\x00\x00##\x00\x00\x80#\xcd\xccL>\xcd\xcc\xcc=0[\x1a>)\x03\xfaB$\x00\x00HD\x00\x00aD\x00\x00HC\x00\x00\xc8BM\xdcAC \x00\x00HD\x00\x00aD\x00\x00HC\x00\x00\xc8B\x188\x0bD\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0b\x01\x00\x01\xde\xdd\xf7B\x0b\x01\x00\x01ff\xe6#\x02\x01\x00\x01\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x000[\x1a>M\xdcAC\x188\x0bD\x00\x01\x00\x00\x00\x01\x01\x00\x01\x00\x01\x00\x02\x00\x03\x00\x12\x12\x04\x00\x02\x00\x01\x00\x00\x05\x00\x06\x00\x06\x00\xff\xff\xff\xff\x00\x00\x00\x00\xde\xdd\xf7B'
I've tried to .decode() and hex() in order to turn them into string however the output is not human readable
Have a look at pycomm3. Especially its CIP reference.
According to the reference, 0x4c is the "read_tag" custom service for Rockwell devices, whatever that means.
The data you highlighted is listed as "command specific data". That suggests that it is not defined in the CIP, but is custom to the device that sent it. If it had been part of the CIP, wireshark could probably have decoded it further. So you will have to find and read documentation for the device in question.
There is no magic, you need to download the specs and write a parser to decode it. As you can see in your wireshark screenshot, the protocol isn't string/ascii.
I am receiving json data that is base64 encoded and I am supposed to decode it using Python. For most base64 requests, everything works fine, but some requests contains some strange chars after decoding.
The thing is, in Python I am getting an error (Invalid padding), even though the padding is right, but if I try to decode the base64 with the unix command or on https://www.base64decode.org/ website, I get all the data from the base64.
Do you know any work around for this problem, beside calling the unix command from python code?
Thank you!
Hello I have python script that takes apart an email from a string. I am using the get_payload(decode=True) function from the email class and it works great for pdf's and jpg's but it does not decode bmp files. The file is still encoded base64 when I write it to disk.
Has anyone come across this issue themselves?
OK so I finally found the problem and it was not related to the python mail class at all. I was reading from a named pipe using the .read() function and it was not reading the entire email from the pipe. I had to pass the read function a size argument and then it was able to read the entire email. So ultimately the reason why my bmp file was not decoded is because I had invalid base64 data causing the get_payload() function to not be able to decode the attatchment.
Among all the encodings available here http://docs.python.org/library/codecs.html
which one is the one I should use for decoding binary data into unicode without it becoming corrupted when I encode it back to string?
I've used raw_unicode_data and it doesn't work.
Example: I upload picture in a POST (but not as file attachment). Django converts POST data to unicode using utf-8. However when converting back from unicode to string (again using utf-8), data becomes corrupted. I used raw_unicode_data and the same happened (though only a few bytes this time). Which encoding should I use so that the decode and encode steps don't corrupt the data.
If you want to post binary data use the base64 encoding.
http://docs.python.org/library/base64.html
"Binary data" is not text, therefore converting it to a unicode is meaningless. If there is text embedded in the binary data then extract it first and decode using the encoding given in the specification for the data format.
As others have already stated, your question isn't particularly clear. If you are wanting to funnel binary data through a text channel (such as POST), then base64 is the right format to use with appropriate data transformation operations in the client and the server (binary data -> base64 text -> pass over text channel -> base64 text -> binary data).
Alternatively, if you are wanting to tolerate improperly encoded text (e.g. as Python 3 tries to do for some interfaces such as file paths and environment variables), then Python 3.1 and later offer the surrogatescape error handler, which will convert invalid values into a format that isn't valid readable text, but allows the original binary data to be faithfully recreated when encoding back to bytes.
I'm trying to use the nntplib that comes with python to make some posts to usenet. However I can't figure out how to post binary files using the .post method.
I can post plain text files just fine, but not binary files. any ideas?
-- EDIT--
So thanks to Adrian's comment below I've managed to make one step towards my goal.
I now use the email library to make a multipart message and attach the binary files to the message. However I can't seem to figure out how to pass that message directly to the nttplib post method.
I have to first write a temporary file, then read it back in to the nttplib method. There has to be a way to do this all in memory....any suggestions?
you have to MIME-encode your post: a binary post in an NNTP newsgroup is like a mail with an attachment.
the file has to be encoded in ASCII, generally using the base64 encoding, then the encoded file is packaged iton a multipart MIME message and posted...
have a look at the email module: it implements all that you want.
i encourage you to read RFC3977 which is the official standard defining the NNTP protocol.
for the second part of your question:
use StringIO to build a fake file object from a string (the post() method of nntplib accepts open file objects).
email.Message objects have a as_string() method to retrieve the content of the message as a plain string.