IMAPClient - How to get subject and sender? - python

I'm trying to make a simple program to check and show unread messages, but I have problem while trying to get subject and sender adress.
For sender I've tried this method:
import email
m = server.fetch([a], ['RFC822'])
#a is variable with email id
msg = email.message_from_string(m[a], ['RFC822'])
print msg['from']
from email.utils import parseaddr
print parseaddr(msg['from'])
But it didn't work. I was getting this error:
Traceback (most recent call last):
File "C:/Users/ExampleUser/AppData/Local/Programs/Python/Python35-32/myprogram.py", line 20, in <module>
msg = email.message_from_string(m[a], ['RFC822'])
File "C:\Users\ExampleUser\AppData\Local\Programs\Python\Python35-32\lib\email\__init__.py", line 38, in message_from_string
return Parser(*args, **kws).parsestr(s)
File "C:\Users\ExampleUser\AppData\Local\Programs\Python\Python35-32\lib\email\parser.py", line 68, in parsestr
return self.parse(StringIO(text), headersonly=headersonly)
TypeError: initial_value must be str or None, not dict
I also used this:
print(server.fetch([a], ['BODY[HEADER.FIELDS (FROM)]']))
but the result was like:
defaultdict(<class 'dict'>, {410: {b'BODY[HEADER.FIELDS ("FROM")]': b'From: "=?utf-8?q?senderexample?=" <sender#example.com>\r\n\r\n', b'SEQ': 357}, 357: {b'SEQ': 357, b'FLAGS': (b'\\Seen',)}})
Is there a way to repair the first method, or make the result of second look like:
Sender Example <sender#example.com>
?
And I also don't know how to get email subject. But I guess it's the same as sender, but with other arguments. So the only thing I need are these arguments.

You should start by reviewing various IMAP libraries which are available for Python and use one which fits your needs. There are multiple ways of fetching the data you need in IMAP (the protocol), and by extension also in Python (and its libraries).
For example, the most straightforward way of getting the data you need in IMAP the protocol is through fetching the ENVELOPE object. You will still have to perform decoding of RFC2047 encoding of the non-ASCII data (that's that =?utf-8?q?... bit that you're seeing), but at least it would save you from parsing RFC5322 header structure with multiple decades of compatibility syntax rules.

Related

Parsing DNS RDATA using python

I'm trying to use Python to parse hex-formated DNS RDATA-values (should be RFC1035-compliant) that's generated in audit logs from Windows DNS Server when a record is created or deleted. I've tried a couple of Python dns-modules and think I'm getting close with dnslib, however all the documentation I find is for parsing a complete DNS-packet captured from the network including question and answer header ++.
The audit log only provides the class type and the RDATA it stores in AD (Active Directory-integrated zone), so I figured I might be able to use the parse(buffer,length) method of the individual record type-classes to parse it, but so far all my attempts are failing.
Sample data:
Type = MX
RDATA = 0A0011737276312E636F6E746F736F2E636F6D2E
Which should be parsed to:
preference = 10
mx = srv1.contoso.com.
Latest attempt:
import dnslib
import binascii
mxrdata = binascii.unhexlify(b'0A0011737276312E636F6E746F736F2E636F6D2E')
b = dnslib.DNSBuffer(mxrdata)
mx = dnslib.MX.parse(b,len(b))
this fails with:
Traceback (most recent call last):
File "C:\Python37-32\lib\site-packages\dnslib\dns.py", line 1250, in parse
mx = buffer.decode_name()
File "C:\Python37-32\lib\site-packages\dnslib\label.py", line 235, in decode_name
(length,) = self.unpack("!B")
File "C:\Python37-32\lib\site-packages\dnslib\buffer.py", line 103, in unpack
data = self.get(struct.calcsize(fmt))
File "C:\Python37-32\lib\site-packages\dnslib\buffer.py", line 64, in get
(self.offset,self.remaining(),length))
dnslib.buffer.BufferError: Not enough bytes [offset=20,remaining=0,requested=1]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python37-32\lib\site-packages\dnslib\dns.py", line 1254, in parse
(buffer.offset,e))
dnslib.dns.DNSError: Error unpacking MX [offset=20]: Not enough bytes [offset=20,remaining=0,requested=1]
Can anyone help me? Is it even possible using this module?
You are encoding the RDATA wrongly a bit:
First, you specify the preference:
0A00
However, this is not 10 (because integer are encoded MSB first, not LSB first), but 2560. So this should be
000A
Then, you try to encode the hostname here:
11737276312E636F6E746F736F2E636F6D2E
0x11 should be the length byte and the rest is the domain name srv1.contoso.com.. But this is not how encoding a hostname works. You have to encode each label separately with a length byte and terminate the hostname with a 0-length label. So this should be:
04 73727631 07 636F6E746F736F 03 636F6D 00
s r v 1 . c o n t o s o . c o m .
This adds up to:
mxrdata = binascii.unhexlify(b'000A047372763107636F6E746F736F03636F6D00')
Them the parser should succeed. So if you really get the RDATA in such an invalid format, you have to convert it first in to make it rfc1035-compliant.

AttributeError: 'str' object has no attribute 'copy' when parsing Multipart email message

Python 3.6 email module crashes with this error:
Traceback (most recent call last):
File "empty-eml.py", line 9, in <module>
for part in msg.iter_attachments():
File "/usr/lib/python3.6/email/message.py", line 1055, in iter_attachments
parts = self.get_payload().copy()
AttributeError: 'str' object has no attribute 'copy'
The crash can be reproduced with this EML file,
From: "xxx#xxx.xx" <xxx#xxx.xx>
To: <xx#xxx.xx>
Subject: COURRIER EMIS PAR PACIFICA
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_3181_1274694650.1556805728023"
Date: Thu, 2 May 2019 16:02:08 +0200
and this piece of minimal code:
from email import policy
from email.parser import Parser
from sys import argv
with open(argv[1]) as eml_file:
msg = Parser(policy=policy.default).parse(eml_file)
for part in msg.iter_attachments():
pass
I believe it has to do something with the Content-Type being multipart/mixed together with the email content being empty, which causes get_payload to return str. However, I am not sure, if such EML is forbidden by standard (but I have many such samples), it is a bug in the email module, or me using the code wrong.
If you change the policy to strict:
Parser(policy=policy.strict).parse(eml_file)
the parser raises email.errors.StartBoundaryNotFoundDefect, described in the docs as:
StartBoundaryNotFoundDefect – The start boundary claimed in the Content-Type header was never found.
If you parse the message with policy.default and inspect it's defects afterwards it contains two defects:
[StartBoundaryNotFoundDefect(), MultipartInvariantViolationDefect()]
MultipartInvariantViolationDefect – A message claimed to be a multipart, but no subparts were found. Note that when a message has this defect, its is_multipart() method may return false even though its content type claims to be multipart.
A consequence of the StartBoundaryNotFoundDefect is that the parser terminates parsing and sets the message payload to the body that has been captured so far - in this case, nothing, so the payload is an empty string, causing the exception that you are seeing when you run your code.
Arguably the fact that Python doesn't check whether payload is a list before calling copy() on it is a bug.
In practice, you have to handle these messages either by wrapping the iteration of attachments in a try/except, conditioning iteration on the contents of msg.defects, or parsing with policy.strict and discarding all messages that report defects.

Soap Server raised fault: 'java.lang.NullPointerException'. How to debug?

I'm trying to call a SOAP webservice from the Dutch land register (WSDL here). I first tried doing that using the pysimplesoap library. Although I do get relevant xml back, pysimplesoap gives a TypeError: Tag: IMKAD_Perceel invalid (type not found) (I created a SO question about that here). Since I suspect this to be a bug in pysimplesoap I'm now trying to use the suds library.
In pysimplesoap the following returned correct xml (but as I said pysimplesoap gave a TypeError):
from pysimplesoap.client import SoapClient
client = SoapClient(wsdl='http://www1.kadaster.nl/1/schemas/kik-inzage/20141101/verzoekTotInformatie-2.1.wsdl', username=xxx, password=xxx, trace=True)
response = client.VerzoekTotInformatie(
Aanvraag={
'berichtversie': '4.7', # Refers to the schema version: http://www.kadaster.nl/web/show?id=150593&op=/1/schemas/homepage.html
'klantReferentie': 'MyReference1', # Refers to something we can set ourselves.
'productAanduiding': '1185', # a four-digit code referring to whether the response should be in "XML" (1185), "PDF" (1191) or "XML and PDF" (1057).
'Ingang': {
'Object': {
'IMKAD_KadastraleAanduiding': {
'gemeente': 'ARNHEM',
'sectie': 'AC',
'perceelnummer': '1234'
}
}
}
}
)
This produced the xml below:
<soap:Body>
<VerzoekTotInformatieRequest xmlns="http://www.kadaster.nl/schemas/kik-inzage/20141101">
<Aanvraag xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">
<berichtversie xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">4.7</berichtversie>
<klantReferentie xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">ARNHEM-AC-1234</klantReferentie>
<productAanduiding xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">1185</productAanduiding>
<Ingang xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">
<Object xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">
<IMKAD_KadastraleAanduiding xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">
<gemeente xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">ARNHEM AC</gemeente>
<sectie xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">AC</sectie>
<perceelnummer xmlns="http://www.kadaster.nl/schemas/kik-inzage/ip-aanvraag/v20141101">5569</perceelnummer>
</IMKAD_KadastraleAanduiding>
</Object>
</Ingang>
</Aanvraag>
</VerzoekTotInformatieRequest>
</soap:Body>
So now I tried changing this code to use suds instead. So far I came up with this:
from suds.client import Client
client = Client(url='http://www1.kadaster.nl/1/schemas/kik-inzage/20141101/verzoekTotInformatie-2.1.wsdl', username='xxx', password='xxx')
Aanvraag = client.factory.create('ns3:Aanvraag')
Aanvraag.berichtversie = '4.7'
Aanvraag.klantReferentie = 'MyReference1'
Aanvraag.productAanduiding = '1185'
IMKAD_KadastraleAanduiding = client.factory.create('ns3:IMKAD_KadastraleAanduiding')
IMKAD_KadastraleAanduiding.gemeente = 'ARNHEM'
IMKAD_KadastraleAanduiding.sectie = 'AC'
IMKAD_KadastraleAanduiding.perceelnummer = '1234'
Object = client.factory.create('ns3:Object')
Object.IMKAD_KadastraleAanduiding = IMKAD_KadastraleAanduiding
Ingang = client.factory.create('ns3:Ingang')
Ingang.Object = Object
Aanvraag.Ingang = Ingang
result = client.service.VerzoekTotInformatie(Aanvraag)
which produces the following xml:
<ns2:Body>
<ns0:VerzoekTotInformatieRequest>
<ns0:Aanvraag>
<ns1:berichtversie>4.7</ns1:berichtversie>
<ns1:klantReferentie>MyReference1</ns1:klantReferentie>
<ns1:productAanduiding>1185</ns1:productAanduiding>
<ns1:Ingang>
<ns1:Object>
<ns1:IMKAD_KadastraleAanduiding>
<ns1:gemeente>ARNHEM</ns1:gemeente>
<ns1:sectie>AC</ns1:sectie>
<ns1:perceelnummer>1234</ns1:perceelnummer>
</ns1:IMKAD_KadastraleAanduiding>
</ns1:Object>
</ns1:Ingang>
</ns0:Aanvraag>
</ns0:VerzoekTotInformatieRequest>
</ns2:Body>
Unfortunately, this results in the server giving back a Nullpointer:
Traceback (most recent call last):
File "<input>", line 1, in <module>
result = client.service.VerzoekTotInformatie(Aanvraag)
File "/Library/Python/2.7/site-packages/suds/client.py", line 542, in __call__
return client.invoke(args, kwargs)
File "/Library/Python/2.7/site-packages/suds/client.py", line 602, in invoke
result = self.send(soapenv)
File "/Library/Python/2.7/site-packages/suds/client.py", line 649, in send
result = self.failed(binding, e)
File "/Library/Python/2.7/site-packages/suds/client.py", line 702, in failed
r, p = binding.get_fault(reply)
File "/Library/Python/2.7/site-packages/suds/bindings/binding.py", line 265, in get_fault
raise WebFault(p, faultroot)
WebFault: Server raised fault: 'java.lang.NullPointerException'
This error is of course terribly unhelpful. The error gives no hint whatsoever on what causes the NullPointer.
If I look at the differences between the xml which pysimplesoap and suds send over the wire, the xml by suds is missing a lot of xmlns definitions (although I don't know whether they are needed) and the names of the tags include prefixes with for example ns0:. I don't know if these differences are relevant, and I also don't know how I would make suds create the same xml as pysimplesoap.
Although the wsdl file of the service is public, the service itself is paid (€60 yearly + €3 for every successful request). So I guess it is hard/impossible for people reading this to reproduce the issue, and I can't really give out my user credentials here.
But since I'm really stuck on this issue, maybe someone can give me some tips on how to debug this? For example; how can I make suds create the same xml as pysimplesoap? Or how I can get more information on the nullpointer?
Any help is welcome!
This is not so much an answer, but an advice from prior experience with Python and SOAP.
Find some good (established, reference for SOAP) Java tool for making SOAP queries given WSDL.
Make some typical queries, interesting to you, and record what is being sent / received as templates
Forget Python SOAP libraries and just use template to query SOAP endpoint (there are many templating languages for Python).
If the step 2. fails with the prominent Java tool, contact techsupport of the service you are paying for.
Have you checked whether all those nice XSDs are really downloaded by Python SOAP clients?

CGI with Python

I'm beginning to use CGI with Python.
After running the following piece of code:
#!c:\python34\python.exe
import cgi
print("Content-type: text/html\n\n") #important
def getData():
formData = cgi.FieldStorage()
InputUN = formData.getvalue('username')
InputPC = formData.getvalue('passcode')
TF = open("TempFile.txt", "w")
TF.write(InputUN)
TF.write(InputPC)
TF.close()
if __name__ =="__main__":
LoginInput = getData()
print("cgi worked")
The following error occurs:
Traceback (most recent call last):
File "C:\xampp\htdocs\actual\loginvalues.cgi", line 21, in <module>
LoginInput = getData()
File "C:\xampp\htdocs\actual\loginvalues.cgi", line 16, in getData
TF.write(InputUN)
TypeError: must be str, not None
>>>
I'm trying to write the values, inputted in html, to a text file.
Any help would be appreciated :)
Your calls to getValue() are returning None, meaning the form either didn't contain them, had them set to an empty string, or had them set by name only. Python's CGI module ignores inputs that aren't set to a non-null string.
Works for Python CGI:
mysite.com/loginvalues.cgi?username=myname&pass=mypass
Doesn't work for Python CGI:
mysite.com/loginvalues.cgi?username=&pass= (null value(s))
mysite.com/loginvalues.cgi?username&pass (Python requires the = part.)
To account for this, introduce a default value for when a form element is missing, or handle the None case manually:
TF.write('anonymous' if InputUN is None else InputUN)
TF.write('password' if InputPC is None else InputUN)
As a note, passwords and other private login credentials should never be used in a URL. URLs are not encrypted. Even in HTTPS, the URL is sent in plain text that anyone on the network(s) between you and your users can read.
The only time a URL is ever encrypted is over a tunneled SSH port or an encrypted VPN, but you can't control that, so never bank on it.

Python suds error creating object

Trying to work with the echosign SOAP API.
The wsdl is here: https://secure.echosign.com/services/EchoSignDocumentService14?wsdl
When I try to create certain objects, it appears to not be able to find the type, even after listing it in print client
import suds
url = "https://secure.echosign.com/services/EchoSignDocumentService14?wsdl"
client = suds.client.Client(url)
print client
Service ( EchoSignDocumentService14 ) tns="http://api.echosign"
Prefixes (10)
ns0 = "http://api.echosign"
ns1 = "http://dto.api.echosign"
ns2 = "http://dto10.api.echosign"
ns3 = "http://dto11.api.echosign"
ns4 = "http://dto12.api.echosign"
ns5 = "http://dto13.api.echosign"
ns15 = "http://dto14.api.echosign"
ns16 = "http://dto7.api.echosign"
ns17 = "http://dto8.api.echosign"
ns18 = "http://dto9.api.echosign"
Ports (1):
(EchoSignDocumentService14HttpPort)
Methods (45):
...
Types (146):
ns1:CallbackInfo
ns17:WidgetCreationInfo
Trimmed for brevity, but showing the namespaces and the 2 types I'm concerned with right now.
Trying to run WCI = client.factory.create("ns17:WidgetCreationInfo") generates this error:
client.factory.create("ns17:WidgetCreationInfo")
Traceback (most recent call last):
File "", line 1, in
File "build/bdist.macosx-10.7-intel/egg/suds/client.py", line 244, in create
suds.BuildError:
An error occured while building a instance of (ns17:WidgetCreationInfo). As a result
the object you requested could not be constructed. It is recommended
that you construct the type manually using a Suds object.
Please open a ticket with a description of this error.
Reason: Type not found: '(CallbackInfo, http://dto.api.echosign, )'
So it doesn't appear to be able to find the CallbackInfo type. Maybe its because its missing the ns there?
Again, figured it out 15 min after posting here.
suds has an option to cross-pollinate all the namespaces so they all import each others schemas. autoblend can be set in the constructor or using the set_options method.
suds.client.Client(url, autoblend=True)
Take a look in the WSDL, it seems lots of definitions in http://*.api.echosign that suds cannot fetch.
Either update your /etc/hosts to make these not well-formed domains can be reached, or save the wsdl locally, modify it, then use Client('file://...', ...) to create your suds client.

Categories