Currently I am using Python and Exchangelib module to build a macro that shows emails that haven't been replied.
Background for the macro:
A support group of 3 peoples get daily lot of emails from customers to "support#abc.com".
One of 3 peoples will reply back using the same email "support#abc.com" as sender.
Due to high amount of daily inbox and the fact that 3 peoples share the same email "support#abc.com" to respond, human error happens from time to time and therefore some emails stay unreplied.
What I would like to try is to use the following symbol as the sign if the email is replied.
I could not figure out what the attribute for that is called.
I have compared all attributes for the second and the third emails side by side. I was expecting that the second email has a certain boolean attribute X with value "True" while the third email "False" (or vice versa):
Does such a boolean attribute exist? If no, how could my web browser show the symbol on my first screenshot?
If it does not exist, how would you solve it?
Another alternative to solve it would involve any "support#abc.com"-reply would need to be sent not only to the customer but also to "support#abc.com" itself as CC or normal recipient.
After that I just need to read the attribute "conversation_id" and compare it to other earlier emails.
I don't like the alternative because of the CC, it would create a new element in "the solution" that is prone to human error.
Any inputs would be welcome.
Thank you in advance.
I don't know of any fields on the Message object in EWS that tells you directly whether that message has a reply.
I think your best bet is to use the conversation_id of the message and check your Sent folder for that conversation_id. I believe that's what OWA does - messages where only one message is known with that conversation ID will not get the "replied" icon.
Related
I'm currently working on a project and I have chosen to use Gmail for sending and receiving emails. I want to be able to send an email, have a user reply to it, and parse their response. The response can be any number of lines (so something like response.split('\n')[0] won't work). It should then be able to reply directly to that email thread.
I've been following the googleapiclient tutorials, but they leave a lot to be desired. However, I've managed to read email threads using:
service.users.threads().get(userId='me', id=thread_id).execute()
where thread_id is (predictably) the ID of the email thread (which I find elsewhere). In the large dict returned by this, there is a section of base64 data which contains the content of the email. This was the only place I could find the actual data for the response. Unfortunately, I get this when it is decoded:
b'This is my response from my phone\r\n\r\nOn Sat, 28 Nov 2020, 8:40 PM , <myemail#gmail.com>\r\nwrote:\r\n\r\n> This is sent from the python script\r\n>\r\n'
This is all the data in the thread, however, I only want the response as there is clearly no way to split this to get only the data I need. The best I can think of is to parse out anything of the form On <date>, <time>, but that could lead to problems. There must be another way to extract only This is my response from my phone and no other data.
Once I get the response, I want to parse it and reply with an appropriate response based on the contents of the message. I would prefer to reply directly to the thread, rather than starting a new one. Unfortunately, all the Google documentation says is:
If you're trying to send a reply and want the email to thread, make sure that:
The Subject headers match
The References and In-Reply-To headers follow the RFC 2822 standard.
The documentation provides this code (with some minor modifications by me) for sending an email:
def create_message(sender, to, subject, message_text):
message = MIMEText(message_text)
message['to'] = to
message['from'] = sender
message['subject'] = subject
return {'raw': base64.urlsafe_b64encode(message.as_bytes()).decode()}
Sending a reply with the same subject line is pretty straight forward (message['subject'] = same_subject_as_before), but I don't even know where to start with the References and In-Reply-To headers. How do I set these?
Why is this hard?
You are trying to use e-mail for something it simply wasn't originally designed for. My impression is you want the e-mail response to contain structured data, but e-mail text lacks any well-defined structure. It also depends on which e-mail client the other user has, and whether they send HTML e-mail or not.
This is usually easy for a human to see, but difficult for a computer. Which suggests that Machine Learning might be the best strategy if you want higher reliability. Whatever solution you choose, it's not going to be 100% reliable.
E-mail can be plain text or HTML, or both.
There is no well-defined structure to separate replies from the original text. Wikipedia lists a few different "posting styles".
In the old days when "Netiquette" was still cool, putting your reply on top ("top-posting") was considered bad practice, and new Internet users were told by old folks to avoid top-posting. Some users still reply below or interleaved with the original text.
The reply line (e.g. "On DATE, EMAIL wrote:" or "-------- Original Message --------") will be different, depending on which e-mail client is used, what language that client is set to, and the user's own preferences.
Using a text delimiter
A class of software which faces a similar problem as the one you describe is customer service applications, which allow operators to use e-mail for communication. A common strategy is to inject some unique text in your templates for outgoing e-mail. For example, Zendesk uses a text "delimiter" such as:
##- Please type your reply above this line -##
This serves two purposes; it tells users to top-post, and it provides a separator to cut out most of the irrelevant text.
If you first handle any HTML encoding, you should be able to split the message by such a text delimiter. It's not perfect, but it usually works.
Use products made by others
There are some open source options, such as:
https://github.com/zapier/email-reply-parser
And I found a commercial product, SigParser, which seems to use a machine learning model that they've trained very carefully:
https://sigparser.com/developers/extract-reply-chains-from-emails/
They also explain some of the challenges of parsing e-mail text into structured data.
I want to retrieve body (only text) of emails using python imap and email package.
As per this SO thread, I'm using the following code:
mail = email.message_from_string(email_body)
bodytext = mail.get_payload()[ 0 ].get_payload()
Though it's working fine for some instances, but sometime I get similar to following response
[<email.message.Message instance at 0x0206DCD8>, <email.message.Message instance at 0x0206D508>]
You are assuming that messages have a uniform structure, with one well-defined "main part". That is not the case; there can be messages with a single part which is not a text part (just an "attachment" of a binary file, and nothing else) or it can be a multipart with multiple textual parts (or, again, none at all) and even if there is only one, it need not be the first part. Furthermore, there are nested multiparts (one or more parts is another MIME message, recursively).
In so many words, you must inspect the MIME structure, then decide which part(s) are relevant for your application. If you only receive messages from a fairly static, small set of clients, you may be able to cut some corners (at least until the next upgrade of Microsoft Plague hits) but in general, there simply isn't a hierarchy of any kind, just a collection of (not necessarily always directly related) equally important parts.
The main problem in my case is that replied or forwarded message shown as message instance in the bodytext.
Solved my problem using the following code:
bodytext=mail.get_payload()[0].get_payload();
if type(bodytext) is list:
bodytext=','.join(str(v) for v in bodytext)
My external lib: https://github.com/ikvk/imap_tools
from imap_tools import MailBox
# get list of email bodies from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
bodies = [msg.text or msg.html for msg in mailbox.fetch()]
Maybe this post (of mine) can be of help. I receive a Newsletter with prices of different kind of oil in the US. I fetch email in gmail with a given pattern for the title, then I extract the prices in the mail body using regex. So i have to access the mail body for the last n emails which title observe given pattern.
I am using email.message_from_string() also: msg = email.message_from_string(response_part[1])
so maybe it gives you concrete example of how to use methods in this python lib.
I'm writing a monitoring solution using python3 with exchangelib and trying to count messages in our team's mailbox. One of the criteria: recipient list must contain specific email address.
When i use filter() with author or subject arguments script is working fine and return correct results.
But when i tried to filter by to_recipients or to_recipients__contains (which is list-type field), script throws an exception:
ValueError: EWS does not support filtering on field 'to_recipients'
Is there a way to filter mailbox by recipient email_address, avoiding to fetch all messages and than filtering it on the client side?
[exchangelib maintainer here]
I don't think there is. You could try to flip the is_searchable flag on that field and search anyway, but I never could get filtering to work in my tests. I can't remember if it throws server errors, returns all items anyway, or returns an empty list.
I'm happy to accept patches it you do find a solution.
I'd like to implement a way to check an outgoing email in django if it has a high spam score by way of clicking a button to read the email contents. This way, I could modify the email to be less than the spamassassin score of 5 or 10 (something like aweber's implementation) prior to sending it to my contacts.
Any help or pointers is greatly appreciated.
I'm doing a bit of an experiment in Python. I'm making a script which checks a rss-feed for new items, and then sends the title and link of the items via email. I've got the script to work to a certain level: when it runs it will take the link+title of the newest item and email it, regardless of wether it emailed that file already or not. I'd need to add 2 things: a way to get multiple items at once (and email those, one by one), and a way to check wether they have been sent already. How would I do this? I'm using feedparser, this is what I've got so far:
d = feedparser.parse('http://feedparser.org/docs/examples/rss20.xml')
link = d.entries[0].link
title = d.entries[0].title
And then a couple of lines which send an email with "link" and "title" in there. I know I'd need to use the Etag, but haven't been able to work out how, and how would I send the emails 1 by 1?
for the feed parsing part, you could consider following the advise given in this question regarding How to detect changed and new items in an RSS feed?. Basically, you could hash the contents of each entry and use that as an id.
For instance, on the first run of your program it will calculate the hash of each entry, store that hash, and send these new entries by mail. On it's next run, it will rehash each entry's content and compare those hashes with the ones found before (you should use some sort of database for this, or at least an on memory dictionary/list when developing with the entries already parsed and sent). If your program finds hashes that where not generated on the previous runs, it will assemble a new email and send it with the "new" entries.
As for your email assembling part, the question Sending HTML email in Python could help. Just make sure to send a text only and a html version.
For the simplest method see the python smtplib documentation example. (I won't repeat the code here.) It's all you need for basic email sending.
For nicer/more complicated email content also look into python's email module, of course.