How hard is it to build an Email client? - Python

How hard is it to build an Email client? - Python - python

I'm venturing in unknown territory here...
I am trying to work out how hard it could be to implement an Email client using Python:
Email retrieval
Email sending
Email formatting
Email rendering
Also I'm wondering if all protocols are easy/hard to support e.g. SMTP, IMAP, POP3, ...
Hopefully someone could point me in the right direction :)

The Python language does offer raw support for the needed protocols in its standard library. Properly using then, and, properly parsing and assembling a "modern day" e-mail message, however can be tough to do.
Also, you didn't say if you want to create a graphical interface for your e-mail client -- if you want to have a proper graphical interface -- up to the point of being usable, it is quite a lot of work.
Local e-mail storage would be the easier part - unless you want to properly implement an mbox file format RFC-4155 so that other software can easily read/write the messgaes you have fetched, you can store them in as Python Objects using an ORM or an Object Oriented database, such as ZODB, or MongoDB.
If you want more than a toy e-mail app, you will have a lot of work - properly encoding e-mail headers, for example, server authentication and secure authentication and transport layers, decoding of the e-mail text body itself for non ASCII messages. Although the modules on the Python standard library do implement a lot of that, their documentation falls short on examples - and a complete e-mail client would have to use all of then.
Certainly the place to start an e-mail client, even a toy one, would be taking a look on the most recent RFC's for e-mail (and you will have to pick then from here http://www.ietf.org/rfc/rfc-index since just looking for "email rfc" on google gives a poor result).

I think you will find much of the clients important parts prepackaged:
Email retrieval - I think that is covered by many of the Python libraries.
Email sending - This would not be hard and it is most likely covered as well.
Email formatting - I know this is covered because I just used it to parse single and multipart emails for a client.
Email rendering - I would shoot for an HTML renderer of some sort. There is a Python interface to the renderer from the Mozilla project. I would guess there are other rendering engines that have python interfaces as well. I know wxWidgets has some simple HTML facilities and would be a lot lighter weight. Come to think about it the Mozilla engine may have a bunch of the other functions you would need as well. You would have to research each of the parts.
There is lot more to it than what is listed above. Like anything worth while it won't be built in a day. I would lay out precisely what you want it to do. Then start putting together a prototype. Just build a simple framework that does basic things. Like only have it support the text part of a message with no html. Then build on that.
I am amazed at the wealth of coding modules available with Python. I needed to filter html email messages, parse stylesheets, embed styles, and whole host of other things. I found just about every function I needed in a Python library somewhere. I was especially happy when I found out that some css sheets are gzipped that there was a module for that!
So if you are serious about then dig in. You will learn a LOT. :)

I have made two libraries that solve some of those problems easily:
Sending emails: Red Mail (SMTP)
Receiving emails: Red Box (IMAP)
Here is a short example of both:
from redbox import EmailBox
from redmail import EmailSender
USERNAME = "me#example.com"
PASSWORD = "<PASSWORD>"
box = EmailBox(
host="imap.example.com",
port=993,
username=USERNAME,
password=PASSWORD
)
sender = EmailSender(
host="smtp.example.com",
port=587,
username=USERNAME,
password=PASSWORD
)
Then you can send emails:
email.send(
subject='email subject',
sender="me#example.com",
receivers=['you#example.com'],
text="Hi, this is an email.",
html="""
<h1>Hi,</h1>
<p>this is an email.</p>
""",
attachments={
'data.csv': Path('path/to/file.csv'),
'raw_file.html': '<h1>Just some HTML</h1>',
}
)
Or read emails:
from redbox.query import UNSEEN, FROM
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(UNSEEN & FROM('they#example.com')):
# Process the message
print(msg.headers)
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
# Set the message as read/seen
msg.read()
Red Box fully supports logical operations using the query language if you need complex logical operations. You can also easily access various parts of the messages.
Links, Red Mail:
Source code
Documentation
Links, Red Box:
Source code
Documentation

If I were you, I'd check out the source code of existing email-clients to get an idea: thunderbird, sylpheed-claws, mutt...
Depending on the set of features you want to support, it is a big project.

Depends to what level you want to build the client. You can quickly whip something up with libraries like smtplib for handling conection/data. And tk for a GUI. But again it all depends on the level of finish your after.
A quick basic tool for yourself: Easy. (With libraries)
Writing a full-feutured email client: Hard.
Instead of using a library, you can also find an open source project you can contribute to. I'd recommend having a look at Mailpile

Related

Trigger an email when some a certain file becomes available in some location

newbie here.
I have been learning Python for 3 months now because my team uses it to handle big data. I am almost done fully automating the end-to-end process of producing our monthly financials, except for one crucial step:
I need to trigger an email when the extract becomes available.
Basically, I want to do this:
if file F exists in folder Downstream:
Then send email to self.
I will then confirm running the entire process by replying some code word to this email.
TIA!

You may use Python's email library but usually, the examples are not really straightforward and simple especially if you are new. For this reason, I actually made a package that most likely is enough for your needs regarding sending emails.
Just pip install it:
pip install redmail
Then to solve your problem:
from pathlib import Path
from redmail import EmailSender
email = EmailSender(
host="<YOUR SMTP SERVER>",
port=1,
user_name="you#example.com", # If not needed, don't pass this
password="<YOUR PASSWORD>" # If not needed, don't pass this
)
# Check if file exists
if Path("path/to/file.csv").is_file():
# File does exists, send the email
email.send(
receivers=["you#example.com"],
subject="File is now found",
)
If you want to include a message, pass text and/or html parameters and write the email message as a string. You may also set the receivers as a default (attribute of the email instance) if repeatedly send you an email in different sections of your code.
Red Mail is well tested and well documented open-source library with features such as including attachments, embedding images, templating text or HTML and preformatted errors.
Documentation: https://red-mail.readthedocs.io/en/latest/
Source code: https://github.com/Miksus/red-mail
Note that pathlib is from standard library. It is a handy library for manipulating file paths. You could also use os.path.exists if you prefer that.

Parse email and send reply using googleapiclient in Python

I'm currently working on a project and I have chosen to use Gmail for sending and receiving emails. I want to be able to send an email, have a user reply to it, and parse their response. The response can be any number of lines (so something like response.split('\n')[0] won't work). It should then be able to reply directly to that email thread.
I've been following the googleapiclient tutorials, but they leave a lot to be desired. However, I've managed to read email threads using:
service.users.threads().get(userId='me', id=thread_id).execute()
where thread_id is (predictably) the ID of the email thread (which I find elsewhere). In the large dict returned by this, there is a section of base64 data which contains the content of the email. This was the only place I could find the actual data for the response. Unfortunately, I get this when it is decoded:
b'This is my response from my phone\r\n\r\nOn Sat, 28 Nov 2020, 8:40 PM , <myemail#gmail.com>\r\nwrote:\r\n\r\n> This is sent from the python script\r\n>\r\n'
This is all the data in the thread, however, I only want the response as there is clearly no way to split this to get only the data I need. The best I can think of is to parse out anything of the form On <date>, <time>, but that could lead to problems. There must be another way to extract only This is my response from my phone and no other data.
Once I get the response, I want to parse it and reply with an appropriate response based on the contents of the message. I would prefer to reply directly to the thread, rather than starting a new one. Unfortunately, all the Google documentation says is:
If you're trying to send a reply and want the email to thread, make sure that:
The Subject headers match
The References and In-Reply-To headers follow the RFC 2822 standard.
The documentation provides this code (with some minor modifications by me) for sending an email:
def create_message(sender, to, subject, message_text):
message = MIMEText(message_text)
message['to'] = to
message['from'] = sender
message['subject'] = subject
return {'raw': base64.urlsafe_b64encode(message.as_bytes()).decode()}
Sending a reply with the same subject line is pretty straight forward (message['subject'] = same_subject_as_before), but I don't even know where to start with the References and In-Reply-To headers. How do I set these?

Why is this hard?
You are trying to use e-mail for something it simply wasn't originally designed for. My impression is you want the e-mail response to contain structured data, but e-mail text lacks any well-defined structure. It also depends on which e-mail client the other user has, and whether they send HTML e-mail or not.
This is usually easy for a human to see, but difficult for a computer. Which suggests that Machine Learning might be the best strategy if you want higher reliability. Whatever solution you choose, it's not going to be 100% reliable.
E-mail can be plain text or HTML, or both.
There is no well-defined structure to separate replies from the original text. Wikipedia lists a few different "posting styles".
In the old days when "Netiquette" was still cool, putting your reply on top ("top-posting") was considered bad practice, and new Internet users were told by old folks to avoid top-posting. Some users still reply below or interleaved with the original text.
The reply line (e.g. "On DATE, EMAIL wrote:" or "-------- Original Message --------") will be different, depending on which e-mail client is used, what language that client is set to, and the user's own preferences.
Using a text delimiter
A class of software which faces a similar problem as the one you describe is customer service applications, which allow operators to use e-mail for communication. A common strategy is to inject some unique text in your templates for outgoing e-mail. For example, Zendesk uses a text "delimiter" such as:
##- Please type your reply above this line -##
This serves two purposes; it tells users to top-post, and it provides a separator to cut out most of the irrelevant text.
If you first handle any HTML encoding, you should be able to split the message by such a text delimiter. It's not perfect, but it usually works.
Use products made by others
There are some open source options, such as:
https://github.com/zapier/email-reply-parser
And I found a commercial product, SigParser, which seems to use a machine learning model that they've trained very carefully:
https://sigparser.com/developers/extract-reply-chains-from-emails/
They also explain some of the challenges of parsing e-mail text into structured data.

Is there a rules based IMAP email processing engine?

In the past, I have written a script using the Python IMAP library to move certain emails in my Gmail Inbox that matched a certain pattern into the SPAM folder.
I would like to set up more rules like, "archive mails from newsletter.com after two weeks".
Since this seems to be a common use case, I was wondering whether anyone had written a more generic tool to implement rules based email processing. I'm not looking for natural English rules but something a bit easier to configure than writing code.

I've used http://fdm.sourceforge.net/ to do this with local mboxes. Section 6.4 of the page I linked says that you can also have an action be on a remote IMAP(S) server.
Mutt also supports connecting directly to an IMAP server and has powerful regex-based tagging and actions based on tags.

I'm searching for a messaging platform (like XMPP) that allows tight integration with a web application

At the company I work for, we are building a cluster of web applications for collaboration. Things like accounting, billing, CRM etc.
We are using a RESTfull technique:
For database we use CouchDB
Different applications communicate with one another and with the database via http.
Besides, we have a single sign on solution, so that when you login in one application, you are automatically logged to the other.
For all apps we use Python (Pylons).
Now we need to add instant messaging to the stack.
We need to support both web and desktop clients. But just being able to chat is not enough.
We need to be able to achieve all of the following (and more similar things).
When somebody gets assigned to a task, they must receive a message. I guess this is possible with some system daemon.
There must be an option to automatically group people in groups by lots of different properties. For example, there must be groups divided both by geographical location, by company division, by job type (all the programers from different cities and different company divisions must form a group), so that one can send mass messages to a group of choice.
Rooms should be automatically created and destroyed. For example when several people visit the same invoice, a room for them must be automatically created (and they must auto-join). And when all leave the invoice, the room must be destroyed.
Authentication and authorization from our applications.
I can implement this using custom solutions like hookbox http://hookbox.org/docs/intro.html
but then I'll have lots of problems in supporting desktop clients.
I have no former experience with instant messaging. I've been reading about this lately. I've been looking mostly at things like ejabberd. But it has been a hard time and I can't find whether what I want is possible at all.
So I'd be happy if people with experience in this field could help me with some advice, articles, tales of what is possible etc.

Like frx suggested above, the StropheJS folks have an excellent book about web+xmpp coding but since you mentioned you have no experience in this type of coding I would suggest talking to some folks who have :) It will save you time in the long run - not that I'm saying don't try to implement what frx outlines, it could be a fun project :)
I know of one group who has implemented something similar and chatting with them would help solidify what you have in mind: http://andyet.net/ (I'm not affiliated with them at all except for the fact that the XMPP dev community is small and we tend to know each other :)

All goals could be achieved with ejabberd, strophe and little server side scripting
When someone gets assigned to task, server side script could easily authenticate to xmpp server and send message stanza to assigned JID. That its trivial task.
To group different people in groups, it is easily can be done from web chat app if those user properties are stored somewhere. Just join them in particular multi user chat room after authentication.
Ejabberd has option to automatically create and destroy rooms.
Ejabberd has various authorization methods including database and script auth
You could take look at StropheJS library, they have great book (paperback) released. Really recommend to read this book http://professionalxmpp.com/

How to implement Google Suggest in your own web application (e.g. using Python)

In my website, users have the possibility to store links.
During typing the internet address into the designated field I would like to display a suggest/autocomplete box similar to Google Suggest or the Chrome Omnibar.
Example:
User is typing as URL:
http://www.sta
Suggestions which would be displayed:
http://www.staples.com
http://www.starbucks.com
http://www.stackoverflow.com
How can I achieve this while not reinventing the wheel? :)

You could try with
http://google.com/complete/search?output=toolbar&q=keyword
and then parse the xml result.

I did this once before in a Django server. There's two parts - client-side and server-side.
Client side you will have to send out XmlHttpRequests to the server as the user is typing, and then when the information comes back, display it. This part will require a decent amount of javascript, including some tricky parts like callbacks and keypress handlers.
Server side you will have to handle the XmlHttpRequests which will be something that contains what the user has typed so far. Like a url of
www.yoursite.com/suggest?typed=www.sta
and then respond with the suggestions encoded in some way. (I'd recommend JSON-encoding the suggestions.) You also have to actually get the suggestions from your database, this could be just a simple SQL call or something else depending on your framework.
But the server-side part is pretty simple. The client-side part is trickier, I think. I found this article helpful
He's writing things in php, but the client side work is pretty much the same. In particular you might find his CSS helpful.

Yahoo has a good autocomplete control.
They have a sample here..
Obviously this does nothing to help you out in getting the data - but it looks like you have your own source and arent actually looking to get data from Google.

If you want the auto-complete to use date from your own database, you'll need to do the search yourself and update the suggestions using AJAX as users type. For the search part, you might want to look at Lucene.

That control is often called a word wheel. MSDN has a recent walkthrough on writing one with LINQ. There are two critical aspects: deferred execution and lazy evaluation. The article has source code too.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.