Gmail IMAP is not returning uids

Gmail IMAP is not returning uids - python

I am connecting to my gmail account via IMAP to sync some of my emails and parse them. Sometimes I need to download again some emails because I did some kind of fix and now gmail is not returning me the uids of those emails in any way, here is some code to explain myself better:
typ, data = self.connection.uid('search', None, '(SINCE 14-Dec-2012 BEFORE 20-Dec-2012)')
17:05.55 > HJBM3 UID SEARCH (SINCE 14-Dec-2012 BEFORE 20-Dec-2012)
17:05.69 < * SEARCH
17:05.69 < HJBM3 OK SEARCH completed (Success)
('OK', [''])
I have a good bunch of emails on those dates including the ones I want to parse and it doesn't return anything, depending on the date it does return some uids so is not completely broken.
I decided to try if thunderbird synced correctly those emails and it got them no problem.
I am using the python 2.6 imaplib (version 2.58)

Maybe this will help someone so I'll answer it here:
I had in gmail this setting on:
When I changed it to "Do not limit" It worked like a charm.

Related

python win32com: Delete multiple emails in Outlook

I need to delete multiple email messages in Outlook from python via win32com module.
I understand there is a VBA method MailItem.Delete() available to win32com via COM and it works; but it is VERY VERY slow when deleting more than one email since one would have to delete emails sequentially ie loop over the MailItem collection of emails.
Is there any way to delete a selected collection of mailItems at once, something like MailItemCollection.DeleteAll()?
Also, if above is not possible; is it at all possible to delete many emails via multi-threaded approach ie divide the collection of mailItems into, let's say, 4 subsets; have 4 threads operate on those?
I figure since I can delete multiple emails in outlook via its GUI very fast, there has to be a way where I can do the same thing via COM API.

Not in OOM - MailItem.Delete or Items.Remove(Index) is all you get.
On the Extended MAPI level (C++ or Delphi, but not Python), you can delete multiple messages using IMAPIFolder.DeleteMessages (which takes a list of entry ids). Or you can use IMAPIFolder.EmptyFolder (deletes all messages in a folder).
If using Redemption (any language; I am its author) is an option, you can use RDOFolder2.EmptyFolder or RDOFolder.Items.RemoveMultiple. RDOFolder can be retrieved from RDOSession.GetRDOObjectFromOutlookObject if you pass Outlook's MAPIFolder object as a parameter.

On top of a great answer by #Dimitry I'll add a remark which may be important for you: if you start deleting from Items as you iterate over it, strange things may happen.
For example on my system the following Python code:
for mail in folder.Items:
mail.Delete()
as well as
for index, mail in enumerate(folder.Items, 1):
folder.Remove(index)
both remove only half of the items in the folder! The reason seems to be that Items uses a range of indices internally to provide an iterator so each time an element is deleted, the tail of the list is shifted by one...
To remove all items in the folder try:
for i in range(len(folder.Items)):
folder.Remove(1)
If you need to filter by a certain criterion consider first gathering EntryIDs and then deleting searching for ID:
ids = []
for i in range(len(folder.Items), 1):
if to_be_deleted(folder.Items[index]):
ids.append(index)
for id in ids:
outlook.GetEntryByID(id).Delete()
I imagine performance of that is even worse, though :c

Great answer from Dedalus above. Wanted to make a more concise version of the code:
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
# Select main Inbox
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
# Delete all messages from a specific sender
sender = 'myname#abc.com'
try:
for message in messages:
try:
s = message.sender
s = str(s)
if s == sender:
message.Delete()
except:
pass
except:
pass
You may not need two "trys" but I found it was more stable when applying the script to a long and heavily used inbox. Usually I combine this with a script that limits the message = inbox.Items to within a week so it doesn't do the entire inbox.

For me it worked by iterating the items in reverse.
Old:
for mail in folder.Items:
if 'whatever' in mail.Subject: # just a condition (optional)
mail.Delete()
New code:
for mail in reversed(folder.Items): # just tried deleting Items in reverse order
if 'whatever' in mail.Subject: # just a condition (optional)
mail.Delete()
Hope this helps someone.

Am I missing something? Neither Application nor NameSpace objects appear to have a GetEntryByID method, though the rest of what Dedalus pointed out was correct.
Namespace objects have a GetItemFromID method, and MailItem objects have a EntryID property which will uniquely identify them so long as they don't get reorganized into different folders.
Documentation: https://learn.microsoft.com/en-us/office/vba/outlook/how-to/items-folders-and-stores/working-with-entryids-and-storeids
My full solve:
import win32com.client
outlook = win32com.client.gencache.EnsureDispatch("Outlook.Application")
folders = outlook.GetNamespace("MAPI")
inbox= folders.GetDefaultFolder(6)
messages=inbox.Items
email_ids = []
folder_id = inbox.StoreID
# Here create a function to isolate/exclude. Below is just an example of filtering by a subject line.
email_subjects = ['Subj1','Subj2','Subj3']
for i in range(len(messages)):
if any(header in inbox.Items[i].Subject for header in email_subjects):
email_ids.append(inbox.Items[i].EntryID)
for id in email_ids:
folders.GetItemFromID(id, folder_id).Delete()

I've implemented an alternative solution in local Outlook, by moving email ítems from.inbox folder to deleted items folder or to an archive folder, by using VBA code or Outlook filter rules directly.
This way, I just mannualy empty the deleted items folder once a week (of course this periodic step can also be programmed).
I observed that this strategy can be more efficient instead of delete item per item using code (you mentioned the internal.indexes problem).

Is it possible to get recent 10 emails using Gmail-api?

For now, I can use gmail api to get all UNREAD emails or all emails in INBOX.
GMAIL.users().messages().list(userId='me', labelIds=['UNREAD', 'INBOX']).execute()
Because getting all the emails could be annoying, I was wondering is it possible to get only the recent 10 UNREAD emails from gmail api?
Thanks for any hint that will allow me to do such thing.

The documentation tells us that we need to pass maxResults and set it to 10:
GMAIL.users().messages().list(userId='me', labelIds=['UNREAD'], maxResults=10).execute()

Go through this doc
You can query using the advanced search options.
You may use
newer_than:
Example: newer_than:2d
Search for messages older or newer than a time period using d (day), m (month), and y (year)

GMAIL.users().messages().list(
userId="me",
labelIds=["UNREAD", "INBOX"],
maxResults=10,
q="newer_than:6d"
).execute()

Extract attendee response from google calendar event with python

This is a follow on question to one answered recently by wpercy and Kieran.
I'm trying to fashion some Python code to improve a Zap in Zapier.
The first stage involved extracting the attendee emails from the supplied (by Google) string variable containing the emails separated by commas.
What I now need to figure out is how to also extract the attendee responses and pair them or somehow get them to follow their corresponding attendee email address as the remaining steps in the Zap are carried out, once for each email/attendee.
Here is the solution code I have successfully tested. It deals with just the emails:
emails = []
attendeeList = input_data['attendeeEmails'].split(',')
for email in attendeeList:
a = {'Email' : email.strip()}
emails.append(a)
return emails
Here is the other solution offered by Kieran:
[{'Email': email.strip()} for email in input_data['attendeeEmails'].split(',')]
The Google Calendar data looks like this:
attendees:
1:
displayName: Doug Christensen
email: xxxx#gmail.com
responseStatus: needsAction
2:
displayName: Doug Christensen
email: yyyyyy#gmail.com
responseStatus: needsAction
3:
self: true
email: zzzz#xyzmadscience.com
organizer: true
responseStatus: accepted
So I want to get "responseStatus" and the only thing I could think to do was the following:
emails = []
position = 0
responseList = input_data['attendeeReponses'].split(',')
attendeeList = input_data['attendeeEmails'].split(',')
for email in attendeeList:
a = {'Email' : email.strip(), 'responseStatus' : reponseStatus(position).strip()}
a = {'Email' : email.strip()}
emails.append(a)
position += 1
return emails
...but that does not work (says "error" in Zapier).
I'm pretty confused by the fact that the attendee emails are available in 2 Google variables "Attendee Emails" and "Attendees Email". One actually shows up in the variables to pass to the Zap's Python code as 'Attendees[]Email' while the other shows as 'Attendee Emails'. For the attendee responses there is only one option which manifests as 'Attendees[]ResponseStatus'.
I'm clearly no expert but these labels suggest to me a bit of a data structure? when the '[]' is included, making me think that an even more elegant method of extraction of the email and pairing with the attendee response, is possible.
I want the Python code to return the email and its corresponding attendee response in a way such that the following Zap steps will be performed once for each email/response pair.
Again, any guidance would be greatly appreciated.
Doug

The reason for your error is that you're trying to access an element in the list with parentheses (). You should be using brackets [].
Even after fixing that, you can be doing this in a far more pythonic fashion. Instead of keeping track of your position in the list with its own variable, you should use the built-in function enumerate(). This will keep track of the index for you, and you won't have to increment it manually.
You would use it like this
emails = []
responseList = input_data['attendeeReponses'].split(',')
attendeeList = input_data['attendeeEmails'].split(',')
for i,email in enumerate(attendeeList):
a = {'Email': email.strip(), 'responseStatus': reponseStatus[i].strip()}
emails.append(a)
return emails

get email unread content, without affecting unread state [duplicate]

This question already has answers here:
Fetch an email with imaplib but do not mark it as SEEN
(4 answers)
Closed 7 years ago.
Right now its a gmail box but sooner or later I want it to scale.
I want to sync a copy of a live personal mailbox (inbox and outbox) somewhere else, but I don't want to affect the unread state of any unread messages.
what type of access will make this easiest? I can't find any information if IMAP will affect the read state, but it appears I can manually reset a message to unread. Pop by definition doesn't affect unread state but nobody seems to use pop to access their gmail, why?

In the IMAP world, each message has flags. You can set the individual flags on each message. When you Fetch a message, it's actually possible to read the message, without applying the \Seen flag.
Most mail clients will apply the \Seen flag when the message is read. So, if the message has already been read, outside of your app, then you will need to remove the \Seen flag.
Just as fyi...here is the relevant part about flags from the RFCs:
A system flag is a flag name that is pre-defined in this
specification. All system flags begin with "\". Certain system
flags (\Deleted and \Seen) have special semantics described
elsewhere. The currently-defined system flags are:
\Seen
Message has been read
\Answered
Message has been answered
\Flagged
Message is "flagged" for urgent/special attention
\Deleted
Message is "deleted" for removal by later EXPUNGE
\Draft
Message has not completed composition (marked as a draft).
\Recent
Message is "recently" arrived in this mailbox. This session
is the first session to have been notified about this
message; if the session is read-write, subsequent sessions
will not see \Recent set for this message. This flag can not
be altered by the client.
If it is not possible to determine whether or not this
session is the first session to be notified about a message,
then that message SHOULD be considered recent.
If multiple connections have the same mailbox selected
simultaneously, it is undefined which of these connections
will see newly-arrived messages with \Recent set and which
will see it without \Recent set.

There is a .PEEK option on the FETCH command in IMAP that will explicitly not set the /Seen flag.
Look at the FETCH command in RFC 3501 and scroll down a bit to page 57 or search for "BODY.PEEK".

You need to specify section when you use BODY.PEEK. Sections are explained in IMAP Fetch Command documentations under BODY[<section>]<<partial>>
import getpass, imaplib
M = imaplib.IMAP4()
M.login(getpass.getuser(), getpass.getpass())
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
typ, data = M.fetch(num, '(BODY.PEEK[])')
print 'Message %s\n%s\n' % (num, data[0][5])
M.close()
M.logout()
PS: I wanted to fix answer given Gene Wood but was not allowed because edit was smaller than 6 characters (BODY.PEEK -> BODY.PEEK[])

Nobody uses POP because typically they want the extra functionality of IMAP, such as tracking message state. When that functionality is only getting in your way and needs workarounds, I think using POP's your best bet!-)

if it helps anyone, GAE allows you to receive email as an HTTP request, so for now i'm just forwarding emails there.

To follow up on Dan Goldstein's answer above, in python the syntax to use the ".PEEK" option would be to call IMAP4.fetch and pass it "BODY.PEEK"
To apply this to the example in the python docs :
import getpass, imaplib
M = imaplib.IMAP4()
M.login(getpass.getuser(), getpass.getpass())
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
typ, data = M.fetch(num, '(BODY.PEEK)')
print 'Message %s\n%s\n' % (num, data[0][5])
M.close()
M.logout()

How do I validate the MX record for a domain in python?

I have a large number of email addresses to validate. Initially I parse them with a regexp to throw out the completely crazy ones. I'm left with the ones that look sensible but still might contain errors.
I want to find which addresses have valid domains, so given me#abcxyz.com I want to know if it's even possible to send emails to abcxyz.com .
I want to test that to see if it corresponds to a valid A or MX record - is there an easy way to do it using only Python standard library? I'd rather not add an additional dependency to my project just to support this feature.

There is no DNS interface in the standard library so you will either have to roll your own or use a third party library.
This is not a fast-changing concept though, so the external libraries are stable and well tested.
The one I've used successful for the same task as your question is PyDNS.
A very rough sketch of my code is something like this:
import DNS, smtplib
DNS.DiscoverNameServers()
mx_hosts = DNS.mxlookup(hostname)
# Just doing the mxlookup might be enough for you,
# but do something like this to test for SMTP server
for mx in mx_hosts:
smtp = smtplib.SMTP()
#.. if this doesn't raise an exception it is a valid MX host...
try:
smtp.connect(mx[1])
except smtplib.SMTPConnectError:
continue # try the next MX server in list
Another library that might be better/faster than PyDNS is dnsmodule although it looks like it hasn't had any activity since 2002, compared to PyDNS last update in August 2008.
Edit: I would also like to point out that email addresses can't be easily parsed with a regexp. You are better off using the parseaddr() function in the standard library email.utils module (see my answer to this question for example).

The easy way to do this NOT in the standard library is to use the validate_email package:
from validate_email import validate_email
is_valid = validate_email('example#example.com', check_mx=True)
For faster results to process a large number of email addresses (e.g. list emails, you could stash the domains and only do a check_mx if the domain isn't there. Something like:
emails = ["email#example.com", "email#bad_domain", "email2#example.com", ...]
verified_domains = set()
for email in emails:
domain = email.split("#")[-1]
domain_verified = domain in verified_domains
is_valid = validate_email(email, check_mx=not domain_verified)
if is_valid:
verified_domains.add(domain)

An easy and effective way is to use a python package named as validate_email.
This package provides both the facilities. Check this article which will help you to check if your email actually exists or not.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.