getting emails from gmail via python - python

I'm using the following code to download all my emails from gmail, but unfortunately, the total number of emails returned does not match the total number of emails in the account. In particular, I'm able to get the first 43 messages, but I count 20+ more in the inbox that are missed. Perhaps this is some sort of limit on the number that can be pulled back(?). Thanks in advance for any assistance provided!
import imaplib, email, base64
def fetch_messages(username, password):
messages = []
conn = imaplib.IMAP4_SSL("imap.gmail.com", 993)
conn.login(username, password)
conn.select()
typ, data = conn.uid('search', None, 'ALL')
for num in data[0].split():
typ, msg_data = conn.uid('fetch', num, '(RFC822)')
for response_part in msg_data:
if isinstance(response_part, tuple):
messages.append(email.message_from_string(response_part[1]))
typ, response = conn.store(num, '+FLAGS', r'(\Seen)')
return messages

I use the following to get all email messages.
resp,data = mail.uid('FETCH', '1:*' , '(RFC822)')
and to get all the ids I use:
result, data = mail.uid('search', None, "ALL")
print data[0].split()
gives:
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', ... etc ]
EDIT
In my case the follow returns 202 dates which is in excess of what the OP is looking for and is the correct number.
resp,data = mail.uid('FETCH', '1:*' , '(RFC822)')
messages = [data[i][1].strip() for i in xrange(0, len(data), 2)]
for msg in messages:
msg_str = email.message_from_string(msg)
print msg_str.get('Date')

Related

No national characters in mail subject using imaplib, email in Python to read gmail inbox

Windows. Python 3.9.
As a value of mail subject I get other characters instead of Polish characters - I get:
Odpowied�� automatyczna: "Re: Program licz��cy ceny i sprzeda�� w allegro dla EAN��w"
instead of:
Odpowiedź automatyczna: "Re: Program liczący ceny i sprzedaż w allegro dla EANów"
How to make it correct? Should I apply some codepage information somewhere?
I notice all out dictionary values are string except for the subject which is of type Header.
import imaplib, email
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('user', 'pwd')
mail.select('inbox')
data = mail.search(None, 'ALL')
_, data = mail.fetch(str(7), '(RFC822)')
message = email.message_from_bytes(data[0][1])
out = {
'from': message['from'],
'subject': message['subject'],
'to': message['Delivered-To'],
'datetime': message['Date'],
'cc': message['Cc']
}
if understand correctly you need to decode bytes.
try something like
from email.header import decode_header
subject, encoding = decode_header(message["subject"])[0]
if isinstance(subject, bytes):
subject = subject.decode(encoding)

IMAP, view email's labels, Python & Gmail

How can I see what labels an email has?
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('myaccountxyz#gmail.com', mypassword)
mail.select("my-folder") # finds emails with this label
result, data = mail.uid('search', None, 'all')
for email_uid in data[0].split():
result, data_single = mail.uid('fetch', email_uid, '(RFC822)')
raw_email = data_single[0][1]
email_message = email.message_from_string(raw_email)
sender = email_message['From']
# get list of this email's labels
I haven't tried this code myself, but according to Google IMAP Extensions, you should be able to just fetch the X-GM-LABELS item:
typ, dat = mail.uid('fetch', email_uid, 'X-GM-LABELS')

How to delete emails using UID in imaplib python

I'm trying to delete emails by UID. It's a hotmail email account I'm accessing.
Here's what I'm doing:
1. Connect to Email
imap = imaplib.IMAP4_SSL('imap-mail.outlook.com')
imap.login('my_email#hotmail.com', "password")
2. Get UID from emails
resp, _ = imap.select('Inbox')
mbox_response, msgnums = imap.search( None,'FROM', 'email#sender.com')
messages = [] #Appending UID to this dictionary
for num in msgnums[0].split():
msg_uid = imap.fetch(num, 'UID')
messages.append({'uid':imap.fetch(num, 'UID')})
3. Print UIDs
print(messages)
I get the following output:
[{
'uid': ('OK', [b'1 (UID 111)']),
'uid': ('OK', [b'2 (UID 114)'])
}]
4. How do I delete?
How do I use these UIDs to delete the specific message?
I've tried this without success...
for m in messages:
imap.store(m['uid'],'+X-GM-LABELS', '\\Trash')
I get the following error:
TypeError: can't concat tuple to bytes
from imap_tools import MailBox
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX/test') as mailbox:
# DELETE all messages from current folder (INBOX/test)
mailbox.delete([msg.uid for msg in mailbox.fetch()])
https://github.com/ikvk/imap_tools

Convert email body to a string

so I'm working on something that uses regex to search something from an email, which is fetched via imaplib module. Right now I can't get it to work, even after using str() function.
result, data = mail.fetch(x, '(RFC822)')
eemail = email.message_from_bytes(data[0][1])
print(str(eemail))
trying to regex it:
print(re.search("button", eemail))
Regex gives me no matches even after making the email a string object.
This is what I use:
import imaplib
import email
import re
mail = imaplib.IMAP4_SSL(SMTP_SERVER, SMTP_PORT)
mail.login(FROM_EMAIL,FROM_PWD)
mail.select('inbox')
status, response = mail.search(None, '(UNSEEN)')
unread_msg_nums = response[0].split()
for e_id in unread_msg_nums:
_, response = mail.fetch(e_id, '(UID BODY[TEXT])')
b = email.message_from_string(response[0][1])
if b.is_multipart():
for payload in b.get_payload(decode=True):
print(re.search("button", payload.get_payload(decode=True)))
else:
print(re.search("button", b.get_payload(decode=True)))

How to deal with flooded unseen messages

I have written an email parsing mechanism in python.
It finds a new email and passes the data correctly. I am 99.999% certain that my code is functioning correctly, so there should be no issue there. The problem is that occasionally, the Gmail inbox will get flooded with messages that are considered "unseen". At this point, there is nothing that my code can do.
It fails with:
imaplib.error: FETCH command error: BAD ['Could not parse command']
This is distressing, and I would love to have either
a way to check whether the unseen messages have overflown to this state, or
a way to manually (via imaplib) mark all messages as read, including a way to detect this particular error.
Any thoughts on how to accomplish this?
Here is my code:
#!/usr/bin/env python
import imaplib, re, sys, time, OSC, threading, os
iparg = 'localhost'
oportarg = 9000
iportarg = 9002
usern = 'myusrname#gmail.com'
gpass = 'mypass'
kill_program = False
server = imaplib.IMAP4_SSL('imap.googlemail.com', 993)
oclient = OSC.OSCClient()
email_interval = 2.0
def login():
server.login(usern, gpass)
oclient.connect((iparg, oportarg))
def logout_handle(addr, tags, stuff, source):
print 'received kill call'
global kill_program
kill_program = True
def filter_signature(s): #so annoying; wish i didn't have to do this
try:
a_sig = re.sub(r'Sent|--Sent', '', s)
b_sig = re.sub(r'using SMS-to-email. Reply to this email to text the sender back and', '', a_sig)
c_sig = re.sub(r'save on SMS fees.', '', b_sig)
d_sig = re.sub(r'https://www.google.com/voice', '', c_sig)
no_lines = re.sub(r'\n|=|\r?', '', d_sig) #add weird characters to this as needed
except:
nolines = s
return no_lines
def parse_email(interval):
while True:
server.select('INBOX')
status, ids = server.search(None, 'UnSeen')
print 'status is: ', status
if not ids or ids[0] is '':
print 'no new messages'
else:
try:
print 'found a message; attempting to parse...'
latest_id = ids[0]
status, msg_data = server.fetch(latest_id, '(UID BODY[TEXT])')
raw_data = msg_data[0][1]
raw_filter = raw_data
print 'message result: ', raw_filter
time.sleep(interval)
#execute main block
while not kill_program:
login()
parse_email(email_interval)
st.kill()
sys.exit()
Based upon the error, I would very carefully check the parameters that you're passing to fetch. Gmail is telling you that it could not parse the command that you sent to it.
Also, you can do a STORE +FLAGS \SEEN to mark the messages as read.

Categories