Downloading emails with UTF-8 B encoded header

Downloading emails with UTF-8 B encoded header - python

I have a problem with a code which is supposed to download your emails in eml files.
Its supposed to go through the INBOX email listing, retrieve the email content and attachments(if any) and create an .eml file which contains all that.
What it does is that it works with content type of text and the majority multiparts. If an email in the listing contains utf-8B in its header, it simply acts like its the end of the email listing, without displaying any error.
The code in question is:
result, data = p.uid('search',None, search_criteria) # search_criteria is defined earlier in code
if result == 'OK':
data = get_newer_emails_first(data) # get_newer_emails_first() is a function defined to return the list of UIDs in reverse order (newer first)
context['emailsum'] = len(data) # total amount of emails based on the search_criteria parameter.
for num in data:
mymail2 = {}
result,data1 = p.iud('fetch', num, '(RFC822)')
email_message = email.message_from_bytes(data[0][1])
fullemail = email_message.as_bytes()
default_charset = 'ASCII'
if email_message.is_multipart():
m_subject = make_header(decode_header(email_message['Subject']))
else:
m_subject = r''.join([ six.text_type(t[0], t[1] or default_charset) for t in email.header.decode_header(email_message['Subject']) ])
m_from = string(make_header(decode_header(email_message['From'])))
m_date = email_message['Date']
I have done my tests and discovered that while the fullemail variable contains the email properly (thus it reads the data from the actual email successfully), the problem should be in the if else immediately after, but I cannot find what the problem is exactly.
Any ideas?
PS: I accidentally posted this question as a guest, but I opted to delete it and repost it from my account.

Apparently the error lay in my code in the silliest of ways.
Instead of:
m_from = string(make_header(decode_header(email_message['From'])))
m_date = email_message['Date']
It should be:
m_from = str(make_header(decode_header(email_message['From'])))
m_date = str(make_header(decode_header(email_message['Date'])))

Related

Python generated *.msg file

I'm Jan and it's my first post here and the following code is also my first python code. So please don't judge me, if the code is not well shaped :) and don't wonder I had to reduce my mail body.
With the following code I try to generate several msg file depending on a user list called "customer_names". The idea is to iterate through this list and to adjust the email body espacially the placeholder for "Customer". The rest of the body content is not so important. Everything works good except the iteration through the list. I have a suggestion that I may need to increment the index for the list in the loop. Do you have any ideas.
import win32com.client as win32
import datetime
import random
# List of customer names
customer_names = ['Name1','Name2','Name3']
current_customer_index = 0
# List of email providers
email_providers = ['provider1', 'provider2', 'provider3']
# set up the Outlook application
outlook = win32.Dispatch('outlook.application')
# create a new email
mail = outlook.CreateItem(0)
# set the subject and recipients
mail.Subject = "Request for Customer Information"
mail.To = "user#emailprovider.net"
# Message body for the email
message = f"Dear User,\n\nThe information we require from [Customer] is as follows:\n\n- Email Address: [Email Address] \n\nWe kindly request that you send the requested information to us within [number of days] days. \n\n Kind regards."
# set the number of days for the customer to respond
num_days = 7
# set the importance of the email to normal
mail.Importance = 1 # 0=Low, 1=Normal, 2=High
# set the sensitivity of the email to normal
mail.Sensitivity = 0 # 0=Normal, 1=Personal, 2=Private, 3=Confidential
# set the read receipt option to true
mail.ReadReceiptRequested = True
# add a reminder for the sender to follow up in 3 days
mail.FlagRequest = "Follow up"
mail.FlagDueBy = (datetime.date.today() + datetime.timedelta(days=3)).strftime('%m/%d/%Y')
# Generate a random email
for i in range(len(customer_names)):
customer = customer_names[i]
message_with_data = message.replace("[Customer]", customer)
message_with_data = message_with_data.replace("[number of days]", str(num_days))
mail.Body = message_with_data
file_name = "Request_" + customer + ".msg"
file_path = "C:/Users/user/Desktop/Email/" + file_name
mail.SaveAs(file_path)

Pull variable data from all outlook emails that have a particular subject line then take date from the body

I get an email every day with fruit quantities sold on the day. Though I now have come up with some code to log the relevant data going forward, I have been unable to do it going backwards.
The data is stored in the body of the email like so:
Date of report:,01-Jan-2020
Apples,8
Pears,5
Lemons,7
Oranges,9
Tomatoes,6
Melons,3
Bananas,0
Grapes,4
Grapefruit,8
Cucumber,2
Satsuma,1
What I would like for the code to do is first search through my emails and find the emails that match a particular subject, iterate line by line through and find the variables I'm searching for, and then log them in a dataframe with the "Date of Report" logged in a date column and converted into the format: "%m-%d-%Y".
I think I can achieve this by doing some amendments to the code I've written to deal with keeping track of it going forward:
# change for the fruit you're looking for
Fruit_1 = "Apples"
Fruit_2 = "Pears"
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
# find data email
for message in messages:
if message.subject == 'FRUIT QUANTITIES':
if Fruit_1 and Fruit_2 in message.body:
data = str(message.body)
break
else:
print('No data for', Fruit_1, 'or', Fruit_2, 'was found')
break
fruitd = open("fruitd.txt", "w") # copy the contents of the latest email into a .txt file
fruitd.write(data)
fruitd.close()
def get_vals(filename: str, searches: list) -> dict:
#Searches file for search terms and returns the values
dct = {}
with open(filename) as file:
for line in file:
term, *value = line.strip().split(',')
if term in searches:
dct[term] = float(value[0]) # Unpack value
# if terms are not found update the dictionary w remaining and set value to None
if len(dct.keys()) != len(searches):
dct.update({x: None for x in search_terms if x not in dct})
return dct
searchf = [
Fruit_1,
Fruit_2
] # the list of search terms the function searches for
result = get_vals("fruitd.txt", searchf) # search for terms
print(result)
# create new dataframe with the values from the dictionary
d = {**{'date':today}, **result}
fruit_vals = pd.DataFrame([d]).rename(columns=lambda z: z.upper())
fruit_vals['DATE'] = pd.to_datetime(fruit_vals['DATE'], format='%d-%m-%Y')
print(fruit_vals)
I'm creating a .txt titled 'fruitd' because I was unsure how I could iterate through an email message body any other way. Unfortunately I don't think creating a .txt for each of the past emails is really feasible and I was wondering whether there's a better way of doing it?
Any advice or pointers would be most welcome.
**EDIT Ideally would like to get all the variables in the search list; so Fruit_1 & Fruit_2 with room to expand it to a Fruit_3 + Fruit_4 (etc) if necessary.

#PREP THE STUFF
Fruit_1 = "Apples"
Fruit_2 = "Pears"
SEARCHF = [
Fruit_1,
Fruit_2
]
#DEF THE STUFF
# modified to take a list of list of strs as `report` arg
# apparently IDK how to type-hint; type-hinting removed
def get_report_vals(report, searches):
dct = {}
for line in report:
term, *value = line
# `str.casefold` is similar to `str.lower`, arguably better form
# if there might ever be a possibility of dealing with non-Latin chars
if term.casefold().startswith('date'):
#FIXED (now takes `date` str out of list)
dct['date'] = pd.to_datetime(value[0])
elif term in searches:
dct[term] = float(value[0])
if len(dct.keys()) != len(searches):
# corrected (?) `search_terms` to `searches`
dct.update({x: None for x in searches if x not in dct})
return dct
#DO THE STUFF
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
results = []
for message in messages:
if message.subject == 'FRUIT QUANTITIES':
# are you looking for:
# Fruit_1 /and/ Fruit_2
# or:
# Fruit_1 /or/ Fruit_2
if Fruit_1 in message.body and Fruit_2 in message.body:
# FIXED
data = [line.strip().split(",") for line in message.body.split('\n')]
results.append(get_report_vals(data, SEARCHF))
else:
pass
fruit_vals = pd.DataFrame(results)
fruit_vals.columns = map(str.upper, fruit_vals.columns)

Search outlook body for text, then create a variable for the line the text is on

Every morning I get spot data on FX volumes via an email, I'd like to build a process to search two pieces of data within the body of the email and save them as a new variable which I can then refer to later.
I've got the process to search my emails, order them according to date and check whether the entered data exists within the emails, but because the data is contained within a format between two commas, I am unsure how to take that data out and assign it to a new variable.
Format for example is this:
BWP/USD,0
CHF/AMD T,0
This is what I've achieved thus far:
import win32com.client
import os
import time
import re
# change the ticker to the one you're looking for
FX_volume1 = "BWP/USD"
FX_volume2 = "CHF/AMD"
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
# find spot data
for message in messages:
if message.subject.startswith("FX SPOT FIGURES"):
if FX_volume1 and FX_volume2 in message.body:
data = message.body
print(data)
else:
print('No data for', FX_volume1, 'or', FX_volume2, 'was found')
break
Any idea how to take this forward?
Thanks for any assistance/pointers

import win32com.client
import os
import time
import re
# change the ticker to the one you're looking for
FX_volume1 = "BWP/USD"
FX_volume2 = "CHF/AMD"
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
# find spot data
for message in messages:
if message.subject.startswith("FX SPOT FIGURES"):
case1 = re.match(FX_volume1 + ",(\d*)", message.body)
case2 = re.match(FX_volume2 + ",(\d*)", message.body)
case (1 and 2) will be match objects if a match is found, else they will be None. To retrieve your values just do val = case1.group(1). Hence:
EDIT:
if case1 not None:
FX_vol1_val = case1.group(1)
if case2 not None:
FX_vol2_val = case1.group(1)
For more info on match objects:
https://docs.python.org/3/library/re.html#match-objects
If you are expecting floats, see the following link:
Regular expression for floating point numbers
EDIT 2:
Hi, so as you couldn't get it working I gave it a quick try and it worked for me with the following example. Just to add to regex notation, anything that you put in brackets (), if the pattern matches, the contents between the brackets will be stored.
import re
my_text = "BWP/USD,1"
FX_pattern = "BWP/USD," # added the comma here for my convinience
my_match = re.match(FX_pattern, "(\d*)")
print("Group 0:", my_match.group(0))
print("Group 1:", my_match.group(1))
Printout:
Group 0: BWP/USD,1
Group 1: 1

IMAP COPY command not working on Inbox - Python

Iam using Python3.6 with IMAP4 module.Iam trying to copy emails from "Inbox" to "mytestfolder".
Iam getting "OK" as the response but the email itself is not being copied to "mytestfolder".
Where as the same code snippet is working for "someotherfolder" to "mytestfolder" without any problem for the first time and after that it doesn't work. Below is the code snippet can someone please help me resolve this.
import config
import imaplib
from creds import username,password
imap = imaplib.IMAP4_SSL(config.imap_server,config.imap_port)
r, d = imap.login(username, password)
assert r == 'OK', 'login failed: %s' % str (r)
print(" > Signed in as %s" % username, d)
imap.select("Inbox")
r, d = imap.search(None, "ALL")
allIds = d[0].decode('utf8').split(' ')
''' Login works and iam getting msg_ids as well'''
for msg_id in allIds:
apply_lbl_msg = imap.uid('COPY', msg_id, 'mytestfolder')
if apply_lbl_msg[0] == 'OK':
mov, data = imap.uid('STORE', msg_id , '+FLAGS', '(\Deleted)')
imap.expunge()

TLDR: You're miscounting by removing things and then indexing by what used to be the order.
Your code does:
r, d = imap.search(None, "ALL")
"Give me the sequence numbers of all messages in the inbox", so you get 1, 2, 3, 4, 5 and so on. The last number in d will equal the return value from select() a few lines above. Then you loop, I'll explain the first iteration:
apply_lbl_msg = imap.uid('COPY', msg_id, 'mytestfolder')
if apply_lbl_msg[0] == 'OK':
"Copy the first message to mytestfolder, and if that works…."
mov, data = imap.uid('STORE', msg_id , '+FLAGS', '(\Deleted)')
imap.expunge()
"… then delete the first message in the inbox", which means that what was the second message now becomes the first.
The next iteration operates on the message that's currently the second in the mailbox, and was once the third, so you never operate on the message that was 2 at the start. The third iteration operates on the message that's currently the third, and was once the... fifth I think? It doesn't matter.
You can make this correct by switching to the UID versions of the same. UIDs don't change as you renumber.
You could also make this correct and very much faster by issuing one single COPY command that copies all messages, and then one single STORE that marks them as deleted. You don't even need the SEARCH, because the result of the search is just all the numbers from 1 to the return value of select().

How to forward an email message captured using poplib to a different email address?

I have following script which processes emails and save them to csv file. there will be advancement to script where I will use mechanize lib to process the extracted emails data for further processing on an another web interface. There are times it may fail now I can trap that specific email without having any problem but how can I forward the trapped email to a different address where I can process it manually or see what's wrong with it?
Here's the script
import ConfigParser
import poplib
import email
import BeautifulSoup
import csv
import time
DEBUG = False
CFG = 'email' # 'email' or 'test_email'
#def get_config():
def get_config(fnames=['cron/orderP/get_orders.ini'], section=CFG):
"""
Read settings from one or more .ini files
"""
cfg = ConfigParser.SafeConfigParser()
cfg.read(*fnames)
return {
'host': cfg.get(section, 'host'),
'use_ssl': cfg.getboolean(section, 'use_ssl'),
'user': cfg.get(section, 'user'),
'pwd': cfg.get(section, 'pwd')
}
def get_emails(cfg, debuglevel=0):
"""
Returns a list of emails
"""
# pick the appropriate POP3 class (uses SSL or not)
#pop = [poplib.POP3, poplib.POP3_SSL][cfg['use_ssl']]
emails = []
try:
# connect!
print('Connecting...')
host = cfg['host']
mail = poplib.POP3(host)
mail.set_debuglevel(debuglevel) # 0 (none), 1 (summary), 2 (verbose)
mail.user(cfg['user'])
mail.pass_(cfg['pwd'])
# how many messages?
num_messages = mail.stat()[0]
print('{0} new messages'.format(num_messages))
# get text of messages
if num_messages:
get = lambda i: mail.retr(i)[1] # retrieve each line in the email
txt = lambda ss: '\n'.join(ss) # join them into a single string
eml = lambda s: email.message_from_string(s) # parse the string as an email
print('Getting emails...')
emails = [eml(txt(get(i))) for i in xrange(1, num_messages+1)]
print('Done!')
except poplib.error_proto, e:
print('Email error: {0}'.format(e.message))
mail.quit() # close connection
return emails
def parse_order_page(html):
"""
Accept an HTML order form
Returns (sku, shipto, [items])
"""
bs = BeautifulSoup.BeautifulSoup(html) # parse html
# sku is in first <p>, shipto is in second <p>...
ps = bs.findAll('p') # find all paragraphs in data
sku = ps[0].contents[1].strip() # sku as unicode string
shipto_lines = [line.strip() for line in ps[1].contents[2::2]]
shipto = '\n'.join(shipto_lines) # shipping address as unicode string
# items are in three-column table
cells = bs.findAll('td') # find all table cells
txt = [cell.contents[0] for cell in cells] # get cell contents
items = zip(txt[0::3], txt[1::3], txt[2::3]) # group by threes - code, description, and quantity for each item
return sku, shipto, items
def get_orders(emails):
"""
Accepts a list of order emails
Returns order details as list of (sku, shipto, [items])
"""
orders = []
for i,eml in enumerate(emails, 1):
pl = eml.get_payload()
if isinstance(pl, list):
sku, shipto, items = parse_order_page(pl[1].get_payload())
orders.append([sku, shipto, items])
else:
print("Email #{0}: unrecognized format".format(i))
return orders
def write_to_csv(orders, fname):
"""
Accepts a list of orders
Write to csv file, one line per item ordered
"""
outf = open(fname, 'wb')
outcsv = csv.writer(outf)
for poNumber, shipto, items in orders:
outcsv.writerow([]) # leave blank row between orders
for code, description, qty in items:
outcsv.writerow([poNumber, shipto, code, description, qty])
# The point where mechanize will come to play
def main():
cfg = get_config()
emails = get_emails(cfg)
orders = get_orders(emails)
write_to_csv(orders, 'cron/orderP/{0}.csv'.format(int(time.time())))
if __name__=="__main__":
main()

As we all know that POP3 is used solely for retrieval (those who know or have idea how emails work) so there is no point using POP3 for the sake of message sending that why I mentioned How to forward an email message captured with poplib to a different email address? as an question.
The complete answer was
smtplib can be used for that sake to forward an poplib captured email message, all you need to do is to capture the message body and send it using smtplib to the desired email address. Furthermore as Aleksandr Dezhin quoted I will agree with him as some SMTP servers impose different restrictions on message they are processed.
Beside that you can use sendmail to achieve that if you are on Unix machine.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Downloading emails with UTF-8 B encoded header - python

Related

Python generated *.msg file

Pull variable data from all outlook emails that have a particular subject line then take date from the body

Search outlook body for text, then create a variable for the line the text is on

IMAP COPY command not working on Inbox - Python

How to forward an email message captured using poplib to a different email address?

Categories

Resources