How to read in Excel email attachments received in Outlook? - python

So I am trying to read in an Excel file from an attachment in Microsoft outlook. The code below works, but only while the email I am trying to read the attachment from is at the top of my inbox. How can I adjust my code so it looks at all emails in my inbox folder looking for the attachment or finding the emailed based on the subject provided. Also I would eventually like to have this work with a shared mailbox, but that is a secondary issue right now.
from win32com.client import Dispatch
import email
import datetime as date
import os
outlook = Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder("6")
all_inbox = inbox.Items
val_date = date.date.today()
sub_today = 'subject of email'
att_today = 'Filename'
for msg in all_inbox:
yourstring = msg.Subject.encode('ascii', 'ignore').decode('ascii')
if(yourstring.find('"Filename"') != -1):
break
for att in msg.Attachments:
if att.FileName == att_today:
attachments = msg.Attachments
break
attachment = attachments.Item(1)
fn = os.getcwd() + '\\' + att_today
attachment.SaveASFILE(fn)
df = pd.read_excel(fn)

First of all, to iterate over all items you need to skip items, not break the loop. Use the continue keyword instead of break in the loop.
The Outlook object model provides the Find/FindNext and Restrict methods for getting items that correspond to your conditions. The store provider does the job more efficient than just iterating over all items in the folder. Read more about these methods in the following articles:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder
You can use the following search criteria where the query performs a phrase match query for hello in the message subject (VBA syntax):
filter = "#SQL=" & Chr(34) & "https://schemas.microsoft.com/mapi/proptag/0x0037001E" _
& Chr(34) & " ci_phrasematch " & "'hello'"
To get a shared folder using the NameSpace.GetSharedDefaultFolder method which returns a Folder object that represents the specified default folder for the specified user. This method is used in a delegation scenario, where one user has delegated access to another user for one or more of their default folders (for example, their shared Calendar folder).

Related

Viewing content of outlook attachment in python

I'm trying to use python to get some data that is in an attachment on an outlook email and then use that data in python. I've managed to write the code that will get into the outlook inbox and folder I want and then get the attachments of a specific message, however I'm not sure how to view the content of that attachment. A lot of the other questions and tutorials I've found seem to be more related to saving the attachment in a folder location rather than viewing the attachment in python itself.
For context the data I'm trying to get to is an exported report from adobe analytics, this report is a csv file that is attached to an email as a zip file. The CSV file shows some data for a specific time period and I'm planning on scheduling this report to run weekly so what I want to do is get python to look through all the emails with this report on then stack all this data into one dataframe so that I have all the history plus the latest week's data in one place then export this file out.
Please find the code below that I've written so far. If you need more details or I haven't explained anything very well please let me know. I am fairly new to python especially the win32com library so there might be obvious stuff I'm missing.
#STEP 1---------------------------------------------
#import all methods needed
from pathlib import Path
import win32com.client
import requests
import time
import datetime
import os
import zipfile
from zipfile import ZipFile
import pandas as pd
#STEP 2 --------------------------------------------
#connect to outlook
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
#STEP 3 --------------------------------------------
#connect to inbox
inbox = outlook.GetDefaultFolder(6)
#STEP 4 --------------------------------------------
#connect to adobe data reports folder within inbox
adobe_data_reports_folder = inbox.Folders['Cust Insights'].Folders['Adobe data reports']
#STEP 5 --------------------------------------------
#get all messages from adobe reports folder
messages_from_adr_folder = adobe_data_reports_folder.Items
#STEP 6 ---------------------------------------------
#get attachement for a specific message (this is just for testing in real world I'll do this for all messages)
for message in messages_from_adr_folder:
if message.SentOn.strftime("%d-%m-%y") == '07-12-22':
attachment = message.Attachments
else:
pass
#STEP 7 ----------------------------------------------
#get the content of the attachment
##????????????????????????????
With the Outlook Object Model, the best you can do is save the attachment as a file (Attachment.SaveAsFile) - keep in mind that MailItem.Attachments property returns the Attachments collection, not a single Attachment object - loop through all attachments in the collection, figure out which one you want (if there is more than one), and save it as file.
To access file attachment data directly without saving as a file, you will need to use Extended MAPI (C++ or Delphi only) or Redemption (any language, I am its author).
Dmitry mentioned below that there isn't the option to view attachment content with an outlook object model.
So I've come up with a solution for this which basically involves using the save method to save the attachment into a folder location on the current working directory and then once that file is save just load that file back up into python as a dataframe. The only thing to note is that I've added an if statement that only saves files that are csvs, obviously this part can be removed if needed.
If you wanted to do this with multiple files and stack all of these into a single dataframe then I just created a blank dataframe at the start (with the correct column names of the file that will be loaded) and concatenated this blank dataframe with the "importeddata" then added this code into the "attachment" for loop so that each time it's appending the data that is saved and loaded from the attachment
#STEP 1---------------------------------------------
#import all methods needed
from pathlib import Path
import win32com.client
import requests
import time
import datetime
import os
import zipfile
from zipfile import ZipFile
import pandas as pd
#STEP 1b ---------------------------------------------
#create a directory where I can save the files
output_dir = Path.cwd() / "outlook_testing"
#STEP 2 --------------------------------------------
#connect to outlook
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
#STEP 3 --------------------------------------------
#connect to inbox
inbox = outlook.GetDefaultFolder(6)
#STEP 4 --------------------------------------------
#connect to adobe data reports folder within inbox
adobe_data_reports_folder = inbox.Folders['Cust Insights'].Folders['Adobe data
reports']
#STEP 5 --------------------------------------------
#get all messages from adobe reports folder
messages_from_adr_folder = adobe_data_reports_folder.Items
#STEP 6 ---------------------------------------------
#get attachement for a specific message (this is just for testing in real world
#I'll do this for all messages)
for message in messages_from_adr_folder:
body = message.Body
if message.SentOn.strftime("%d-%m-%y") == '07-12-22':
attachments = message.Attachments
for attachment in attachments:
stringofattachment = str(attachment)
#STEP 6b - if the attachment is a csv file then save the attachment to a folder
if stringofattachment.find('.csv') != - 1:
attachment.SaveAsFile(output_dir / str(attachment))
print(output_dir / str(attachment))
#STEP 6C - reload the saved file as a dataframe
importeddata = pd.read_csv(output_dir / str(attachment))
else:
print('NOT CSV')
pass
else:
pass

Why is this throwing an exception when I try to save the attachment from Outlook?

I am trying to iterate through the contents of a subfolder, and if the message contains an .xlsx attachment, download the attachment to a local directory. I have confirmed all other parts of this program work until that line, which throws an exception each time.
I am running the following code in a Jupyter notebook through VSCode:
# import libraries
import win32com.client
import re
import os
# set up connection to outlook
path = os.path.expanduser("~\\Desktop\\SBD_DB")
print(path)
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
target_folder = inbox.Folders['SBD - Productivity'].Folders['Productivity Data Request']
target_folder.Name
messages = target_folder.Items
message = messages.GetLast()
# while True:
x=0
while x < 100:
try:
# print(message.subject) # get the subject of the email
for attachment in message.attachments:
if 'xlsx' in attachment.FileName:
# print("reached")
attachment.SaveAsFile(os.path.join(path, str(attachment.FileName)))
print("found excel:", attachment.FileName)
message = messages.GetPrevious()
x+=1
except:
print("exception")
message = messages.GetPrevious()
x+=1
Looks like the following line of code throws an exception at runtime:
attachment.SaveAsFile(os.path.join(path, str(attachment.FileName)))
First, make sure that you deal with an attached file, not a link to the actual file. The Attachment.Type property returns an OlAttachmentType constant indicating the type of the specified object. You are interested in the olByValue value when the attachment is a copy of the original file and can be accessed even if the original file is removed.
Second, you need to make sure that the file path (especially the FileName property) doesn't contain forbidden symbols, see What characters are forbidden in Windows and Linux directory names? for more information.
Third, make sure that a target folder exists on the disk and points to the local folder. According to the exception message:
'Cannot save the attachment. Path does not exist. Verify the path is correct.'
That is it. Try to open the folder manually first, according to the error message the path doesn't exist. Before calling the SaveAsFile method you need to created the target folder or make sure it exists before.

Python Loop through emails in outlook

This code takes email pdf attachments, download it, merge to one pdf file and send further.
Now it takes all emails which are marked with specific category in that inbox, so it merge all pdf's from all emails to one file.
But I want that it take emails one by one, that after download pdf's from one email it will merge and send them, delete them from folder and just after that it will take second email.
How to make such loop for this code?
import datetime
import os
import win32com.client as win32
from PyPDF2 import PdfFileMerger
from pathlib import Path
path = ('C:\\Users\\Desktop\\Work')
today = datetime.date.today()
outlook = win32.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
subFolder = inbox.Folders("Test")
messages = subFolder.Items
def save_attachments(subject):
for message in messages:
if message.Categories == "Red Category":
for attachment in message.Attachments:
print(attachment.FileName)
attachment.SaveAsFile(os.path.join(path, str(attachment)))
if __name__ == "__main__":
save_attachments('PB report - next steps')
#Merge PDF's
merger = PdfFileMerger()
path_to_files = r'C:\Users\Desktop\Work/'
for root, dirs, file_names in os.walk(path_to_files):
for file_name in file_names:
merger.append(path_to_files + file_name)
merger.write(r"C:\Users\Desktop\Work\merged.pdf")
merger.close()
#Send PDF with outlook
# construct Outlook application instance
olApp = win32.Dispatch('Outlook.Application')
olNS = olApp.GetNameSpace('MAPI')
# construct the email item object
mailItem = olApp.CreateItem(0)
mailItem.Subject = 'Test'
mailItem.BodyFormat = 1
mailItem.Body = "Pdf merged"
mailItem.To = 'email'
path = (os.path.join('C:\\Users\\Desktop\\Work\\merged.pdf'))
mailItem.Attachments.Add(path)
mailItem.Display()
mailItem.Save()
mailItem.Send()
#Delete PDF's from folder
[f.unlink() for f in Path("C:\\Users\\Desktop\\Work").glob("*") if f.is_file()]
Iterating over all items in the folder is not really a good idea:
for message in messages:
if message.Categories == "Red Category":
Instead, you need to use the Find/FindNext or Restrict methods of the Items class from the Outlook object model. So, in that case you will get all items that correspond to your search criteria and iterate over them only. Read more about these methods in the following articles:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder
Second, there is no need to create a new Outlook Application instance:
# construct Outlook application instance
olApp = win32.Dispatch('Outlook.Application')
olNS = olApp.GetNameSpace('MAPI')
Re-use the existing application instance instead. Moreover, Outlook is a singleton, you can't have two instances running at the same time.
Third, there is no need to display and save the item created before sending:
mailItem.Attachments.Add(path)
mailItem.Send()

Python Command to Mark Mails as Read in Outlook (MAPI)

I am writing a python code to download a specific attachment from unread emails in outlook and mark those emails as read. I have managed to finish 90% of it i.e. I can do an iteration to open unread emails and download the attachments with a specific emails. However, I have two issues.
I am downloading the attachment with the same name, and if there are two attachments with the same name, it just saves the one it extracts from the last iteration. I tried appending a time stamp at the end of the file name but it has the same effect. Any help would be appreciated. This is not mandatory requirement since the mail comes at stipulated intervals and I can write a separate python code to rename it but I want to pack everything in this single email.
I would like to mark the email as read after the attachment is downloaded. I do not know the command for this one. I have attached the code for your reference.
P.S. This is my first real python code. Also this is my first post here. Apologies if this is was already asked elsewhere.
import win32com.client
import os
import time
date_time_stamp = time.strftime("%Y%m%d-%H%M%S")
#set custom working directory
os.chdir('C:\\Users\user_name\Desktop\')
print(os.getcwd())
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
main_inbox = outlook.GetDefaultFolder(6)
subfolder = main_inbox.Folders.Item("my_child_folder_under_inbox")
subfolderitems = subfolder.Items
message = subfolderitems.GetFirst()
attachment_name = 'my_attachment_name'
#Loop to pick messages that are unread
for message in subfolderitems:
if message.Unread == True:
print("New Mail Found... Downloading Attachment...")
#Loop to check if the attachment name is the same
for attachments in message.Attachments:
if attachments.Filename == attachment_name:
#Saves to the attachment to the working directory
attachments.SaveASFile(os.getcwd() + '\\' + 'my_attachment_name' + date_time_stamp + '.csv')
print (attachments)
time.sleep(2)
break
#Go to next unread messages if any
message = subfolderitems.GetNext()
else:
print ("Checking...")
--
Thanks and Regards,
Sakthi Ganesh K.
I think it may have to do with your 'date_time_stamp', since it tries to download the files in the same second, and the system only keeps the last one. You could try using a UUID for that to ensure it is a unique string:
import uuid
file_uuid = str(uuid.uuid4())
...
attachments.SaveASFile(os.getcwd() + '\\' + 'my_attachment_name' + file_uuid + '.csv')
To mark the message as Read, you could simply do:
message.Unread = False

How to Add File as Attachment to Outlook Item in Python

I have just created a few files and zipped them up, then begun an email to send it. I'ts probably simple, but I haven't been able to figure out how to specify a file by path to attach. Feeding the filepath alone doesn't seem to work?
ZipName = 'Order'+OrderNumber+'.zip'
zip = zipfile.ZipFile(ZipName, 'a', 8)
for file in os.listdir(filepath_out):
if file.endswith(".epw"):
zip.write(file)
zip.close()
outlook = win32.Dispatch('outlook.application')
mail = outlook.CreateItem(0)
#mail.From = 'sales#c1.com'
mail.To = 'support#c2.com'
mail.Subject = 'Files for Order ' + OrderNumber
mail.HtmlBody = ""
mail.Attachments.Add(ZipName)
mail.Display(True)
It's off topic but related; is there an easy way to specify a non-default "from" email address? "From" doesn't seem to be a property and "Sender" doesn't change anything.
Attachments.Add takes a fully qualified file name (e.g. c:\temp\order1.zip), not just a file name.

Categories