Modify Subject, Body after message_from_string in Python - python

I am trying to modify an email2sms script for Smstools 3.
A sample incoming sms file:
$ cat /var/spool/sms/incoming/GSM1.AtEO8G
From: 950
From_TOA: D0 alphanumeric, unknown
From_SMSC: 421950900050
Sent: 17-09-13 17:41:17
Received: 17-09-13 17:48:21
Subject: GSM1
Modem: GSM1
IMSI: 231030011459971
Report: no
Alphabet: ISO
Length: 5
test1
The script is using the following code to format the message:
if (statuscode == 'RECEIVED'):
smsfile = open(smsfilename)
msg = email.message_from_string(smsfile.read())
msg['Original-From'] = msg['From']
msg['To'] = forwardto
The problem: I want to modify Subject field in the code above. I tried something msg['Subject '] = 'Example' (after msg['To']), but the Subject field is not overwrited, but doubled. Anybody knows how to modify this after email.message_from_string() function?

You want to replace Subject header for message.
msg.replace_header('Subject', 'Example Subject')
Assigning to an index always adds a new header. Only use when header doesn't exist.
msg['Subject'] = 'Example Subject' # add new subject header
print(msg.items)
>> [('From', '950'), ('From_TOA', 'D0 alphanumeric, unknown'),
('From_SMSC', '421950900050'), ('Sent', '17-09-13 17:41:17'),
('Received', '17-09-13 17:48:21'), ('Subject', 'GSM 1'),
('Modem', 'GSM1'), ('IMSI', '231030011459971'),
('Report', 'no'), ('Alphabet', 'ISO'),
('Length', '5'), ('Original-From', '950'),
('Subject', 'Example Subject')]

Related

Python generated *.msg file

I'm Jan and it's my first post here and the following code is also my first python code. So please don't judge me, if the code is not well shaped :) and don't wonder I had to reduce my mail body.
With the following code I try to generate several msg file depending on a user list called "customer_names". The idea is to iterate through this list and to adjust the email body espacially the placeholder for "Customer". The rest of the body content is not so important. Everything works good except the iteration through the list. I have a suggestion that I may need to increment the index for the list in the loop. Do you have any ideas.
import win32com.client as win32
import datetime
import random
# List of customer names
customer_names = ['Name1','Name2','Name3']
current_customer_index = 0
# List of email providers
email_providers = ['provider1', 'provider2', 'provider3']
# set up the Outlook application
outlook = win32.Dispatch('outlook.application')
# create a new email
mail = outlook.CreateItem(0)
# set the subject and recipients
mail.Subject = "Request for Customer Information"
mail.To = "user#emailprovider.net"
# Message body for the email
message = f"Dear User,\n\nThe information we require from [Customer] is as follows:\n\n- Email Address: [Email Address] \n\nWe kindly request that you send the requested information to us within [number of days] days. \n\n Kind regards."
# set the number of days for the customer to respond
num_days = 7
# set the importance of the email to normal
mail.Importance = 1 # 0=Low, 1=Normal, 2=High
# set the sensitivity of the email to normal
mail.Sensitivity = 0 # 0=Normal, 1=Personal, 2=Private, 3=Confidential
# set the read receipt option to true
mail.ReadReceiptRequested = True
# add a reminder for the sender to follow up in 3 days
mail.FlagRequest = "Follow up"
mail.FlagDueBy = (datetime.date.today() + datetime.timedelta(days=3)).strftime('%m/%d/%Y')
# Generate a random email
for i in range(len(customer_names)):
customer = customer_names[i]
message_with_data = message.replace("[Customer]", customer)
message_with_data = message_with_data.replace("[number of days]", str(num_days))
mail.Body = message_with_data
file_name = "Request_" + customer + ".msg"
file_path = "C:/Users/user/Desktop/Email/" + file_name
mail.SaveAs(file_path)

Facing problem to decode ?UTF-8?B?ZnVjayDwn5CO?=! type in subject. Using IMAP and Python

Need to get real string instead of that encoded string. Few subjects are proper in string format but few are in this encoded format, I don't know how to solve it.
How can I decode the string and print the decoded part of the subject?
FROM_EMAIL = "my_id#gmail.com"
FROM_PWD = "my Password"
SMTP_SERVER = "imap.gmail.com"
SMTP_PORT = 993
l=['Developer','Architect','NEED','Internship','Urgent']
def get_body(msg):
if msg.is_multipart():
return get_body(msg.get_payload(0))
else:
return msg.get_payload(None,True)
def readmail():
mail = imaplib.IMAP4_SSL(SMTP_SERVER)
mail.login(FROM_EMAIL,FROM_PWD)
mail.select('inbox')
type, data = mail.search(None, '(SINCE "20-May-2020" BEFORE "26-May-2020")')
mail_ids = data[0]
id_list = mail_ids.split()
id_list=id_list[::-1]
first_email_id = id_list[0]
latest_email_id = id_list[-1]
for byte_obj in id_list:
typ, data = mail.fetch(byte_obj, '(RFC822)' )
raw=email.message_from_bytes(data[0][1])
msg=get_body(raw)
s=''
s=raw['SUBJECT']
s1=raw['Date']
print(s)
readmail()
output:
Winner announcement! Amazon Kindle Oasis.
[FREE WEBINAR] Natural Language Processing for Beginners
Godrej 24 | Get Rs. 2 Lakh Gold Voucher | 2 & 3 BHK at Rs. 83 Lakh*
=?UTF-8?B?TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw=?=
=?UTF-8?B?b3cgYXMg4oK5NDU1?=
Panda just uploaded a video
Vernix Gamerz just uploaded a video
Most of your question has been answered here:
Find, decode and replace all base64 values in text file
In order to better understand your example I have some additional information:
Part of your subject lines are encoded in the base64-Format.
Take the following part of your string s=raw['SUBJECT'] as example
=?UTF-8?B?TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw=?=
=?UTF-8?B?b3cgYXMg4oK5NDU1?=
The structure is as follows:
First you have:
?UTF-8?B?
Then comes the encoded string:
TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw
Followed by
=?
Converting the encoded string from base64 to UTF-8 gives you the text:
Last day to save! Popular courses as l
You can verify this under https://www.base64decode.org/

How to Handle single and Multiple Email ID fetched from Database in Python

I have a scenario where I have Extract Emails from Database and send mails to the respective Users.
The values fetched from the database can be of Single Email Id or Multiple Email Id.
I have written the following code and its throwing me an error when it encounters multiple email id seperated by comma.
for index, row in df1.iterrows():
myVar1 = row["abc"]
#myVar2 = row["Email"]
if row["Email"].count('#') > 1:
myVar2 = ','.join(row["Email"])
else:
myVar2 = row["Email"]
msg = email.message.Message()
msg['From'] = 'do.not.reply#xyz.com'
msg['To'] = myVar2
msg['Subject'] = "abc to be read - {0}".format(myVar1)
msg.add_header('Content-Type', 'text')
msg.set_payload("Hello Users,\n\n ABC - {0} has reached its limit.".format(myVar1))
smtp_obj = smtplib.SMTP("outlook.xyz.com")
smtp_obj.sendmail(msg['From'],msg['To'], msg.as_string())
smtp_obj.quit()
if it a single email id then the mail is trigerring properly but if multiple email is passed then each alphabet is seperated by comma
input 'abc#xyz.com,asd#xyz.com'
error message : a,b,c,#,x,y,z,.,c,o,m,,,a,s,d,#,x,y,z,.,c,o,m
Please help me in this concern.
Thanks

Python Retrieve Email Addresses

I want to create a Python program to resend a email using "email" and "smtplib" package.
import email
import smtplib
f = open('email_source.eml')
em = email.message_from_file(f)
email_from = em['FROM'] # '"Me" <me#xyz.com>'
email_to = em['TO'] # '"John, A" <john#abc.com>, "Peter, B" <peter#def.com>'
In above case, I have 2 recipients, I want to resend to these 2 person by smtplib.
import smtplib
smtp = smtplib.SMTP('localhost', '25')
smtp.sendmail(email_from, email_to, em.as_string())
If I put the string email_to into sendmail, it only send the email to first people. If I replace the email_to by a list,
email_to_list = ['"John, A" <jphn#abc.com>', '"Peter, B" <peter#def.com>']
the email can sent to both person.
My problem is, can I extract the recipients to a list from the em['TO'] and em['CC'] string?
Thank you.
The problem is that smtp.sendmail requires a list of addresses, according to the documentation:
SMTP.sendmail(from_addr, to_addrs, msg, mail_options=[], rcpt_options=[])
Send mail. The required arguments are an RFC 822 from-address string, a list of RFC 822 to-address strings (a bare string will be treated as a list with 1 address) […]
From the email-package you get a string, which the smtp-package then interprets as only one address.
In simple words, you need to split your to-address-string into a list of addresses.
How do you do this? You could do it manually, but it's best to just rely on the library:
import email.utils
email_to_raw = '"John, A" <john#abc.com>, "Peter, B" <peter#def.com>'
# split into (Name, Addr) tuple
email_to_split = email.utils.getaddresses([email_to_raw])
# combine the tuples into addresses, but keep the list
email_to = [email.utils.formataddr(pair) for pair in email_to_split]
print(email_to) # ['"John, A" <john#abc.com>', '"Peter, B" <peter#def.com>']
After swearing a bit at the designer of the API, you wrap it up into a function:
import email.utils
def split_combined_addresses(addresses):
parts = email.utils.getaddresses(addresses)
return [email.utils.formataddr(name_addr) for name_addr in parts]
print(split_combined_addresses(email_to))
This is how I do.
I assume emails are separated with semi-colon,but you can replace ; to ,
Here is sample data
addresses='"Johnny Test" <johnny#test.com>; Jack <another#test.com>; "Scott Summers" <scotts#test.com>; noname#test.com'
Source code:
import email.utils
import re
def split_combined_addresses(addresses):
#remove special chars
addresses = re.sub("\n|\r|\t", "", addresses)
# addrs = re.findall(r'(.*?)\s<(.*?)>,?', addresses) #colon separated
addrs = re.findall(r'(.*?)\s<(.*?)>;?', addresses) #semicolon separated
# remove leading space in name .strip()
# remove double-quote in name
addrs_clean = [(i.replace('"','').strip(), j) for i,j in addrs]
# add missing emails without name
emails = re.findall(r"[\w.+-]+#[\w-]+\.[\w.-]+", addresses)
for email in emails:
if (not email in list(zip(*addrs_clean))[1]):
addrs_clean.append(('', email))
return addrs_clean

Python email quoted-printable encoding problem

I am extracting emails from Gmail using the following:
def getMsgs():
try:
conn = imaplib.IMAP4_SSL("imap.gmail.com", 993)
except:
print 'Failed to connect'
print 'Is your internet connection working?'
sys.exit()
try:
conn.login(username, password)
except:
print 'Failed to login'
print 'Is the username and password correct?'
sys.exit()
conn.select('Inbox')
# typ, data = conn.search(None, '(UNSEEN SUBJECT "%s")' % subject)
typ, data = conn.search(None, '(SUBJECT "%s")' % subject)
for num in data[0].split():
typ, data = conn.fetch(num, '(RFC822)')
msg = email.message_from_string(data[0][1])
yield walkMsg(msg)
def walkMsg(msg):
for part in msg.walk():
if part.get_content_type() != "text/plain":
continue
return part.get_payload()
However, some emails I get are nigh impossible for me to extract dates (using regex) from as encoding-related chars such as '=', randomly land in the middle of various text fields. Here's an example where it occurs in a date range I want to extract:
Name: KIRSTI Email:
kirsti#blah.blah Phone #: + 999
99995192 Total in party: 4 total, 0
children Arrival/Departure: Oct 9=
,
2010 - Oct 13, 2010 - Oct 13, 2010
Is there a way to remove these encoding characters?
You could/should use the email.parser module to decode mail messages, for example (quick and dirty example!):
from email.parser import FeedParser
f = FeedParser()
f.feed("<insert mail message here, including all headers>")
rootMessage = f.close()
# Now you can access the message and its submessages (if it's multipart)
print rootMessage.is_multipart()
# Or check for errors
print rootMessage.defects
# If it's a multipart message, you can get the first submessage and then its payload
# (i.e. content) like so:
rootMessage.get_payload(0).get_payload(decode=True)
Using the "decode" parameter of Message.get_payload, the module automatically decodes the content, depending on its encoding (e.g. quoted printables as in your question).
If you are using Python3.6 or later, you can use the email.message.Message.get_content() method to decode the text automatically. This method supersedes get_payload(), though get_payload() is still available.
Say you have a string s containing this email message (based on the examples in the docs):
Subject: Ayons asperges pour le =?utf-8?q?d=C3=A9jeuner?=
From: =?utf-8?q?Pep=C3=A9?= Le Pew <pepe#example.com>
To: Penelope Pussycat <penelope#example.com>,
Fabrette Pussycat <fabrette#example.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Salut!
Cela ressemble =C3=A0 un excellent recipie[1] d=C3=A9jeuner.
[1] http://www.yummly.com/recipe/Roasted-Asparagus-Epicurious-203718
--Pep=C3=A9
=20
Non-ascii characters in the string have been encoded with the quoted-printable encoding, as specified in the Content-Transfer-Encoding header.
Create an email object:
import email
from email import policy
msg = email.message_from_string(s, policy=policy.default)
Setting the policy is required here; otherwise policy.compat32 is used, which returns a legacy Message instance that doesn't have the get_content method. policy.default will eventually become the default policy, but as of Python3.7 it's still policy.compat32.
The get_content() method handles decoding automatically:
print(msg.get_content())
Salut!
Cela ressemble à un excellent recipie[1] déjeuner.
[1] http://www.yummly.com/recipe/Roasted-Asparagus-Epicurious-203718
--Pepé
If you have a multipart message, get_content() needs to be called on the individual parts, like this:
for part in message.iter_parts():
print(part.get_content())
That's known as quoted-printable encoding. You probably want to use something like quopri.decodestring - http://docs.python.org/library/quopri.html

Categories