Add adding if/else to python parser

Add adding if/else to python parser - python

I have a snippet of code here that uses gmail POP to to parse messages coming from a text message (1xxxxxxxxx7#vtext.com). I want the parser to be able to search for multiple strings in the message, and run code accordingly per each different string. Right now, the parser is set to find sequences with 'Thank you' but I don't know how to expand on this as I am extremely new to python. My code is as follows:
import poplib
from email import parser
pop_conn = poplib.POP3_SSL('pop.gmail.com')
pop_conn.user('xxxxxxxxxxxxx')
pop_conn.pass_('xxxxxxxxxxxxx')
#Get messages from server:
messages = [pop_conn.retr(i) for i in range(1, len(pop_conn.list()[1]) + 1)]
# Concat message pieces:
messages = ["\n".join(mssg[1]) for mssg in messages]
#Parse message intom an email object:
messages = [parser.Parser().parsestr(Thankyou) for Thankyou in messages]
for message in messages:
print 'Data Received'
pop_conn.quit()

The code snippet you provided uses list comprehensions - the most powerful operator in Python. You must learn them if you want to write Python. Here is the beginning.
As of your question - ThankYou here is just a variable name, it means nothing.

It looks like you're struggling with list comprehensions.
#List comprehension
messages = [parser.Parser().parsestr(Thankyou) for Thankyou in messages]
#Equivalent for loop
#Temporary list
temp = []
#Loop through all elements in messages
for Thankyou in messages:
#If parsestr returns True for the current element (i.e. it's the string you're looking for)
if parser.Parser().parsestr(Thankyou):
temp.append(Thankyou)
#Overwrite the messages list with the temporary one
messages = temp
As you can see, the list comprehension is a lot more concise and readable. They're used a lot in Python code, but they're not scary. Just think of them as a for loop that iterates through every element in the given container.
In order to search for more tokens, it looks like you'll need to edit the parsestr() method to return True when you encounter the strings you are looking for.

Related

I am trying to download attachments from Outlook using Python by specifying subject line. But the code does not download any attachment

Below is my code:
import win32com.client
import os
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) # "6" refers to the index of a folder - in this case the inbox. You can change that number to reference
messages = inbox.Items
message = messages.GetFirst()
subject = message.Subject
body = message.body
#
get_path = 'C:\\Users\\username\\Downloads'
for m in messages:
if m.Subject == "Dummy report":
attachments = message.Attachments
num_attach = len([x for x in attachments])
for x in range(1, num_attach):
attachment = attachments.Item(x)
attachment.SaveAsFile(os.path.join(get_path,attachment.FileName))
print (attachment.FileName)
break
else:
message = messages.GetNext()
Please let me know what is wrong with this code. I was able to find the specific mail but I was not able to download the attachment associated with that mail.

First, the subject line may contain forbidden symbols for file names. Make sure the file name string is safe. The file will not be saved if the string contains any forbidden symbols.
Second, it makes sense to check the Attachment.Type property which returns an OlAttachmentType constant indicating the type of the specified object. Make sure that you deal with real attached files by making sure the property is set to the olByValue value.
Third, make sure the FileName property is not empty. In some cases you may need to use the DisplayName property value instead.
Fourth, direct comparison of the subject line is not the best way to find items with a specified subject line. It may be prepended with RE: or FW: prefixes.
for m in messages:
if m.Subject == "Dummy report":
Instead, you need to use the Find/FindNext or Restrict methods of the Items class. They allow getting items that correspond to your conditions without iterating over all items in the folder. Read more about these methods in the articles I wrote for the technical blog:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder
For example, you could use the following search criteria (VBA syntax):
criteria = "#SQL=" & Chr(34) _
& "urn:schemas:httpmail:subject" & Chr(34) _
& " ci_phrasematch 'question'"
This example shows Equivalence Matching, assuming that the folder you are searching contains items with the following subjects:
Question
Questionable
Unquestionable
RE: Question
The big question
If a store is indexed, searching with content indexer keywords is more efficient than with like. If your search scenarios include substring matching (which content indexer keywords don't support), use the like keyword in a DASL query.
Read more about that in the Filtering Items Using a String Comparison article.

Taking on board some of #Eugene Astafiev's points, this code will iterate over the Inbox for a matching subject:
import win32com.client as wc
from os.path import join
ol = wc.gencache.EnsureDispatch('Outlook.Application')
ns = ol.GetNamespace('MAPI')
inbox = ns.GetDefaultFolder(wc.constants.olFolderInbox)
items = inbox.Items
pattern = 'Dummy report'
criteria = '#SQL="urn:schemas:httpmail:subject" like \'%' + pattern + '%\''
msg = items.Find(criteria)
while msg is not None:
for att in msg.Attachments:
if att.Type == wc.constants.olByValue:
att.SaveAsFile(join('c:\\temp',att.FileName))
print(att.FileName)
msg = items.FindNext()
I have to say that I did try to use ci_phrasematch with Find, as suggested, but it would not work for me (even cutting & pasting the MS example into VBA). The keyword like does seem to work though.
NB. By using EnsureDispatch to create the Outlook object, you can access the Outlook enumerated constants from the documentation (such as olFolderInbox and olByValue) without resorting to magic numbers.

parser.BytesFeedParser().Close() return empty list in Python's email.parser

I'm trying to parse all emails from my mail-server.But when I try to process the data with using close() from Python's email.parser.BytesFeedParser,I get empty list error:
IndexError: pop from empty list
The code like belows:
client = poplib.POP3(host=theServer)
client.user(user=user)
client.pass_(pswd=password)
oriData = [client.retr(i)[1] for i in range(1, len(client.list()[1]) + 1)]
a = parser.BytesFeedParser()
for eleList in oriData:
for ele in eleList:
print(ele)
a.feed(ele)
print(a)
test = a.close()
In the python's related docs,it said:
close()
Complete the parsing of all previously fed data and return the
root message object. It is undefined what happens if feed() is called
after this method has been called.
That makes me confused.I construe it as when I apply the close(),it will return the EmailMessage object.I'm sure that the feeded data is not null but it finally return an empty list.I have searched this problem with many search-engines but get no answers so I push this question in stackoverflow. Can someone help me and explain why I make this mistake?😢😢😢

How to delete messages with imap-tools

I use the following code to delete messages from my IMAP server
uids = []
for msg in mailbox.fetch(filter):
print(msg.uid, msg.date, msg.from_, msg.subject)
uids.append(msg.uid)
mailbox.delete([msg.uid])
That doesn't delete the intended messages, though. If the filter returns e. g. 3 messages, only the first filtered message is deleted and then maybe two others (though I'm not sure about those two others).
I've read about MSNs that cause errors when used instead of UIDs when deleting messages. But I don't see the problem in the code above. Here is the example code from the repo which seems to work fine, but I don't understand the difference:
mailbox.delete([msg.uid for msg in mailbox.fetch()])
Can anybody point me in the right direction?

You collect message UIDs into a list (uids), and then remove only the last message (after the loop is finished).
This is probably what you intended to do (minimal changes for clarity):
uids = []
for msg in mailbox.fetch(filter):
print(msg.uid, msg.date, msg.from_, msg.subject)
uids.append(msg.uid)
mailbox.delete(uids)

How to format a list of strings into an html table?

I'm trying to send a sample message in gmail. My message contains text which is a list of items. I wanted to print each list item in a tabular format. Right now I am trying with the new line but every time when the string length is different, it is not aligned properly with the email. Can someone let me know how to achieve keeping my list items inside a message?
Following is the demo code I am trying with.
def html_table(list_name):
print('<table>')
for sublist in list_name:
print(' <tr><td>')
print(sublist)
print(' </td></tr>')
print('</table>')
l = ["--> ServiceName is ec2 , this is a sample string inside list item1 ",
"-------- ServiceName is s3 , this message is a sample string inside list item2------"]
message = """
Hi ,
The following are different aws services
{0}
Best regards,
Team
""".format('\n\t'.join(map(str, l)))
print(message)

Random.choice pulling entire list, instead of single items

I am (continually) working on a project with gmail and imaplib. I am searching gmail for emails that contain a specific word, and receiving a list of unique ids (most of my code is based off of/is Doug Hellman's, from his excellent imaplib tutorial). Those unique ids are saved to a list. I am trying to pull a single id from the list, but random.choice keeps pulling the entire list. Here is the code:
import imaplib
import random
from imaplib_list_parse import parse_list_response
c = imaplib_connect.open_connection()
msg_ids = []
c.select('[Gmail]/Chats', readonly=True)
typ, msg_ids = c.search(None, '(BODY "friend")')
random_id = random.choice(msg_ids)
print random_id
I've messed around in the interpreter and msg_ids is definitely a list. I've also attempted to pull specific elements of the array (ex: msg_ids[1] etc) but it says "IndexError: list index out of range" which I understand to mean "the thing you are looking for isn't there", which is confusing because it is there.
Is there anytime a list is not a list? Or something? I'm confused.
As always, I appreciate any feedback the wonderful people of stackoverflow could give :)

I think random_id is a list of list, something like: [[1,2,3,5]]。So when you call msg_ids[1], it raise IndexError. And since there is only one element in the list, random.choice() always return the element. You can try:
print random.choice(msg_ids[0])
To debug this kind of things, you can try print the random_id, or use IPython to interactive your code to find out the problem.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.