I am building the xml format.
I want to do something like this:
Now I can append all email into ValidationRequest. But I got the wrong format.
This is my code:
import lxml.etree
import lxml.builder
email = ['12345#gmail.com','564685#yahoo.com.hk','54321#yahoo.com.hk']
E = lxml.builder.ElementMaker()
ValidationRequest = E.ValidationRequest
Username = E.Username
Password = E.Password
Email = E.Email
op = []
for a in email:
GG = Email(a)
op.append(lxml.etree.tostring(GG, pretty_print=True))
print (lxml.etree.tostring(GG, pretty_print=True))
ed = ''.join(map(str, op))
the_doc = ValidationRequest(
print (lxml.etree.tostring(the_doc, pretty_print=True))
The response:
[b'<Email>12345#gmail.com</Email>\n', b'<Email>564685#yahoo.com.hk</Email>\n', b'<Email>54321#yahoo.com.hk</Email>\n']
How can I keep <Email></Email> and post to ValidationRequest?
You had the right idea, there's just a lot there that's just moving stuff around. You can bring it down to this:
import lxml.etree
import lxml.builder
email = ['12345#gmail.com','564685#yahoo.com.hk','54321#yahoo.com.hk']
em = lxml.builder.ElementMaker()
the_doc = em.ValidationRequest(
*(em.Email(address) for address in email)
print(lxml.etree.tostring(the_doc, pretty_print=True).decode())
The .decode() is only there for it to print nicely. If you need the bytes, you can just leave that off.
Note this line in particular:
*(em.Email(address) for address in email)
This loops over email, generating an Email element for each address as it goes, and then unpacking the resulting tuple (which is the result of the parentheses surrounding the generator) in to the constructor for the ValidationRequest element.
I've read this post How can I search the Outlook (2010) Global Address List for a name? and found a working solution for getting a name from Outlook GAL.
I have 3 questions:
I can get the contact if the search_string is an email address. When it's a name, the search doesn't work. It would return False for resolved, but True for sendable. Then I get error when using ae object. What am I doing wrong?
I don't understand the code enough to modify it for searching multiple names. I simply created a for loop but maybe there is a more efficient way? For example, can I reuse the outlook.Session object between different searches?
Is the line recipient.Resolve() necessary?
Thanks in advance!
My attempt is below.
from __future__ import print_function
import win32com.client
search_strings = ['Doe John', 'Doe Jane']
outlook = win32com.client.gencache.EnsureDispatch('Outlook.Application')
for search_string in search_strings:
recipient = outlook.Session.CreateRecipient(search_string)
print('Resolved OK: ', recipient.Resolved)
print('Is it a sendable? (address): ', recipient.Sendable)
print('Name: ', recipient.Name)
ae = recipient.AddressEntry
email_address = None
if 'EX' == ae.Type:
eu = ae.GetExchangeUser()
email_address = eu.PrimarySmtpAddress
if 'SMTP' == ae.Type:
email_address = ae.Address
print('Email address: ', email_address)
Can't believe I found the solution so quickly after posting the question. Since it's hard to find the answer. I'm sharing my findings here.
It's inspired by How to fetch exact match of addressEntry object from GAL (Global Address List), though it's in c# rather than python.
This method uses exact match of displayname rather than relying on outlook to resolve the name. Though, it's possible to loop through the global address list and do partial match yourself.
import win32com.client
search_string = 'Doe John'
outlook = win32com.client.gencache.EnsureDispatch('Outlook.Application')
gal = outlook.Session.GetGlobalAddressList()
entries = gal.AddressEntries
ae = entries[search_string]
email_address = None
if 'EX' == ae.Type:
eu = ae.GetExchangeUser()
email_address = eu.PrimarySmtpAddress
if 'SMTP' == ae.Type:
email_address = ae.Address
print('Email address: ', email_address)
I can print rev['contributor'] for a while but then every try to access rev['contributor'] returns the following
TypeError: string indices must be integers
I'm trying to extract data from an xml using xml to dict with the code:
import xmltodict, json
with open('Sockpuppet_articles.xml', encoding='utf-8') as xml_file:
dic_xml = xmltodict.parse(xml_file.read(), xml_attribs=False)
for page in dic_xml['mediawiki']['page']:
for rev in page['revision']:
for user in open("Sockpuppet_names.txt", "r", encoding='utf-8'):
user = user.strip()
if 'username' in rev['contributor'] and rev['contributor']['username'] == user:
I get this error in the last line with the if-statement:
TypeError: string indices must be integers
Weird thing is, it works on another xml-file.
I got the same error when the next level has only one element.
## Read XML
pastas = [os.path.join(caminho, name) for name in os.listdir(caminho)]
pastas = filter(os.path.isdir, pastas)
for pasta in pastas:
for arq in glob.glob(os.path.join(pasta, "*.xml")):
xmlData = codecs.open(arq, 'r', encoding='utf8').read()
xmlDict = xmltodict.parse(xmlData, xml_attribs=True)["XMLBIBLE"]
bible_name = xmlDict["#biblename"]
list_verse = []
for xml_inBook in xmlDict["BIBLEBOOK"]:
bnumber = xml_inBook["#bnumber"]
bname = xml_inBook["#bname"]
for xml_chapter in xml_inBook["CHAPTER"]:
cnumber = xml_chapter["#cnumber"]
for xml_verse in xml_chapter["VERS"]:
vnumber = xml_verse["#vnumber"]
vtext = xml_verse["#text"]
TypeError: string indices must be integers
The error occurs when the book is "Obadiah". It has only one chapter.
Cliking CHAPTER value we see the following view. Then it's supposed xml_chapter will be the same. That is true only if the book has more then one chapter:
But the loop returns "#cnumber" instead of an OrderedDict.
I solved that converting the OrderedDict to List when has only one chapter.
if len(xml_inBook["CHAPTER"]) == 2:
xml_chapter = list(xml_inBook["CHAPTER"].items())
cnumber = xml_chapter[0][1]
for xml_verse in xml_chapter[1][1]:
vnumber = xml_verse["#vnumber"]
vtext = xml_verse["#text"]
I am using Python 3,6.
I have following script which processes emails and save them to csv file. there will be advancement to script where I will use mechanize lib to process the extracted emails data for further processing on an another web interface. There are times it may fail now I can trap that specific email without having any problem but how can I forward the trapped email to a different address where I can process it manually or see what's wrong with it?
Here's the script
import ConfigParser
import poplib
import email
import BeautifulSoup
import csv
import time
DEBUG = False
CFG = 'email' # 'email' or 'test_email'
#def get_config():
def get_config(fnames=['cron/orderP/get_orders.ini'], section=CFG):
Read settings from one or more .ini files
cfg = ConfigParser.SafeConfigParser()
return {
'host': cfg.get(section, 'host'),
'use_ssl': cfg.getboolean(section, 'use_ssl'),
'user': cfg.get(section, 'user'),
'pwd': cfg.get(section, 'pwd')
def get_emails(cfg, debuglevel=0):
Returns a list of emails
# pick the appropriate POP3 class (uses SSL or not)
#pop = [poplib.POP3, poplib.POP3_SSL][cfg['use_ssl']]
emails = []
# connect!
host = cfg['host']
mail = poplib.POP3(host)
mail.set_debuglevel(debuglevel) # 0 (none), 1 (summary), 2 (verbose)
# how many messages?
num_messages = mail.stat()[0]
print('{0} new messages'.format(num_messages))
# get text of messages
if num_messages:
get = lambda i: mail.retr(i)[1] # retrieve each line in the email
txt = lambda ss: '\n'.join(ss) # join them into a single string
eml = lambda s: email.message_from_string(s) # parse the string as an email
print('Getting emails...')
emails = [eml(txt(get(i))) for i in xrange(1, num_messages+1)]
except poplib.error_proto, e:
print('Email error: {0}'.format(e.message))
mail.quit() # close connection
return emails
def parse_order_page(html):
Accept an HTML order form
Returns (sku, shipto, [items])
bs = BeautifulSoup.BeautifulSoup(html) # parse html
# sku is in first <p>, shipto is in second <p>...
ps = bs.findAll('p') # find all paragraphs in data
sku = ps[0].contents[1].strip() # sku as unicode string
shipto_lines = [line.strip() for line in ps[1].contents[2::2]]
shipto = '\n'.join(shipto_lines) # shipping address as unicode string
# items are in three-column table
cells = bs.findAll('td') # find all table cells
txt = [cell.contents[0] for cell in cells] # get cell contents
items = zip(txt[0::3], txt[1::3], txt[2::3]) # group by threes - code, description, and quantity for each item
return sku, shipto, items
def get_orders(emails):
Accepts a list of order emails
Returns order details as list of (sku, shipto, [items])
orders = []
for i,eml in enumerate(emails, 1):
pl = eml.get_payload()
if isinstance(pl, list):
sku, shipto, items = parse_order_page(pl[1].get_payload())
orders.append([sku, shipto, items])
print("Email #{0}: unrecognized format".format(i))
return orders
def write_to_csv(orders, fname):
Accepts a list of orders
Write to csv file, one line per item ordered
outf = open(fname, 'wb')
outcsv = csv.writer(outf)
for poNumber, shipto, items in orders:
outcsv.writerow([]) # leave blank row between orders
for code, description, qty in items:
outcsv.writerow([poNumber, shipto, code, description, qty])
# The point where mechanize will come to play
def main():
cfg = get_config()
emails = get_emails(cfg)
orders = get_orders(emails)
write_to_csv(orders, 'cron/orderP/{0}.csv'.format(int(time.time())))
if __name__=="__main__":
As we all know that POP3 is used solely for retrieval (those who know or have idea how emails work) so there is no point using POP3 for the sake of message sending that why I mentioned How to forward an email message captured with poplib to a different email address? as an question.
The complete answer was
smtplib can be used for that sake to forward an poplib captured email message, all you need to do is to capture the message body and send it using smtplib to the desired email address. Furthermore as Aleksandr Dezhin quoted I will agree with him as some SMTP servers impose different restrictions on message they are processed.
Beside that you can use sendmail to achieve that if you are on Unix machine.
Hey friends I am generating XML data using Python libraries as follow
def multiwan_info_save(request):
data = {}
init = "init"
form = Addmultiwanform(request.POST)
if form.is_valid():
from_sv = form.save(commit=False)
obj_get = False
obj_get = MultiWAN.objects.get(isp_name=from_sv.isp_name)
obj_get = False
nameservr = request.POST.getlist('nameserver_mw')
for nm in nameservr:
nameserver1, is_new = NameServer.objects.get_or_create(name=nm)
from_sv.nameserver = nameserver1
# main(init)
top = Element('ispinfo')
# comment = Comment('Generated for PyMOTW')
all_connection = MultiWAN.objects.all()
for conn in all_connection:
child = SubElement(top, 'connection number ='+str(conn.id)+'name='+conn.isp_name+'desc='+conn.description )
subchild_ip = SubElement(child,'ip_address')
subchild_subnt = SubElement(child,'subnet')
subchild_gtwy = SubElement(child,'gateway')
subchild_nm1 = SubElement(child,'probe_server1')
subchild_nm2 = SubElement(child,'probe_server2')
subchild_interface = SubElement(child,'interface')
subchild_weight = SubElement(child,'weight')
subchild_ip.text = str(conn.ip_address)
subchild_subnt.text = str(conn.subnet)
subchild_gtwy.text = str(conn.gateway)
subchild_nm1.text = str(conn.nameserver.name)
# subchild_nm2.text = conn.
subchild_weight.text = str(conn.weight)
subchild_interface.text = str(conn.interface)
print "trying to print _____________________________"
print tostring(top)
print "let seeeeeeeeeeeeeeeeee +++++++++++++++++++++++++"
But I am getting output like follow
<ispinfo><connection number =5name=Airtelllldesc=Largets TRelecome ><ip_address></ip_address><subnet></subnet><gateway></gateway><probe_server1></probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection number =5name=Airtelllldesc=Largets TRelecome ><connection number =6name=Uninordesc=Uninor><ip_address></ip_address><subnet></subnet><gateway></gateway><probe_server1></probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection number =6name=Uninordesc=Uninor><connection number =7name=Airteldesc=Largets TRelecome ><ip_address></ip_address><subnet></subnet><gateway></gateway><probe_server1></probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection number =7name=Airteldesc=Largets TRelecome ></ispinfo>
I just want to know that how can I write this XML in proper XML format ?
Thanks in advance
UPDATED to include simulation of both creating and printing of the XML tree
The Basic Issue
Your code is generating invalid connection tags like this:
<connection number =5name=Airtelllldesc=Largets TRelecome ></connection number =5name=Airteldesc=Largets TRelecome >
when they should look like this (I am omitting the sub-elements in between. Your code is generating these correctly):
<connection number="5" name="Airtellll" desc="Largets TRelecome" ></connection>
If you had valid XML, this code would print it neatly:
from lxml import etree
xml = '''<ispinfo><connection number="5" name="Airtellll" desc="Largets TRelecome" ><ip_address></ip_address><subnet></subnet><gateway></gateway><probe_server1></probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection></ispinfo>'''
xml = etree.XML(xml)
print etree.tostring(xml, pretty_print = True)
Generating Valid XML
A small simulation follows:
from lxml import etree
# Some dummy text
conn_id = 5
conn_name = "Airtelll"
conn_desc = "Largets TRelecome"
ip = ""
# Building the XML tree
# Note how attributes and text are added, using the Element methods
# and not by concatenating strings as in your question
root = etree.Element("ispinfo")
child = etree.SubElement(root, 'connection',
number = str(conn_id),
name = conn_name,
desc = conn_desc)
subchild_ip = etree.SubElement(child, 'ip_address')
subchild_ip.text = ip
# and pretty-printing it
print etree.tostring(root, pretty_print=True)
This will produce:
<connection desc="Largets TRelecome" number="5" name="Airtelll">
A single line is proper, in the sense that a XML parser will understand it.
For pretty-printing to sys.stdout, use the dump method of Element.
For pretty-printing to a stream, use the write method of ElementTree.
Kindly have a look at below code i am using this to generate a xml using python .
from lxml import etree
# Some dummy text
conn_id = 5
conn_name = "Airtelll"
conn_desc = "Largets TRelecome"
ip = ""
# Building the XML tree
# Note how attributes and text are added, using the Element methods
# and not by concatenating strings as in your question
root = etree.Element("ispinfo")
child = etree.SubElement(root, 'connection',
number = str(conn_id),
name = conn_name,
desc = conn_desc)
subchild_ip = etree.SubElement(child, 'ip_address')
subchild_ip.text = ip
# and pretty-printing it
print etree.tostring(root, pretty_print=True)
This will produce:
<connection desc="Largets TRelecome" number="5" name="Airtelll">
But i want it to be like :
<connection desc="Largets TRelecome" number='1' name="Airtelll">
Mean number attribute should be come in a single quote .Any idea ....How can i achieve this
There is no flag in lxml to do this, so you have to resort to manual manipulation.
import re
re.sub(r'number="([0-9]+)"',r"number='\1'", etree.tostring(root, pretty_print=True))
However, why do you want to do this? As there is no difference other than cosmetics.