how filtre huge email list by domain with python - python

i need help with python
how filtre huge email list by domain with python?
my email list contain different email AOl Gmail Hotmail ....
i want to select one domain ex Gmail and creat a new file contain only gmail adresses
this is the regex function how can i edit it to get only gmail accounts ?
regex = re.compile(("([a-z0-9!#$%&*+\/=?^_{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_" "{|}~-]+)*(#|\sat\s)(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?(\.|" "\sdot\s))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)"))

Can you provide an example of input data?
Anyway, you don't need regex here, just split each email address on its # and get the domains.
If you have a string with one address per line, you can do the following.
hosts = {}
for address in addresses.splitlines():
_, host = address.split('#')
if host not in hosts:
hosts[host] = [address]
else:
hosts[host].append(address)

Related

Recipient Name & Required Attendees not matching Outlook Python

I am trying to fetch Recipients email address (some are saved contacts, some are in-organization people and remaining are out-of-organization people). I wish to get names and email addresses of all the Required Attendees from a shared calendar. Following is what i have tried so far, but i am not able to get the correct "Recipient" names. Required attendees data is correct, but recipient names dont match.
for appointmentItem in restrictedItems:
Organizer = appointmentItem.Organizer
Required_Attendees = appointmentItem.RequiredAttendees
recipient = appointmentItem.Recipients
for r in recipient:
recipients_list.append(r)
for rec in recipients_list:
rec_nam = rec.name
email_add = rec.Address
## email_add = rec.AddressEntry.GetExchangeUser().PrimarySmtpAddress
## the above command does not give me all email addresses..
## ..i dont get out of organization email addresses
I am not sure what am i doing wrong! Any help would be appreciated!!
Issue Faced -
Recipient names not same as Required Attendees
Recipient email address not fetched correctly
Need to get email address of all people - be it from same organization or outside organization
Thanks in advance!!!

python IMAP content of email contains a string

I am able to log in a gmail account with python IMAP
imap = imaplib.IMAP4_SSL('imap.gmail.com')
imap.login(myDict["emailUsername"], myDict["emailPassword"])
imap.select(mailbox='inbox', readonly=False)
resp, items = imap.search(None, 'All')
email_ids = items[0].split()
latest_email_id = email_ids[-1]
resp, data = imap.fetch(latest_email_id, "(UID)")
print ("resp= ", resp, " data=", data)
#msg_uid = parse_uid(data[0])
match = pattern_uid.match(data[0].decode("utf-8"))
#print ("match= ", match)
msg_uid = match.group('uid')
I need to make sure that the UID for the last email I have contains a certain string (XYZ). I am NOT looking for header subject but the content of email. How can I do that ?
There's a couple ways you could go:
Fetch the message and walk through the text body parts looking for your string -- example at Finding links in an emails body with Python
Get the server to do the search by supplying 'latest_email_id' and your search criteria back to the server in a UID SEARCH command. For Gmail, you can even use the X-GM-RAW attribute to use the same syntax support by the GMail web interface. See https://developers.google.com/gmail/imap/imap-extensions for details of that.

Python IMAP - Read Gmail with '+' in email address

I've previously used imaplib in Python 3to extract emails from gmail. However I would want to generate a script to differentiate emails to the same address with different strings after a plus sign. For example, the base email address can be:
example#gmail.com
Then I would want to separately read all emails with the addresses:
example+test1#gmail.com,
example+test2#gmail.com,
example#gmail.com.
Therefore I would wind up with a dictionary of lists containing the specific emails. This only works for example#gmail.com. For example:
{'example':[],
'example_test':[],
'example_test2':[]}
Currently I can retrieve the emails that I need with this function from a class:
def get_emails(self):
"""Retrieve emails"""
self.M = imaplib.IMAP4_SSL(self.server)
self.M.login(self.emailaddress,self.password)
self.M.select(readonly=1)
self.M.select('INBOX', readonly=True)
#Yesterdays date
date = (datetime.date.today() - datetime.timedelta(self.daysback)).strftime("%d-%b-%Y")
print("Selecting email messages since %s" % date)
#Retrieve all emails from yesterday on
result,data = self.M.uid('search', None, '(SENTSINCE {date})'.format(date=date))
return result,data
You should directly use the exact mail address you want in the IMAP search request. For example it could be something like :
result,data = self.M.uid('search', None, '(SENTSINCE {date})'.format(date=date),
('TO example+test1#gmail.com'))

How do I extract everything to the left of an # in python using string indexing?

trying to take a user's email address input and print out the website in their address.
email=input('What is your email address?')
website=email[40:]
print(website)
user, at, domain = email.partition("#")
Now user is the user name, at is the # symbol, and domain is the domain name.
If there is no # symbol, at and domain will be empty strings. You could test for this and change domain to a default value:
at, domain = at or "#", domain or "gmail.com"
Or just issue an error message.
Split once on the # sign and take the last element:
website = email.split('#', 1)[-1]
This works even if there is no # sign in the input string.
I think this would work best:
out=email.split('#')
try:
print(out[1])
except IndexError:
print('Invalid email address!')
This will explode if the address is ill-formed: server = email.split('#')[1]. Catch the exception with rescue, and report the error back to the user.
And it's not a "website" it's a "server". A website is just one of many services you can put on a server.
>>> email=input('What is your email address?')
What is your email address?me#somewhere.com
>>> i = email.index("#")
>>> i
2
>>> email[i+1:]
'somewhere.com'
>>> email.split("#")[1]
'somewhere.com'
>>> email.partition("#")[-1]
'somewhere.com'

Get sender email address with Python IMAP

I have this python IMAP script, but my problem is that, every time I want to get the sender's email address, (From), I always get the sender's first name followed by their email address:
Example:
Souleiman Benhida <souleb#gmail.com>
How can i just extract the email address (souleb#gmail.com)
I did this before, in PHP:
$headerinfo = imap_headerinfo($connection, $count)
or die("Couldn't get header for message " . $count . " : " . imap_last_error());
$from = $headerinfo->fromaddress;
But, in python I can only get the full name w/address, how can I get the address alone? I currently use this:
typ, data = M.fetch(num, '(RFC822)')
mail = email.message_from_string(data[0][1])
headers = HeaderParser().parsestr(data[0][1])
message = parse_message(mail) #body
org = headers['From']
Thanks!
Just one more step, using email.utils:
email.utils.parseaddr(address)
Parse address – which should be the value of some address-containing field such as To or Cc – into its constituent realname and email address parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of ('', '') is returned.
Note: originally referenced rfc822, which is now deprecated.
to = email.utils.parseaddr(msg['cc'])
This works for me.
My external lib https://github.com/ikvk/imap_tools
let you work with mail instead read IMAP specifications.
from imap_tools import MailBox, A
# get all emails from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
for msg in mailbox.fetch(A(all=True)):
print(msg.date, msg.from_, msg.to, len(msg.text or msg.html))
msg.from_, msg.to - parsed addresses, like: 'Sender#ya.ru'
I didn't like the existing solutions so I decided to make a sister library for my email sender called Red Box.
Here is how to search and process emails including getting the from address:
from redbox import EmailBox
# Create email box instance
box = EmailBox(
host="imap.example.com",
port=993,
username="me#example.com",
password="<PASSWORD>"
)
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(unseen=True):
# Process the message
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
# Flag the email as read/seen
msg.read()
I also wrote extensive documentation for it. It also has query language that fully supports nested logical operations.

Categories