Python: Verify if e-mail does not exist safely - python

We have a database of over 200.000 e-mail addresses and associated contacts. I had an idea that if I could find out which e-mail addresses don't exist anymore I could inactivate those contacts, thus keeping a more up to date database. My main goal is not to validate if an e-mail exists. My main goal is to find as many non-exsitent e-mail addresses as possible.
I have based a lot of my research on these answers: How to check if an email address exists without sending an email?
I have tried the python validate_email library, but it was very unreliable. It's also unsafe because you could get banned if you try to validate multiple contacts at the same company any time. It returned False to my active company e-mail, and None to my active gmail as well... so definitely unreliable.
I have tried both DNS with py3dns and MX records. Also the VRFY command. Unfortunately none of these seemed to be reliable since any e-mail server could send a fake response.
Greylisting is also a problem:
There is also an antispam technique called greylisting, which will
cause the server to reject the address initially, expecting a real
SMTP server would attempt a re-delivery some time later. This will
mess up attempts to validate the address.
The idea also occured to my that I could send a dummy e-mail, or two because of greylisting with a bit of delay in between them. I am afraid that this could get me blacklisted after a while, especially if multiple of these contacts work at the same company. Another idea is to do this from randomly generated e-mail addresses and hosts but that is probably not possible. Is there any way I could determine if an e-mail does not exist, preferably in a way that the chance of getting banned is minimal?

Related

Is there a method for automating emails with SMTP or otherwise?

I've been writing some python scripts in order to do some automation for my work. One of the scripts is intended to gather some test results, compile the string of results with a "message" string, and send it as an email every 12-24 hours (if there are results) to each individual who needs this information. Additionally, we're running this script on Linux; either in a Jenkins pipeline, or in a crontab (this script will most likely be run via crontab).
I was initially using gmail's SMTP server (smtp.gmail.com, port 587) to send these since we're working off of our own personal gmail server anyways, and it worked for a bit once I gave the script an "App Password" since it was a "less secure app" to Google. However, after about an hour of testing with it, Google disabled the account for spam. Any subsequent accounts I try to set up for the same purpose are disabled on the spot, as well (the moment I try to send an email with it, it's halted and disabled). It's been a few days since I requested reviews on both of the accounts; but I don't think they'll get back to me any time soon, nor will it be a result in my favor.
Since Google was no longer viable, I looked online and saw that there are plenty of SMTP hosting options available, but we're not looking for a paid service just to send an email once every few days or so. In terms of free options, I was able to find one other post related to PHP/Ruby sending emails without SMTP (Send email without external SMTP service), but if possible I'd like to keep this within Linux/Python only unless there is a simpler way, or a way that links well with Linux/Python. Even then, I'm still concerned that using SMTP is necessary for our gmail accounts to receive these emails. If I'm wrong, please correct me; because it certainly seems that way to me.
Based off of the situation, how could I adjust my strategy in order to automate email updates of this nature?

workaround SPAM in Catch-all email

We are building a website expecting to serve around 10k users or more.
We want a functionality that each user will have an email address # our domain, however, this email address will not be accessible by the user it will only be used to receive a very rare amount of emails (say it will be published to receive gift cards directed to that user) and we need to notify the user as any gift card is received and do something accordingly at our end
We want to use Google App Engine with Python, and what first came to our minds is that we don't actually create any real mailboxes and we create a single catch-all email that will receive any mail sent to our domain mailboxes even not existed, then we can filter all received emails and map them to our users accounts
Within current Google receive limits (1 email per second) this will work, the main issue is that SPAM mails comes and flood the catch-all mail box with emails that directly reach receive limit and email account gets suspended by Google
We want to apply a smarter solution, we only care for emails that comes from a specific sender domain (say: giftcards.com) and we can drop any other emails
Is there is any reliable service or setup at our side we can use to filter out unwanted emails and only forward emails from giftcards.com (for example) to our main Google apps email?

Fetch mails from another mailbox in Google App Engine

I am trying to fetch mails from another mailbox (xxx#domail.com or xxx#gmail.com) in google-app-engine.
I don't want to read mails from appspotmail box as it is being used for different purpose.
Is there any efficient way in which i can make this happen.
Two options:
You could read an inbox via POP/IMAP, but this requires a bit of coding. You also need to have Outgoing Sockets API enabled, which requires you to have a paid app. This approach is async, which means you will constantly need to poll for new messages.
Forward emails to a new appspotmail address (you can have many). This is pretty easy, especially since you already process incoming emails. Since you can have multiple accounts, e.g. xyz#yourappid.appspotmail.com, you can distinguish between them in code.
You can use imap+oauth to read email from a google address. If you google it the very first result is what you need. https://developers.google.com/gmail/oauth_overview

Process dynamic email addresses using python

I need to do the following and I was wondering if anyone has done something similar, and if so what they did.
I need to write a program that will handle incoming emails for different clients, process them, and then depending on the email address, do something (add to database, reply, etc).
The thing that makes this a little more challenging is that the email addresses aren't static they are dynamic. For example. The emails would be something like this. dynamic-email1#dynamic-subdomain1.domain.com . The emails are grouped by client using a dynamic subdomain in this example it would be 'dynamic-subdomain1'. A client would have their own subdomain that is assigned to them. Each client can create their own email address under their subdomain, and assign an event to that email. These email addresses and subdomains can change all of the time, new ones added, old ones removed, etc.
So for example if an email comes in for the email 'dynamic-email1#dynamic-subdomain1.domain.com' then I would need to look up in the database to find out which client is assigned the 'dynamic-subdomain1' subdomain and then look to see which event maps to the email address of 'dynamic-email1' and then execute that event. I have the event processing already, I'm just not sure how to map the email addresses to the event.
Since the email addresses are dynamic, it would be a real pain to handle this with file based configuration files, it would be nice to look up in a database instead. I did some research and I found some projects that do something similar but not exactly. The closest that I found is Zed Shaw's Lamson project: http://lamsonproject.org
More background:
I'm using python, django, linux, mysql, memcached currently.
Questions:
Has anyone used Lamson to do what I'm looking to do, how did you like it?
Is there any other projects that do something similar, maybe in a different language besides python?
How would I setup my DNS MX record to handle something like this?
Thanks for your help.
Update:
I did some more research on the google app engine suggestion and it might work but I would need to change too many things and it would add too many moving parts. I would also need a catch all emailer forwarder, anyone know of any good cheap ones? I prefer to deploy on system that handles all email. It looks like people have used postfix listening on port 25 and forwarding requests to lamson. This seems reasonable, I'm going to try it out and see how it goes. I'll update with my results.
Update 2:
I did some more research and I found a couple of websites that do something like this for me, so I'm going to look at them next.
http://mailgun.net
http://www.emailyak.com
I've done some work on a couple projects using dynamic email addresses, but never with dynamic subdomains at the same time. My thoughts on your questions:
I've never used Lamson, so I can't comment on that.
I usually use App Engine's API to receive and handle incoming messages, and it works quite well. You could easily turn each received message into a basic POST request on your own server with e.g. To, From, Subject, and Message fields and handle those with standard django.
One downside with GAE email is having to use *#yourappname.appspotmail.com, but you could get around that by setting up a catch-all email forwarder for *#yourdomain.com to direct everything to secretaddress#yourappname.appspotmail.com. That would let you receive the messages on the custom domain and handle them with GAE.
The other issue/benefit with GAE is using Google's servers instead of your own (at least for the email bit).
For the subdomain issue, you could try setting up wildcard DNS for the MX records, which (in theory) would direct all mail sent to any subdomain to the same server(s). This would enable you to receive email on all subdomains (for better or worse--look out for spam!)
For lamson, have you tried something as simple as:
#route("(address)#(subdomain).(host)", address=".+", subdomain="[^\.]+")
def START(message, address=None, subdomain=None, host=None):
....

Google App Engine (Python)- Strange behaviour of REMOTE_ADDR

In order to make the registration process on my website easy, I allow users to enter their email address which I will send a verification code to or alternatively they can solve a captcha.
The problem is that in order to prevent robots from registering accounts (with fake emails) I limit the number of registrations allowed per IP address and if this limit is exceeded I trigger a warning in the logs.
However ... what seems to be happening is that I am using os.environ['REMOTE_ADDR'] to check the remote address -- but it seems that I am triggering warnings on addresses that are owned by Google (66.249.65.XXX). It is possible that this is happening only after I change the version (but not confirmed). Does anyone know how/why this might be happening? Shouldn't the REMOTE_ADDR return the address of the client computer (and hopefully in all cases it would do this)?
I am curious if there is some behind the scenes re-directions going on, and if this is a normal event or if it only happens when a new version is installed (perhaps when a new version is installed the original server then proxies the user to the new server, therefore creating the illusion that the IP address is an internal IP?)
I believe that I have figured out the reason for seeing so many warnings from google server IP addresses. It seems that immediately after a new user registers, the google crawlers are going to the same (registration) webpage (which I send information to as a GET instead of a POST for reasons which I will not get into). Of course, since many users are registering, but there are only a few crawler computers that are checking periodic updates to my website, I am triggering warning messages that a particular (google) IP is accessing a registration area repeatedly.

Categories