Workaround for Python & Selenium: authenticate against Active Directory

Workaround for Python & Selenium: authenticate against Active Directory - python

I am using Python (2.7) and Selenium (3.4.3) to drive Firefox (52.2.0 ESR) via geckodriver (0.19.0) to automate a process on a CentOS 7 machine.
I need totally unattended operation of this automation with user credentials passed through; no storage allowed and no breaking in.
One piece of drama is being caused by the fact that the internal website required for the process is within an Active Directory domain while the machine running my automation is not. I have no need to validate the user, only pass the credentials to the website in such a way as to not require human interaction or for the person to be a local user on the machine.
I have tried various permutations of:
[protocol]://[user,pass]#[url]
driver.switch_to_alert() + send_keys
It seems some of those only work on IE, something I have no access to.
I have checked for libraries to handle this and all to no avail.
I can add libraries to python and I have sudo access to the machine - can't touch authentication, so AD integration is not possible.
How can I give this AD website the credentials of an arbitrary user such that no local storage of their credentials happens an no user interaction is required?
Thank you
EDIT
I think something like a proxy which could authenticate the user then retain that authentication for selenium to do its thing ...
Is there a simple LDAP/AD proxy available?
EDIT 2
Perhaps a very simple way of stating this is that I want to pass user credentials and prevent the authentication popup from happening.

Solution Found:
I needed to use a browser extension.
My solution has been built for chromium but it should port almost-unchanged for Firefox and maybe edge.
First up, you need 2 APIs to be available for your browser:
webRequest.onAuthRequired - Chrome & Firefox
runtime.nativeMessaging - Chrome & Firefox
While both browser APIs are very similar, they do have some significant differences - such as Chrome's implementation lacking Promises.
If you setup your Native Messaging Host to send a properly-formed JSON string, you need only poll it once. This means you can use a single call to runtime.sendNativeMessage() and be assured that your credentials are paresable. Pun intended.
Next, we need to look at how we're supposed to handle the webRequest.onAuthRequired event.
Since I'm working in Chromium, I need to use the promise-less Chrome API.
chrome.webRequest.onAuthRequired.addListener(
callbackFunctionHere,
{urls:[targetUrls]},
['asyncBlocking'] // --> this line is important, too. Very.
The Change:
I'll be calling my function provideCredentials because I'm a big stealy-stealer and used an example from this source. Look for the asynchronous version.
The example code fetches the credentials from storage.local ...
chrome.storage.local.get(null, gotCredentials);
We don't want that. Nope.
We want to get the credentials from a single call to sendNativeMessage so we'll change that one line.
chrome.runtime.sendNativeMessage(hostName, { text: "Ready" }, gotCredentials);
That's all it takes. Seriously. As long as your Host plays nice, this is the big secret. I won't even tell you how long it took me to find it!
Links:
My questions with helpful links:
Here - Workaround for Authenticating against Active Directory
Here - Also has some working code for a functional NM Host
Here - Some enlightening material on promises
So this turns out to be a non-trivial problem.
I haven't implemented the solution, yet, but I know how to get there...
Passing values to an extension is the first step - this can be done in both Chrome and Firefox. Watch the version to make sure the API required, nativeMessaging, actually exists in your version. I have had to switch to chromium for this reason.
Alternatively, one can use the storage API to put values in browser storage first. [edit: I did not go this way for security concerns]
Next is to use the onAuthRequired event from the webRequest API . Setup a listener on the event and pass in the values you need.
Caveats: I have built everything right up to the extension itself for the nativeMessaging API solution and there's still a problem with getting the script to recognise the data. This is almost certainly my JavaScript skills clashing with the arcane knowledge required to make these APIs make much sense ...
I have yet to attempt the storage method as it's less secure (in my mind) but it does seem to be simpler.

Related

How to integrate BIRT with Python Django Project by using Py4j

Hi is there anyone who is help me to Integrate BIRT report with Django Projects? or any suggestion for connect third party reporting tools with Django like Crystal or Crystal Clear Report.

Some of the 3rd-party Crystal Reports viewers listed here provide a full command line API, so your python code can preview/export/print reports via subprocess.call()
The resulting process can span anything between an interactive Crystal Report viewer session (user can login, set/change parameters, print, export) and an automated (no user interaction) report printing/exporting.
While this would simplify your code, it would restrict deployment to Windows.

For prototyping, or if you don't mind performance, you can call from BIRT from the command line.
For example, download the POJO runtime and use the script genReport.bat (IIRC) to generate a report to a file (eg. PDF format). You can specify the output options and the report parameters on the command line.
However, the BIRT startup is heavy overhead (several seconds).
For achieving reasonable performance, it is much better to perform this only once.
To achieve this goal, there are at least two possible ways:
You can use the BIRT viewer servlet (which is included as a WAR file with the POJO runtime). So you start the servlet with a web server, then you use HTTP requests to generate reports.
This looks technically old-fashioned (eg. no JSON Requests), but it should work. However, I never used this approach.
The other option is to write your own BIRT server.
In our product, we followed this approach.
You can take the viewer servlet as a template for seeing how this could work.
The basic idea is:
You start one (or possibly more than one) Java process.
The Java process initializes the BIRT runtime (this is what takes some seconds).
After that, the Java process listens for requests somehow (we used a plain socket listener, but of course you could use HTTP or some REST server framework as well).
A request would contain the following information:
which module to run
which output format
report parameters (specific to the module)
possibly other data/metadata, e.g. for authentication
This would create a RunAndRenderTask or separate RunTask and RenderTasks.
Depending on your reports, you might consider returning the resulting output (e.g. PDF) directly as a response, or using an asynchronous approach.
Note that BIRT will happily create several reports at the same time - multi-threading is no problem (except for the initialization), given enough RAM.
Be warned, however, that you will need at least a few days to build a POC for this "create your own server" approach, and probably some weeks for prodction quality.
So if you just want to build something fast to see if the right tool for you, you should start with the command line approach, then the servlet approach and only then, and only if you find that the servlet approach is not quite good enough, you should go the "create your own server" way.
It's a pity that currently there doesn't seem to exist an open-source, production-quality, modern BIRT REST service.
That would make a really good contribution to the BIRT open-source project... (https://github.com/eclipse/birt)

API endpoint for data from Microsoft Exchange Online Protection?

I am working on a project where I have been using Python to make API calls to our organization's various technologies to get data, which I then push to Power BI to track metrics over time relating to IT Security.
My boss wants to see info added from Exchange Online Protection such as malware detected in emails, spam blocks etc., essentially replicating some of the email and collaboration reports you'd see in M365 defender > reports > email and collaboration (security.microsoft.com/emailandcollabreport).
I have tried the Defender API and MS Graph API, read through a ton of documentation, and can't seem to find anywhere to pull this info from. Has anyone done something similar, or know where this data can be pulled from?
Thanks in advance.

You can try using the Microsoft Graph Security API using which you can get the alerts, information protection, secure score using that. Also you can refer the alerts section in the documentation which talks about the list of supported providers at this point using the Microsoft Graph security api.

In case anyone else runs into this, this is the solution I ended up using (hacky as it may be);
The only way to extract the pertinent info seems to be through PowerShell, you need the modules ExchangeOnlineManagement and PSWSMan so those will need to be installed.
You need to add an app to your Azure instance with global reader role minimum (or something custom) and generate and upload self-signed certificates to the app.
I then ran the following lines as a ps1 script:
Connect-ExchangeOnline -CertificateFilePath "<PATH>" -AppID "<APPID>" -Organization "<ORG>.onmicrosoft.com" -CertificatePassword (ConvertTo-SecureString -String '<PASSWORD>' -AsPlainText -Force)
$dte = (Get-Date).AddDays(-30)
Get-MailflowStatusReport -StartDate $dte -EndDate (Get-Date)
Disconnect-ExchangeOnline
I used python to call the powershell script, then extract the info I needed from the output and push it to PowerBI.
I'm sure there is a more secure and efficient way to do this but I was able to accomplish the task this way.

Limit access to a specific file from only a specific Python script in Linux

Problem:
Customer would like to make sure that the script I've developed in Python, running under CentOS7, can sufficiently obscure the credentials required to access a protected web service (that only supports Basic Auth) such that someone with access to the CentOS login cannot determine those credentials.
Discussion:
I have a Python script that needs to run as a specific CentOS user, say "joe".
That script needs to access a protected Web Service.
In order to attempt to make the credentials external to the code I have put them into a configuration file.
I have hidden the file (name starts with a period "."), and base64 encoded the credentials for obscurity, but the requirement is to only allow the Python script to be able to "see" and open the file, vs anyone with access to the CentOS account.
Even though the file is hidden, "joe" can still do an ls -a and see the file, and then cat the contents.
As a possible solution, is there a way to set the file permissions in CentOS such that they will only allow that Python script to see and open the file, but still have the script run under the context of a CentOS account?

Naive solution
For this use-case I would probably create with a script (sh or zsh or whatever, but I guess u use the default one here) a temporal user iamtemporal-and-wontstayafterifinish. Then creating the config file for being able to read ONLY by specifically this user (and none permission for all the others). Read here for the how: https://www.thegeekdiary.com/understanding-basic-file-permissions-and-ownership-in-linux/
Getting harder
If the problem still raises in case someone would have root-rights (for any such reason), then just simply forget everything above, and start planning for a vacation, cuz' this will be a lot longer then anyone would think.
Is not anymore a simple python problem, but needs a different business logic. The best u could do is to implement (at least this credentials handling part) in a low-level language so could handle memory in a customized way and ask for them runtime only, don't store them...
Or maybe if u could limit the scope of this user accesses towards the protected Web Service as u say.
Bonus
Even tho it wasn't explicitly asked, I would discourage you from storing credentials with using a simple base64...
For this purpose a simple solution could be the following one at least (without the knowledge of the whole armada of cryptography):
encrypt the passw with a asymmetric cryptographic algorithm (probably RSA with a huge key
inject the key for decryption as a env var while you have an open ssh session to the remote terminal
ideally u use this key only while u decrypt and send it, afterwards make sure u delete the references to the variables
Sidenote: it's still filled with 'flaws'. If security is really a problem, I would consider changing technology or using some sort of lib that handles these stuff more securely. I would start probably here: Securely Erasing Password in Memory (Python)
Not to mention memory dumps can be read 'easily' (if u know what u are looking for...): https://cyberarms.wordpress.com/2011/11/04/memory-forensics-how-to-pull-passwords-from-a-memory-dump/
So yeah, having a key-server which sends you the private key to decrypt is not enough, if you read these last two web entries...

Inexplicable Urllib2 problem between virtualenv's.

I have some test code (as a part of a webapp) that uses urllib2 to perform an operation I would usually perform via a browser:
Log in to a remote website
Move to another page
Perform a POST by filling in a form
I've created 4 separate, clean virtualenvs (with --no-site-packages) on 3 different machines, all with different versions of python but the exact same packages (via pip requirements file), and the code only works on the two virtualenvs on my local development machine(2.6.1 and 2.7.2) - it won't work on either of my production VPSs
In the failing cases, I can log in successfully, move to the correct page but when I submit the form, the remote server replies telling me that there has been an error - it's an application server error page ('we couldn't complete your request') and not a webserver error.
because I can successfully log in and maneuver to a second page, this doesn't seem to be a session or a cookie problem - it's particular to the final POST
because I can perform the operation on a particular machine with the EXACT same headers and data, this doesn't seem to be a problem with what I am requesting/posting
because I am trying the code on two separate VPS rented from different companies, this doesn't seem to be a problem with the VPS physical environment
because the code works on 2 different python versions, I can't imagine it being an incompabilty problem
I'm completely lost at this stage as to why this wouldn't work. I've even 'turned-it-off-and-turn-it-on-again' because I just can't see what the problem could be.
I think it has to be something to do with the final POST coming from a VPS that the remote server doesn't like, but I can't figure out what that could be. I feel like there is something going on under the hood of URLlib that is causing the remote server to dislike the reply.
EDIT
I've installed the exact same Python version (2.6.1) on the VPS as is on my working local copy and it doesn't work remotely, so it must be something to do with originating from a VPS. How could this effect the Http request? Is it something lower level?

You might try setting the debuglevel=1 for urllib2 and see what it comes up with:
import urllib2
h=urllib2.HTTPHandler(debuglevel=1)
opener = urllib2.build_opener(h)
...

This is a total shot in the dark, but are your VPSs 64-bit and your home computer 32-bit, or vice versa? Maybe a difference in default sizes or accuracies of something could be freaking out the server.
Barring that, can you try to find out any information on the software stack the web server is using?

I had similar issues with urllib2 (working with Zimbra's REST api), in the end switched to pycurl with success.
PS
for operations of login/navigate/post, I usually find Mechanize useful and easier to use. Maybe you can give it a show.

Well, it looks like I know why the problem was happening, but I'm not 100% the reason for it.
I simply had to make the server wait (time.sleep()) after it sent the 2nd request (Move to another page) before doing the 3rd request (Perform a POST by filling in a form).
I don't know is it because of a condition with the 3rd party server, or if it's some sort of odd issue with URLlib? The reason it seemed to work on my development machine is presumably because it was slower then the server at running the code?

How to set Selenium browsers to treat Selenium's hub as a proxy server in python on Selenium Grid?

I'm running Selenium 2.0b4dev on Selenium Grid in Ubuntu 10.04, using Python code to write test cases. I've been having trouble with getting basic HTTP authentication to a specific site working, and with a quick google search found that my problem could be solved with the addition of the line self.selenium.add_custom_request_header("Authorization", "Basic %s" % _encoded) (with a proper line break in the middle to conform to PEP 8, of course.)
Unfortunately, apparently also through my search I found in order for that line of code to work I need to configure my browser (whichever one I'm using to run the test cases on the grid) to treat Selenium's (automatically running, apparently?) proxy server as a proxy for that browser to use. But apparently I need to modify the profile of Firefox (or IE)'s launcher to automatically use that proxy, since the whole point of these Selenium Grid test cases is that they aren't supposed to require user intervention, and I have little to no idea how to do that. I've just been using the "ant launch-hub" and "ant launch-remote-control" and then running python programs on the hub that import selenium and unittest.
If anyone could help, that would be just fantastic.

I wrote up an article on how to do this in Ruby. It links to a complementary article on testing self-signed certificates and gives you the set of flags you need to launch Selenium with.
http://mogotest.com/blog/2010/06/23/how-to-perform-basic-auth-in-selenium
To pass args through from grid to the underlying RC server, you need to use something like:
ant -DseleniumArgs="-trustAllSSLCertificates" launch-remote-control
Re: browsers . . . firefox will auto-enable the proxy mode stuff if you pass trustAllSSLCertificates now. Otherwise you need to use *firefoxproxy. IE requires the use of *iexploreproxy or a custom HTA launcher that configures the proxy (the article links to one we open-sourced but would need to be updated to work with 2.0 beta 4).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.