Python requests, Kerberos and NTLM

Python requests, Kerberos and NTLM - python

I've been recently working on a project in which I need to access a asp.net web API in order to get some data. The way I've been gaining access to this API so far is by manually setting the cookies manually within the code and then using requests to get the information that I need. My task now is to automate this process. I get the cookies by using the Chrome developer tools, in the network tab. Now obviously the cookies change every once in a while so I've been trying to make something that will automatically change the cookies inside.
I should mention that the network at which this is being done is air-gaped and getting python libraries inside is kind of tedious, so I am trying to avoid that. It is also the reason why getting code examples here is very complicated.
The way the log-in process works in this web app is as follows (data from chrome dev tools):
Upon entering the URL there are a bunch of redirects which seem to do nothing.
A request is made to /login.aspx which returns a "set-cookie: 'sessionId=xyz'" header and redirects to /LandingPage.aspx
A request is made to /LandingPage.aspx with said cookie which returns a "set-cookie" header with a bunch of cookies (ASP.NET etc'). These are the cookies that I need in order to make the python script access the API.
What's written above is the browser way of doing things, when I try to imitate this in python requests, I get the first cookie from /login.aspx but when it redirects to /LandingPage.aspx, I get a 401 Unauthorized with the following headers:
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
After having done some reading I understood that these response headers are related to NTLM and Kerberos protocols (side question: if it responds with both headers does it mean that I need to provide both authentications or that either one will suffice?).
Quick google search yielded that after these mentioned responses should follow a request with the Kerberos/NTLM token (which I have no idea how to acquire) in order to get a 200 response. I find this pretty weird considering the browser doesn't make any of these requests and the web app just gives it the cookies without it seemingly transferring any NTLM or Kerberos data.
I've thought of a few ways to overcome this and hopefully you could help me figure out whether this would work.
Trying to get the requests-kerberos or requests-ntlm libraries for python and using those to overcome this problem. I would like your opinion to whether this would work. I am reluctant to use this method though, because of what was mentioned above.
Somehow using PowerShell to get these tokens and then somehow using these tokens in python requests without the above mentioned libraries. But I have no idea if this would work either.
I would very much appreciate anyone who could maybe further explain the process that's happening here in general, and of course would greatly appreciate any help with solving this.
Thank you very much!

Trying to get the requests-kerberos or requests-ntlm libraries for python and using those to overcome this problem. I would like your opinion to whether this would work. I am reluctant to use this method though, because of what was mentioned above.
Yes, requests-kerberos would work. HTTP Negotiate means Kerberos almost 100% of the time.
For Linux I'd slightly prefer requests-gssapi, which is based on a more maintained 'gssapi' backend, but at the moment it's limited to Unix-ish systems only – while requests-kerberos has the advantage of supporting Windows through the 'winkerberos' backend. But it doesn't really matter; both will do the job fine.
Don't use NTLM if you can avoid it. Your domain admins will appreciate being able to turn off NTLM domain-wide as soon as they can.
Somehow using PowerShell to get these tokens and then somehow using these tokens in python requests without the above mentioned libraries. But I have no idea if this would work either.
Technically it's possible, but doing this via PowerShell (or .NET in general) is going the long way around. You can achieve exactly the same thing using Python's sspi module, which talks directly to the actual Windows SSPI interface that handles Kerberos ticket acquisition (and NTLM, for that matter).
(The gssapi module is the Linux equivalent, and the spnego module is a cross-platform wrapper around both.)
You can see a few examples here – OP has a .NET example, the answer has Python.
But keep in mind that Kerberos tokens contain not only the service ticket but also a one time use authenticator (to prevent replay attacks), so you need to get a fresh token for every HTTP request.
So don't reinvent the wheel and just use requests-kerberos, which will automatically call SSPI to get a token whenever needed.
it says that in order for requests-kerberos to work there has to be a TGT cached already on the PC. This program is supposed to run for weeks without being interfered with and to my understanding these tickets expire after about 10 hours.
That's typical for all Kerberos use, not just requests-kerberos specifically.
If you run the app on Windows, from an interactive session, then Windows will automatically renew Kerberos tickets as needed (it keeps your password cached in LSA memory for that purpose). However, don't run long-term tasks in interactive sessions...
If you run the app on Windows, as a service, then it will use the "machine credentials" aka "computer account" (see details), and again LSA will keep the tickets up-to-date.
If you run the app on Linux, then you can create a keytab that stores the client credentials for the application. (This doesn't need domain admin rights, you only need to know the app account's password.)
On Linux there are at least 4 different ways to use a keytab for long-term jobs: k5start (third-party, but common); KRB5_CLIENT_KTNAME (built-in to MIT Kerberos, but only in recent versions); gss-proxy (from RedHat, might already be part of the OS); or a basic cronjob that just re-runs kinit to acquire new tickets every 4-6 hours.
I find this pretty weird considering the browser doesn't make any of these requests and the web app just gives it the cookies without it seemingly transferring any NTLM or Kerberos data.
It likely does, you might be overlooking it.
Note that some SSO systems use JavaScript to dynamically probe for whether the browser has Kerberos authentication properly set up – if the main page really doesn't send a token, then it might be an iframe or an AJAX/XHR request that does.

Related

Why is an Apache server responding differently to my NodeJS vs Python requests?

I've been hitting a REST API for a website for months with the Python requests library without issue. It's on a private network that I have no control over. It also requires client certificate authentication, so the call looks something like:
r = requests.get('https://example.com/some/endpoint?param=something', cert=(key_path, cert_path), verify=ca_path)
Upon inspection of the request, I was able to see that the server initially replies with a 302 redirect code to another URI on the server in order to set an authentication cookie, and then redirects back to the resource I was requesting, which is then fulfilled successfully. Hakuna matata.
We've been standing up a new NodeJS server component and attempting to connect it to the same endpoint. I've tried with a few different libraries, and they all get identical results. Here's an example of one of my attempts using the popular got library with the tough-cookie CookieJar:
let opts = {
key: fs.readFileSync(keyPath),
cert: fs.readFileSync(certPath),
ca: [fs.readFileSync(caPath)],
cookieJar: new CookieJar()
}
// Using Promise API here because I'm running on an older version of NodeJS which
// unfortunately doesn't have async/await :(
got('https://example.com/some/endpoint?param=something', opts).then((res) => {
// ...
});
Strangely, the server responds with a 200 code, and an HTML page that basically says (and I'm paraphrasing of course):
"You seem to have lost your session info, dude. Maybe your browser restarted after a plugin installation or something."
It has embedded links to the site's login page. Unfortunately I can't copy and paste the output because everything's on a closed private network.
I can't for the life of me figure out why this server (which the response headers indicates is an Apache server), would be responding differently to Node's HTTP request vs Python's. I've even changed the HTTP request headers on both the requests and the got calls to be absolutely identical, to no avail. It always works for Python, and never works for Node.
This makes no sense to me. Is there anyone familiar with Python's requests library and Node's HTTP module who can identify the subtle differences in the way these might be connecting to the server that might be causing this issue? The server obviously seems to be able to distinguish between the two requests, even though the headers are identical. Anyone familiar with Apache have any ideas or things to look for which could shed light on the issue?
In case it's relevant, we're using Python v3.6 and Node v8.x (Can't remember exactly which minor version... and unfortunately I can't access the machine at the moment, will update later)
Any suggestions? What other things can I try to get the requests to complete successfully in Node like they do in Python?

Facebook authentication - are there different approaches?

I'm in process of building an app for Facebook using Python and Django. I'm investigating different solutions for integration with Facebook authentication API.
So far I've found the two viable solutions:
django-social-auth
python-sdk
I've already tried the first one and it seems to work nicely. I've just read about the second one and it seems to use Facebook JavaScript SDK.
My question is: are those two libraries doing authentication differently? Do I understand correctly that the first one uses OAuth directly to communicate with Facebook and get an authentication token from there, whereas the second one just displays some JavaScript enriched intermediate sites that request the authentication token from the level of a web browser?
In general: are there different ways of going about facebook authentication (JavaScript SDK vs something else)? Why is JavaScript SDK a recommended approach? And is the "something else" approach incapable of producing cookies and therefore less efficient in any way...

When you use a backend implementation (python, PHP, Perl, etc), you generally have to use URL redirects (Graph API) to interact with Facebook and the user. Personally, I don't think this is a good user experience.
Using the javascript SDK, you can do everything inline. Which means the user never has to leave your page to grant permissions, post to wall, send requests, etc. You can still use the backend libs to do other things. And you would need to if you are doing any "offline" activity or subscribing to real time events.
In the end, you end up with the same authorization rights. Both are making similar calls to Facebook to get a valid, authorized session. So either one, or both works.

how to implement thin client app with pyqt

Here is what I would like to do, and I want to know how some people with experience in this field do this:
With three POST requests I get from the http server:
widgets and layout
and then app logic (minimal)
data
Or maybe it's better to combine the first two or all three. I'm thinking of using pyqt. I think I can load .ui files. I can parse json data. I just think it would be rather dangerous to pass code over a network to be executed on the client. If someone can hijack the connection, or can change the apps setting to access a bogus server, that is nasty.
I want to do it this way because it keeps all the clients up-to-date. It's sort of like a webapp but simpler because of Qt. Essentially the "thin" app is just a minimal compiled python file that loads data from a server.
How can I do this without introducing security issues on the client? Is https good enough? Is there a way to get pyqt to run in a sandbox of sorts?
PS. I'm not stuck on Qt or python. I do like the concept though. I don't really want to use Java - server or client side.

Your desire to send "app logic" from the server to the client without sending "code" is inherently self-contradictory, though you may not realize that yet -- even if the "logic" you're sending is in some simplified ad-hoc "language" (which you don't even think of as a language;-), to all intents and purposes your Python code will be interpreting that language and thereby execute that code. You may "sandbox" things to some extent, but in the end, that's what you're doing.
To avoid hijackings and other tricks, instead, use HTTPS and validate the server's cert in your client: that will protect you from all the problems you're worrying about (if somebody can edit the app enough to defeat the HTTPS cert validation, they can edit it enough to make it run whatever code they want, without any need to send that code from a server;-).
Once you're using https, having the server send Python modules (in source form if you need to support multiple Python versions on the clients, else bytecode is fine) and the client thereby save them to disk and import / reload them, will be just fine. You'll basically be doing a variant of the classic "plugins architecture" where the "plugins" are in fact being sent from the server (instead of being found on disk in a given location).

Use a web-browser it is a well documented system that does everything you want. It is also relatively fast to create simple graphical applications in a browser. Examples for my reasoning:
The Sage math environment has built their graphical client as an application that runs in a browser, together with a local web-server.
There is the Pyjamas project that compiles Python to Javascript. This is IMHO worth a try.
Edit:
You could try PyPy's sandbox interpreter, as a secure Python interpreter for the code that was transferred over a network.
An then there is the most simple solution: Simply send Python modules over the network, but sign and/or encrypt them. This is the way all Linux distributions work. You store a cryptographic token on the local computer. The server signs/encrypts the code before it sends it, with a matching token. GPG should be able to do it.

TFS Webservice Documentation

We use a lot of of python to do much of our deployment and would be handy to connect to our TFS server to get information on iteration paths, tickets etc. I can see the webservice but unable to find any documentation. Just wondering if anyone knew of anything?

The web services are not documented by Microsoft as it is not an officially supported route to talk to TFS. The officially supported route is to use their .NET API.
In the case of your sort of application, the course of action I usually recommend is to create your own web service shim that lives on the TFS server (or another server) and uses their API to talk to the server but allows you to present the data in a nice way to your application.
Their object model simplifies the interactions a great deal (depending on what you want to do) and so it actually means less code over-all - but better tested and testable code and also you can work around things such as the NTLM auth used by the TFS web services.
Hope that helps,
Martin.

So, this question is friggin' old, but let me take a whack at it (since it keeps coming up in my google searches).
There's no officiall supported API for the on premise TFS (the MSFT hosted one has http://www.visualstudio.com/en-us/integrate/api/overview).
That said, you can always use Fiddler (http://www.telerik.com/fiddler) or something like it to inspect the calls that the web client for TFS is making to the server and do your magic to turn those into the scripts in python you want.
You'll need to run your python scripts under a service account that has TFS privs appropriate to what it is trying to do (read, update, confugure... whatever).
Since it sounds like you are just trying to read from TFS, this might be a really easy way for you to get what you want since an HTTP get to
http://yourserver/tfs/yourcollection/yourproject/_workitems#id=yourworkitemid
will hand you back (halfway) sane html payloads.
If you want lists of iterations or teams or whatever, then your service account needs to have the appropriate admin privileges and hit things like
http://yourserver/tfs/yourcollection/yourproject/_admin/_iterations
and use that response.

Google OAuth2 callbacks calls me without parameters

I have been running a service for some months now using Google's
OAuth2 authentication. Most of the time everything works well, but
there are occasional issues with callbacks coming back empty from Google to me: Something along the lines of 1 out of 15 callbacks arrives at my service completely without parameters in the GET request. Just a naked /my-callback-url request, no parameters at all.
I'm having quite some difficulty explaining this behaviour and neither can I find many references to it when searching the net.
I'm also so far unable to re-create this phenomenon in my own development environment, so my solution ideas have had to be mostly speculation: My first hunch at a quick-n-dirty work around was to re-generate the OAuth request URL and return a 302 redirect response back so Google can have another go. But that sounds like taking the risk of creating an infinite redirect loop if it would turn out that the problem originates from my code. I would very much prefer to understand what's actually going on.
Do any of you have experience of 'empty' OAuth2 callbacks from Google?
And in that case, what would be the most sensible way of handling
them? Or are there a typical error when generating the authentication
URL's that causes this behaviour (I'm using Python &
Requests-OAuthlib)
to handle my OAuth2 interaction).

I suspect that these requests are not redirects back from Google. There are crawlers and other hackers trying to hit every endpoint that they find on the web. So these could be just abusive requests.
If you can correlate the request with an empty parameter with a request that redirected from your server (based on IP address or a cookie you set before redirecting to Google) then we can try to investigate further.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.