How to safely authenticate a user using LDAP? - python

For context: I am developing a web application where users need to authenticate to view internal documents. I neither need any detailed info on users nor special permission management, two states are sufficient: Either a session belongs to an authenticated user (→ documents can be accessed) or it does not (→ documents cannot be accessed). A user authenticates by providing a username and a password, which I want to check against an LDAP server.
I am using Python 3.10 and the ldap3 Python library.
The code
I am currently using the following code to authenticate a user:
#!/usr/bin/env python3
import ssl
from ldap3 import Tls, Server, Connection
from ldap3.core.exceptions import LDAPBindError, LDAPPasswordIsMandatoryError
def is_valid(username: str, password: str) -> bool:
tls_configuration = Tls(validate=ssl.CERT_REQUIRED)
server = Server("ldaps://ldap.example.com", tls=tls_configuration)
user_dn = f"cn={username},ou=ops,dc=ldap,dc=example,dc=com"
try:
with Connection(server, user=user_dn, password=password):
return True
except (LDAPBindError, LDAPPasswordIsMandatoryError):
return False
Demo instance
If you want to run this code, you could try using the FreeIPA's project demo LDAP server.
Replace CERT_REQUIRED with CERT_NONE because the server only provides a self-signed cert (this obviously is a security flaw, but required to use this particular demo – the server I want to use uses a Let's Encrypt certificate).
Replace "ldaps://ldap.example.com" with ldaps://ipa.demo1.freeipa.org
Replace the user_dn with f"uid={username},cn=users,cn=accounts,dc=demo1,dc=freeipa,dc=org"
After doing so, you could try running the following commands:
>>> is_valid("admin", "Secret123")
True
>>> is_valid("admin", "Secret1234")
False
>>> is_valid("admin", "")
False
>>> is_valid("admin", None)
False
>>> is_valid("nonexistent", "Secret123")
False
My question(s)
Does the code above safely determine if a user has provided valid credentials?
Notably, I am concerned about the following particular aspects:
Is attempting to bind to the LDAP server enough to verify credentials?
The body of the with statement should only be executed if binding was successful and therefore returns True without further ado. Is this safe? Or could it be possible that binding succeeds but the password provided would still be considered wrong and not sufficient to authenticate the user against the web app.
Am I opening myself up to injection attacks? If so, how to properly mitigate them?
user_dn = f"cn={username},ou=ops,dc=ldap,dc=example,dc=com" uses the untrusted username (that came directly from the web form) to build a string. That basically screams LDAP injection.
Is TLS properly configured?
The connection should use modern TLS encryption and verify the certificate presented by the server, just like a normal browser would do.
Also, of course, if there is anything else unsafe about my code, I'd be happy to know what it is.
Resources I've already found
I've already searched for answers to the particular aspects. Sadly, I have found nothing definite (i.e. no one definitely saying something I do here is bad or good), but I wanted to provide them as a starting point for a potential answer:
Probably yes.
“How to bind (authenticate) a user with ldap3 in python3” uses a similar code snippet to bind, and no one explicitly says that that's bad.
Auth0 uses this method in their blog post “Using LDAP and Active Directory with C# 101” and they probably know what they're doing.
Probably not, so no mitigation is needed.
There are a few questions on LDAP injection (like “How to prevent LDAP-injection in ldap3 for python3”) but they always only mention filtering and search, not binding.
The OWASP Cheat Sheet on LDAP Injection mentions enabling bind authentication as a way to mitigate LDAP injection when filtering, but say nothing about sanitization needed for the bind DN.
I suppose you could even argue that this scenario is not susceptible to injection attacks, because we are indeed processing untrusted input, but only where untrusted input is expected. Anyone can type anything into a login form, but they can also put anything into a request to bind to an LDAP server (without even bothering with the web app). As long as I don't put untrusted input somewhere where trusted input is expected (e.g. using a username in a filter query after binding with an LDAP admin account), I should be safe.
However, the ldap3 documentation of the Connection object does mention one should use escape_rdn when binding with an untrusted username. This is at odds with my suppositions, who's right?
Probably yes.
At least an error was thrown when I tried to use this code with a server that only presented a self-signed certificate, so I suppose I should be safe.

Is attempting to bind to the LDAP server enough to verify credentials?
From the LDAP protocol side, yes, and many systems already rely on this behavior (e.g. pam_ldap for Linux OS-level authentication against an LDAP server). I've never heard of any server where the bind result would be deferred until another operation.
From the ldap3 module side I'd be more worried, as in my experience initializing a Connection did not attempt to connect – much less bind – to the server until I explicitly called .bind() (or unless I specified auto_bind=True), but if your example works then I assume that using a with block does this correctly.
In old code (which holds a persistent connection, no 'with') I've used this, but it may be outdated:
conn = ldap3.Connection(server, raise_exceptions=True)
conn.bind()
(For some apps I use Apache as a reverse proxy and its mod_auth_ldap handles LDAP authentication for me, especially when "is authenticated" is sufficient.)
Am I opening myself up to injection attacks? If so, how to properly mitigate them?
Well, kind of, but not in a way that would be easily exploitable. The bind DN is not a free-form query – it's only a weird-looking "user name" field and it must exactly match an existing entry; you can't put wildcards in it.
(It's in the LDAP server's best interests to be strict about what the "bind" operation accepts, because it's literally the user-facing operation for logging into an LDAP server before anything else is done – it's not just a "password check" function.)
For example, if you have some users at OU=Ops and some at OU=Superops,OU=Ops, then someone could specify Foo,OU=Superops as their username resulting in UID=Foo,OU=Superops,OU=Ops, as the DN – but they'd still have to provide the correct password for that account anyway; they cannot trick the server into using one account's privileges while checking another account's password.
However, it's easy to avoid injection regardless. DN component values can be escaped using:
ldap3: ldap3.utils.dn.escape_rdn(string)
python-ldap: ldap.dn.escape_dn_chars(string)
That being said, I dislike "DN template" approach for a completely different reason – its rather limited usefulness; it only works when all of your accounts are under the same OU (flat hierarchy) and only when they're named after the uid attribute.
That may be the case for a purpose-built LDAP directory, but on a typical Microsoft Active Directory server (or, I believe, on some FreeIPA servers as well) the user account entries are named after their full name (the cn attribute) and can be scattered across many OUs. A two-step approach is more common:
Bind using your app's service credentials, then search the directory for any "user" entries that have the username in their uid attribute, or similar, and verify that you found exactly one entry;
Unbind (optional?), then bind again with the user's found DN and the provided password.
When searching, you do have to worry about LDAP filter injection attacks a bit more, as a username like foo)(uid=* might give undesirable results. (But requiring the results to match exactly 1 entry – not "at least 1" – helps with mitigating this as well.)
Filter values can be escaped using:
ldap3: ldap3.utils.conv.escape_filter_chars(string)
python-ldap: ldap.filter.escape_filter_chars(string)
(python-ldap also has a convenient wrapper ldap.filter.filter_format around this, but it's basically just the_filter % tuple(map(escape_filter_chars, args)).)
The escaping rules for filter values are different from those for RDN values, so you need to use the correct one for the specific context. But at least unlike SQL, they are exactly the same everywhere, so the functions that come with your LDAP client module will work with any server.
Is TLS properly configured?
ldap3/core/tls.py looks good to me – it uses ssl.create_default_context() when supported, loads the system default CA certificates, so no extra configuration should be needed. Although it does implement custom hostname checking instead of relying on the ssl module's check_hostname so that's a bit weird. (Perhaps the LDAP-over-TLS spec defines wildcard matching rules that are slightly incompatible with the usual HTTP-over-TLS ones.)
An alternative approach instead of manually escaping DN templates:
dn = build_dn({"CN": f"{last}, {first} ({username})"},
{"OU": "Faculty of Foo and Bar (XYZF)"},
{"OU": "Staff"},
ad.BASE_DN)
def build_dn(*args):
components = []
for rdn in args:
if isinstance(rdn, dict):
rdn = [(a, ldap.dn.escape_dn_chars(v))
for a, v in rdn.items()]
rdn.sort()
rdn = "+".join(["%s=%s" % av for av in rdn])
components.append(rdn)
elif isinstance(rdn, str):
components.append(rdn)
else:
raise ValueError("Unacceptable RDN type for %r" % (rdn,))
return ",".join(components)

Related

How to replicate python's `ssl.get_default_context()` in Windows C++?

My final goal is to port over a simple mqtt-paho-python script to C++ for integration within a large application.
The python example using paho is quite simple:
client = mqtt.Client(transport="websockets")
client.username_pw_set(settings['username'], password=settings['password'])
client.tls_set_context(context=ssl.create_default_context())
They set up the default TLS context, authenticate with a username and password, and then connect. This works great!
However, now I want to try to get the same secure configuration using paho-mqtt-cpp. The basic example, borrowing from their async examples, goes like this:
mqtt::connect_options connOpts;
connOpts.set_keep_alive_interval(20);
connOpts.set_clean_session(true);
connOpts.set_user_name("username");
connOpts.set_password("password123");
mqtt::ssl_options sslOpts;
connOpts.set_ssl(sslOpts);
mqtt::async_client client("wss://test.mosquitto.org:8081", "myClient");
callback cb(client, connOpts);
client.set_callback(cb);
However, ssl.get_default_context() in python's ssl library seems to do quite a bit of setup for me that isn't replicated in C++; from python's own documentation:
"For client use, if you don’t have any special requirements for your security policy, it is highly recommended that you use the create_default_context() function to create your SSL context. It will load the system’s trusted CA certificates, enable certificate validation and hostname checking, and try to choose reasonably secure protocol and cipher settings."
Most WSS connections I've tried require a certificate, and create_default_context() seems to be able to provide the proper certificates without me generating any myself.
So my questions:
(1) Where are Windows' System Default Certificates that I can use for secure connections? and
(2) What other settings do I need to manually configure that create_default_context() might be setting up for me under the hood?
I've tried looking at the source, but it's not easily discernible where the OS-specific options are.

Name resolving in http requests

I am trying to a simple make a http request to a server inside my company, from a dev server. I figured out that depending on the origin / destination server, I might, or not, to be forced to use qualified name of the destination server, like srvdestination.com.company.world instead of just srvdestination.
I am ok with this, but I don't understand how come my DB connection works?
Let's say I have srvorigin. Now, to make http request, I must use qualified name srvdestination.com.company.world. However, for database connection, the connection string with un-qualified name is enough psycopg.connect(host='srvdestination', ...) I understand that protocols are different, but how psycopg2 does to resolve the real name?
First it all depend on how the name resolution subsystem of your OS is configured. If you are on Unix (you did not specify), this is governed by /etc/resolv.conf. Here you can provide the OS with a search list: if a name has not "enough" dots (the number is configurable) then a suffix is added to retry resolution.
The library you use to do the HTTP request may not query the OS for name resolution and do its DNS resolution itself. In which case, it can only work with the information you give it (but it could as well re-use the OS /etc/resolv.conf and information in it), hence the need to use the full name.
On the contrary, the psycopg2 may use the OS resolution mechanism and hence dealing with "short" names just fine.
Both libraries should have documentation on how they handle hostnames... or otherwise you need to study their source code. I guess psycopg2 is a wrapper around the default libpq standard library, written in C if I am not mistaken, which hence certainly use the standard OS resolution process.
I can understand the curiosity around this difference but anyway my advice is to keep short names when you type commands on the shell and equivalent (and even there it could be a problem), but always use FQDNs (Fully Qualified Domain Names) in your program and configuration files. You will avoid a lot of problems.

Pyramid.security: Is getting user info from a database with unauthenticated_userid(request) really secure?

I'm trying to make an accesible cache of user data using Pyramid doc's "Making A “User Object” Available as a Request Attribute" example.
They're using this code to return a user object to set_request_property:
from pyramid.security import unauthenticated_userid
def get_user(request):
# the below line is just an example, use your own method of
# accessing a database connection here (this could even be another
# request property such as request.db, implemented using this same
# pattern).
dbconn = request.registry.settings['dbconn']
userid = unauthenticated_userid(request)
if userid is not None:
# this should return None if the user doesn't exist
# in the database
return dbconn['users'].query({'id':userid})
I don't understand why they're using unauthenticated_userid(request) to lookup user info from the database...isn't that insecure? That means that user might not be logged in, so why are you using that ID to get there private info from the database?
Shouldn't
userid = authenticated_userid(request)
be used instead to make sure the user is logged in? What's the advantage of using unauthenticated_userid(request)? Please help me understand what's going on here.
The unauthenticated_userid call is a cheaper call; it looks up the user id from the request without going through the whole authentication process again.
The key concept there is the word again. You should only use the method in views that have already been authorized. In other words, by the time you reach code that uses unauthenticated_userid you've already verified the user, and specifically do not want to do this again for this particular call.
Authenticating users against a backend persistent storage can be expensive, especially if such a storage doesn't support caching. The unauthenticated_userid API method is an optimization where the request is basically your userid cache.
This is a late reply but it was linked as a source of confusion for some users of Pyramid.
The accepted answer here is not the actual reason that unauthenticated_userid is used for request.user. It has nothing to do with performance.
The reason that it uses unauthenticated_userid is because it makes it easier to reuse the authentication policy between applications with smaller modifications required. Your application needs a "source of truth" for whether the user is allowed to be considered authenticated and usually the policy's internal logic is not enough to make this determination. A valid cookie is nice, but you usually want to verify it with your backend before trusting it. Great, so where do we put that logic? Well unauthenticated_userid doesn't make sense because it is the reusable part of the policy that focuses specifically on parsing the request headers. You could put it into authenticated_userid but this method is not the one you normally care about in your application. You normally use request.user in your apps (rarely do you probably care about request.authenticated_userid directly) and lastly the request.user is a superset of functionality - it provides an entire user object, not just an id. It would be silly to verify the id without verifying the entire object in most cases. We can only have one "source of truth" and so the recipe declares it to be request.user. The groupfinder (and thus authenticated_userid) can now depend on request.user and trust that what it gets back from there has been properly verified with the backend. Also request.user is already reified and thus speeds up subsequent calls to request.authenticated_userid naturally.
Looks like Martijn Pieters is right.
My micro benchmark to test this (in my project I use Redis as DB for users and everything else):
print ('start test')
t1 = time()
authenticated_userid(self.request)
print ('authenticated: ' + str(time()-t1))
t1 = time()
unauthenticated_userid(self.request)
print ('unauthenticated: ' + str(time()-t1))
print ('test_stop')
Results:
start test
REDIS AUTH! # <-- actually this is query to groups finder in Redis
authenticated: 0.00032901763916
unauthenticated: 7.31945037842e-05
test_stop
It was tested for few times, results are constant :) Do you think I should add Martijn's answer to that article in Pyramid docs to make things more 'clear'? :)

How to add authentication to a (Python) twisted xmlrpc server

I am trying to add authentication to a xmlrpc server (which will be running on nodes of a P2P network) without using user:password#host as this will reveal the password to all attackers. The authentication is so to basically create a private network, preventing unauthorised users from accessing it.
My solution to this was to create a challenge response system very similar to this but I have no clue how to add this to the xmlrpc server code.
I found a similar question (Where custom authentication was needed) here.
So I tried creating a module that would be called whenever a client connected to the server. This would connect to a challenge-response server running on the client and if the client responded correctly would return True. The only problem was that I could only call the module once and then I got a reactor cannot be restarted error. So is there some way of having a class that whenever the "check()" function is called it will connect and do this?
Would the simplest thing to do be to connect using SSL? Would that protect the password? Although this solution would not be optimal as I am trying to avoid having to generate SSL certificates for all the nodes.
Don't invent your own authentication scheme. There are plenty of great schemes already, and you don't want to become responsible for doing the security research into what vulnerabilities exist in your invention.
There are two very widely supported authentication mechanisms for HTTP (over which XML-RPC runs, therefore they apply to XML-RPC). One is "Basic" and the other is "Digest". "Basic" is fine if you decide to run over SSL. Digest is more appropriate if you really can't use SSL.
Both are supported by Twisted Web via twisted.web.guard.HTTPAuthSessionWrapper, with copious documentation.
Based on your problem description, it sounds like the Secure Remote Password Protocol might be what you're looking for. It's a password-based mechanism that provides strong, mutual authentication without the complexity of SSL certificate management. It may not be quite as flexible as SSL certificates but it's easy to use and understand (the full protocol description fits on a single page). I've often found it a useful tool for situations where a trusted third party (aka Kerberos/CA authorities) isn't appropriate.
For anyone that was looking for a full example below is mine (thanks to Rakis for pointing me in the right direction). In this the user and password is stored in a file called 'passwd' (see the first useful link for more details and how to change it).
Server:
#!/usr/bin/env python
import bjsonrpc
from SRPSocket import SRPSocket
import SocketServer
from bjsonrpc.handlers import BaseHandler
import time
class handler(BaseHandler):
def time(self):
return time.time()
class SecureServer(SRPSocket.SRPHost):
def auth_socket(self, socket):
server = bjsonrpc.server.Server(socket, handler_factory=handler)
server.serve()
s = SocketServer.ForkingTCPServer(('', 1337), SecureServer)
s.serve_forever()
Client:
#! /usr/bin/env python
import bjsonrpc
from bjsonrpc.handlers import BaseHandler
from SRPSocket import SRPSocket
import time
class handler(BaseHandler):
def time(self):
return time.time()
socket, key = SRPSocket.SRPSocket('localhost', 1337, 'dht', 'testpass')
connection = bjsonrpc.connection.Connection(socket, handler_factory=handler)
test = connection.call.time()
print test
time.sleep(1)
Some useful links:
http://members.tripod.com/professor_tom/archives/srpsocket.html
http://packages.python.org/bjsonrpc/tutorial1/index.html

Exposing passwords in django

For a git repository that is shared with others, is it a vulnerability to expose your database password in the settings.py file? (My initial thought was no, since you still need the ssh password.)
It depends on who has read access to the repository, but it's generally a good idea not to put passwords into version control. It's probably better to put it in a seperate file like password.py with only the password in it, like this:
password = 'asdasd'
and import or execfile this in your settings.py. You can then add the password.py to your .gitignore.
That assumes your database is only accessible from one specific host, and even then, why would you want to give a potential attacker another piece of information? Suppose you deploy this to a shared host and I have an account on there, I could connect to your database just by logging into my account on that box.
Also, depending on who you are writing this for and what kind of auditing they need to go through (PCI, state audits, etc), this might just not be allowed.
I would try to find a way around checking in the password.

Categories