kafka authentication and authorization - python

I read Kafka's documents
But I did not understand. Can I use username and password for Python Producers?
Can specify that any Producer can only produce a Topic, like MySQL .(producer has written with Python)

yes, you can have user/pass per topic. see official documentation Authorization and ACLs.
You can enable security with either SSL or SASL, Kafka's SASL support:
SASL/GSSAPI (Kerberos) - starting at version 0.9.0.0
SASL/PLAIN - starting at version 0.10.0.0
SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512 - starting at version 0.10.2.0
From the docs, example of adding Acls:
Suppose you want to add an acl "Principals User:Bob and User:Alice are allowed to perform Operation Read and Write on Topic Test-Topic from IP 198.51.100.0 and IP 198.51.100.1". You can do that by executing the CLI with following options:
1
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Bob --allow-principal User:Alice --allow-host 198.51.100.0 --allow-host 198.51.100.1 --operation Read --operation Write --topic Test-topic
Also in this Blog post you can find some information
I'm not sure what library you are using but it should just be a matter of passing the proper properties to the producer/client; kafka-python has support:
Support SASL/Kerberos
Support for ACL based kafka

If you want to use username+password for authentication, you need to enable SASL authentication using the Plain mechanism on your cluster. See the Authentication using SASL section on the Kafka website for the full instructions.
Note that you also probably want to enable SSL (SASL_SSL), as otherwise, SASL Plain would transmit credentials in plaintext.
Several Python clients support SASL Plain, for example:
kafka-python: https://github.com/dpkp/kafka-python
confluent-kafka-python: https://github.com/confluentinc/confluent-kafka-python
Regarding authorizations, using the default authorizer, kafka.security.auth.SimpleAclAuthorizer, you can restrict a producer to only be able to produce to a topic. Again this is all fully documented on Kafka's website in the Authorization and ACLs section.
For example with SASL Plain, by default, the Principal name is the username that was used to connect. Using the following command you can restrict user Alice to only be able to produce to the topic named testtopic:
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Alice --producer --topic testtopic

do you mean something like this :
topic = "test"
sasl_mechanism = "PLAIN"
username = "admin"
password = "pwd$"
security_protocol = "SASL_PLAINTEXT"
#context = ssl.create_default_context()
#context.options &= ssl.OP_NO_TLSv1
#context.options &= ssl.OP_NO_TLSv1_1
consumer = KafkaConsumer(topic, bootstrap_servers='kafka1:9092',
#api_version=(0, 10),
security_protocol=security_protocol,
#ssl_context=context,
#ssl_check_hostname=True,
#ssl_cafile='../keys/CARoot.pem',
sasl_mechanism = sasl_mechanism,
sasl_plain_username = username,
sasl_plain_password = password)
#ssl_certfile='../keys/certificate.pem',
#ssl_keyfile='../keys/key.pem')#,api_version = (0, 10))

Related

Active Directory Authentication of Logged-in User using ADODB/ADSDSOObject

My system is connected to Active Directory and I can query it by binding using a username and password.
I noticed that I am also able to query it without explicitly providing a username and password, when using ADO or ADSDSOObject Provider (tried in Java/Python/VBA).
I would like to understand how the authentication is done in this case.
Example of first case where username and password is explicitly needed:
import ldap3
from ldap3.extend.microsoft.addMembersToGroups import ad_add_members_to_groups as addUsersInGroups
server = Server('172.16.10.50', port=636, use_ssl=True)
conn = Connection(server, 'CN=ldap_bind_account,OU=1_Service_Accounts,OU=0_Users,DC=TG,DC=LOCAL','Passw0rds123!',auto_bind=True)
print(conn)
Example of second case where no username and password is explicitly needed:
Set objConnection = CreateObject("ADODB.Connection")
Set objCommand = CreateObject("ADODB.Command")
objConnection.Provider = "ADsDSOObject"
objConnection.Open "Active Directory Provider"
Set objCOmmand.ActiveConnection = objConnection
objCommand.CommandText = "SELECT Name FROM 'LDAP://DC=mydomain,DC=com' WHERE objectClass = 'Computer'"
objCommand.Properties("Page Size") = 1000
objCommand.Properties("Searchscope") = ADS_SCOPE_SUBTREE
Set objRecordSet = objCommand.Execute
I tried to look at the source code of the libraries but was not able to understand what is being done.
In the second case, it's using the credentials of the account running the program, or it could even be using the computer account (every computer joined to the domain has an account on the domain, with a password that no person ever sees).
Python's ldap3 package doesn't automatically do that, however, it appears there may be way to make it work without specifying credentials, using Kerberos authentication. For example, from this issue:
I know that, for GSSAPI and GSS-SPNEGO, if you specify "authentication=SASL, sasl_mechanism=GSSAPI" (or spnego as needed) in your connection, then you don't need to specify user/password at all.
And there's also this StackOverflow question on the same topic: Passwordless Python LDAP3 authentication from Windows client

Setting Python KafkaProducer sasl mechanism property

The sasl mechanism we are using is SCRAM-SHA-256 but the kafka producer will only accept sasl_mechanism as PLAIN, GSSAPI, OAUTHBEARER
The following config will give the error
sasl_mechanism must be in PLAIN, GSSAPI, OAUTHBEARER
config
ssl_produce = KafkaProducer(bootstrap_servers='brokerCName:9093',
security_protocol='SASL_SSL',
ssl_cafile='pemfilename.pem',
sasl_mechanism='SCRAM-SHA-256',
sasl_plain_username='password',
sasl_plain_password='secret')
I need to know how can I specify the correct sasl mechanism.
Thanks
Updated answer for kafka-python v2.0.0+
Since 2.0.0, kafka-python supports both SCRAM-SHA-256 and SCRAM-SHA-512.
Previous answer for older versions of kafka-python
As far as I understand, you are using kafka-python client. From the source code, I can see that sasl_mechanism='SCRAM-SHA-256' is not a valid option:
"""
...
sasl_mechanism (str): Authentication mechanism when security_protocol
is configured for SASL_PLAINTEXT or SASL_SSL. Valid values are:
PLAIN, GSSAPI, OAUTHBEARER.
...
"""
if self.config['security_protocol'] in ('SASL_PLAINTEXT', 'SASL_SSL'):
assert self.config['sasl_mechanism'] in self.SASL_MECHANISMS, (
'sasl_mechanism must be in ' + ', '.join(self.SASL_MECHANISMS))
One quick workaround is to use confluent-kafka client that supports sasl_mechanism='SCRAM-SHA-256':
from confluent_kafka import Producer
# See https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
conf = {
'bootstrap.servers': 'localhost:9092',
'security.protocol': 'SASL_SSL',
'sasl.mechanisms': 'SCRAM-SHA-256',
'sasl.username': 'yourUsername',
'sasl.password': 'yourPassword',
# any other config you like ..
}
p = Producer(**conf)
# Rest of your code goes here..
kafka-python supports both SCRAM-SHA-256 and SCRAM-SHA-512 in version 2.0.0.

Google Admin Directory API - Send a query via apiclient

I am retrieving a ChromeOS device MAC address via the Google Admin Directory API using the device's Serial Number as reference, and am making my calls through
apiclient.
service = discovery.build('admin', 'directory_v1', developerKey=settings.API_KEY)
Here are the calls available for ChromeOS devices; my issue is that I require a Device ID in order to execute the following:
service.chromeosdevices().get(customerId=settings.CID, deviceId=obtained_id, projection=None).execute()
I can send a GET query via the following format:
https://www.googleapis.com/admin/directory/v1/customer/my_customer/devices/chromeos?projection=full&query=id:" + serial + "&orderBy=status&sortOrder=ascending&maxResults=10", "GET")
... but I'm trying to avoid using OAuth2 and just use my API key. Passing the key in a GET request doesn't work either, as it still returns a "Login Required" notice.
How do I squeeze the above query into an apiclient-friendly format? The only option I found via the above calls was to request every device we have (via list), then sift through the mountain of data for the matching Serial number, which seems silly and excessive.
I did notice I could call apiclient.http.HttpRequests, but I couldn't find a way to pass the API key through it either. There's new_batch_http_request, but I can't discern from the docs how to simply pass a URL to it.
Thank you!
Got it!
You can't use just a key for Directory API queries, you need a Service account.
I'm using google-auth (see here) since oauth2client is deprecated.
You also need to:
Delegate the necessary permissions for your service account (mine has the role of Viewer and has scope access to https://www.googleapis.com/auth/admin.directory.device.chromeos.readonly)
Delegate API access to it separately in the Admin Console (Security -> Advanced Settings -> Authentication)
Get your json client secret key and place it with your app (don't include it in your VCS)
Obtain your credentials like this:
credentials = service_account.Credentials.from_service_account_file(
settings.CLIENT_KEY,
scopes=settings.SCOPES,
subject=settings.ADMIN_USER)
where ADMIN_USER is the email address of an authorized Domain admin.
Then you send a GET request like so:
authed_session = AuthorizedSession(credentials)
response = authed_session.get(request_id_url)
This returns a Requests object you can read via response.content.
Hope it helps someone else!

What is "principal" argument of kerbros-sspi?

I was trying to connect to remote machine via WinRM in Python (pywinrm) using domain account, following the instruction in
How to connect to remote machine via WinRM in Python (pywinrm) using domain account?
using
session = winrm.Session(server, auth=('user#DOMAIN', 'doesNotMatterBecauseYouAreUsingAKerbTicket'), transport='kerberos')
but I got this:
NotImplementedError("Can't use 'principal' argument with kerberos-sspi.")
I googled "principal argument" and I got its meaning in mathematics,which is in complex_analysis (https://en.m.wikipedia.org/wiki/Argument_(complex_analysis)) and definitely not the right meaning. I'm not a native English speaker and I got stuck here.
The original code is here:
https://github.com/requests/requests-kerberos/blob/master/requests_kerberos/kerberos_.py
def generate_request_header(self, response, host, is_preemptive=False):
"""
Generates the GSSAPI authentication token with kerberos.
If any GSSAPI step fails, raise KerberosExchangeError
with failure detail.
"""
        # Flags used by kerberos module.
        gssflags = kerberos.GSS_C_MUTUAL_FLAG | kerberos.GSS_C_SEQUENCE_FLAG
        if self.delegate:
            gssflags |= kerberos.GSS_C_DELEG_FLAG
        try:
            kerb_stage = "authGSSClientInit()"
            # contexts still need to be stored by host, but hostname_override
            # allows use of an arbitrary hostname for the kerberos exchange
            # (eg, in cases of aliased hosts, internal vs external, CNAMEs
            # w/ name-based HTTP hosting)
            kerb_host = self.hostname_override if self.hostname_override is not None else host
            kerb_spn = "{0}#{1}".format(self.service, kerb_host)
            
            kwargs = {}
            # kerberos-sspi: Never pass principal. Raise if user tries to specify one.
            if not self._using_kerberos_sspi:
                kwargs['principal'] = self.principal
            elif self.principal:
                raise NotImplementedError("Can't use 'principal' argument with kerberos-sspi.")
Any help will be greatly appreciated.

Can I call the Bluemix message hub service from Python?

The kafka-python client supports Kafka 0.9 but doesn't obviously include the new authentication and encryption features so my guess is that it only works with open servers (as in previous releases). In any case, even the Java client needs a special message hub login module to connect (or so it would seem from the example) which suggests that nothing will work unless there is a similar module available for Python.
My specific scenario is that I want to use the message hub service from a Jupyter notebook also hosted in Bluemix (the Apache Spark service).
I was able to connect using the kafka-python library:
$ pip install --user kafka-python
Then ...
from kafka import KafkaProducer
from kafka.errors import KafkaError
import ssl
############################################
# Service credentials from Bluemix UI:
############################################
bootstrap_servers = # kafka_brokers_sasl
sasl_plain_username = # user
sasl_plain_password = # password
############################################
sasl_mechanism = 'PLAIN'
security_protocol = 'SASL_SSL'
# Create a new context using system defaults, disable all but TLS1.2
context = ssl.create_default_context()
context.options &= ssl.OP_NO_TLSv1
context.options &= ssl.OP_NO_TLSv1_1
producer = KafkaProducer(bootstrap_servers = bootstrap_servers,
sasl_plain_username = sasl_plain_username,
sasl_plain_password = sasl_plain_password,
security_protocol = security_protocol,
ssl_context = context,
sasl_mechanism = sasl_mechanism,
api_version=(0,10))
# Asynchronous by default
future = producer.send('my-topic', b'raw_bytes')
# Block for 'synchronous' sends
try:
record_metadata = future.get(timeout=10)
except KafkaError:
# Decide what to do if produce request failed...
log.exception()
pass
# Successful result returns assigned partition and offset
print (record_metadata.topic)
print (record_metadata.partition)
print (record_metadata.offset)
This worked for me from Bluemix spark as a service from a jupyter notebook, however, note that this approach is not using spark. The code is just running on the driver host.
The SASL support in the Kafka Python client has been requested : https://github.com/dpkp/kafka-python/issues/533 but until the username/password login method used by Message Hub is supported, it won't work
Until this is natively supported by the Bluemix Apache Spark Service, you can follow the same approach as the Realtime Sentiment Analysis project. Helper code for this can be found on the cds labs spark samples github repo.
We've added some text to our documentation on non-Java language support - see the "CONNECTING AND AUTHENTICATING IN A NON-JAVA APPLICATION" section:
https://www.ng.bluemix.net/docs/services/MessageHub/index.html
Our current authentication method is non-standard and not supported by the Apache project, but was a temporary solution. The Message Hub team is working with the Apache Kafka community to develop KIP-43. Once this is finalised, we'll change the Message Hub authentication implementation to match and it will be possible to implement clients to that spec in any language.

Categories