The kafka-python client supports Kafka 0.9 but doesn't obviously include the new authentication and encryption features so my guess is that it only works with open servers (as in previous releases). In any case, even the Java client needs a special message hub login module to connect (or so it would seem from the example) which suggests that nothing will work unless there is a similar module available for Python.
My specific scenario is that I want to use the message hub service from a Jupyter notebook also hosted in Bluemix (the Apache Spark service).
I was able to connect using the kafka-python library:
$ pip install --user kafka-python
Then ...
from kafka import KafkaProducer
from kafka.errors import KafkaError
import ssl
############################################
# Service credentials from Bluemix UI:
############################################
bootstrap_servers = # kafka_brokers_sasl
sasl_plain_username = # user
sasl_plain_password = # password
############################################
sasl_mechanism = 'PLAIN'
security_protocol = 'SASL_SSL'
# Create a new context using system defaults, disable all but TLS1.2
context = ssl.create_default_context()
context.options &= ssl.OP_NO_TLSv1
context.options &= ssl.OP_NO_TLSv1_1
producer = KafkaProducer(bootstrap_servers = bootstrap_servers,
sasl_plain_username = sasl_plain_username,
sasl_plain_password = sasl_plain_password,
security_protocol = security_protocol,
ssl_context = context,
sasl_mechanism = sasl_mechanism,
api_version=(0,10))
# Asynchronous by default
future = producer.send('my-topic', b'raw_bytes')
# Block for 'synchronous' sends
try:
record_metadata = future.get(timeout=10)
except KafkaError:
# Decide what to do if produce request failed...
log.exception()
pass
# Successful result returns assigned partition and offset
print (record_metadata.topic)
print (record_metadata.partition)
print (record_metadata.offset)
This worked for me from Bluemix spark as a service from a jupyter notebook, however, note that this approach is not using spark. The code is just running on the driver host.
The SASL support in the Kafka Python client has been requested : https://github.com/dpkp/kafka-python/issues/533 but until the username/password login method used by Message Hub is supported, it won't work
Until this is natively supported by the Bluemix Apache Spark Service, you can follow the same approach as the Realtime Sentiment Analysis project. Helper code for this can be found on the cds labs spark samples github repo.
We've added some text to our documentation on non-Java language support - see the "CONNECTING AND AUTHENTICATING IN A NON-JAVA APPLICATION" section:
https://www.ng.bluemix.net/docs/services/MessageHub/index.html
Our current authentication method is non-standard and not supported by the Apache project, but was a temporary solution. The Message Hub team is working with the Apache Kafka community to develop KIP-43. Once this is finalised, we'll change the Message Hub authentication implementation to match and it will be possible to implement clients to that spec in any language.
Related
I have the following code
conn_str = "HostName=my_host.azure-devices.net;DeviceId=MY_DEVICE;SharedAccessKey=MY_KEY"
device_conn = IoTHubDeviceClient.create_from_connection_string(conn_str)
await device_conn.connect()
This works fine, but only because I've manually retrieved this from the IoT hub and pasted it into the code. We are going to have hundreds of these devices, so is there a way to retrieve this connection string programmatically?
It'll be the equivalent of the following
az iot hub device-identity connection-string show --device-id MY_DEVICCE --hub-name MY_HUB --subscription ABCD1234
How do I do this?
The device id and key are you give to the each device and you choose where to store/how to load it. The connection string is just a concept for easy to get started but it has no meaning in the actual technical level.
You can use create_from_symmetric_key(symmetric_key, hostname, device_id, **kwargs) to direct pass key, id and hub uri to sdk.
I found it's not possible to retrieve the actual connection string, but a connection string can be built from the device primary key
from azure.iot.hub import IoTHubRegistryManager
from azure.iot.device import IoTHubDeviceClient
# HUB_HOST is YOURHOST.azure-devices.net
# SHARED_ACCESS_KEY is from the registryReadWrite connection string
reg_str = "HostName={0};SharedAccessKeyName=registryReadWrite;SharedAccessKey={1}".format(
HUB_HOST, SHARED_ACCESS_KEY)
device = IoTHubRegistryManager(reg_str).get_device("MY_DEVICE_ID")
device_key = device.authentication.symmetric_key.primary_key
conn_str = "HostName={0};DeviceId={1};SharedAccessKey={2}".format(
HUB_HOST, "MY_DEVICE_ID", device_key)
client = IoTHubDeviceClient.create_from_connection_string(
conn_str)
client.connect()
# Remaining code here...
Other options you could consider include:
Use the Device Provisioning service to manage provisioning and connecting your device to your IoT hub. You won't need to generate your connection strings manually in this case.
Use X.509 certificates (recommended for production environments instead of SAS). Each device has an X.509 cert derived from the root cert in your hub. See: https://learn.microsoft.com/azure/iot-hub/tutorial-x509-introduction
I am new to Kafka and trying to read messages from kafka consumer topics using python. I am using below piece of code to read the messages.
from kafka import KafkaConsumer
topic = 'topic'
bootstrap_servers = 'server'
consumer = KafkaConsumer(bootstrap_servers = [bootstrap_servers],
auto_offset_reset = 'earliest',
enable_auto_commit = True,
security_protocol = 'SASL_PLAINTEXT',
sasl_mechanism = 'GSSAPI',
consumer_timeout_ms = 1000)
When I run this, got the error message 'Could not find KfW installation' and failed to connect Kafka. Installed Kerberos for Windows MSI and reran, its able to establish the connectivity.
However, I am trying to avoid KfW installation in the local system, instead find a way to pass the keytab file and principal to use in the authentication process and read the data from kafka topic. (if its possible?)
But not sure, which argument of KafkaConsumer holds the keytab file.
Please suggest any better way available?
The sasl mechanism we are using is SCRAM-SHA-256 but the kafka producer will only accept sasl_mechanism as PLAIN, GSSAPI, OAUTHBEARER
The following config will give the error
sasl_mechanism must be in PLAIN, GSSAPI, OAUTHBEARER
config
ssl_produce = KafkaProducer(bootstrap_servers='brokerCName:9093',
security_protocol='SASL_SSL',
ssl_cafile='pemfilename.pem',
sasl_mechanism='SCRAM-SHA-256',
sasl_plain_username='password',
sasl_plain_password='secret')
I need to know how can I specify the correct sasl mechanism.
Thanks
Updated answer for kafka-python v2.0.0+
Since 2.0.0, kafka-python supports both SCRAM-SHA-256 and SCRAM-SHA-512.
Previous answer for older versions of kafka-python
As far as I understand, you are using kafka-python client. From the source code, I can see that sasl_mechanism='SCRAM-SHA-256' is not a valid option:
"""
...
sasl_mechanism (str): Authentication mechanism when security_protocol
is configured for SASL_PLAINTEXT or SASL_SSL. Valid values are:
PLAIN, GSSAPI, OAUTHBEARER.
...
"""
if self.config['security_protocol'] in ('SASL_PLAINTEXT', 'SASL_SSL'):
assert self.config['sasl_mechanism'] in self.SASL_MECHANISMS, (
'sasl_mechanism must be in ' + ', '.join(self.SASL_MECHANISMS))
One quick workaround is to use confluent-kafka client that supports sasl_mechanism='SCRAM-SHA-256':
from confluent_kafka import Producer
# See https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
conf = {
'bootstrap.servers': 'localhost:9092',
'security.protocol': 'SASL_SSL',
'sasl.mechanisms': 'SCRAM-SHA-256',
'sasl.username': 'yourUsername',
'sasl.password': 'yourPassword',
# any other config you like ..
}
p = Producer(**conf)
# Rest of your code goes here..
kafka-python supports both SCRAM-SHA-256 and SCRAM-SHA-512 in version 2.0.0.
I read Kafka's documents
But I did not understand. Can I use username and password for Python Producers?
Can specify that any Producer can only produce a Topic, like MySQL .(producer has written with Python)
yes, you can have user/pass per topic. see official documentation Authorization and ACLs.
You can enable security with either SSL or SASL, Kafka's SASL support:
SASL/GSSAPI (Kerberos) - starting at version 0.9.0.0
SASL/PLAIN - starting at version 0.10.0.0
SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512 - starting at version 0.10.2.0
From the docs, example of adding Acls:
Suppose you want to add an acl "Principals User:Bob and User:Alice are allowed to perform Operation Read and Write on Topic Test-Topic from IP 198.51.100.0 and IP 198.51.100.1". You can do that by executing the CLI with following options:
1
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Bob --allow-principal User:Alice --allow-host 198.51.100.0 --allow-host 198.51.100.1 --operation Read --operation Write --topic Test-topic
Also in this Blog post you can find some information
I'm not sure what library you are using but it should just be a matter of passing the proper properties to the producer/client; kafka-python has support:
Support SASL/Kerberos
Support for ACL based kafka
If you want to use username+password for authentication, you need to enable SASL authentication using the Plain mechanism on your cluster. See the Authentication using SASL section on the Kafka website for the full instructions.
Note that you also probably want to enable SSL (SASL_SSL), as otherwise, SASL Plain would transmit credentials in plaintext.
Several Python clients support SASL Plain, for example:
kafka-python: https://github.com/dpkp/kafka-python
confluent-kafka-python: https://github.com/confluentinc/confluent-kafka-python
Regarding authorizations, using the default authorizer, kafka.security.auth.SimpleAclAuthorizer, you can restrict a producer to only be able to produce to a topic. Again this is all fully documented on Kafka's website in the Authorization and ACLs section.
For example with SASL Plain, by default, the Principal name is the username that was used to connect. Using the following command you can restrict user Alice to only be able to produce to the topic named testtopic:
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Alice --producer --topic testtopic
do you mean something like this :
topic = "test"
sasl_mechanism = "PLAIN"
username = "admin"
password = "pwd$"
security_protocol = "SASL_PLAINTEXT"
#context = ssl.create_default_context()
#context.options &= ssl.OP_NO_TLSv1
#context.options &= ssl.OP_NO_TLSv1_1
consumer = KafkaConsumer(topic, bootstrap_servers='kafka1:9092',
#api_version=(0, 10),
security_protocol=security_protocol,
#ssl_context=context,
#ssl_check_hostname=True,
#ssl_cafile='../keys/CARoot.pem',
sasl_mechanism = sasl_mechanism,
sasl_plain_username = username,
sasl_plain_password = password)
#ssl_certfile='../keys/certificate.pem',
#ssl_keyfile='../keys/key.pem')#,api_version = (0, 10))
I was trying to connect to remote machine via WinRM in Python (pywinrm) using domain account, following the instruction in
How to connect to remote machine via WinRM in Python (pywinrm) using domain account?
using
session = winrm.Session(server, auth=('user#DOMAIN', 'doesNotMatterBecauseYouAreUsingAKerbTicket'), transport='kerberos')
but I got this:
NotImplementedError("Can't use 'principal' argument with kerberos-sspi.")
I googled "principal argument" and I got its meaning in mathematics,which is in complex_analysis (https://en.m.wikipedia.org/wiki/Argument_(complex_analysis)) and definitely not the right meaning. I'm not a native English speaker and I got stuck here.
The original code is here:
https://github.com/requests/requests-kerberos/blob/master/requests_kerberos/kerberos_.py
def generate_request_header(self, response, host, is_preemptive=False):
"""
Generates the GSSAPI authentication token with kerberos.
If any GSSAPI step fails, raise KerberosExchangeError
with failure detail.
"""
# Flags used by kerberos module.
gssflags = kerberos.GSS_C_MUTUAL_FLAG | kerberos.GSS_C_SEQUENCE_FLAG
if self.delegate:
gssflags |= kerberos.GSS_C_DELEG_FLAG
try:
kerb_stage = "authGSSClientInit()"
# contexts still need to be stored by host, but hostname_override
# allows use of an arbitrary hostname for the kerberos exchange
# (eg, in cases of aliased hosts, internal vs external, CNAMEs
# w/ name-based HTTP hosting)
kerb_host = self.hostname_override if self.hostname_override is not None else host
kerb_spn = "{0}#{1}".format(self.service, kerb_host)
kwargs = {}
# kerberos-sspi: Never pass principal. Raise if user tries to specify one.
if not self._using_kerberos_sspi:
kwargs['principal'] = self.principal
elif self.principal:
raise NotImplementedError("Can't use 'principal' argument with kerberos-sspi.")
Any help will be greatly appreciated.