I am trying to send the file over the tornado websocket like this
in_file = open("/home/rootkit/Pictures/test.png", "rb")
data = in_file.read()
in_file.close()
d = {'file': base64.b64encode(data), 'filename': 'test.png'}
self.ws.write_message(message=d)
as per tornado documentation.
The message may be either a string or a dict (which will be encoded as json). If the binary argument is false, the message will be sent as utf8; in binary mode any byte string is allowed.
But I am getting this exception.
ERROR:asyncio:Future exception was never retrieved
future: <Future finished exception=TypeError("Expected bytes, unicode, or None; got <class 'dict'>",)>
Traceback (most recent call last):
File "/home/rootkit/.local/lib/python3.5/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "/home/rootkit/PycharmProjects/socketserver/WebSocketClient.py", line 42, in run
self.ws.write_message(message=d, binary=True)
File "/home/rootkit/.local/lib/python3.5/site-packages/tornado/websocket.py", line 1213, in write_message
return self.protocol.write_message(message, binary=binary)
File "/home/rootkit/.local/lib/python3.5/site-packages/tornado/websocket.py", line 854, in write_message
message = tornado.escape.utf8(message)
File "/home/rootkit/.local/lib/python3.5/site-packages/tornado/escape.py", line 197, in utf8
"Expected bytes, unicode, or None; got %r" % type(value)
TypeError: Expected bytes, unicode, or None; got <class 'dict'>
The documentation which you're citing is for WebSocketHandler which is meant for serving a websocket connection.
Whereas you're using a websocket client. You'll have to manually convert your dictionary to json.
from tornado.escape import json_encode
self.ws.write_message(message=json_encode(d))
Related
In my Python application, I make the query to the Cassandra database. I'm trying to implement pagination through the cassandra-driver package. As you can see from the code below, paging_state returns the bytes data type. I can convert this value to the string data type. Then I send the value of the str_paging_state variable to the client. If this client sends me str_paging_state again I want to use it in my query.
This part of code works:
query = "select * from users where user_type = 'clients';"
statement = SimpleStatement(query, fetch_size=10)
results = session.execute(statement)
paging_state = results.paging_state
print(type(paging_state)) # <class 'bytes'>
str_paging_state = str(paging_state)
print(str_paging_state) # "b'\\x00C\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x03_hk\\x00\\x00\\x00\\x11P]5C#\\x8bGD~\\x8b\\xc7g\\xda\\xe5rH\\xb0\\x00\\x00\\x00\\x03_rk\\x00\\x00\\x00\\x18\\xee\\x14\\xf7\\x83\\x84\\x00tTmw[\\x00\\xec\\xdb\\x9b\\xa9\\xfd\\x00\\xb9\\xff\\xff\\xff\\xff\\xfe\\x01\\x00'"
This part of code raise error:
results = session.execute(
statement,
paging_state=bytes(str_paging_state.encode())
)
Error:
[ERROR] NoHostAvailable: ('Unable to complete the operation against any hosts')
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 51, in lambda_handler
results = cassandra_connection.execute(statement, paging_state=bytes(paging_state.encode()))
File "/opt/python/lib/python3.8/site-packages/cassandra/cluster.py", line 2618, in execute
return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state, host, execute_as).result()
File "/opt/python/lib/python3.8/site-packages/cassandra/cluster.py", line 4877, in result
raise self._final_exceptionEND RequestId: 4b7bf588-a2d2-45e5-ad7e-8611f1704313
In Java documentation I found the .fromString method which creates a PagingState object from a string previously generated with toString(). Unfortunately, I didn't find an equivalent for this method in Python.
I also tried to use codecs package to decode and encode the paging_state.
str_paging_state = codecs.decode(paging_state, encoding='utf-8', errors='ignore')
# "\u0000C\u0000\u0000\u0000\u0002\u0000\u0000\u0000\u0003_hk\u0000\u0000\u0000\u0011P]5C#GD~grH\u0000\u0000\u0000\u0003_rk\u0000\u0000\u0000\u0018\u0014\u0000tTmw[\u0000ۛ\u0000\u0001\u0000"
# Raise error
results = session.execute(statement, paging_state=codecs.encode(str_paging_state, encoding='utf-8', errors='ignore'))
In this case I see next error:
[ERROR] ProtocolException: <Error from server: code=000a [Protocol error] message="Invalid value for the paging state">
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 50, in lambda_handler
results = cassandra_connection.execute(
File "/opt/python/lib/python3.8/site-packages/cassandra/cluster.py", line 2618, in execute
return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state, host, execute_as).result()
File "/opt/python/lib/python3.8/site-packages/cassandra/cluster.py", line 4877, in result
raise self._final_exceptionEND RequestId: 979f098a-a566-4904-821a-2ce06522d909
In my case, protocol version is 4.
cluster = Cluster(..., protocol_version=4)
I would appreciate any help!
Just convert the binary data into hex string or base64 - use binascii module for that. For example, for first case functions hexlify/unhexlify (or in Python 3 use .hex method of binary data), and for base64 - use functions b2a_base64/a2b_base64
I am trying to run Python script from
Google Cloud Natural Language API Python Samples
https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/language/cloud-client/v1beta2/snippets.py
I have not made any modifications, so I expected it would just work.
Specifically, I want to run entities analysis on a text file/document.
and the relevant part of the code is below.
def entities_file(gcs_uri):
"""Detects entities in the file located in Google Cloud Storage."""
client = language_v1beta2.LanguageServiceClient()
# Instantiates a plain text document.
document = types.Document(
gcs_content_uri=gcs_uri,
type=enums.Document.Type.PLAIN_TEXT)
# Detects sentiment in the document. You can also analyze HTML with:
# document.type == enums.Document.Type.HTML
entities = client.analyze_entities(document).entities
# entity types from enums.Entity.Type
entity_type = ('UNKNOWN', 'PERSON', 'LOCATION', 'ORGANIZATION',
'EVENT', 'WORK_OF_ART', 'CONSUMER_GOOD', 'OTHER')
for entity in entities:
print('=' * 20)
print(u'{:<16}: {}'.format('name', entity.name))
print(u'{:<16}: {}'.format('type', entity_type[entity.type]))
print(u'{:<16}: {}'.format('metadata', entity.metadata))
print(u'{:<16}: {}'.format('salience', entity.salience))
print(u'{:<16}: {}'.format('wikipedia_url',
entity.metadata.get('wikipedia_url', '-')))
I have put my text file (utf-8 encoding) on cloud storage at
gs://neotokyo-cloud-bucket/TXT/TTS-01.txt
I am running the script in Google cloud shell. and when I run the file:
python snippets.py entities-file gs://neotokyo-cloud-bucket/TXT/TTS-01.txt
I get the following error, which appears to be protobuf related.
[libprotobuf ERROR google/protobuf/wire_format_lite.cc:629].
String field 'google.cloud.language.v1beta2.TextSpan.content'
contains invalid UTF-8 data when parsing a protocol buffer.
Use the 'bytes' type if you intend to send raw bytes.
ERROR:root:Exception deserializing message!
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/grpc/_common.py", line 87, in _transform
return transformer(message)
DecodeError: Error parsing message
Traceback (most recent call last):
File "snippets.py", line 336, in <module>
entities_file(args.gcs_uri)
File "snippets.py", line 114, in entities_file
entities = client.analyze_entities(document).entities
File "/usr/local/lib/python2.7/dist- packages/google/cloud/language_v1beta2/gapic/language_service_client.py", line 226, in analyze_entities
return self._analyze_entities(request, retry=retry, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/google/api_core/gapic_v1/method.py", line 139, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/google/api_core/retry.py", line 260, in retry_wrapped_func
on_error=on_error,
File "/usr/local/lib/python2.7/dist-packages/google/api_core/retry.py", line 177, in retry_target
return target()
File "/usr/local/lib/python2.7/dist-packages/google/api_core/timeout.py", line 206, in func_with_timeout
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/google/api_core/grpc_helpers.py", line 56, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "/usr/local/lib/python2.7/dist-packages/six.py", line 737, in raise_from
raise value
google.api_core.exceptions.InternalServerError: 500 Exception deserializing response!
I do not know protobuf so, any help appreciated!
Where is your text file from?
Python's ParseFromString/SerializeToString are using bytes. Try to convert your text file to bytes before parsing
It looks like your file starts with a byte order mark (utf-8-sig). Try converting your content to standard UTF8 before calling the client.
I have a Kafka topic that is receiving binary data (raw packet capture data). I can confirm that it is indeed landing data using the Kafka CLI tools. I receive multiple messages each second.
kafka-console-consumer.sh --zookeeper svr:2181 --topic test
But when I use kafka-python, I cannot retrieve any messages. The poll method simply returns no results.
(Pdb) consumer = kafka.KafkaConsumer("test", bootstrap_servers=["svr:9092"])
(Pdb) consumer.poll(5000)
{}
I have been able to use kafka-python to pull messages from a separate topic that contains just text strings.
I am curious if somehow internally kafka-python is dropping the messages because they are binary and failing some sort of validation. How can I dig deeper and see why no messages can be retrieved?
The problem was that the data sent to the topic was using snappy compression. All I had to do was install an additional module to handle snappy.
pip install python-snappy
Unfortunately, using the code I outlined in the question, it simply returns no data rather than telling me that the issue is related to compression.
For comparison, I used the older consumer API which does correctly report the problem and led me to this solution.
>>> client = kafka.SimpleClient("svr:9092")
>>> consumer.close()
>>> consumer = kafka.SimpleConsumer(client, "group", "test")
>>> for message in consumer:
... print(message)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/site-packages/kafka/consumer/simple.py", line 353, in __iter__
message = self.get_message(True, timeout)
File "/usr/lib/python2.7/site-packages/kafka/consumer/simple.py", line 305, in get_message
return self._get_message(block, timeout, get_partition_info)
File "/usr/lib/python2.7/site-packages/kafka/consumer/simple.py", line 320, in _get_message
self._fetch()
File "/usr/lib/python2.7/site-packages/kafka/consumer/simple.py", line 379, in _fetch
fail_on_error=False
File "/usr/lib/python2.7/site-packages/kafka/client.py", line 665, in send_fetch_request
KafkaProtocol.decode_fetch_response)
File "/usr/lib/python2.7/site-packages/kafka/client.py", line 295, in _send_broker_aware_request
for payload_response in decoder_fn(future.value):
File "/usr/lib/python2.7/site-packages/kafka/protocol/legacy.py", line 212, in decode_fetch_response
for partition, error, highwater_offset, messages in partitions
File "/usr/lib/python2.7/site-packages/kafka/protocol/legacy.py", line 219, in decode_message_set
inner_messages = message.decompress()
File "/usr/lib/python2.7/site-packages/kafka/protocol/message.py", line 121, in decompress
assert has_snappy(), 'Snappy decompression unsupported'
AssertionError: Snappy decompression unsupported
I just started learning Python and Kafka.
This is the first example I tried to get started.
http://www.giantflyingsaucer.com/blog/?p=5541
And I got an exception:
Traceback (most recent call last):
File "producer.py", line 23, in <module>
main()
File "producer.py", line 18, in main
print_response(producer.send_messages(topic, msg))
File "D:\Setups\Python35-32\lib\site-packages\kafka\producer\simple.py", line 50, in send_messages
topic, partition, *msg
File "D:\Setups\Python35-32\lib\site-packages\kafka\producer\base.py", line 379, in send_messages
return self._send_messages(topic, partition, *msg)
File "D:\Setups\Python35-32\lib\site-packages\kafka\producer\base.py", line 396, in _send_messages
raise TypeError("all produce message payloads must be null or type bytes")
TypeError: all produce message payloads must be null or type bytes
I've searched on google but I'm not quite sure what the problem is.
Could anyone please give me some advice?
Thank you very much!
Here is my code:
from kafka import SimpleProducer, KafkaClient
def print_response(response=None):
if response:
print('Error: {0}'.format(response[0].error))
print('Offset: {0}'.format(response[0].offset))
def main():
kafka = KafkaClient("10.2.5.53:9092")
producer = SimpleProducer(kafka)
topic = 'test'
msg = 'Hello World'
print_response(producer.send_messages(topic, msg))
kafka.close()
if __name__ == "__main__":
main()
Oh, I just realized from looking at the example you posted is that strings must be byte strings, prepended with a b character, as explained here
Also, you're missing the try-except clause catches exceptions for when the server is not ready...
I'm trying to take some http proxies and append them to a list and then test them individually by opening them with urllib but I get the following type error. I have tried wrapping 'proxy' with str() in the test function but that returns another error.
proxies = []
with open('working_proxies.txt', 'rb') as working_proxies:
for proxy in working_proxies:
proxy.rstrip()
proxies.append(proxy)
def test(proxy):
try:
urllib.urlopen(
"http://google.com",
proxies={'http': proxy}
)
except IOError:
print "Connection error! (Check proxy)"
else:
working_proxy = True
working_proxy = False
while working_proxy == False:
myProxy = proxies.pop()
test(myProxy)
My error:
Connection error! (Check proxy)
Traceback (most recent call last):
File "proxy_hand.py", line 26, in <module>
test(proxy)
File "proxy_hand.py", line 16, in test
proxies={'http': proxy}
File "/usr/lib/python2.7/urllib.py", line 87, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 193, in open
urltype, proxyhost = splittype(proxy)
File "/usr/lib/python2.7/urllib.py", line 1074, in splittype
match = _typeprog.match(url)
TypeError: expected string or buffer
You opened the file with proxies as binary here:
with open('working_proxies.txt', 'rb') as working_proxies:
The b in the 'rb' mode string means you'll be reading binary, e.g. bytes objects.
Either open the file in text mode (and perhaps specify a codec other than your system default) or decode your bytes objects to str using an explicit bytes.decode() call:
proxies.append(proxy.decode('ascii'))
I'd expect ASCII to be sufficient to decode hostnames suitable to be used as proxies.
Note that your working_proxy flag won't work; it is not marked as global in test. Perhaps you want to catch the IOError exception outside of test instead, or move the loop into that function. You'll also need to figure out what you'll do when you run out of proxies (so when none of them work).