Using python BOTO with AWS SQS, getting back nonsense characters - python

So, I am using python and BOTO to access my AWS SQS. I have some messages in the SQS which I can see from the AWS dashboard. However, when I try to get these messages through python, the characters that come through are just gibberish. Any idea what is going on here?
conn = boto.sqs.connect_to_region("us-east-1")
q = conn.get_queue('my-worker-queue')
print q
#read from message queue
message = q.read(60)
print message
print message.get_body()
Given the code above, I get the following:
Queue(https://queue.amazonaws.com/247124526695/my-worker-queue)
<boto.sqs.message.Message instance at 0x16f31b8>
??e??b?+??-
The text in the message queue is the following:
hello this is a test

I guess the reason is base64 decoding issue, as boto uses base64 for messages encoding and decoding. You can try to use get_body_encoded method:
print message.get_body_encoded()
Other option is convert to RawMessage:
from boto.sqs.message import RawMessage
q.set_message_class(RawMessage)
Update
Yes it is, it became clear with your test case:
>>> print 'hello this is a test'.decode('base64')
??e??b?+??-

Related

Unable to decode AWS Session Manager websocket output in python

Hope you're doing great !
The usecase
I'm trying to PoC something on AWS, the use case is that we need to be able to check on all our infrastructure that all instance are reachable through AWS Session Manager.
In order to do that, I will use a Lambda in Python 3.7, I make my PoC locally currently. I'm able to open the websocket, send the Token Payload and get an output that contains a shell.
The problem is that the byte output contains character that the python decode function can't decode in a lot of tested character encoding, every time something block.
The output
Here is the output I have after sending the payload :
print(event)
b'\x00\x00\x00toutput_stream_data \x00\x00\x00\x01\x00\x00\x01m\x1a\x1b\x9b\x15\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xb1\x0b?\x19\x99A\xfc\xae%\xb2b\xab\xfd\x02A\xd7C\xcd\xd8}L\xa8\xb2J\xad\x12\xe3\x94\n\xed\xb81\xfa\xb6\x11\x18\xc2\xecR\xf66&4\x18\xf6\xbdd\x00\x00\x00\x01\x00\x00\x00\x10\x1b[?1034hsh-4.2$ '
What I already tried
I researched a lot on stackoverflow, tried to decode with ascii, cp1252, cp1251, cp1250, iso8859-1, utf-16, utf-8, utf_16_be, but everytime, it doesn't decode anything or it leads to an error because a character is unknown.
I also already tried to use chardet.detect, but the returned encoding is not working and also the probability result is really low. And also tried to strip the \x00 but strip doesn't work that time.
I already know that shell output can sometimes contains coloring character and some things that make it looks like garbled, but here, I tried to pass colorama on it, tried to match some ANSI character with some regex, nothing successfully decode this bytes response.
The code
Here is the code for my PoC, feel free to use it to try, you just have to change the target instance id (your instance needs to have the latest amazon-ssm-agent running on it).
import boto3
import uuid
import json
from websocket import create_connection
# Setting the boto3 client and the target
client = boto3.client('ssm','eu-west-1')
target = 'i-012345678910'
# Starting a session, this return a WebSocket URL and a Token for the Payload
response = client.start_session(Target=target)
# Creating a session with websocket.create_connection()
ws = create_connection(response['StreamUrl'])
# Building the Payload with the Token
payload = {
"MessageSchemaVersion": "1.0",
"RequestId": str(uuid.uuid4()),
"TokenValue": response['TokenValue']
}
# Sending the Payload
ws.send(json.dumps(payload))
# Receiving, printing and measuring the received message
event = ws.recv()
print(event)
print(len(event))
# Sending pwd, that should output /usr/bin
ws.send('pwd')
# Checking the result of the received message after the pwd
event = ws.recv()
print(event)
print(len(event))
Expected output
In the final solution, I expect to be able to do something like a curl http://169.254.169.254/latest/meta-data/instance-id through the websocket, and compare the instance-id of the command output against the target, to validate that instance is reachable. But I need to be able to decode the websocket output before achieving that.
Thank you in advance for any help on this.
Enjoy the rest of your day !
As per my reading of the amazon-ssm-agent code, the payload exchanged via the websocket connection and managed by the session-manager channels follow a specific structure called the AgentMessage.
You will have to comply with this structure to use session-manager with the remote agent through the MGS Service, which means serializing messages and deserializing responses.
The fields of the above struct are also broken down into models via additional structs.
It shouldn't be too long to re-implement that in python. Good luck!

cTrader decode protobuf message from Report API Events (tunnel)

i am dealing with cTrader Trading platform.
My project is written in python 3 on tornado.
And have issue in decoding the prtobuf message from report API Events.
Below will list everything what i achieved and where have the problem.
First cTrader have Rest API for Report
so i got the .proto file and generated it for python 3
proto file is called : cTraderReportingMessages5_9_pb2
from rest Report API getting the protobuf message and able to decode in the following way because i know which descriptor to pass for decoding
from models import cTraderReportingMessages5_9_pb2
from protobuf_to_dict import protobuf_to_dict
raw_response = yield async_client.fetch(base_url, method=method, body=form_data, headers=headers)
decoded_response = cTraderReportingMessages5_9_pb2._reflection.ParseMessage(descriptors[endpoint]['decode'], raw_response.body)
descriptors[endpoint]['decode'] = is my descriptor know exactly which descriptor to pass to decode my message
my content from cTraderReportingMessages5_9_pb2
# here is .proto file generated for python 3 is too big cant paste content here
https://ufile.io/2p2d6
So until here using rest api and know exactly which descriptor to pass, i am able to decode protobuf message and go forward.
2. Now the issue i face
Connecting with python 3 to the tunnel on 127.0.0.:5672
i am listening for events and receiving this kind of data back
b'\x08\x00\x12\x88\x01\x08\xda\xc9\x06\x10\xb6\xc9\x03\x18\xa1\x8b\xb8\x01 \x00*\x00:\x00B\x00J\x00R\x00Z\x00b\x00j\x00r\x00z\x00\x80\x01\xe9\x9b\x8c\xb5\x99-\x90\x01d\x98\x01\xea\x9b\x8c\xb5\x99-\xa2\x01\x00\xaa\x01\x00\xb0\x01\x00\xb8\x01\x01\xc0\x0
1\x00\xd1\x01\x00\x00\x00\x00\x00\x00\x00\x00\xd9\x01\x00\x00\x00\x00\x00\x00\x00\x00\xe1\x01\x00\x00\x00\x00\x00\x00\x00\x00\xea\x01\x00\xf0\x01\x01\xf8\x01\x00\x80\x02\x00\x88\x02\x00\x90\x02\x00\x98\x02\x00\xa8\x02\x00\xb0\x02\x00\xb8\x02\x90N\xc0\x02\x00\xc8\x0
2\x00
as recommendation i got, i need to use same .proto file generated for python that i did in step 1 and decode the message but without any success because i don't know the descriptor need to be passed.
so in 1 step was doing and working perfect this way
decoded_response = cTraderReportingMessages5_9_pb2._reflection.ParseMessage(descriptors[endpoint]['decode'], raw_response.body)
but in second step can not decode the message using in the same way, what i am missing or how to decode the message using same .proto file?
Finally found a workaround by my self, maybe is a primitive way but only this worked for me.
By the answer got from providers need to use same .proto file for both situations
SOLUTION:
1. Did list with all the descriptors from .proto file
here is .proto file generated for python 3 is too big cant paste content here
https://ufile.io/2p2d6
descriptors = [cTraderReportingMessages5_9_pb2.descriptor_1, cTraderReportingMessages5_9_pb2.descriptor_2]
2. Loop throw list and pass one by one
for d in descriptors:
decoded_response = cTraderReportingMessages5_9_pb2._reflection.ParseMessage(d, raw_response.body)
3. Check if decoded_response is not blank
if decoded_response:
# descriptor was found
# response is decoded
else:
# no descriptor
4. After decoded response we go parse it into dict:
from protobuf_to_dict import protobuf_to_dict
decoded_response_to_dict = protobuf_to_dict(decoded_response)
This solution that spent weeks on it finally worked.

Converting data from serial/usb using PySerial

I have a UBlox receiver connected to my computer and I am trying to read it using PySerial however I am new to python and was hoping to get some clarification/help on understanding the data.
My code looks like:
import serial
# open the connection port
connection = serial.Serial('/dev/ttyACM0', 9600)
# open a file to print the data. I am doing this to make
# sure it is working
file1 = open('output_file', 'wb+')
# All messages from ublox receivers end with a carriage return
# and a newline
msg = connection.readline()
# print the message to the file
print >> file1, msg
What I get in the file, and when I print the 'type' of msg it is a list:
['\xb5b\x01\x064\x00\xe0\x88\x96#\xd3\xb9\xff\xffX\x07\x03\xdd6\xc31\xf6\xfd)\x18\xea\xe6\x8fd\x1d\x00\x01\x00\x00\x00\x00\x00\x00\xfd\xff\xff\xff\x01\x00\x00\x00\x02\x00\x00\x00p\x00\x02\x0f\x16\xa2\x02\x00\x9c\xeb\xb5b\x01\x07\\x00\xe0\x88\x96#\xe0\x07\x01\x17\x15237\x04\x00\x00\x00\xd6\xb9\xff\xff\x03\x01\n']
["\x1a\x0c\x04\x19'y\x00$\xf7\xff\xff\x1a\x1d\x04\x01\x00\x007\x00\x00\x00\x00\x00\x02\x1f\x0c\x01\x00+:\x00\x00\x00\x00\x00\x01 \r\x07&-\x9f\x00\xff\x01\x00\x00\x17\xc1\x0c\x04\x16\n"]
In order to interpret/decode the ublox messages have two format types. Some of the messages are in NMEA format(basically comma delimited)
$MSG, 1, 2, 3, 4
Where the other messages are straight hexidecimal, where each byte or set of bytes represent some information
[AA BB CC DD EE]
So my question is: is there a way I can interpret/convert the data from serial connection to a readable or more usable format so I can actually work with the messages. Like I said, I am new to python and more used to C++ style strings or array of characters
`
A typical parsing task. In this case, it'll probably be the simplest to make tokenization two-stage:
read the data until you run into a message boundary (you didn't give enough info on how to recognize it)
split the read message into its meaningful parts
for type 1, it's likely re.split(", *",text)
for type 2, none needed
display the parts however you want
Regarding why serial.Serial.readline returns a list. I consulted the sources - serial.Serial delegates readline to io.IOBase, and its source indeed shows that it should return a bytestring.
So, the function might be overridden in your code by something. E.g. what do print connection.readline and print serial.Serial.readline show?

Manually signing an email with DKIM in Python

I'm new to Python and trying to create a email sending script via socket communication but can't seem to sign it with the dkimpy lib. I tried a couple of examples on the web but all returned the same error when running dkim.sign:
File "C:\Python34\lib\re.py", line 196, in split return _compile(pattern,flags).split(string, maxsplit)
TypeError: expected string or buffer
Near as I can tell, the first variable in the dkim.sign function should be a string so I tried readlines () and even .as_string() just to be sure. I have checked the message and it seems RFC822 compliant. But I'll double check if anyone thinks that might be the problem. Without the dkim.sign it works perfectly ( minus any security like SPF/DKIM )
This is a snippet of the code I'm using:
f=open('mail.htm','r')
text=MIMEText(f.read(),'html')
headers = Parser().parse(open('mail.htm', 'r'))
sender=headers['From']
receiver=headers['To']
subj=headers['Subject']
f.close()
private_key = open('default.pem').read()
headers = ['To', 'From', 'Subject']
sig = dkim.sign(text, 'default', '<mydomain.here>', private_key, include_headers=headers)
The parsed headers are also used as input to the socket sending script. I do have a dkim key for test purposes, but I do not think it even reaches that point.
Any insight?
EDIT: Ok, I just tried parsing the string ( instead of signing it ) with dkim.rfc822_parse from dkimpy lib and I get the following error:
return _compile(pattern, flags).split(string, maxsplit)
TypeError: can't use a bytes pattern on a string-like object
Am I reading this write or does it seem that the code is expecting a string but the pattern is in bytes?
FIXED: Odly enough I did not think to check the private_key. I manually created the key in Win so unbeknownst to me, windows added an invisible linebreak character that even vim or nano could not see. After removing it with MCEdit the program worked without a hitch.
Thanks for the help :)
if I remember correctly, dkim.sign expects the full message source as parameter, but you are passing a MIMEText object.
try passing text.as_string() instead
sig = dkim.sign(text.as_string(), .... )
python3 enforces a strong difference between processing bytes and string.
the easiest way I found to avoid conversion when using the dkim module is to stay in bytes, here is what I use:
from email.parser import BytesParser
import dkim
mail = BytesParser().parse (open('mail.eml', 'rb'))
print(dkim.verify( mail.as_bytes () ) )))
The "rb" is for opening the file in bytes mode.
Give it a try.

comparing strings and decoded unicode in python3

I'm doing some socket/select programming and one of my events is triggered by the incoming byte string of 'OK'. I'm using utf_8 to encode everything sent from the server and decoding it on the client. However, my client comparisons aren't working and my if statement never evaluates to true. Here is the code in question:
Server side:
def broadcast_string(self, data, omit_sock): # broadcasts data utf_8 encoded to all socks
for sock in self.descriptors:
if sock is not self.server and sock is not omit_sock:
sock.send(data.encode('utf_8'))
print(data)
def start_game(self): # i call this to send 'OK'
data = 'OK'
self.broadcast_string(data, 0)
self.new_round()
Client side:
else: # got data from server
if data.decode('utf_8') == 'OK': # i've tried substituting this with a var, no luck
self.playstarted = True
else:
sys.stdout.write(data.decode('utf_8') + "\n")
sys.stdout.flush()
if self.playstarted is True: # never reached because if statement never True
command = input("-->")
I've read this and I think I'm following it but apparently not. I've even done the examples using the python shell and have had them evaluate to True, but not when I run this program.
Thanks!
TCP sockets don't have message boundaries. As your last comment says you are getting multiple messages in one long string. You are reponsible for queuing up data until you have a complete message, and then processing it as one complete message.
Each time select says a socket has some data to read, append the data to a read buffer, then check to see if the buffer contains a complete message. If it does, extract just the message from the front of the buffer and process it. Continue until no more complete messages are found, then call select again. Note also you should only decode a complete message, since you might receive a partial UTF-8 multi-byte character otherwise.
Rough example using \n as a message terminator (no error handling):
tmp = sock.recv(1000)
readbuf += tmp
while b'\n' in readbuf:
msg,readbuf = readbuf.split(b'\n',1)
process(msg.decode('utf8'))

Categories