I am using mqtt for the first time to transfer some binary files, so far I have no issues transferring it using a code like bellow
import paho.mqtt.client as paho
f=open("./file_name.csv.gz","rb")
filename= f.read()
f.close()
byteArray = bytearray(filename)
mqttc = paho.Client()
mqttc.will_set("/event/dropped", "Sorry, I seem to have died.")
mqttc.connect(*connection definition here*)
mqttc.publish("hello/world", byteArray )
However together with the file itself there is some extra info I want to send (the original file name, creation date,etc...), I can't find any proper way to transfer it using mqtt, is there any way to do that or do I need to add that info to the message byteArray itself? How would I do that?
You need to build your own data structor to hold the file and it's meta data.
How you build that structure is up to you. A couple of options would be:
base64/uuencode encode the file and add it as a field in a JSON object and save the meta data as other fields then publish the JSON object.
Build a Python map with the file as a field and other meta data as other fields. Then use pickle to serialise the map.
Related
I'm trying to retrieve pickle data I have uploaded to an openstack object storage using openstacksdk's connection.get_object(container,object), I get a response from it, however the file body is a string, I can even save it to file with the outfile option without issues. However I would like to be able to work with it directly without having to resort to save it to file first and then loading it into pickle.
Simply using pickle's load and loads doesn't work as neither takes string objects. Is there another way to retrieve the data so I can work with the pickled data directly or is there some way to parse to string/set a config parameter on get_object()?
If you are using Python 3 - pickle expects a bytes-like object. The load method takes a file path, and relies on the file type to handle the providing of bytes back into pickle. When you use the loads method you need to provide it a bytes-like object, not a string, so you will need to convert the string to bytes.
Best way to convert string to bytes in Python 3?
EDIT:
I found the solution, for pickled objects or any other files retrieved from openstack with openstacksdk, there are a few ways of dealing with the data without resorting to disk.
First my implemented solution was to use openstack's connection method get_object_raw:
conn = connection(foo,bar, arg**)
pickle.loads(conn.get_object_raw('containerName', 'ObjectName').content)
.get_object_raw returns a response request object with the attribute content which is the binary file content which is the pickle content one can load with pickle.
You could also create a temporary in-memory file with io.BytesIO, and using it as the outfile argument in get_object from the connection object.
i am dealing with cTrader Trading platform.
My project is written in python 3 on tornado.
And have issue in decoding the prtobuf message from report API Events.
Below will list everything what i achieved and where have the problem.
First cTrader have Rest API for Report
so i got the .proto file and generated it for python 3
proto file is called : cTraderReportingMessages5_9_pb2
from rest Report API getting the protobuf message and able to decode in the following way because i know which descriptor to pass for decoding
from models import cTraderReportingMessages5_9_pb2
from protobuf_to_dict import protobuf_to_dict
raw_response = yield async_client.fetch(base_url, method=method, body=form_data, headers=headers)
decoded_response = cTraderReportingMessages5_9_pb2._reflection.ParseMessage(descriptors[endpoint]['decode'], raw_response.body)
descriptors[endpoint]['decode'] = is my descriptor know exactly which descriptor to pass to decode my message
my content from cTraderReportingMessages5_9_pb2
# here is .proto file generated for python 3 is too big cant paste content here
https://ufile.io/2p2d6
So until here using rest api and know exactly which descriptor to pass, i am able to decode protobuf message and go forward.
2. Now the issue i face
Connecting with python 3 to the tunnel on 127.0.0.:5672
i am listening for events and receiving this kind of data back
b'\x08\x00\x12\x88\x01\x08\xda\xc9\x06\x10\xb6\xc9\x03\x18\xa1\x8b\xb8\x01 \x00*\x00:\x00B\x00J\x00R\x00Z\x00b\x00j\x00r\x00z\x00\x80\x01\xe9\x9b\x8c\xb5\x99-\x90\x01d\x98\x01\xea\x9b\x8c\xb5\x99-\xa2\x01\x00\xaa\x01\x00\xb0\x01\x00\xb8\x01\x01\xc0\x0
1\x00\xd1\x01\x00\x00\x00\x00\x00\x00\x00\x00\xd9\x01\x00\x00\x00\x00\x00\x00\x00\x00\xe1\x01\x00\x00\x00\x00\x00\x00\x00\x00\xea\x01\x00\xf0\x01\x01\xf8\x01\x00\x80\x02\x00\x88\x02\x00\x90\x02\x00\x98\x02\x00\xa8\x02\x00\xb0\x02\x00\xb8\x02\x90N\xc0\x02\x00\xc8\x0
2\x00
as recommendation i got, i need to use same .proto file generated for python that i did in step 1 and decode the message but without any success because i don't know the descriptor need to be passed.
so in 1 step was doing and working perfect this way
decoded_response = cTraderReportingMessages5_9_pb2._reflection.ParseMessage(descriptors[endpoint]['decode'], raw_response.body)
but in second step can not decode the message using in the same way, what i am missing or how to decode the message using same .proto file?
Finally found a workaround by my self, maybe is a primitive way but only this worked for me.
By the answer got from providers need to use same .proto file for both situations
SOLUTION:
1. Did list with all the descriptors from .proto file
here is .proto file generated for python 3 is too big cant paste content here
https://ufile.io/2p2d6
descriptors = [cTraderReportingMessages5_9_pb2.descriptor_1, cTraderReportingMessages5_9_pb2.descriptor_2]
2. Loop throw list and pass one by one
for d in descriptors:
decoded_response = cTraderReportingMessages5_9_pb2._reflection.ParseMessage(d, raw_response.body)
3. Check if decoded_response is not blank
if decoded_response:
# descriptor was found
# response is decoded
else:
# no descriptor
4. After decoded response we go parse it into dict:
from protobuf_to_dict import protobuf_to_dict
decoded_response_to_dict = protobuf_to_dict(decoded_response)
This solution that spent weeks on it finally worked.
I am trying to extract a Wiktionary xml file from their dumps using the wiktextract python module. However their website does not give me enough information. I could not use the command line program that comes with it since it isn't a Windows executable, so I tried the programmatic way. The following code takes a while to run so it seems to be doing something but then I'm not sure what to do with the ctx variable. Can anyone help me?
import wiktextract
def word_cb(data):
print(data)
ctx = wiktextract.parse_wiktionary(
r'myfile.xml', word_cb,
languages=["English", "Translingual"])
You are on the right track, but don't have to worry too much about the ctx object.
As the documentation says:
The parse_wiktionary call will call word_cb(data) for words and redirects found in the
Wiktionary dump. data is information about a single word and part-of-speech as a dictionary (multiple senses of the same part-of-speech are combined into the same dictionary). It may also be a redirect (indicated by presence of a redirect key in the dictionary).
The output ctx object mostly contains summary information (the number of sections processed, etc; you can use dir(ctx) to see some of its fields.
The useful results are not the ones in the returned ctx object, but the ones passed to word_cb on a word-by-word basis. So you might just try something like the following to get a JSON dump from a wiktionary XML dump. Because the full dumps are many gigabytes, I put a small one on a server for convenience in this example.
import json
import wiktextract
import requests
xml_fn = 'enwiktionary-20190220-pages-articles-sample.xml'
print("Downloading XML dump to " + xml_fn)
response = requests.get('http://45.61.148.79/' + xml_fn, stream=True)
# Throw an error for bad status codes
response.raise_for_status()
with open(xml_fn, 'wb') as handle:
for block in response.iter_content(4096):
handle.write(block)
print("Downloaded XML dump, beginning processing...")
fh = open("output.json", "wb")
def word_cb(data):
fh.write(json.dumps(data))
ctx = wiktextract.parse_wiktionary(
r'enwiktionary-20190220-pages-articles-sample.xml', word_cb,
languages=["English", "Translingual"])
print("{} English entries processed.".format(ctx.language_counts["English"]))
print("{} bytes written to output.json".format(fh.tell()))
fh.close()
For me this produces:
Downloading XML dump to enwiktionary-20190220-pages-articles-sample.xml
Downloaded XML dump, beginning processing...
684 English entries processed.
326478 bytes written to output.json
with the small dump extract I placed on a server for convenience. It will take much longer to run on the full dump.
Hi I am working on a simple program that takes data from a json file (input through an html form with flask handling the data) and uses this data to make calls to an API.
So I have some JSON like this:
[{"id": "ßLÙ", "server": "NA"}]
and I want to send the id to an api call like this example:
http://apicallnamewhatever+id=ßLÙ
however when i load the json file into my app.py with the following command
ids = json.load(open('../names.json'))
json.load seems to alter the id from 'ßLÙ' to 'ßLÙ'
im not sure why this happens during json.load, but i need to find a way to get 'ßLÙ' into the api call instead of the deformed 'ßLÙ'
It looks as if your names.json is encoded in "utf-8", but you are opening it as "windows-1252" [*] or something like that. Try
json.load(open('names.json', encoding="utf-8"))
and you probably should also URL-encode the id instead of concatenating it directly with that server address, something along these lines:
urllib2.quote(idExtractedFromJson.encode("utf-8")
[*] Thanks #jDo for pointing that out, I initially guessed the wrong codepage.
I've got a large python project with several components, that exchange information with JSON files. Actually, this project is our internal tool for analysis and integration testing, and our developers use it either from web-UI, or from a command line.
The python modules process a labeled database, consisting of large amount of files, and labels are encoded in file names. For example, file name ab001l_AS_5_15Fps_1.raw contains information that it stores data from user ab001l, collected in session number 1 under conditions, that we encode as AS.
There are several such encodings exist.
JSON files usually store file names.
My question is: how can I save a python code into JSON file, so that another module could load it and decode file name into components?
I guess you can store python code as text in JSON, then use the exec built-in function to execute the text. See
https://docs.python.org/3/library/functions.html?highlight=exec#exec.
But it seems a much better approach to share your module and import your module like any python code.
You can use jsonpickle. Please check the documentation page for usage.
import jsonpickle
class Thing(object):
def __init__(self, name):
self.name = name
obj = Thing('Awesome')
frozen = jsonpickle.encode(obj)
thawed = jsonpickle.decode(frozen)