I have a json full of event data that I need to send into snowplow in python using an iglu webhook but having trouble finding any solid guidance on this. Most of the documentation I've been able to find relates to tracking specific events and sending the data through but I need to backfill historical data in the same manner I'll fill forward looking data hence having to send a large json with activity history at the outset.
Is this possible using snowplow/python/iglu or am I approaching the problem incorrectly?
This question is getting old and OP may have moved on, but I'll leave an answer for anyone else who might stumble upon it.
A Snowplow collector (eg, the stream-collector) receives data over HTTP. Any method of sending an HTTP request should work in theory, however there are specific SDKs that address common use cases. For Python specifically, there is the snowplow-python-tracker. You can refer to the full documentation here: Snowplow Python Tracker Docs.
You do not need to be using an Iglu webhook. You can point your Python tracker instance directly to your collector via the existing request paths, which are documented here. Yes, one of these paths is for requests via the Iglu webhook adapter but that is meant to be used in specific situations where you don't control the environment, in which the tracker is instantiated, eg third-pary vendor systems.
Related
I'm experimenting (for the first time) with Prometheus. I've setup Prometheus to send messages to a local flask server:
remote_write:
- url: "http://localhost:5000/metric"
I'm able to read the incoming bytes, however, I'm not able to convert the incoming messages to any meaningful data.
I'm very new to Prometheus (and Protobuf!) so I'm not sure what the best approach is. I would rather not use a third party package, but want to learn and understand the Protobuf de/serialization myself.
I tried copying the metrics.proto definitions from the Prometheus GitHub and compiling them with protoc. I tried importing the metrics_pb2.py file and parsing the incoming message:
read_metric = metrics_pb2.Metric()
read_metric.ParseFromString(request.data)
I also tried using the remote.proto definitions (specifically WriteRequest) which also didn't work:
read_metric = remote_pb2.WriteRequest()
read_metric.ParseFromString(request.data)
This results in:
google.protobuf.message.DecodeError: Error parsing message
So I suspect that I'm using the wrong Protobuf definitions?
I would really appreciate any help & advice on this!
To provide some more context for what I'm attempting to accomplish:
I'm trying to stream data from multiple Prometheus instances to a message queue so they can be passed to a machine learning model.
I'm using online training with an active learning model, and I want the data to be (near) real-time. That's why I thought the remote_write functionality is the best approach rather than continuously scraping each instance. If you have any other ideas on how I can build this system, feel free to share - I've just been playing around with it for a couple days, so I'm open to any feedback!
ANSWER EDIT:
I had to first decompress the data using snappy, thanks larsks!:
bytes = request.data
decompressed = snappy.uncompress(bytes)
read_metric = remote_pb2.WriteRequest()
read_metric.ParseFromString(decompressed)
The remote.proto document is the correct protobuf specification. You may find this document useful, which explicitly defines the remote write protocol. That document includes the "official" protobuf specification, and mentions that:
The remote write request MUST be encoded using Google Protobuf 3, and MUST use the schema defined above. Note the Prometheus implementation uses gogoproto optimisations - for receivers written in languages other than Golang the gogoproto types MAY be substituted for line-level equivalents.
The document also notes that the body of remote write requests is compressed:
The remote write request in the body of the HTTP POST MUST be compressed with Google’s Snappy. The block format MUST be used - the framed format MUST NOT be used.
So before you can parse the request body you'll need to find a Python solution for decompressing snappy-compressed data.
(I found a link to that google doc from this article that talks about the development of the remote write protocol.)
I'm building a project using python and grafana where I'd like to generate a certain number of copies of certain grafana dashboards based on certain criteria. I've downloaded the grafanalib library to help me out with that, and I've read through the Generating Dashboards From Code section of the grafanalib website, but I feel like I still need more context to understand how to use this library.
So my first question is, how do I convert a grafana dashboard JSON model into a python friendly format? What method of organization do I use? I saw the dashboard generation function written in the grafanalib documentation, but it looked quite a bit different from how my JSON data is organized. I'd just like some further description of how to do the conversion.
My second question is, once I've converted my grafana JSON into a python format, how do I then get the proper information to send that generated dashboard to my grafana server? I see in the grafanalib documentation the "upload_to_grafana" function used to send the information and it takes in the three parameters (json, server, api_key), and I understand where its getting the json parameter from, but I dont get where the server information or API key are coming from or where that information is found to be input.
This is all being developed on a raspberry pi 4 just to put that out there. I'm working on a personal smart agriculture project as a way to develop my coding abilities further, as I'm self taught. Any help that can be provided to help me in my understanding is most appreciated. Thank you.
create an API key in Grafana configuration ..The secret key that u get while creating is the API key ..Server is localhost:3000 in case of installed grafana
I have recently started developing an application to analyse my all-time exercises in the Polar platform.
I'm using their Accesslink API to get new sessions and I have exported my old sessions through another service they offer.
The exported sessions come with fully detailed information (instant GPS location, speed, heart rate), but the JSON data provided by the API is just a summary. I am looking for a way to get the initial position (GPS location) of my session to, later, find the city's name from another source. I think that the only way to do this is by getting the GPS info of my sessions.
Although the sessions have a has-route field, I cannot find in their documentation a way to request this route. They have provided a working example, but it does not provide a way to get these data.
Does anyway know if this is possible and, if so, could you please give me some directions?
Thanks in advance.
Turns out that the GPS information is provided through GPX files, which are provided by the API mentioned on the question. There is a method implemented to do this on their github (link also on the question) which already performs this task. I have added the call to this method and saved its output in this project.
I am trying to use an API which I have used previously for various jobs, to query and get me relevant data. But lately, I am unable to do that because of an unusual exception returned, which I honestly have no idea about.
The CODE:
import SIEMAuth
import requests
alert_id = '144116287822364672|12101929'
query_params = {"id": {"value": alert_id}, "format": {"format": 0}}
print(requests.post(SIEMAuth.url + 'ipsGetAlertPacket', json=query_params, headers=SIEMAuth.session_headers, verify=False).text)
The following exception/traceback response is returned on querying this:
Can not construct instance of com.mcafee.siem.api.data.alert.EsmPacketFormat: no suitable constructor found, can not deserialize from Object value (missing default constructor or creator, or perhaps need to add/enable type information?)
at [Source: java.io.StringReader#1a15fbf; line: 1, column: 2]
Process finished with exit code 0
On trying to surf the internet to know more about the exception, most of the results are related to Jackson Parser for Json in Java Programming Environment which is not something I am working on or am aware of.
If anybody could help, I'd be extremely grateful.....
Unfortunately it's as I suggested; basically one way or another it's broken. The response from their support is as follows.
I have reach out to my development team for this question. I got below response.
That particular get is not meant to be used in the external API. It should only be used from the interface, and has been removed since the version of the ESM you are on. If you want to use that externally then you need to submit it as a per.
I hope this clears your questions.
Edit: This has actually been expanded on in a thread on their support forums. You need a login to see the original thread.
Name notwithstanding, this API does not return the actual data packet associated with an event. In fact, when aggregation is enabled, not all of the packets associated with a given event are available on the ESM. Raw packet data can be retrieved from the ELM through the UI, but unfortunately there currently is not a way to do that programmatically.
We are building an exensive api-link with the Exact online odata API. Problem we are having is that many objects cant be updated or deleted. For instance BankEntryLines, GeneralJournalEntryLines.
We have now worked around this by creating new EntryLines upon each update or delete, but this creates much unclarity in some cases.
Can the API be changed, or can I get extra authorization to be able to update or delete these objects, just like is possible in the GUI?
As the Exact Online REST API doesn't support modifying on quite some objects, there is no way to achieve what you want using the REST API. If the Exact Online XML API doesn't support updating either, there is only one solution left.
That solution is forbidden by Exact, and it could risk you lose you application developer status. You can make those changes using HTTP POSTS on the web site itself. If you can extract the calls that are made through the screens, you can mimic their behavior and by replaying that, you can modify what you need.
If you want to make a coupling to Exact Online and you are starting with developing, I want to suggest you to take a look at Invantive Data Hub, which allows updating Exact Online using SQL syntax. (To give full disclosure: I work for that company)