I am writing FastAPI program that is just a bunch of #app.get endpoints for querying data. There are many, many different query arguments they could use that are automatically generated from a config file, for example the #app.get("/ADJUST_COLOR/") endpoint could look something like /ADJUST_COLOR/?RED_darker=10&BLUE_lighter=43&GREEN_inverse=true where all those parameters are generated from a list of colors and a list of operations to perform on those colors (This is only an example, not what I am actually doing).
The way I am doing that is to take in the request object like this:
#app.get("/ADJUST_COLOR/")
def query_COLORS( request: Request ):
return look_through_parameters(request.query_params)
But the problem is that the automatically generated swagger UI does not show any useful data:
Since I am parsing the request manually there are no parameters generated. But since I have a full list of the parameters I am expecting then I should be able to generate my own documentation and have the UI show it.
I have looked through these two documents: https://fastapi.tiangolo.com/tutorial/path-operation-configuration/
And https://fastapi.tiangolo.com/advanced/path-operation-advanced-configuration/
But I was not able to figure out if it was possible or not
You can define custom api schema in your route via openapi_extra (this is a recent feature of FastAPI, 0.68 will work but I'm not sure the exact earliest version that supports this):
#app.get("/ADJUST_COLOR/", openapi_extra={
"parameters": [
{
"in": "query",
"name": "RED_darker",
"schema": {
"type": "integer"
},
"description": "The level of RED_darker"
},
{
"in": "query",
"name": "BLUE_lighter",
"schema": {
"type": "integer"
},
"description": "The level of BLUE_lighter"
},
{
"in": "query",
"name": "GREEN_inverse",
"schema": {
"type": "boolean"
},
"description": "is GREEN_inverse?"
},
]
})
async def query_COLORS(request: Request):
return look_through_parameters(request.query_params)
Which is rendered like this in your api /docs:
Related
attached an example AVRO-Schema
{
"type": "record",
"name": "DummySampleAvroValue",
"namespace": "de.company.dummydomain",
"fields": [
{
"name": "ID",
"type": "int"
},
{
"name": "NAME",
"type": [
"null",
"string"
]
},
{
"name": "STATE",
"type": "int"
},
{
"name": "TIMESTAMP",
"type": [
"null",
"string"
]
}
]
}
Regarding the section "JSON Encoding" of the official AVRO-Specs - see: https://avro.apache.org/docs/current/spec.html#json_encoding - a JSON Message which validates against the above AVRO-Schema should look like the following because of the UNION-Types used:
{
"ID":1,
"NAME":{
"string":"Kafka"
},
"STATE":-1,
"TIMESTAMP":{
"string":"2022-04-28T10:57:03.048413"
}
}
When producing this message via Confluent Rest Proxy (AVRO), everything works fine, the data is accepted, validated and present in Kafka.
When using the "SearializingProducer" from the confluent_kafka Python Package, the example message is not accepted and only "regular" JSON works, e. g.:
{
"ID":1,
"NAME":"Kafka",
"STATE":-1,
"TIMESTAMP":"2022-04-28T10:57:03.048413"
}
Is this intended behaviour or am I doing something wrong? Can I tell the SerializingProducer to accept this encoding?
I need to hold open both ways to produce messages but the sending system can/want´s only to provide one of the above Payloads. Is there a way to use both with the same payload?
Thanks in advance.
Best regards
I am a new Elasticsearch user, but I am struggling to accomplish something that was easy for me in Splunk. There are a few specific fields that I want from each event in my search, but the search "hit" outputs are always returned in a big json structure that is 95% useless for me. I do my searches with the python requests module, so I can parse the results I want in python when they return, but I have to access millions of events and performance is important, so I hope there is a faster way.
Here is an example of one single event returned from an Elasticsearch search:
<Response [200]>
{
"hits": {
"hits": [
{
"sort": [
1559438581000
],
"_type": "_doc",
"_source": {
"datapoint": {
"updated_at": "2019-06-02T00:01:02Z",
"value": 102
},
"metadata": {
"id": "AB33",
"property_name": "some_property",
"oem_model": "some_model"
}
},
"_score": null,
"_index": "datapoint-2019.06",
"_id": "datapoint+4+314372003"
},
What I would prefer is for my search to return only results in a table/.csv/dataframe format of the updated_at,value,id,property_name,oem_model values like this:
2019-06-02T00:01:02Z,102,AB33,some_property,some_model
..... and similar for other events ...
Does anyone know if this is possible to do with Elasticsearch or with the requests library without parsing the json after the search output is returned? Thank you very much for any help.
Yes, sure with the source filtering. Doc here
You filter the field to be returned from your query, so in this way tou choose only the useful fields and then you should not parse the json. Have a look here:
from elasticsearch import Elasticsearch
es = Elasticsearch()
query = {
"_source": [ "obj1.*", "obj2.*" ], #this is the list of the fields that you would return as a doc
"query" : {
"term" : { "user" : "kimchy" }
}
}
res = es.search(index="your_index_name", body=query)
I am using Python3.5 and Django for a web api.When I refer to input, I refer to a HTTP request parameters. I have a parameter where I am expecting a JSON data which I need to validate before processing further.
I have a base json structure that the input has to be in.
Example,
{
"error": "bool",
"data": [
{
"name": "string",
"age": "number"
},
{
"name": "string",
"age": "number"
},
...
]
}
The above JSON represents the structure that I want my input to be in. The keys are predefined, and the value represents the datatype of that key that I am expecting. I came across a Python library(jsonschema) that does this validation, but I can't find any documentation where it works with dynamic data. i.e. the objects inside the JSON array 'data' can be of any number, of course this is the most simple scenario I came up with for explaining the basic requirement. In cases like these, how can I validate my json?
The solution here didn't help because it's just checking if the json is proper or not based on the Django model. My json has no relation with Django model. Its a simple json structure. It still doesn't tell me how to validate dynamic object
JSON Schema is a specification for validating JSON; jsonschema is just a Python library that implements it. It certainly does allow you to specify that a key can contain any number of elements.
An example of a JSON Schema that validates your code might be:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"additionalProperties": false,
"required": [
"error",
"data"
],
"properties": {
"error": {
"type": "boolean"
},
"data": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": false,
"properties": {
"name": {
"type": "string"
},
"age": {
"type": "integer"
}
}
}
}
}
}
See https://spacetelescope.github.io/understanding-json-schema/ for a good overview
Take a look into the documentation of Python's JSON API. I believe json.tool is what you're looking for, however there are a couple of other ways to validate JSON using that API.
I've got a question about the Rest API for Connectwise. I've been doing get and post requests with no issue, but when I do a patch request I get a 400 response with 'field value is invalid' message regardless of what I try. I'm on 2016v1 and using the Rest API making calls from Python with the requests library.
The Rest API documentation says the following object is supposed to be passed in the body, but I haven't a clue what values are supposed to go with these keys:
{
op (string, optional),
path (string,optional),
value (string,optional)
}
I've tried dozens of calls including with the following bodies:
{'summary': 'updatedsummarytext'}
{'value': {'summary': 'updatedsummarytext'}}
{'op': {'summary': 'updatedsummarytext'}}
I have only gotten the following response so far:
<Response [400]>
{
"code": "InvalidObject",
"message": "operations object is invalid",
"errors": [
{
"code": "InvalidField",
"message": "The field value is invalid.",
"resource": "operations",
"field": "value"
}
]
}
Is their a specific value that connectwise is expecting for the op or value keys, or is there something I'm missing unique to Patch rest api calls?
The PATCH calls at a basic level use RFC6902.
Consider the following (simplified) Ticket document:
{
"summary": "Initial Ticket Summary",
"id": 1,
"company": {
"id": 5
},
"board": {
"id": 10
}
}
If you wish to update the summary field, your PATCH request JSON would look like this:
[
{"op": "replace", "path": "/summary", "value": "Updated Summary"}
]
Hope this helps.
We're in the process of writing a django app that lets users send private messages among themselves, as well as send message to a group, and are looking to implement a per-user customized search functionality so each user can search and view only messages they have received.
How do we offer a search experience that's customized to each user? Some messages are part of threads sent to thousands of users as part of a group, whereas others may be private messages sent between 2 users and even others may be "pending" messages that are held for moderation.
Do we hard-code the filters that determine if a user can view a message into each query we send to ElasticSearch, or if a message goes to a group with 1000 members do I add 1000 identical documents to ElasticSearch with the only thing changing being the recipient?
Update
So here's an individual message in it's serialized form serialized:
{
"snippet": "Hi All,Though Marylan...", // Friendly snippet, this will be needed in the result
"thread_id": 28719, // Unique ID for this thread
"thread_title": "Great Thread Title Here", // Title for the thread, will be used to diplay in search results
"sent_at": "2015-03-19 07:28:15.092030-05:00", // Datetime the message was originr
"text": "Clean Message Test Here", // Text to be queryable
"pending": false, // If pending, this should only appear in the search results of the sender
"id": 30580, // Unique ID for this message across the entire
"sender": {
"sender_is_staff": false, // If the sender is a staff member or not (Filterable)
"sender": "Anna M.", // Friendly name (we'll need this to display on the result page)
"sender_guid": "23234304-eeee-bbbb-1234-bfb19d56ad68" // Guid of sender (necessary to display a link to the user's profile in the result)
},
"recipient" {
"name": "", // Not filled in for group messages
"recipient_guid": "" // Not filled in for group messages
}
"type": "group", // Values for this can be 'direct' or 'group'
"group_id": 43 // This could be null
}
A user should be able to search:
All the messages that they're the "sender" of
All messages where their GUID is in the "recipient" area (and the "type" is "direct")
All the messages sent to the groups IDs they're a member of that are not pending (they could be a member of 100 groups though, so it could be [10,14,15,18,25,44,50,60,75,80,81,82,83,...])
In SQL that'd be SELECT * FROM messages WHERE text contains 'query here' AND (sender.guid = 'my-guid' OR recipient.guid = 'my-guid' OR (group_id in [10,14,15,18,25,44,50,60,75,80,81,82,83,...] AND pending != True))
I hope I'm understanding your problem correctly.
So you have a messaging system where there are 3 types of messages (group, 2-users, moderated). Your goal is to allow your users to search through all messages, with the option to apply filters on type, user, date, etc.
Take advantage of the scalable nature of ElasticSearch for storing your searchable data. First, consider the servers on which your ES nodes are running on. Do they have enough performant resources (memory, CPU, network, hard drive speed) for your traffic and the size/quantity of your documents? Once you've decided on the server specs, you can simply add more as needed to distribute data and processing.
Next, create your message document structure. I imagine your mapping may look something like this:
"message": {
"properties": {
"id": {
"type": "long"
},
"type": {
"type": "string"
},
"body": {
"type": "string"
},
"from_user": {
"type": "object",
"properties": {
"id": {
"type": "integer"
},
"name": {
"type": "string"
}
}
},
"to_user": {
"type": "object",
"properties": {
"id": {
"type": "integer"
},
"name": {
"type": "string"
}
}
},
"group": {
"type": "object",
"properties": {
"id": {
"type": "integer"
},
"name": {
"type": "string"
}
}
},
"added_on": {
"type": "date"
},
"updated_on": {
"type": "date"
},
"status_id": {
"type": "short"
}
}}
You may want to create custom analyzers for the "body" and "name" fields to customize your search results to fit your expectations. Then it's just a matter of writing queries and using filters/sorts to allow users to search globally or from/to specific users or groups.
After that, you just need to set up a bridge between your database and your ES index for syncing your messages for search. Sync frequency depends on how quickly you want messages to be made available for search.
Well, I truly hope I understood your question correctly. Otherwise, OK...