youtube api TopicId translating? - python

So I have topicIds for a video
[u'/m/0lbxz:/m/04rlf', u'/m/06m6j', u'/m/017bqr', u'/m/05r5c'].
I am trying to translate the list into something like [people, children...]
Is there an API call that takes in a topicID and returns the pertaining topic?

Those are Freebase topics and you can retrieve them using the Freebase topic API https://developers.google.com/freebase/v1/topic-overview you can also dump and build your own local dataset using freebase dumps https://developers.google.com/freebase/data
However I would advise caution on using this, as freebase API has been deprecated.
At any case, here is a simple example on how to retrieve the data that you are looking for
$ curl "https://www.googleapis.com/freebase/v1/search?query=/m/06m6j&indent=true"
{
"status": "200 OK",
"result": [
{
"mid": "/m/06m6j",
"id": "/en/ragtime",
"name": "Ragtime",
"notable": {
"name": "Musical genre",
"id": "/music/genre"
},
"lang": "en",
"score": 20.126026
}
],
"cost": 4,
"hits": 1
}

Related

AVRO - JSON Enconding of messages in Confluent Python SerializingProducer vs. Confluent Rest Proxy with UNIONS

attached an example AVRO-Schema
{
"type": "record",
"name": "DummySampleAvroValue",
"namespace": "de.company.dummydomain",
"fields": [
{
"name": "ID",
"type": "int"
},
{
"name": "NAME",
"type": [
"null",
"string"
]
},
{
"name": "STATE",
"type": "int"
},
{
"name": "TIMESTAMP",
"type": [
"null",
"string"
]
}
]
}
Regarding the section "JSON Encoding" of the official AVRO-Specs - see: https://avro.apache.org/docs/current/spec.html#json_encoding - a JSON Message which validates against the above AVRO-Schema should look like the following because of the UNION-Types used:
{
"ID":1,
"NAME":{
"string":"Kafka"
},
"STATE":-1,
"TIMESTAMP":{
"string":"2022-04-28T10:57:03.048413"
}
}
When producing this message via Confluent Rest Proxy (AVRO), everything works fine, the data is accepted, validated and present in Kafka.
When using the "SearializingProducer" from the confluent_kafka Python Package, the example message is not accepted and only "regular" JSON works, e. g.:
{
"ID":1,
"NAME":"Kafka",
"STATE":-1,
"TIMESTAMP":"2022-04-28T10:57:03.048413"
}
Is this intended behaviour or am I doing something wrong? Can I tell the SerializingProducer to accept this encoding?
I need to hold open both ways to produce messages but the sending system can/want´s only to provide one of the above Payloads. Is there a way to use both with the same payload?
Thanks in advance.
Best regards

I am getting an error while using Python YouTube API

I am trying to create a Python program to get channel statisitics, but when I run it the YouTube API website gives this output (error):
{
"error": {
"code": 400,
"message": "'statisitcs'",
"errors": [
{
"message": "'statisitcs'",
"domain": "youtube.part",
"reason": "unknownPart",
"location": "part",
"locationType": "parameter"
}
]
}
}
This is my code:
class YTstats:
def __init__(self, api_key, channel_id):
self.api_key = api_key
self.channel_id = channel_id
self.channel_stats = None
def get_channel_statistics(self):
url = f'https://www.googleapis.com/youtube/v3/channels?part=statisitcs&id={self.channel_id}&key={self.api_key}'
print(url)
API_KEY = 'I cannot share my api key so I am not showing it but it is in my code'
yt = YTstats(API_KEY, 'UCbXgNpp0jedKWcQiULLbDTA')
yt.get_channel_statistics()
This problem is fixed now (it was a typo)
You have a typo here part=statisitcs - as you mentioned in your comment - and after looking closely to the code you provided in your question.
Next time, check closely your code and try to replicate the error using the try-it demo feature in the YouTube Data API documentation.
I do get the statistics of the channel_id you provided - that is: UCbXgNpp0jedKWcQiULLbDTA:
These are their statistics:
"statistics": {
"viewCount": "5642720",
"subscriberCount": "98400",
"hiddenSubscriberCount": false,
"videoCount": "158"
}
}
See the demo here.
Here is the full response:
{
"kind": "youtube#channelListResponse",
"etag": "lG-nYlbLnN81gtjVKe1zKPW6v7A",
"pageInfo": {
"totalResults": 1,
"resultsPerPage": 5
},
"items": [
{
"kind": "youtube#channel",
"etag": "j4Fo8qKbWrLHnQYB8sCI8_I4v9A",
"id": "UCbXgNpp0jedKWcQiULLbDTA",
"snippet": {
"title": "Python Engineer",
"description": "Free Python and Machine Learning Tutorials!\n\nHi, I'm Patrick. I’m a passionate Software Engineer who loves Machine Learning, Computer Vision, and Data Science. I create free content in order to help more people get into those fields. If you have any questions, feedback, or comments, just shoot me a message! I am happy to talk to you :)\n\nIf you like my content, please subscribe to the channel!\n\nPlease check out my website for more information:\nhttps://www.python-engineer.com\n\nIf you find these videos useful and would like to support my work you can find me on Patreon:\nhttps://www.patreon.com/patrickloeber\n\nLegal: https://www.python-engineer.com/legal-notice/\n",
"customUrl": "pythonengineer",
"publishedAt": "2019-05-03T11:22:33Z",
"thumbnails": {
"default": {
"url": "https://yt3.ggpht.com/ytc/AKedOLTs-Pvd4mvUi-m2rDLd8bzrKwS5a8C9HnDbkUDzHw=s88-c-k-c0x00ffffff-no-rj",
"width": 88,
"height": 88
},
"medium": {
"url": "https://yt3.ggpht.com/ytc/AKedOLTs-Pvd4mvUi-m2rDLd8bzrKwS5a8C9HnDbkUDzHw=s240-c-k-c0x00ffffff-no-rj",
"width": 240,
"height": 240
},
"high": {
"url": "https://yt3.ggpht.com/ytc/AKedOLTs-Pvd4mvUi-m2rDLd8bzrKwS5a8C9HnDbkUDzHw=s800-c-k-c0x00ffffff-no-rj",
"width": 800,
"height": 800
}
},
"localized": {
"title": "Python Engineer",
"description": "Free Python and Machine Learning Tutorials!\n\nHi, I'm Patrick. I’m a passionate Software Engineer who loves Machine Learning, Computer Vision, and Data Science. I create free content in order to help more people get into those fields. If you have any questions, feedback, or comments, just shoot me a message! I am happy to talk to you :)\n\nIf you like my content, please subscribe to the channel!\n\nPlease check out my website for more information:\nhttps://www.python-engineer.com\n\nIf you find these videos useful and would like to support my work you can find me on Patreon:\nhttps://www.patreon.com/patrickloeber\n\nLegal: https://www.python-engineer.com/legal-notice/\n"
}
},
"contentDetails": {
"relatedPlaylists": {
"likes": "",
"uploads": "UUbXgNpp0jedKWcQiULLbDTA"
}
},
"statistics": {
"viewCount": "5642720",
"subscriberCount": "98400",
"hiddenSubscriberCount": false,
"videoCount": "158"
}
}
]
}
Try to wait a few minutes after the call you made - probably, the API couldn't retrieve the data you requested due to excessive requests or the channel itself didn't have their statistics publicly available.

Override the automatically generated parameters in FastAPI

I am writing FastAPI program that is just a bunch of #app.get endpoints for querying data. There are many, many different query arguments they could use that are automatically generated from a config file, for example the #app.get("/ADJUST_COLOR/") endpoint could look something like /ADJUST_COLOR/?RED_darker=10&BLUE_lighter=43&GREEN_inverse=true where all those parameters are generated from a list of colors and a list of operations to perform on those colors (This is only an example, not what I am actually doing).
The way I am doing that is to take in the request object like this:
#app.get("/ADJUST_COLOR/")
def query_COLORS( request: Request ):
return look_through_parameters(request.query_params)
But the problem is that the automatically generated swagger UI does not show any useful data:
Since I am parsing the request manually there are no parameters generated. But since I have a full list of the parameters I am expecting then I should be able to generate my own documentation and have the UI show it.
I have looked through these two documents: https://fastapi.tiangolo.com/tutorial/path-operation-configuration/
And https://fastapi.tiangolo.com/advanced/path-operation-advanced-configuration/
But I was not able to figure out if it was possible or not
You can define custom api schema in your route via openapi_extra (this is a recent feature of FastAPI, 0.68 will work but I'm not sure the exact earliest version that supports this):
#app.get("/ADJUST_COLOR/", openapi_extra={
"parameters": [
{
"in": "query",
"name": "RED_darker",
"schema": {
"type": "integer"
},
"description": "The level of RED_darker"
},
{
"in": "query",
"name": "BLUE_lighter",
"schema": {
"type": "integer"
},
"description": "The level of BLUE_lighter"
},
{
"in": "query",
"name": "GREEN_inverse",
"schema": {
"type": "boolean"
},
"description": "is GREEN_inverse?"
},
]
})
async def query_COLORS(request: Request):
return look_through_parameters(request.query_params)
Which is rendered like this in your api /docs:

Elasticsearch: return specific fields from within json in .csv/table format

I am a new Elasticsearch user, but I am struggling to accomplish something that was easy for me in Splunk. There are a few specific fields that I want from each event in my search, but the search "hit" outputs are always returned in a big json structure that is 95% useless for me. I do my searches with the python requests module, so I can parse the results I want in python when they return, but I have to access millions of events and performance is important, so I hope there is a faster way.
Here is an example of one single event returned from an Elasticsearch search:
<Response [200]>
{
"hits": {
"hits": [
{
"sort": [
1559438581000
],
"_type": "_doc",
"_source": {
"datapoint": {
"updated_at": "2019-06-02T00:01:02Z",
"value": 102
},
"metadata": {
"id": "AB33",
"property_name": "some_property",
"oem_model": "some_model"
}
},
"_score": null,
"_index": "datapoint-2019.06",
"_id": "datapoint+4+314372003"
},
What I would prefer is for my search to return only results in a table/.csv/dataframe format of the updated_at,value,id,property_name,oem_model values like this:
2019-06-02T00:01:02Z,102,AB33,some_property,some_model
..... and similar for other events ...
Does anyone know if this is possible to do with Elasticsearch or with the requests library without parsing the json after the search output is returned? Thank you very much for any help.
Yes, sure with the source filtering. Doc here
You filter the field to be returned from your query, so in this way tou choose only the useful fields and then you should not parse the json. Have a look here:
from elasticsearch import Elasticsearch
es = Elasticsearch()
query = {
"_source": [ "obj1.*", "obj2.*" ], #this is the list of the fields that you would return as a doc
"query" : {
"term" : { "user" : "kimchy" }
}
}
res = es.search(index="your_index_name", body=query)

How to validate a json structure in Python3

I am using Python3.5 and Django for a web api.When I refer to input, I refer to a HTTP request parameters. I have a parameter where I am expecting a JSON data which I need to validate before processing further.
I have a base json structure that the input has to be in.
Example,
{
"error": "bool",
"data": [
{
"name": "string",
"age": "number"
},
{
"name": "string",
"age": "number"
},
...
]
}
The above JSON represents the structure that I want my input to be in. The keys are predefined, and the value represents the datatype of that key that I am expecting. I came across a Python library(jsonschema) that does this validation, but I can't find any documentation where it works with dynamic data. i.e. the objects inside the JSON array 'data' can be of any number, of course this is the most simple scenario I came up with for explaining the basic requirement. In cases like these, how can I validate my json?
The solution here didn't help because it's just checking if the json is proper or not based on the Django model. My json has no relation with Django model. Its a simple json structure. It still doesn't tell me how to validate dynamic object
JSON Schema is a specification for validating JSON; jsonschema is just a Python library that implements it. It certainly does allow you to specify that a key can contain any number of elements.
An example of a JSON Schema that validates your code might be:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"additionalProperties": false,
"required": [
"error",
"data"
],
"properties": {
"error": {
"type": "boolean"
},
"data": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": false,
"properties": {
"name": {
"type": "string"
},
"age": {
"type": "integer"
}
}
}
}
}
}
See https://spacetelescope.github.io/understanding-json-schema/ for a good overview
Take a look into the documentation of Python's JSON API. I believe json.tool is what you're looking for, however there are a couple of other ways to validate JSON using that API.

Categories