I am using py-flink to read protobuf and write it to table, protobuf like:
message test {
int id = 1;
string val = 2;
}
Actually, the protobuf desc is too long, so I don't wan't write the schema like
Schema()\.
.field('id', DataTypes.BIGINT())\
.field('val', DataTypes.STRING())
How can I do?
I found I can use pyflink.table.protobuf.ProtobufSchemaConverter to convert the protobuf desc to flink schema like:
ProtobufSchemaConverter().fromDescriptor(desc)
in py-flink 1.10, but I can't find this function in py-flink 1.16.
Related
I have a high-nested dictionary with pandas dataframes like:
{HEAD:
{NameOne:
{TAG : VALUE}
}
{NameTwo : DataFrame}
{NameThree : DataFrame}
}
and I want to send it to MongoDB via PyMongo
client = MongoClient('mylink')
db = client['DB_NAME']
collection = db['COLLECTION_NAME']
file = {...}
collection.insert_one(file)
But I have this error:
bson.errors.InvalidDocument: cannot encode object: he show my
Dataframe here of type:
Pymongo needs to be able to convert each element of the dictionary into something it can store as a BSON document. If you try and insert somehting it can't convert (such as a pandas dataframe), you will see the InvalidDocument exception.
You will have to convert each of the embedded dataframes to something pymongo can encode before you can store the document in MongoDB.
You could start with df.to_dict().
I've seen a wide variety of good answers on how to store JSONB in postgres, but as of Postgres 9.5, we are now able to insert into the existing JSONB array without updating the data in the column. None of the material I can find has this documented anywhere, and since I'm new at SQLAlchemy (and python somewhat), reading the code isn't really helping me as much as I'd like.
I'm using Postgres 10.9, python 3.7 and SQLAlchemy 1.3.8 (with the GINO wrapper)
This is just the latest attempt, but there have been many with a myriad of different errors:
await gathering. \
update(participation=func.jsonb_insert("participation",
"{applications}",
func.to_jsonb(json.dumps(application)))). \
apply()
In the participation column I have a JSONB object
{
"applications": [ { ... element to be appended} ]
}
In this particular case, the code yields an error:
could not determine polymorphic type because input has type unknown
Figured it out..
ajs = json.dumps(application)
await gathering. \
update(participation=func.jsonb_insert(
json.dumps(gathering.participation),
["applications","0"], ajs)). \
apply()
Here's to trial and error. There has GOT to be a more elegant solution than this.
I'm writing an insert function and I have a nested dictionary that I want to insert into a column in postgres, is there a way to insert the whole json into the column? Lets say I have to insert the value of the key "val" into a column, how can I achieve that? I'm using psycopg2 library in my python code.
"val": {
"name": {
"mike": "2.3",
"roy": "4.2"
}
}
Yes, you can extract nested JSON using at least Postgres 9.4 and up by casting your string to JSON and using the "Get JSON object field by key" operator:
YOUR_STRING ::CAST JSON_OPERATOR
'{"val":1}' ::JSON -> 'val'
This works in at least Postgres 9.4 and up:
INSERT INTO my_table (my_json)
VALUES ('"val":{"name":{"mike":"2.3"}}'::JSON->'val');
Depending on your column type you may choose to cast to JSONB instead of JSON (the above will only work for TEXT and JSON).
INSERT INTO my_table (my_json)
VALUES ('"val":{"name":{"mike":"2.3"}}'::JSONB->'val');
See: https://www.postgresql.org/docs/9.5/static/functions-json.html
I am trying to query Azure Table Storage using Python. An int32 datatype column doesn't return its value but returns something like this azure.storage.table.models.EntityProperty obj..... But, in case of string datatype columns, i am not facing any such issues. Could someone please help me ?
The column Pos in below script is an integer column in the table
queryfilter = "startDateTime gt datetime'%s' and temp eq '%s'" % (datefilter, temp)
task = table_service.query_entities(azureTable, filter=queryfilter)
for t in task:
print(t.Pos)
Looking at the documentation here: https://learn.microsoft.com/en-us/python/api/azure.cosmosdb.table.models.entityproperty?view=azure-python, can you try the following?
for t in task: print(t.Pos.value)
Azure Table Storage has a new python library in preview release that is available for installation via pip. To install use the following pip command
pip install azure-data-tables
This SDK is able to target either a Tables or Cosmos endpoint (albeit there are known issues with Cosmos).
The new library uses a similar TableEntity which is a key-value type inheriting from the Python dictionary, the value is the same EntityProperty. There are two ways to access entity properties. If the type is an Int32 (default integer type) or String they can be accessed by:
my_value = entity.my_key # direct access
my_value = entity['my_key'] # same access pattern as a dict
If the EntityProperty is of type INT32 or BINARY then you will have to use the .value notation:
my_value = entity.my_key.value # direct access
my_value = entity['my_key'].value # same access pattern as a dict
FYI I am a full-time engineer at Microsoft on the Azure SDK for Python team.
I am using pymongo
I have a mongo db in which all documents have a
"timestamp" : "25-OCT-2011"
So a string is stored in the key timestamp in all documents.
I want to apply a python function as mentioned below on these string dates and convert them into a datetime object. What's the best way to do this in mongodb?
import datetime
def make_date(str_date):
return datetime.datetime.strptime(str_date, "%d-%b-%Y")
To fit your needs:
import bson
for document in list(database.collection.find({ })):
converted_date = make_date(document['timestamp'])
database.collection.update(
{ "_id": bson.objectid.ObjectId(document['_id']) },
{ "converted": converted_date }
)
I use the ObjectId as a query to be sure that I update the document that I just got. I do that because I'm unsure whether timestamp collisions would lead to unwanted consequences.