Matching the JSON schema while receiving the data from client - python

I wrote a Flask REST implementation to receive the following data.
After checking the API key from the client, the server should store the data which comes in the following API definition. The issue, I am facing is, I have got many strings under the same field 'services' where I would appreciate any help.
{
"id": "string",
"termsAndConditions": "string",
"offererBranchId": "string",
"requesterBranchId": "string",
"accepted": "2017-05-24T10:06:31.012Z",
"services": [
{
"id": "string",
"name": "string",
"aggregationLevel": [
"string"
],
"aggregationMethod": [
"string"
],
"timestep": [
"string"
]
]
}
}
My code is below, if the field name 'services' has a single string with it, like the other ones (i.e "id","termsAndConditions" etc.).
from flask_pymongo import PyMongo
import json
app = Flask(__name__)
app.config['MONGO_DBNAME'] = 'demo'
app.config['MONGO_URI'] = 'mongodb://xxxx#xxxx.mlab.com:xxxx/demo'
mongo = PyMongo(app)
users = mongo.db.users
#app.route('/service-offer/confirmed/REQUESTER',methods=['POST'])
def serviceofferconfirmed():
key = request.headers.get('X-API-Key')
users=mongo.db.users
api_record=users.find_one({'name':"apikey"})
actual_API_key=api_record['X-API-Key']
if key==actual_API_key:
offer={"id": request.json["id"],
"termsAndConditions":request.json["termsAndConditions"],
"offererBranchId":request.json["offererBranchId"],
"requesterBranchId": request.json["requesterBranchId"],
"accepted":request.json["accepted"],
"services":request.json["services"] # Here I need help to match the schema.
}
users.insert(offer)
return "Service Data Successfully Stored"
return jsonify("Pleae check your API Key or URL")
I wish to receive the whole data which are many strings and store the data under the field name 'services'.

you can use isinstance("str", request.json["services"])
if you don't want value as string for services
if not isinstance("str", request.json["services"]):
// your code..........

Related

how to call python function by getting mongob collection values

how to create document and collection in mongodb to make python code configuration. Get attribute name, datatype, function to be called from mongodb ?
mongodb collection sample example
db.attributes.insertMany([
{ attributes_names: "email", attributes_datype: "string", attributes_isNull="false", attributes_std_function = "email_valid" }
{ attributes_names: "address", attributes_datype: "string", attributes_isNull="false", attributes_std_function = "address_valid" }
]);
Python script and function
def email_valid(df):
df1 = df.withColumn(df.columns[0], regexp_replace(lower(df.columns[0]), "^a-zA-Z0-9#\._\-| ", ""))
extract_expr = expr(
"regexp_extract_all(emails, '(\\\w+([\\\.-]?\\\w+)*#\\[A-Za-z\-\.]+([\\\.-]?\\\w+)*(\\\.\\\w{2,3})+)', 0)")
df2 = df1.withColumn(df.columns[0], extract_expr) \
.select(df.columns[0])
return df2
How to get all the mongodb values in python script and call the function according to attribues.
To create MongoDB collection from a python script :
import pymongo
# connect to your mongodb client
client = pymongo.MongoClient(connection_url)
# connect to the database
db = client[database_name]
# get the collection
mycol = db[collection_name]
from bson import ObjectId
from random_object_id import generate
# create a sample dictionary for the collection data
mydict = { "_id": ObjectId(generate()),
"attributes_names": "email",
"attributes_datype": "string",
"attributes_isNull":"false",
"attributes_std_function" : "email_valid" }
# insert the dictionary into the collection
mycol.insert_one(mydict)
To insert multiple values in the MongoDB, use insert_many() instead of insert_one() and pass the list of dictionary to it. So your list of dictionary will look like this
mydict = [{ "_id": ObjectId(generate()),
"attributes_names": "email",
"attributes_datype": "string",
"attributes_isNull":"false",
"attributes_std_function" : "email_valid" },
{ "_id": ObjectId(generate()),
"attributes_names": "email",
"attributes_datype": "string",
"attributes_isNull":"false",
"attributes_std_function" : "email_valid" }]
To get all the data from MongoDB collection into python script :
data = list()
for x in mycol.find():
data.append(x)
import pandas as pd
data = pd.json_normalize(data)
And then access the data as you access an element of a list of dictionaries:
value = data[0]["attributes_names"]

Handing Special JSON characters in REDIS using python

I am trying to write and retrive a JSON with Special characters to REDIS but the special characters are getting converted
The special character Mój is getting converted to Mój and Můj is converted to Můj
from rejson import Client, Path
import json
rj = Client(host='localhost', port=6360, decode_responses=True)
app_details2 = {
"applist": [
{
"appname": "Mój",
"country": "PL"
},
{
"appname": "Můj",
"country": "CZ"
}
],
"lasttimestamp": "2021-01-03 12:58:26",
"loadtype":"F"
}
rj.jsonset('app_details', Path.rootPath(),app_details)
valo = rj.jsonget('app_details',Path('.applist'))
print(type(valo[0]))
print(valo)
for i in valo:
app = i["appname"]
country = i["country"]
print(app)
The issue was resolved by adding a extra parameter to the JSONGET
valo = rj.jsonget('app_details',Path('.applist'),no_escape=True)
This solved the problem and data is getting fetched properly

bson.errors.InvalidDocument: key '$numberDecimal' must not start with '$' when using json

I have a small json file, with the following lines:
{
"IdTitulo": "Jaws",
"IdDirector": "Steven Spielberg",
"IdNumber": 8,
"IdDecimal": "2.33"
}
An there is a schema in my db collection, named test_dec. This is what I've used to create the schema:
db.createCollection("test_dec",
{validator: {
$jsonSchema: {
bsonType: "object",
required: ["IdTitulo","IdDirector"],
properties: {
IdTitulo: {
"bsonType": "string",
"description": "string type, nombre de la pelicula"
},
IdDirector: {
"bsonType": "string",
"description": "string type, nombre del director"
},
IdNumber : {
"bsonType": "int",
"description": "number type to test"
},
IdDecimal : {
"bsonType": "decimal",
"description": "decimal type"
}
}
}}
})
I've made multiple attempts to insert the data. The problem is in the IdDecimal field value.
Some of the trials, replacing the IdDecimal line by:
"IdDecimal": 2.33
"IdDecimal": {"$numberDecimal": "2.33"}
"IdDecimal": NumberDecimal("2.33")
None of them work. The second one is the formal solution provided by MongoDB manuals (mongodb-extended-json) adn the error is the output I've placed in my question: bson.errors.InvalidDocument: key'$numberDecimal' must not start with '$'.
I am currently using a python to load the json. I've been playing around with this file:
import os,sys
import re
import io
import json
from pymongo import MongoClient
from bson.raw_bson import RawBSONDocument
from bson.json_util import CANONICAL_JSON_OPTIONS,dumps,loads
import bsonjs as bs
#connection
client = MongoClient('localhost',27018,document_class=RawBSONDocument)
db = client['myDB']
coll = db['test_dec']
other_col = db['free']
for fname in os.listdir('/mnt/win/load'):
num = re.findall("\d+", fname)
if num:
with io.open(fname, encoding="ISO-8859-1") as f:
doc_data = loads(dumps(f,json_options=CANONICAL_JSON_OPTIONS))
print(doc_data)
test = '{"idTitulo":"La pelicula","idRelease":2019}'
raw_bson = bs.loads(test)
load_raw = RawBSONDocument(raw_bson)
db.other_col.insert_one(load_raw)
client.close()
I am using a json file. If I try to parse anything like Decimal128('2.33') the output is "ValueError: No JSON object could be decoded", because my json has an invalid format.
The result of
db.other_col.insert_one(load_raw)
Is that the content of "test" is inserted.
But I cannot use doc_data with RawBSONDocument, because it goes like that. It says:
TypeError: unpack_from() argument 1 must be string or buffer, not list:
When I manage to parse the json directly to the RawBSONDocument I got all the trash within and the record in database looks like the sample here:
{
"_id" : ObjectId("5eb2920a34eea737626667c2"),
"0" : "{\n",
"1" : "\t\"IdTitulo\": \"Gremlins\",\n",
"2" : "\t\"IdDirector\": \"Joe Dante\",\n",
"3" : "\t\"IdNumber\": 6,\n",
"4" : "\"IdDate\": {\"$date\": \"2010-06-18T:00.12:00Z\"}\t\n",
"5" : "}\n"
}
It seems it is not that simple to load a extended json into MongoDB. The extended version is because I want to use schema validation.
Oleg pointed out that is numberDecimal and not NumberDecimal as I had it before. I've fixed the json file, but nothing changed.
Executed:
with io.open(fname, encoding="ISO-8859-1") as f:
doc_data = json.load(f)
coll.insert(doc_data)
And the json file:
{
"IdTitulo": "Gremlins",
"IdDirector": "Joe Dante",
"IdNumber": 6,
"IdDecimal": {"$numberDecimal": "3.45"}
}
One more roll of the dice from me. If you are using schema validation as you are, I would recommend defining a class and being explicit with defining each field and how you propose to convert the field to the relevant python datatypes. While your solution is generic, the data structure has to be rigid to match the validation.
IMO this is clearer and you have control over any errors etc within the class.
Just to confirm I ran the schema validation and this works with the supplied validation.
from pymongo import MongoClient
import bson.json_util
import dateutil.parser
import json
class Film:
def __init__(self, file):
data = file.read()
loaded = json.loads(data)
self.IdTitulo = loaded.get('IdTitulo')
self.IdDirector = loaded.get('IdDirector')
self.IdDecimal = bson.json_util.Decimal128(loaded.get('IdDecimal'))
self.IdNumber = int(loaded.get('IdNumber'))
self.IdDateTime = dateutil.parser.parse(loaded.get('IdDateTime'))
def insert_one(self, collection):
collection.insert_one(self.__dict__)
client = MongoClient()
mycollection = client.mydatabase.test_dec
with open('c:/temp/1.json', 'r') as jfile:
film = Film(jfile)
film.insert_one(mycollection)
gives:
> db.test_dec.findOne()
{
"_id" : ObjectId("5eba79eabf951a15d32843ae"),
"IdTitulo" : "Jaws",
"IdDirector" : "Steven Spielberg",
"IdDecimal" : NumberDecimal("2.33"),
"IdNumber" : 8,
"IdDateTime" : ISODate("2020-05-12T10:08:21Z")
}
>
JSON file used:
{
"IdTitulo": "Jaws",
"IdDirector": "Steven Spielberg",
"IdNumber": 8,
"IdDecimal": "2.33",
"IdDateTime": "2020-05-12T11:08:21+0100"
}
JSON with type information is called Extended JSON. Following the examples, construct extended json for your data:
ext_json = '''
{
"IdTitulo": "Jaws",
"IdDirector": "Steven Spielberg",
"IdNumber": 8,
"IdDecimal": {"$numberDecimal":"2.33"}
}
'''
In Python, use json_util to load extended json into a Python dictionary:
from bson.json_util import loads
doc = loads(ext_json)
print(doc)
# {u'IdTitulo': u'Jaws', u'IdDirector': u'Steven Spielberg', u'IdDecimal': Decimal128('2.33'), u'IdNumber': 8}
The result of this load is sometimes referred to as a "BSON document" but it is not BSON, which is binary. "BSON" in this context really means that some values are not of python standard library types. The "document" part basically means the object is a dictionary.
You will notice that IdNumber is of a non-standard library type:
print type(doc['IdDecimal'])
# <class 'bson.decimal128.Decimal128'>
To insert this dictionary into MongoDB, follow pymongo tutorial:
from pymongo import MongoClient
client = MongoClient('localhost', 14420)
db = client.test_database
collection = db.test_collection
collection.insert_one(doc)
print(doc)
Finally, I've got the solution and it is using RawBSONDocument.
First the json file:
{
"IdTitulo": "Dead Snow",
"IdDirector": "Tommy Wirkola",
"IdNumber": 11,
"IdDecimal": {"$numberDecimal": "2.22"}
}
& the validation schema file:
db.createCollection("test_dec",
{validator: {
$jsonSchema: {
bsonType: "object",
required: ["IdTitulo","IdDirector"],
properties: {
IdTitulo: {
"bsonType": "string",
"description": "string type, nombre de la pelicula"
},
IdDirector: {
"bsonType": "string",
"description": "string type, nombre del director"
},
IdNumber : {
"bsonType": "int",
"description": "number type to test"
},
IdDecimal : {
"bsonType": "decimal",
"description": "decimal type"
}
}
}}
})
So, the collection in this case is "test_dec".
And the python script that opens the file ".json", reads it and parses it to be imported into MongoDB.
import json
from bson.raw_bson import RawBSONDocument
from pymongo import MongoClient
import bsonjs
#connection
client = MongoClient('localhost',27018)
db = client['movieDB']
coll = db['test_dec']
#open an read file
with open('1.json', 'r') as jfile:
data = jfile.read()
loaded = json.loads(data)
dumped = json.dumps(loaded, indent=4)
bson_bytes = bsonjs.loads(dumped)
coll.insert_one(RawBSONDocument(bson_bytes))
client.close()
The inserted document:
{
"_id" : ObjectId("5eb971ec6fbab859dfae8a6f"),
"IdTitulo" : "Dead Snow",
"IdDirector" : "Toomy Wirkola",
"IdDecimal" : NumberDecimal("2.22"),
"IdNumber" : 11
}
I don't know how it flipped the fields IdDecimal and IdNumber, but it passes the validation and I am really happy.
I tried a document with 'hello' instead of a number in NumberDecimal and the insertion resulted in:
{
"_id" : ObjectId("5eb973b76fbab859dfae8ecd"),
"IdTitulo" : "Shining",
"IdDirector" : "Stanley Kubrick",
"IdDecimal" : NumberDecimal("NaN"),
"IdNumber" : 19
}
Thanks to all that tried to help. Specially Oleg!!! Thank you for being so patient.
Could you not just use bson.decimal128.Decimal128? Ot am I missing something?
from pymongo import MongoClient
from bson.decimal128 import Decimal128
db = MongoClient()['mydatabase']
data = {
"IdTitulo": "Jaws",
"IdDirector": "Steven Spielberg",
"IdNumber": 8,
"IdDecimal": "2.33"
}
data['IdDecimal'] = Decimal128(data['IdDecimal'])
db.other_col.insert_one(data)

How to accept None for String type field when using Flask-RESTPlus

I am just starting develop with flask-restplus and I am not a native speaker,
but I will try to describe my question as clear as I can.
I know there is a fields module in flask that help us define and filter response data type,
such as String, Integer, List and so on.
Is there any way to allow NULL / None when using fields module?
the following is my code that using field module to catch the value,
add_group = api.model(
"add_group",
{"team_groups": fields.List(fields.Nested(api.model("team_groups", {
"name": fields.String(example="chicago bulls", description="name of add group"),
"display_name": fields.String(example="bulls", description="display name of add group")})))})
and if the data type of display_name is not String, there would be the following error raised,
{
"errors": {
"team_groups.0.display_name": "123 is not of type 'string'"
},
"message": "Input payload validation failed"
}
what I want is when entering display_name, I can enter bulls or None
It seems few of the reference data / questions can be found, and I only found one result related
to my question, but eventually converting as non-null value to solve the issue.
if there is any part of my question not much clear,
please let me know, thank you.
the following is my develop environment:
flask-restplus 0.13.0
Python 3.7.4
postman 7.18.1
The following is my updated code:
from flask_restplus import Namespace, fields
class NullableString(fields.String):
__schema_type__ = ['string', 'null']
__schema_example__ = 'nullable string'
class DeviceGroupDto:
api = Namespace("device/group", description="device groups")
header = api.parser().add_argument("Authorization", location="headers", help="Bearer ")
get_detail_group = api.model(
"getdetail",
{"team_groups": fields.List(fields.String(required=True,
description="team group id to get detail", example=1))})
add_group = api.model(
"add_group",
{"team_groups": fields.List(fields.Nested(api.model("team_groups", {
"name": fields.String(example="chicago bulls", description="name of add group"),
"display_name": NullableString(attribute='a')})))})
if I input the following payload: (by postman)
{
"team_groups": [
{
"name": "chicago bulls",
"display_name": null
}
]
}
It still returns:
{
"errors": {
"team_groups.0.display_name": "None is not of type 'string'"
},
"message": "Input payload validation failed"
}
Yes, you can create a child class and use it instead of default ones, which will accept None as well
class NullableString(fields.String):
__schema_type__ = ['string', 'null']
__schema_example__ = 'nullable string'
So your code will look like
{ "property": NullableString(attribute=value)}
Additionally you can visit the issue github.com/noirbizarre/flask-restplus/issues/179
if some of your fields are optional then make required=False
add_group = api.model(
"add_group",
{"team_groups": fields.List(fields.Nested(api.model("team_groups", {
"name": fields.String(example="chicago bulls", description="name of add group"),
"display_name": fields.String(example="bulls", description="display name of add group", required=False)})))})
Here's slightly evolved approach that I use. It lets you have fields of any type as nullable.
def nullable(fld, *args, **kwargs):
"""Makes any field nullable."""
class NullableField(fld):
"""Nullable wrapper."""
__schema_type__ = [fld.__schema_type__, "null"]
__schema_example__ = f"nullable {fld.__schema_type__}"
return NullableField(*args, **kwargs)
employee = api.model(
"Employee",
{
"office": nullable(fields.String),
"photo_key": nullable(fields.String, required=True),
},
)

python - Assigning a string to a field object

I'm working on Google BigQuery's python client.
I'm writing a program to automate the creation/exporting of tables.
Everything is working fine but there is a slight problem - I'd like to take in the schema of the table as an input from the user.
Here's how the table schema is assigned currently:
table.schema = (
bigquery.SchemaField('Name', 'STRING'),
bigquery.SchemaField('Gender', 'STRING'),
bigquery.SchemaField('Frequency', 'INTEGER')
)
It's hardcoded in the code.
I've written the code to process the user input and convert it to the format mentioned above.
What my code returns is a string - bq_schema - which looks like:
bigquery.SchemaField(Name, STRING),bigquery.SchemaField(Gender, STRING),bigquery.SchemaField(Frequency, INTEGER)
Now when I try to assign this string to the table schema,
table.schema = (bq_schema)
I get an error stating Schema items must be fields
So how do I make the table schema dynamic depending on the user input?
EDIT: As requested, here the code for converting user input to string:
s_raw = raw_input('\n\nEnter the schema for the table in the following format- fieldName:TYPE, anotherField:TYPE\n')
s = s_raw.split(',')
schema = []
for obj in s:
temp = obj.split(':')
schema.append(temp[0])
schema.append(temp[1])
bq_schema = ''
for i in range(0, len(schema), 2):
bq_schema+=('bigquery.SchemaField(\'{}\', \'{}\'),'.format(schema[i], schema[i+1]))
To define how a field behaves in BigQuery you basically need 3 inputs: name, type and the mode.
One issue you might find when processing the input schema is managing fields of type RECORD because each of those fields have defined inside of them other fields.
This being the case, it would be somewhat difficult for a user to give you in strings what schema setup he's working with.
What I recommend you to do therefore is to receive a JSON like input data with the correspondent schema. For instance, you could receive this as input:
[{"name": "sku", "type": "INT64", "mode": "NULLABLE"},
{"name": "name", "type": "STRING", "mode": "NULLABLE"},
{"name": "type", "type": "STRING", "mode": "NULLABLE"},
{"fields":
[{"name": "id", "type": "STRING", "mode": "NULLABLE"}, {"name": "name", "type": "STRING", "mode": "NULLABLE"}],
"name": "category", "type": "RECORD", "mode": "REPEATED"},
{"name": "description", "type": "STRING", "mode": "NULLABLE"},
{"name": "manufacturer", "type": "STRING", "mode": "NULLABLE"}]
Notice that this JSON fully defines a schema in a straightforward manner. The field "category" is of type "RECORD" and it contains the schema definitions for each of its child fields, that is, "name" and "id".
In Python, all you'd have to do is processing this JSON input them. This function might do the trick:
from google.cloud.bigquery import SchemaField
def get_schema_from_json(input_json, res=[]):
if not input_json:
return res
cur = input_json.pop()
res.append(SchemaField(name=cur['name'], field_type=cur['type'], mode=cur['mode'], fields = None if 'fields' not in cur else get_schema_from_json(cur['fields'], [])))
return get_schema_from_json(input_json, res)
And then you just can extract the schema like so:
table.schema = get_schema_from_json(user_json_input_data)
Question: What my code returns is a string - bq_schema
What you are facing ist the __str__ Represantation of the bigquery object.
Change the following:
bq_schema = []
for i in range(0, len(schema), 2):
bq_schema.append(SchemaField(schema[i], schema[i+1]))

Categories