Keep jsonschema from always making requests to URI - python

Background
I am trying to validate a JSON file using jsonchema. However, the library is trying to make a GET request and I want to avoid that.
from jsonschema import validate
point_schema = {
"$id": "https://example.com/schemas/point",
"type": "object",
"properties": {"x": {"type": "number"}, "y": {"type": "number"}},
"required": ["x", "y"],
}
polygon_schema = {
"$id": "https://example.com/schemas/polygon",
"type": "array",
"items": {"$ref": "https://example.com/schemas/point"},
}
a_polygon = [{'x': 1, 'y': 2}, {'x': 3, 'y': 4}, {'x': 1, 'y': 2}]
validate(instance=a_polygon, schema=polygon_schema)
Error
I am trying to connect both schemas using a $ref key from the spec:
https://json-schema.org/understanding-json-schema/structuring.html?highlight=ref#ref
Unfortunately for me, this means the library will make a GET request to the URI specified and try to decode it:
Traceback (most recent call last):
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 777, in resolve_from_url
document = self.resolve_remote(url)
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 860, in resolve_remote
result = requests.get(uri).json()
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/requests/models.py", line 910, in json
return complexjson.loads(self.text, **kwargs)
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 932, in validate
error = exceptions.best_match(validator.iter_errors(instance))
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/exceptions.py", line 367, in best_match
best = next(errors, None)
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 328, in iter_errors
for error in errors:
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/_validators.py", line 81, in items
for error in validator.descend(item, items, path=index):
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 344, in descend
for error in self.iter_errors(instance, schema):
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 328, in iter_errors
for error in errors:
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/_validators.py", line 259, in ref
scope, resolved = validator.resolver.resolve(ref)
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 766, in resolve
return url, self._remote_cache(url)
File "/home/user/anaconda3/envs/myapp-py/lib/python3.7/site-packages/jsonschema/validators.py", line 779, in resolve_from_url
raise exceptions.RefResolutionError(exc)
jsonschema.exceptions.RefResolutionError: Expecting value: line 1 column 1 (char 0)
I don't want this, I just want the polygon schema to reference the point schema that is right above (as for this purpose, a polygon is a list of points).
In fact, these schemas are in the same file.
Questions
I could always do the following:
point_schema = {
"$id": "https://example.com/schemas/point",
"type": "object",
"properties": {"x": {"type": "number"}, "y": {"type": "number"}},
"required": ["x", "y"],
}
polygon_schema = {
"$id": "https://example.com/schemas/polygon",
"type": "array",
"items": point_schema,
}
And this would technically work.
However I would simply be build a bigger dictionary and I would not be using the spec as it was designed to.
How can I use the spec to solve my problem?

You have to provide your other schemas to the implementation.
With this implementation, you must provide a RefResolver to the validate function.
You'll need to either provide a single base_uri and referrer (the schema), or a store which contains a dictionary of URI to schema.
Additionally, you may handle protocols with a function.
Your RefResolver would look like the following...
refResolver = jsonschema.RefResolver(referrer=point_schema, base_uri='https://example.com/schemas/point'

Related

Why am I getting a JSONDecodeError when trying to load a JSON file in Python?

I'm trying to load a JSON file in to Python, but it's giving me an JSONDecodeError error which seems to suggest the file is empty... which I know is not true.
I've reviewed similar questions I can find on here (65466227 & 62286816), and while one or two provide useful alternatives... that's not really the point. I feel like what I'm doing should work but it doesn't.
I put my JSON in to jsonlint.com and it confirmed that my JSON is indeed valid.
Error message/traceback:
Traceback (most recent call last):
File "utils/fmt_test.py", line 62, in <module>
test_cases_json = json.load(jf)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
My Python code:
test_case_file, _ = os.path.splitext(fmt_file)
test_case_file = f"{test_case_file}.json"
if os.path.isfile(test_case_file):
with open(test_case_file, "r") as jf:
test_cases_json = json.load(jf)
My JSON:
{
"tests": [
{
"id": "746fd83s23dy",
"event": "%t LFA_DUMMYSTRING LFA_DUMMYSTRING TEST LFA_DUMMYSTRING1 LFA_DUMMYSTRING2 TEST2 LFA_DUMMYSTRING",
"match": "True"
},
{
"id": "990gb98s34dm",
"event": "%t LFA_DUMMYSTRING LFA_DUMMYSTRING1 LFA_DUMMYSTRING2",
"match": "True"
},
{
"id": "100ma09x31ui",
"event": "%t localhost LFA_DUMMYSTRING1 LFA_DUMMYSTRING2 TEST3 LFA_DUMMYSTRING1 LFA_DUMMYSTRING2",
"match": "True"
}
]
}
Any help is much appreciated, thanks.
There is likely a UTF-8 BOM encoded at the beginning of the file since it is complaining about the first byte. Open with encoding='utf-8-sig' and it will be removed if present:
>>> import json
>>> data = {}
>>> with open('test.json','w',encoding='utf-8-sig') as f: # write data with BOM
... json.dump(data,f)
...
>>> with open('test.json') as f: # read it back
... data2 = json.load(f)
...
Traceback (most recent call last):
<traceback removed>
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) # same error
>>> with open('test.json',encoding='utf-8-sig') as f:
... data2 = json.load(f)
...
>>> # worked

json.loads() - json.decoder.JSONDecodeError: Expecting value

I should read a json file located in the same folder as the python file.
The code is this:
import json
import os
with open(os.path.join(os.path.dirname(__file__), 'datasets.json'), 'r') as f:
dataset = json.loads(f.read())
This is the error:
Traceback (most recent call last):
File "Desktop/proj/ai/index.py", line 6, in <module>
dataset = json.loads(f.read())
File "/opt/anaconda3/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/opt/anaconda3/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/anaconda3/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 14 column 1 (char 284)
This is the JSON:
[
{
"name": "linear1",
"values": [[1,3],[2,5],[3,7]]
},
{
"name": "linear2",
"values": [[1.1,2],[2.1,4.3],[2.9,6.4],[4.1,7.9],[5.2,9.7],[6.4,12],[6.5,13.3],[8,15.9],[8.9,18.1],[9.7,20.4]]
},
{
"name": "parabolic1",
"values": [[1,1],[2,4],[3,9]]
},
]
You JSON is incorrect,See where i have removed a comma in the JSON below
[
{
"name": "linear1",
"values": [[1,3],[2,5],[3,7]]
},
{
"name": "linear2",
"values": [[1.1,2],[2.1,4.3],[2.9,6.4],[4.1,7.9],[5.2,9.7],[6.4,12],[6.5,13.3],[8,15.9],[8.9,18.1],[9.7,20.4]]
},
{
"name": "parabolic1",
"values": [[1,1],[2,4],[3,9]]
} < ---- Removed This Comma
]
What is the contents of your json file? It is probably some incorrect formatting in there. There are various json validators online like https://jsonlint.com/ which can help check issues with these.

Why MongoEngine/pymongo giving error when trying to access object first time only

I have defined MongoEngine classes which are mapped with MongoDB. When I am trying to access the data using MongoEngine, at the specific code it is failing at first attempt but successfully returns data in the second attempt with the same code. Executing the code in python terminal
from Project.Mongo import User
user = User.objects(username = 'xyz#xyz.com').first()
from Project.Mongo import Asset
Asset.objects(org = user.org)
Last line from code generating the following error in first attempt.
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.5/dist-packages/mongoengine/queryset/manager.py", line 37, in get
queryset = queryset_class(owner, owner._get_collection())
File "/usr/local/lib/python3.5/dist-packages/mongoengine/document.py", line 209, in _get_collection
cls.ensure_indexes()
File "/usr/local/lib/python3.5/dist-packages/mongoengine/document.py", line 765, in ensure_indexes
collection.create_index(fields, background=background, **opts)
File "/usr/local/lib/python3.5/dist-packages/pymongo/collection.py", line 1754, in create_index
self.__create_index(keys, kwargs, session, **cmd_options)
File "/usr/local/lib/python3.5/dist-packages/pymongo/collection.py", line 1656, in __create_index
session=session)
File "/usr/local/lib/python3.5/dist-packages/pymongo/collection.py", line 245, in _command
retryable_write=retryable_write)
File "/usr/local/lib/python3.5/dist-packages/pymongo/pool.py", line 517, in command
collation=collation)
File "/usr/local/lib/python3.5/dist-packages/pymongo/network.py", line 125, in command
parse_write_concern_error=parse_write_concern_error)
File "/usr/local/lib/python3.5/dist-packages/pymongo/helpers.py", line 145, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: Index: { v: 2, key: { org: 1, _fts: "text", _ftsx: 1 }, name: "org_1_name_content_text_description_text_content_text_tag_content_text_remote.source_text", ns: "digitile.asset", weights: { content: 3, description: 1, name_content: 10, remote.owner__name: 20, remote.source: 2, tag_content: 2 }, default_language: "english", background: false, language_override: "language", textIndexVersion: 3 } already exists with different options: { v: 2, key: { org: 1, _fts: "text", _ftsx: 1 }, name: "org_1_name_text_description_text_content_text_tag_content_text_remote.source_text", ns: "digitile.asset", default_language: "english", background: false, weights: { content: 3, description: 1, name: 10, remote.owner__name: 20, remote.source: 2, tag_content: 2 }, language_override: "language", textIndexVersion: 3 }
When I try same last line second time, it produces accurate result
I am using python 3.5.2
pymongo 3.7.2
mongoengine 0.10.6
The first time you call .objects on a document class, mongoengine tries to create the indexes if they don't exist.
In this case it fails during the creation of an index on the asset collection (detail of indexes are taken from your Asset/User Document classes) as you can see in the error message:
pymongo.errors.OperationFailure: Index: {...new index details...} already exists with different options {...existing index details...}.
The second time you make that call, mongoengine assumes that the indexes were created and isn't attempting to create it again, which explains why the second call passes.

How to fix ValueError: Expecting property name: line 4 column 1 (char 43)

when I tr to run python manage.py runserver code is giving error . And its traceback is strange ,
I tried
JSON ValueError: Expecting property name: line 1 column 2 (char 1)
and all similar questions but didn't get what exactly I am facing.
Traceback (most recent call last):
File "manage.py", line 22, in <module>
execute_from_command_line(sys.argv)
File "/home/tousif/.local/lib/python2.7/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
utility.execute()
File "/home/tousif/.local/lib/python2.7/site-packages/django/core/management/__init__.py", line 308, in execute
settings.INSTALLED_APPS
File "/home/tousif/.local/lib/python2.7/site-packages/django/conf/__init__.py", line 56, in __getattr__
self._setup(name)
File "/home/tousif/.local/lib/python2.7/site-packages/django/conf/__init__.py", line 41, in _setup
self._wrapped = Settings(settings_module)
File "/home/tousif/.local/lib/python2.7/site-packages/django/conf/__init__.py", line 110, in __init__
mod = importlib.import_module(self.SETTINGS_MODULE)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/home/tousif/Desktop/ITP/ITP/itpcrm/itpcrm/settings.py", line 55, in <module>
cfg = json.loads(open('/home/tousif/Desktop/ITP/ITP/itpcrm/config.json', 'r').read())
File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 4 column 1 (char 43)
my config.json file which containt credentials etc (I have changed credentials to post here ) .And got this file from live server where its working fine but on local its giving this error.
{
"dev": {
"db": {
​
"ENGINE": "django.db.backends.mysql",
"NAME": "itpcrm",
"USER": "root",
"PASSWORD": "password",
"HOST": "localhost",
"PORT": "3306"
},
"jwt_key": "GRESDFwef3452fwefer",
"voice_api_url": "http://192.112.255.32:9040",
"voice_api_key": "3123",
"auth_api_key": "379h4f73f",
"provisioner_api_key": "abc",
"quote_approval_url": "http://192.112.255.145:9998/quotes/customer-approval?token=",
"docusign_base_url": "https://demo.docusign.net/restapi",
"docusign_integrator_key": "8a256bde-405b",
"docusign_oauth_base_url": "account-d.docusign.com",
"docusign_redirect_uri": "http://192.112.255.145:9998/api/callbacks/docusign",
"docusign_private_key_filename": "/home/itp/docusign-examples/keys/docusign_private_key.txt",
"docusign_user_id": "7f2444f-ae99-54922fec68f6",
"docusign_user_name": "dor.com"
},
"prod": {
"db": {
​
"ENGINE": "django.db.backends.mysql",
"NAME": "itp",
"USER": "it",
"PASSWORD": "password",
"HOST": "192.168.3.111",
"PORT": "3306"
},
"jwt_key": "rRregrgERg54g564heRGRfdsger",
"voice_api_url": "https://api.crm.itpscorp.com/itpvoice",
"voice_api_key": "abc1",
"auth_api_key": "379h4f73f3279fy927yf928oowqofabdbf",
"provisioner_api_key": "abc123123",
"quote_approval_url": "http://192.112.255.145:9998/quotes/customer-approval?token=",
"docusign_base_url": "https://demo.docusign.net/restapi",
"docusign_integrator_key": "8a256bde-405b-4032-bf24-be0245631f03",
"docusign_oauth_base_url": "account-d.docusign.com",
"docusign_redirect_uri": "http://192.112.255.145:9998/api/callbacks/docusign",
"docusign_private_key_filename": "/home/itp/docusign-examples/keys/docusign_private_key.txt",
"docusign_user_id": "7f26f6bb-8a39-444f-ae99-54922fec68f6",
"docusign_user_name": "docusign#itpfiber.com"
},
"mode": "dev"
}
The empty lines after "db" starts with the unicode codepoint 0x200B ('ZERO WIDTH SPACE'). That is what trips up the JSON decoder.
I copied the text into gvim and made a screenshot. See below.
Remove those characters (or the whole line) and it works...
(Looking at the JSON file with a hex editor would also show the problem clearly.)
If you look closely at the error message, you can that this correctly identifies the problem:
ValueError: Expecting property name: line 4 column 1 (char 43)
The moral of this story: look out for whitespace codepoints.

OverflowError: MongoDB can only handle up to 8-byte ints?

I have spent the last 12 hours scouring the web. I am completely lost, please help.
I am trying to pull data from an API endpoint and put it into MongoDB. The data looks like this:
{"_links": {
"self": {
"href": "https://us.api.battle.net/data/sc2/ladder/271302?namespace=prod"
}
},
"league": {
"league_key": {
"league_id": 5,
"season_id": 37,
"queue_id": 201,
"team_type": 0
},
"key": {
"href": "https://us.api.battle.net/data/sc2/league/37/201/0/5?namespace=prod"
}
},
"team": [
{
"id": 6956151645604413000,
"rating": 5321,
"wins": 131,
"losses": 64,
"ties": 0,
"points": 1601,
"longest_win_streak": 15,
"current_win_streak": 4,
"current_rank": 1,
"highest_rank": 10,
"previous_rank": 1,
"join_time_stamp": 1534903699,
"last_played_time_stamp": 1537822019,
"member": [
{
"legacy_link": {
"id": 9964871,
"realm": 1,
"name": "mTOR#378",
"path": "/profile/9964871/1/mTOR"
},
"played_race_count": [
{
"race": "Zerg",
"count": 195
}
],
"character_link": {
"id": 9964871,
"battle_tag": "Hellghost#11903",
"key": {
"href": "https://us.api.battle.net/data/sc2/character/Hellghost-11903/9964871?namespace=prod"
}
}
}
]
},
{
"id": 11611747760398664000, .....
....
Here's the code:
for ladder_number in ladder_array:
ladder_call_url = ladder_call+slash+str(ladder_number)+eng_locale+access_token
url = str(ladder_call_url)
response = requests.get(url)
print('trying ladder number '+str(ladder_number))
print('calling :'+url)
if response.status_code == 200:
print('status: '+str(response))
mmr_db.ladders.insert_one(response.json())
I get an error:
OverflowError: MongoDB can only handle up to 8-byte ints?
Is this because the data I am trying to load is too large? Are the "ID" integers too large?
Oh man, any help would be sincerely appreciated.
_______ EDIT ____________
Edited to include the Traceback:
Traceback (most recent call last):
File "C:\scripts\mmr_from_ladders.py", line 96, in <module>
mmr_db.ladders.insert_one(response.json(), bypass_document_validation=True)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\collection.py", line 693, in insert_one
session=session),
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\collection.py", line 607, in _insert
bypass_doc_val, session)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\collection.py", line 595, in _insert_one
acknowledged, _insert_command, session)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\mongo_client.py", line 1243, in _retryable_write
return self._retry_with_session(retryable, func, s, None)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\mongo_client.py", line 1196, in _retry_with_session
return func(session, sock_info, retryable)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\collection.py", line 590, in _insert_command
retryable_write=retryable_write)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\pool.py", line 584, in command
self._raise_connection_failure(error)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\pool.py", line 745, in _raise_connection_failure
raise error
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\pool.py", line 579, in command
unacknowledged=unacknowledged)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\network.py", line 114, in command
codec_options, ctx=compression_ctx)
File "C:\Users\me\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymongo\message.py", line 679, in _op_msg
flags, command, identifier, docs, check_keys, opts)
OverflowError: MongoDB can only handle up to 8-byte ints
The BSON spec — MongoDB’s native binary extended JSON format / data type — only supports 32 bit (signed) and 64 bit (signed) integers — 8 bytes being 64 bits.
The maximum integer value that can be stored in a 64 bit int is:
9,223,372,036,854,775,807
In your example you appear to have larger ids, for example:
11,611,747,760,398,664,000
I’m guessing that the app generating this data is using uint64 types (unsigned can hold x2-1 values).
I would start by looking at either of these potential solutions, if possible:
Changing the other side to use int64 (signed) types for the IDs.
Replacing the incoming IDs using ObjectId() as you then get a 12 byte ~ GUID for your unique IDs.

Categories