In Python, is there a way to check if a string is valid JSON before trying to parse it?
For example working with things like the Facebook Graph API, sometimes it returns JSON, sometimes it could return an image file.
You can try to do json.loads(), which will throw a ValueError if the string you pass can't be decoded as JSON.
In general, the "Pythonic" philosophy for this kind of situation is called EAFP, for Easier to Ask for Forgiveness than Permission.
Example Python script returns a boolean if a string is valid json:
import json
def is_json(myjson):
try:
json.loads(myjson)
except ValueError as e:
return False
return True
Which prints:
print is_json("{}") #prints True
print is_json("{asdf}") #prints False
print is_json('{ "age":100}') #prints True
print is_json("{'age':100 }") #prints False
print is_json("{\"age\":100 }") #prints True
print is_json('{"age":100 }') #prints True
print is_json('{"foo":[5,6.8],"foo":"bar"}') #prints True
Convert a JSON string to a Python dictionary:
import json
mydict = json.loads('{"foo":"bar"}')
print(mydict['foo']) #prints bar
mylist = json.loads("[5,6,7]")
print(mylist)
[5, 6, 7]
Convert a python object to JSON string:
foo = {}
foo['gummy'] = 'bear'
print(json.dumps(foo)) #prints {"gummy": "bear"}
If you want access to low-level parsing, don't roll your own, use an existing library: http://www.json.org/
Great tutorial on python JSON module: https://pymotw.com/2/json/
Is String JSON and show syntax errors and error messages:
sudo cpan JSON::XS
echo '{"foo":[5,6.8],"foo":"bar" bar}' > myjson.json
json_xs -t none < myjson.json
Prints:
, or } expected while parsing object/hash, at character offset 28 (before "bar}
at /usr/local/bin/json_xs line 183, <STDIN> line 1.
json_xs is capable of syntax checking, parsing, prittifying, encoding, decoding and more:
https://metacpan.org/pod/json_xs
I would say parsing it is the only way you can really entirely tell. Exception will be raised by python's json.loads() function (almost certainly) if not the correct format. However, the the purposes of your example you can probably just check the first couple of non-whitespace characters...
I'm not familiar with the JSON that facebook sends back, but most JSON strings from web apps will start with a open square [ or curly { bracket. No images formats I know of start with those characters.
Conversely if you know what image formats might show up, you can check the start of the string for their signatures to identify images, and assume you have JSON if it's not an image.
Another simple hack to identify a graphic, rather than a text string, in the case you're looking for a graphic, is just to test for non-ASCII characters in the first couple of dozen characters of the string (assuming the JSON is ASCII).
I came up with an generic, interesting solution to this problem:
class SafeInvocator(object):
def __init__(self, module):
self._module = module
def _safe(self, func):
def inner(*args, **kwargs):
try:
return func(*args, **kwargs)
except:
return None
return inner
def __getattr__(self, item):
obj = getattr(self.module, item)
return self._safe(obj) if hasattr(obj, '__call__') else obj
and you can use it like so:
safe_json = SafeInvocator(json)
text = "{'foo':'bar'}"
item = safe_json.loads(text)
if item:
# do something
An effective, and reliable way to check for valid JSON. If the 'get' accessor does't throw an AttributeError then the JSON is valid.
import json
valid_json = {'type': 'doc', 'version': 1, 'content': [{'type': 'paragraph', 'content': [{'text': 'Request for widget', 'type': 'text'}]}]}
invalid_json = 'opo'
def check_json(p, attr):
doc = json.loads(json.dumps(p))
try:
doc.get(attr) # we don't care if the value exists. Only that 'get()' is accessible
return True
except AttributeError:
return False
To use, we call the function and look for a key.
# Valid JSON
print(check_json(valid_json, 'type'))
Returns 'True'
# Invalid JSON / Key not found
print(check_json(invalid_json, 'type'))
Returns 'False'
Much simple in try block. You can then validate if the body is a valid JSON
async def get_body(request: Request):
try:
body = await request.json()
except:
body = await request.body()
return body
Related
In views.py VENDOR_MAPPER is list of dictionary each dictionary has id, name, placeholder and autocommit key. I also tried sending json instead of Response object.
resp_object = {}
resp_object['supported_vendors'] = VENDOR_MAPPER
resp_object['vendor_name'] = ""
resp_object['create_vo_entry'] = False
resp_object['generate_signature_flag'] = False
resp_object['branch_flag'] = False
resp_object['trunk_flag'] = False
resp_object['branch_name'] = ""
resp_object['advisory'] = ""
data = {'data': resp_object}
return Response(data)
On home.html I am accessing the vendors_supported which is list and iterate through it, however instead of object i am getting string as type of variable.
var supported_vendors = "{{data.supported_vendors|safe}}";
console.log(supported_vendors);
console.log("Supported_vendors ", supported_vendors);
console.log("Supported_vendors_type:", typeof(supported_vendors));
data.supported_vendors|safe (django template tagging) is used to remove the unwanted characters in the response i have also tried without safe, but still the type was string
also tried converted as well as parse the response but type is shown as string
var supported_vendors = "{{data.supported_vendors}}";
console.log(JSON.parse(supported_vendors));
console.log(JSON.stringify(supported_vendors));
Output generated, i have printed the response type and values i get, also converting using JSON.parse and JSON.stringify did not work and output every time was string
[1]: https://i.stack.imgur.com/DuSMb.png
I want to convert the property into javascript object and perform some computations
You can try this instead ,
return HttpResponse(json.dumps(data),
content_type="application/json")
I got the answer:
var supported_vendors = "{{data.supported_vendors}}";
Converted the above line to
var supported_vendors = {{data.supported_vendors}};
removed quotes from the variable
I am using voluptuous a lot to validate yaml description files. Often the errors are cumbersome to decipher, especially for regular users.
I am looking for a way to make the error a bit more readable. One way is to identify which line in the YAML file is incrimined.
from voluptuous import Schema
import yaml
from io import StringIO
Validate = Schema({
'name': str,
'age': int,
})
data = """
name: John
age: oops
"""
data = Validate(yaml.load(StringIO(data)))
In the above example, I get this error:
MultipleInvalid: expected int for dictionary value # data['age']
I would rather prefer an error like:
Error: validation failed on line 2, data.age should be an integer.
Is there an elegant way to achieve this?
The problem is that on the API boundary of yaml.load, all representational information of the source has been lost. Validate gets a Python dict and does not know where it originated from, and moreover the dict does not contain this information.
You can, however, implement this yourself. voluptuous' Invalid error carries a path which is a list of keys to follow. Having this path, you can parse the YAML again into nodes (which carry representation information) and discover the position of the item:
import yaml
def line_from(path, yaml_input):
node = yaml.compose(yaml_input)
for item in path:
for entry in node.value:
if entry[0].value == item:
node = entry[1]
break
else: raise ValueError("unknown path element: " + item)
return node.start_mark.line
# demostrating this on more complex input than yours
data = """
spam:
egg:
sausage:
spam
"""
print(line_from(["spam", "egg", "sausage"], data))
# gives 4
Having this, you can then do
try:
data = Validate(yaml.load(StringIO(data)))
except Invalid as e:
line = line_from(e.path, data)
path = "data." + ".".join(e.path)
print(f"Error: validation failed on line {line} ({path}): {e.error_message}")
I'll go this far for this answer as it shows you how to discover the origin line of an error. You will probably need to extend this to:
handle YAML sequences (my code assumes that every intermediate node is a MappingNode, a SequenceNode will have single nodes in its value list instead of a key-value tuple)
handle MultipleInvalid to issue a message for each inner error
rewrite expected int to should be an integer if you really want to (no idea how you'd do that)
abort after printing the error
With the help of flyx I found ruamel.yaml which provide the line and col of a parsed YAML file. So one can manage to get the wanted error with:
from voluptuous import Schema
from ruamel.yaml import load, RoundTripLoader
from io import StringIO
Validate = Schema({
'name': {
'firstname': str,
'lastname': str
},
'age': int,
})
data = """
name:
firstname: John
lastname: 12.0
age: 42
"""
class Validate:
def __init__(self, stream):
self._yaml = load(stream, Loader=RoundTripLoader)
return self.validate()
def validate(self):
try:
self.data = Criteria(self._yaml)
except Invalid as e:
node = self._yaml
for key in e.path:
if (hasattr(node[key], '_yaml_line_col')):
node = node[key]
else:
break
path = '/'.join(e.path)
print(f"Error: validation failed on line {node._yaml_line_col.line}:{node._yaml_line_col.col} (/{path}): {e.error_message}")
else:
return self.data
data = Validate(StringIO(data))
With this I get this error message:
Error: validation failed on line 2:4 (/name): extra keys not allowed
I'm fairly new to Python so bear with me please.
I have a function that takes two parameters, an api response and an output object, i need to assign some values from the api response to the output object:
def map_data(output, response):
try:
output['car']['name'] = response['name']
output['car']['color'] = response['color']
output['car']['date'] = response['date']
#other mapping
.
.
.
.
#other mapping
except KeyError as e:
logging.error("Key Missing in api Response: %s", str(e))
pass
return output
Now sometimes, the api response is missing some keys i'm using to generate my output object, so i used the KeyError exception to handle this case.
Now my question is, in a case where the 'color' key is missing from the api response, how can i catch the exception and continue to the line after it output['car']['date'] = response['date'] and the rest of the instructions.
i tried the pass instruction but it didn't have any affect.
Ps: i know i can check the existence of the key using:
if response.get('color') is not None:
output['car']['color'] = response['color']
and then assign the values but seeing that i have about 30 values i need to map, is there any other way i can implement ? Thank you
A few immediate ideas
(FYI - I'm not going to explain everything in detail - you can check out the python docs for more info, examples etc - that will help you learn more, rather than trying to explain everything here)
Google 'python handling dict missing keys' for a million methods/ideas/approaches - it's a common use case!
Convert your response dict to a defaultdict. In that case you can have a default value returned (eg None, '', 'N/A' ...whatever you like) if there is no actual value returned.
In this case you could do away with the try and every line would be executed.
from collections import defaultdict
resp=defaultdict(lambda: 'NA', response)
output['car']['date'] = response['date'] # will have value 'NA' if 'date' isnt in response
Use the in syntax, perhaps in combination with a ternary else
output['car']['color'] = response['color'] if 'color' in response
output['car']['date'] = response['date'] if 'date' in response else 'NA'
Again you can do away with the try block and every line will execute.
Use the dictionary get function, which allows you to specify a default if there is no value for that key:
output['car']['color'] = response.get('car', 'no car specified')
You can create a utility function that gets the value from the response and if the value is not found, it returns an empty string. See example below:
def get_value_from_response_or_null(response, key):
try:
value = response[key]
return value
except KeyError as e:
logging.error("Key Missing in api Response: %s", str(e))
return ""
def map_data(output, response):
output['car']['name'] = get_value_from_response_or_null(response, 'name')
output['car']['color'] = get_value_from_response_or_null(response, 'color')
output['car']['date'] = get_value_from_response_or_null(response, 'date')
# other mapping
# other mapping
return output
I tried to generate uid for a user confirmation email.
'uid':urlsafe_base64_encode(force_bytes(user.pk)),
so, it's works nice, it returns something like "Tm9uZQ"
Then, when I tried to decode it, using force_text(urlsafe_base64_decode(uidb64))
it return None.
The next string
urlsafe_base64_decode(uidb64)
also, return b'None'
I tried to google it, and see different implementations, but copy-paste code not works.
I write something like
b64_string = uidb64
b64_string += "=" * ((4 - len(b64_string) % 4) % 4)
print(b64_string)
print(force_text(base64.urlsafe_b64decode(b64_string)))
and the result still None:
Tm9uZQ==
None
I don't understand how the default decode doesn't work.
"Tm9uZQ==" is the base64 encoding of the string "None",
>>> from base64 import b64encode, b64decode
>>> s = b'None'
>>>
>>> b64encode(s)
b'Tm9uZQ=='
>>> b64decode(b64encode(s))
b'None'
>>>
It could be possible that some of your data is missing. E.g. user.pk is not set. I think that force_bytes is turning a None user.pk into the bytestring b'None', from the Django source,
def force_bytes(s, encoding='utf-8', strings_only=False, errors='strict'):
"""
Similar to smart_bytes, except that lazy instances are resolved to
strings, rather than kept as lazy objects.
If strings_only is True, don't convert (some) non-string-like objects.
"""
# Handle the common case first for performance reasons.
if isinstance(s, bytes):
if encoding == 'utf-8':
return s
else:
return s.decode('utf-8', errors).encode(encoding, errors)
if strings_only and is_protected_type(s):
return s
if isinstance(s, memoryview):
return bytes(s)
return str(s).encode(encoding, errors)
You might be able to prevent None being turned into b'None' by setting strings_only=True when calling force_bytes.
The following code is giving me:
Runtime.MarshalError: Unable to marshal response: {'Yes'} is not JSON serializable
from calendar import monthrange
def time_remaining_less_than_fourteen(year, month, day):
a_year = int(input['year'])
b_month = int(input['month'])
c_day = int(input['day'])
days_in_month = monthrange(int(a_year), int(b_month))[1]
time_remaining = ""
if (days_in_month - c_day) < 14:
time_remaining = "No"
return time_remaining
else:
time_remaining = "Yes"
return time_remaining
output = {time_remaining_less_than_fourteen((input['year']), (input['month']), (input['day']))}
#print(output)
When I remove {...} it then throws: 'unicode' object has no attribute 'copy'
I encountered this issue when working with lambda transformation blueprint kinesis-firehose-process-record-python for Kinesis Firehose which led me here. Thus I will post a solution to anyone who also finds this questions when having issues with the lambda.
The blueprint is:
from __future__ import print_function
import base64
print('Loading function')
def lambda_handler(event, context):
output = []
for record in event['records']:
print(record['recordId'])
payload = base64.b64decode(record['data'])
# Do custom processing on the payload here
output_record = {
'recordId': record['recordId'],
'result': 'Ok',
'data': base64.b64encode(payload)
}
output.append(output_record)
print('Successfully processed {} records.'.format(len(event['records'])))
return {'records': output}
The thing to note is that the Firehose lambda blueprints for python provided by AWS are for Python 2.7, and they don't work with Python 3. The reason is that in Python 3, strings and byte arrays are different.
The key change to make it work with lambda powered by Python 3.x runtime was:
changing
'data': base64.b64encode(payload)
into
'data': base64.b64encode(payload).decode("utf-8")
Otherwise, the lambda had an error due to inability to serialize JSON with byte array returned from base64.b64encode.
David here, from the Zapier Platform team.
Per the docs:
output: A dictionary or list of dictionaries that will be the "return value" of this code. You can explicitly return early if you like. This must be JSON serializable!
In your case, output is a set:
>>> output = {'Yes'}
>>> type(output)
<class 'set'>
>>> json.dumps(output)
Object of type set is not JSON serializable
To be serializable, you need a dict (which has keys and values). Change your last line to include a key and it'll work like you expect:
# \ here /
output = {'result': time_remaining_less_than_fourteen((input['year']), (input['month']), (input['day']))}