python: construct multipart/form data within defined in swagger.json - python

I am trying to construct a python request based on swagger.json schema. It mentioned multipart/form data and I did some research. And now the remaining issue is about type "array", not sure how to do it. Below is swagger.json schema.
"requestBody": {
"required": true,
"content": {
"multipart/form-data": {
"schema": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"file": {
"items": {
"type": "string",
"format": "binary"
},
"type": "array"
}
},
"required": [
"id",
"name",
"file"
]
}
}
}
}
I found files parameter in python requests module could do the multiform(How to send a "multipart/form-data" with requests in python?), but I don't know how to do the 'file' part which is an array here...if it is not array, just one object. I will go with 'file': ('testfile', open('testfile', 'rb')
current the UI side has not been deployed, so I cannot test. so could anyone help here? Thanks
data = {
'id' : test_id,
'name' : test_name,
'file': []
}

Related

Why does json.load() not work with json schema?

I created a JSON schema. I approved the schema using different online validators, e. g. Hyperjump.io. I want to load the schema with json.load() in Python, but it always raises the error
JSONDecodeError: Expecting value
According to jsonschema it should be possible to load JSON schema using json.load(). I tried it with simpler examples without success. I don't know what I'm missing and hope someone can help me out.
My schema:
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"type": "object",
"properties":
{
"Person":
{
"type": "array",
"title": "Person",
"items":
{
"type": "object",
"properties":
{
"FirstName":
{
"type": "string"
}
}
}
}
}
}
A simpler schema:
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"type": "object",
"properties":
{
"Person":
{
"type": "string"
}
}
}
My code:
schema = r"PATH/TO/JSON_FILE.json"
with open(schema) as schema_file:
if os.path.splitext(schema)[1] == '.json':
schema_data = json.load(schema_file)

Python: JSON to CSV

I am receiving a JSON file from a Docparser API, which I would like to convert to a CSV document.
The structure is here below:
{
"type": "object",
"properties": {
"id": {
"type": "string"
},
"document_id": {
"type": "string"
},
"remote_id": {
"type": "string"
},
"file_name": {
"type": "string"
},
"page_count": {
"type": "integer"
},
"uploaded_at": {
"type": "string"
},
"processed_at": {
"type": "string"
},
"table_data": [
{
"type": "array",
"items": {
"type": "object",
"properties": {
"account_ref": {
"type": "string"
},
"client": {
"type": "string"
},
"transaction_type": {
"type": "string"
},
"key_4": {
"type": "string"
},
"date_yyyymmdd": {
"type": "string"
},
"amount_excl": {
"type": "string"
}
},
"required": [
"account_ref",
"client",
"transaction_type",
"key_4",
"date_yyyymmdd",
"amount_excl"
]
}
}
]
}
}
The first problem that I have is how to only work with the table_data section?
My second problem is writing the actual code that allows me to put each section, i.e. account_ref, client, etc., into their own columns. I had so many changes to my code, the output varied from adding the properties into columns and dumping the table_data part into one cell, to only printing the headers into a single cell (as a list).
Here's my current code (which is not working correctly):
import pydocparser
import json
import pandas as pd
parser = pydocparser.Parser()
parser.login('API')
data2 = str(parser.fetch("Name of Parser", 'documentID'))
data2 = str(data2).replace("'", '"') # I had to put this in because it kept saying that it needs double quotes.
y = json.loads(str(data2))
json_file = open(r"C:\File.json", "w")
json_file.write(str(y))
json_file.close()
df1 = df = pd.DataFrame({str(y)})
df1.to_csv(r"C:\jsonCSV.csv")
Thanks for your help!
Pandas has a nice built in function called pandas.json_noramlize()
If you're using pandas version lower then 1.0.0 use pandas.io.json.json_normalize(), it should split the columns nicely.
read more about it here:
>1.0.0:
https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.io.json.json_normalize.html
=<1.0.0
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

JSON Schema: How to check if a field contains a value

I have a JSON schema validator where I need to check a specific field email to see if it's one of 4 possible emails. Lets call the possibilities ['test1', 'test2', 'test3', 'test4']. Sometimes the emails contain a \n new line separator so I need to account for that also. Is it possible to do a string contains method in JSON Schema?
Here is my schema without the email checks:
{
"type": "object",
"properties": {
"data": {
"type":"object",
"properties": {
"email": {
"type": "string"
}
},
"required": ["email"]
}
}
}
My input payload is:
{
"data": {
"email": "test3\njunktext"
}
}
I would need the following payload to pass validation since it has test3 in it. Thanks!
I can think of two ways:
Using enum you can define a list of valid emails:
{
"type": "object",
"properties": {
"data": {
"type": "object",
"properties": {
"email": {
"enum": [
"test1",
"test2",
"test3"
]
}
},
"required": [
"email"
]
}
}
}
Or with pattern which allows you to use a regular expression for matching a valid email:
{
"type": "object",
"properties": {
"data": {
"type": "object",
"properties": {
"email": {
"pattern": "test"
}
},
"required": [
"email"
]
}
}
}

ElasticSearch Parse Error

I am attempting to read JSON Data from a Network Port Scan and store these results in an ElasticSearch Index as a document. However, whenever I try to do this, I get a MapperParsingException error on the scan output results. In my mapping, I even tried to change the analysis to not_analyzed and no, but the error doesnt go away. Then, I figured that ES might be trying to interpret certain values as date values and attempted to set date_format to 0 or none. That led to a dead-end as well, with the mapping throwing an Unsupported option exception.
I have a dump of the values that I want to index in ElasticSearch here:
{
"protocol": "tcp",
"service": "ssh",
"state": "open",
"script_out": [
{
"output": "\n 1024 de:4e:50:33:cd:f6:8a:d0:c4:5a:e9:7d:1e:7b:13:12 (DSA)\nssh-dss AAAAB3NzaC1kc3MAAACBANkPx1nphZwsN1SVPPQHwz93abIHuEC4wMEeZiXdBC8RoSUUeCmdgPfIh4or0LvZ1pqaZP/k0qzCLyVxFt/eI7n36Lb9sZdVMf1Ao7E9TSc7lj9wg5ffY58WbWob/GQs1llGZ2K9Gp7oWuwCjKP164MsxMvahoJAAaWfap48ZiXpAAAAFQCnRMwRp8wBzzQU6lia8NegIb5rswAAAIEAxvN66VMDxE5aU8SvwwVmcUNwtVQWZ6pxn2W0gzF6H7JL1BhcnbCwQ3J/S6WdtqL2Dscw8drdAvsrN4XC8RT6Jowsir4q4HSQCybll6fSpNEdlv/nLIlYsH5ZuZZUIMxbTQ9vT0oYvzpDHejIQ/Zl1inYnJ+6XJmOc0LPUsu5PEsAAACAQO+Tsd3inLGskrqyrWSDO0VDD3cApYW7C+uTWXBfIoh/sVw+X9+OPa833w/PQkpacm68kYPXKS7GK8lqhg93dwbUNYFKz9MMNY6WVOjeAX9HtUAbglgLyRIt0CBqmL4snoZeKab22Nlmaf4aU5cHFlG9gnFEcK0vVIwIWp2EM/I=\n 2048 94:5f:86:77:81:39:2e:03:e0:42:d8:7d:10:a5:60:f0 (RSA)\nssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDV9BKj+QSavAr4UcDaCoHADVIaMOpcI5/hx/X9CRLDTxmB/WvEiL42tziMZEx7ipHT28/hl4HOwK64eXZuK75JMrMDutCZ2gmvRmvFKl6mAVbUEOlVkMGZeNJxATCZyWQyrZ6wA9E2ns5+id6l9C8we+bdq39cIR/e+yR8Ht8sfaigDi0gcW67GrHDI/oIgTQ79l+T/xAqCVrtQxqn/6pCuaCWQUVCxgOPXmJPbsd+g+oqZtm0aEjIJvcDJocMkZ2qMMlgMPeJBN27FCTKB80UUbV57iHXHzZF+cD7v+Jlw0fmyMapMkkPH+aabOUy7Kkbty1mucrFxaisLsckEf47",
"elements": {
"null": [
{
"type": "ssh-dss",
"bits": "1024",
"key": "AAAAB3NzaC1kc3MAAACBANkPx1nphZwsN1SVPPQHwz93abIHuEC4wMEeZiXdBC8RoSUUeCmdgPfIh4or0LvZ1pqaZP/k0qzCLyVxFt/eI7n36Lb9sZdVMf1Ao7E9TSc7lj9wg5ffY58WbWob/GQs1llGZ2K9Gp7oWuwCjKP164MsxMvahoJAAaWfap48ZiXpAAAAFQCnRMwRp8wBzzQU6lia8NegIb5rswAAAIEAxvN66VMDxE5aU8SvwwVmcUNwtVQWZ6pxn2W0gzF6H7JL1BhcnbCwQ3J/S6WdtqL2Dscw8drdAvsrN4XC8RT6Jowsir4q4HSQCybll6fSpNEdlv/nLIlYsH5ZuZZUIMxbTQ9vT0oYvzpDHejIQ/Zl1inYnJ+6XJmOc0LPUsu5PEsAAACAQO+Tsd3inLGskrqyrWSDO0VDD3cApYW7C+uTWXBfIoh/sVw+X9+OPa833w/PQkpacm68kYPXKS7GK8lqhg93dwbUNYFKz9MMNY6WVOjeAX9HtUAbglgLyRIt0CBqmL4snoZeKab22Nlmaf4aU5cHFlG9gnFEcK0vVIwIWp2EM/I=",
"fingerprint": "de4e5033cdf68ad0c45ae97d1e7b1312"
},
{
"type": "ssh-rsa",
"bits": "2048",
"key": "AAAAB3NzaC1yc2EAAAADAQABAAABAQDV9BKj+QSavAr4UcDaCoHADVIaMOpcI5/hx/X9CRLDTxmB/WvEiL42tziMZEx7ipHT28/hl4HOwK64eXZuK75JMrMDutCZ2gmvRmvFKl6mAVbUEOlVkMGZeNJxATCZyWQyrZ6wA9E2ns5+id6l9C8we+bdq39cIR/e+yR8Ht8sfaigDi0gcW67GrHDI/oIgTQ79l+T/xAqCVrtQxqn/6pCuaCWQUVCxgOPXmJPbsd+g+oqZtm0aEjIJvcDJocMkZ2qMMlgMPeJBN27FCTKB80UUbV57iHXHzZF+cD7v+Jlw0fmyMapMkkPH+aabOUy7Kkbty1mucrFxaisLsckEf47",
"fingerprint": "945f867781392e03e042d87d10a560f0"
}
]
},
"id": "ssh-hostkey"
}
],
"banner": "product: OpenSSH version: 6.2 extrainfo: protocol 2.0",
"port": "22"
},
Update
I am able to index the content in the "output" key. However, the error appears when I try and index the content in the "elements" key
Update 2
There's a possibility that there's something wrong with my mapping. This is the python code that I am using for the mapping.
"scan_info": {
"properties": {
"protocol": {
"type": "string",
"index": "analyzed"
},
"service": {
"type": "string",
"index": "analyzed"
},
"state": {
"type": "string",
"index": "not_analyzed"
},
"banner": {
"type": "string",
"index": "analyzed"
},
"port": {
"type": "string",
"index": "not_analyzed"
},
"script_out": { #is this the problem??
"type": "object",
"dynamic": True
}
}
}
I am drawing a blank here. What do I need to do?

Is there a way to use JSON schemas to enforce values between fields?

I've recently started playing with JSON schemas to start enforcing API payloads. I'm hitting a bit of a roadblock with defining the schema for a legacy API that has some pretty kludgy design logic which has resulted (along with poor documentation) in clients misusing the endpoint.
Here's the schema so far:
{
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string"
},
"object_id": {
"type": "string"
},
"question_id": {
"type": "string",
"pattern": "^-1|\\d+$"
},
"question_set_id": {
"type": "string",
"pattern": "^-1|\\d+$"
},
"timestamp": {
"type": "string",
"format": "date-time"
},
"values": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": [
"type",
"object_id",
"question_id",
"question_set_id",
"timestamp",
"values"
],
"additionalProperties": false
}
}
Notice that for question_id and question_set_id, they both take a numeric string that can either be a -1 or some other non-negative integer.
My question: is there a way to enforce that if question_id is set to -1, that question_set_id is also set to -1 and vice-versa.
It would be awesome if I could have that be validated by the parser rather than having to do that check in application logic.
Just for additional context, I've been using python's jsl module to generate this schema.
You can achieve the desired behavior by adding the following to your items schema. It asserts that the schema must conform to at least one of the schemas in the list. Either both are "-1" or both are positive integers. (I assume you have good reason for representing integers as strings.)
"anyOf": [
{
"properties": {
"question_id": { "enum": ["-1"] },
"question_set_id": { "enum": ["-1"] }
}
},
{
"properties": {
"question_id": {
"type": "string",
"pattern": "^\\d+$"
},
"question_set_id": {
"type": "string",
"pattern": "^\\d+$"
}
}
}

Categories