flask-sqlalchemy dynamically construct query - python

I have an input json like the following:
{
"page": 2,
"limit": 10,
"order": [
{
"field": "id",
"type": "asc"
},
{
"field": "email",
"type": "desc"
},
...
{
"field": "fieldN",
"type": "desc"
}
],
"filter": [
{
"field": "company_id",
"type": "=",
"value": 1
},
...
{
"field": "counter",
"type": ">",
"value": 5
}
]
}
How do I dynamically construct sqlalchemy query based on my input json if I don't know fields count?
Something like this:
User.query.filter(filter.field, filter.type, filter.value).filter(filter.field1, filter.type1, filter.value1)...filter(filter.fieldN, filter.typeN, filter.valueN).order_by("id", "ask").order_by("email", "desc").order_by("x1", "y1")....order_by("fieldN"...."desc").all()

Convert the json into a dictionary and retrieve the value.
If your json is in a file (say, data.json), the json library will satisfy your needs:
import json
f = open("data.json")
data = json.load(f)
f.close()
User.query.filter(company_id=1).order_by(data["id"], data["ask"]).order_by(data["email"], data["desc"]).all()
If your json is a string (say, json_data):
import json
data = json.loads(json_data)
User.query.filter(company_id=1).order_by(data["id"], data["ask"]).order_by(data["email"], data["desc"]).all()
If your json is a request from the python requests library i.e. res = requests.get(...), then res.json() will return a dictionary:
data = res.json()
User.query.filter(company_id=1).order_by(data["id"], data["ask"]).order_by(data["email"], data["desc"]).all()

Related

how to retrieve data from json file using python

I'm doing api requests to get json file to be parsed and converted into data frames. Json file sometimes may have empty fields, I am posting 2 possible cases where 1st json fill have the field I am looking for and the 2nd json file has that field empty.
1st json file:
print(resp2)
{
"entityId": "proc_1234",
"displayName": "oracle12",
"firstSeenTms": 1639034760000,
"lastSeenTms": 1650386100000,
"properties": {
"detectedName": "oracle.sysman.gcagent.tmmain.TMMain",
"bitness": "64",
"jvmVendor": "IBM",
"metadata": [
{
"key": "COMMAND_LINE_ARGS",
"value": "/usr/local/oracle/oem/agent12c/agent_13.3.0.0.0"
},
{
"key": "EXE_NAME",
"value": "java"
},
{
"key": "EXE_PATH",
"value": "/usr/local/oracle/oem/agent*c/agent_*/oracle_common/jdk/bin/java"
},
{
"key": "JAVA_MAIN_CLASS",
"value": "oracle.sysman.gcagent.tmmain.TMMain"
},
{
"key": "EXE_PATH",
"value": "/usr/local/oracle/oem/agent12c/agent_13.3.0.0.0/oracle_common/jdk/bin/java"
}
]
}
}
2nd Json file:
print(resp2)
{
"entityId": "PROCESS_GROUP_INSTANCE-FB8C65551916D57D",
"displayName": "Windows System",
"firstSeenTms": 1619147697131,
"lastSeenTms": 1653404640000,
"properties": {
"detectedName": "Windows System",
"bitness": "32",
"metadata": [],
"awsNameTag": "Windows System",
"softwareTechnologies": [
{
"type": "WINDOWS_SYSTEM"
}
],
"processType": "WINDOWS_SYSTEM"
}
}
as you can see metadata": [] empty.
I need to extract entityId, detectedName and if metada has data, I need to get EXE_NAME and EXE_PATH. if metada section is empty, I still need to get the entityId and detectedName from this json file and form a data frame.
so, I have done this:
#retrieve the detecteName value from the json
det_name = list(resp2.get('properties','detectedName').values())[0]
#retrieve EXE_NAME, EXE_PATH and entityId from the json. This part works when metata section has data
Procdf=(pd.json_normalize(resp2, record_path=['properties', 'metadata'], meta=['entityId']).drop_duplicates(subset=['key']).query("key in ['EXE_NAME','EXE_PATH']").assign(detectedName=det_name).pivot('entityId', 'key', 'value').reset_index())
#Add detectedName to the Procdf data frame
Procdf["detectedName"] = det_name
this above code snippet works when metadata has data, if it has no data [], I still need to create a data frame with entityId, detectedName and EXE_NAME and EXE_PATH being empty.
how can I do this? Right now when metadat[], I get this error name 'key' is not defined and skipps that json.
Why not create a new dict based on whether there's value for metadata or not?
Here's an example (this should work with both response types):
import pandas as pd
def find_value(response: dict, key: str) -> str:
result = []
try:
for x in response['properties']['metadata']:
if x['key'] == key:
result.append(x['value'])
except KeyError:
return ""
return result[0] if result else ""
def get_values(response: dict) -> dict:
return {
"entityId": response['entityId'],
"displayName": response['displayName'],
"EXE_NAME": find_value(response, 'EXE_NAME'),
"EXE_PATH": find_value(response, 'EXE_PATH'),
}
sample_response = {
"entityId": "PROCESS_GROUP_INSTANCE-FB8C65551916D57D",
"displayName": "Windows System",
"firstSeenTms": 1619147697131,
"lastSeenTms": 1653404640000,
"properties": {
"detectedName": "Windows System",
"bitness": "32",
"awsNameTag": "Windows System",
"metadata": [],
"softwareTechnologies": [
{
"type": "WINDOWS_SYSTEM"
}
],
"processType": "WINDOWS_SYSTEM"
}
}
print(pd.json_normalize(get_values(sample_response)))
Sample output for metadata being empty:
entityId displayName EXE_NAME EXE_PATH
0 PROCESS_GROUP_INSTANCE-FB8C65551916D57D Windows System
And one when metadata carries, well, data:
entityId ... EXE_PATH
0 proc_1234 ... /usr/local/oracle/oem/agent*c/agent_*/oracle_c...

Extract values from JSON file

I have this JSON file:
{
"entityId": "proc_1234",
"displayName": "oracle12",
"firstSeenTms": 1639034760000,
"lastSeenTms": 1650386100000,
"properties": {
"detectedName": "oracle.sysman.gcagent.tmmain.TMMain",
"bitness": "64",
"jvmVendor": "IBM",
"metadata": [
{
"key": "COMMAND_LINE_ARGS",
"value": "/usr/local/oracle/oem/agent12c/agent_13.3.0.0.0"
},
{
"key": "EXE_NAME",
"value": "java"
},
{
"key": "EXE_PATH",
"value": "/usr/local/oracle/oem/agent*c/agent_*/oracle_common/jdk/bin/java"
},
{
"key": "JAVA_MAIN_CLASS",
"value": "oracle.sysman.gcagent.tmmain.TMMain"
},
{
"key": "EXE_PATH",
"value": "/usr/local/oracle/oem/agent12c/agent_13.3.0.0.0/oracle_common/jdk/bin/java"
}
]
}
}
I need to extract entityId, detectedName, EXE_NAME, EXE_PATH from the json file.
output should be like this:
entityId detectedName EXE_NAME EXE_PATH
proc_1234 oracle.sysman.gcagent.tmmain.TMMain java /usr/local/oracle/oem/agent*c/agent_*/oracle_common/jdk/bin/java
I have tried this:
Procdf = (pd.json_normalize(resp2, record_path=['properties', 'metadata'], meta=['entityId']).drop_duplicates(subset=['key']) .query("key in ['EXE_NAME','EXE_PATH']").pivot('entityId', 'key', 'value', 'detectedName').reset_index())
I get this error:
TypeError: pivot() takes from 1 to 4 positional arguments but 5 were given
It is not clear to me what exactly the purpose of pivot is. But you are trying to pivot detectedName, that's not in your dataframe. Below might be what you need.
import pandas as pd
det_name = list(resp2.get('properties','detectedName').values())[0]
dataframe = pd.json_normalize(resp2, record_path=['properties', 'metadata'], meta=['entityId']).drop_duplicates(subset=['key']).query("key in ['EXE_NAME','EXE_PATH']").assign(detectedName=det_name).T
print(type(dataframe))
<class 'pandas.core.frame.DataFrame'>

How to convert a JSON file from GET request into pandas dataframe?

I'm trying to convert json obtained from a python GET request (requests library) into a pandas dataframe.
I've tried some other solutions on the subject, including json_normalize, however it does not appear to be working. The dataframe appears as a single column with dictionary's.
response = requests.get(myUrl, headers=head)
data = response.json()
#what now?
gives me the following json:
"data": [
{
"timestamp": "2019-04-10T11:40:13.437Z",
"score": 87,
"sensors": [
{
"comp": "temp",
"value": 20.010000228881836
},
{
"comp": "humid",
"value": 34.4900016784668
},
{
"comp": "co2",
"value": 418
},
{
"comp": "voc",
"value": 166
},
{
"comp": "pm25",
"value": 4
},
{
"comp": "lux",
"value": 961.4000244140625
},
{
"comp": "spl_a",
"value": 45.70000076293945
}
],
"indices": [
{
"comp": "temp",
"value": -1
},
{
"comp": "humid",
"value": -2
},
{
"comp": "co2",
"value": 0
},
{
"comp": "voc",
"value": 0
},
{
"comp": "pm25",
"value": 0
}
]
}
How do i convert this into a dataframe? The end result is supposed to look have the following headers:
you can import json in order use json package.
json package has loads() method, you can use this method convert json object to dict object, then by giving key to this dict object to get value to put it into dataframe.

Python - Parse complex JSON with objectpath

i need parse terraform file, write in JSON format. I have to extract two data, resource and id, this is example file:
{
"version": 1,
"serial": 1,
"modules": [
{
"path": [
"root"
],
"outputs": {
},
"resources": {
"aws_security_group.vpc-xxxxxxx-test-1": {
"type": "aws_security_group",
"primary": {
"id": "sg-xxxxxxxxxxxxxx",
"attributes": {
"description": "test-1",
"name": "test-1"
}
}
},
"aws_security_group.vpc-xxxxxxx-test-2": {
"type": "aws_security_group",
"primary": {
"id": "sg-yyyyyyyyyyyy",
"attributes": {
"description": "test-2",
"name": "test-2"
}
}
}
}
}
]
}
I need export for any resources, the first key and value of id, in this case, aws_security_group.vpc-xxxxxxx-test-1 sg-xxxxxxxxxxxxxx and aws_security_group.vpc-xxxxxxx-test-2 sg-yyyyyyyyyyyy
I have tried to write this in python:
#!/usr/bin/python3.6
import json
import objectpath
with open('file.json') as json_file:
data = json.load(json_file)
json_tree = objectpath.Tree(data['modules'])
result = tuple(json_tree.execute('$..resources[0]'))
result is
('aws_security_group.vpc-xxxxxxx-test-1', 'aws_security_group.vpc-xxxxxxx-test-2')
It's'ok but I can't extract the id, any help is appreciated, also use other methods
Thanks
I don't know objectpath, but I think you need:
tree.execute('$..resources[0]..primary.id')
or even just
tree.execute('$..resources[0]..id')

How to modify nested JSON with python

I need to update (CRUD) a nested JSON file using Python. To be able to call python function(s)(to update/delete/create) entires and write it back to the json file.
Here is a sample file.
I am looking at the remap library but not sure if this will work.
{
"groups": [
{
"name": "group1",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
],
"groups": [
{
"name": "group-child",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
]
}
]
},
{
"name": "group2",
"properties": [
{
"name": "Test-Key2-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value2"
}
}
]
}
]
}
I feel like I'm missing something in your question. In any event, what I understand is that you want to read a json file, edit the data as a python object, then write it back out with the updated data?
Read the json file:
import json
with open("data.json") as f:
data = json.load(f)
That creates a dictionary (given the format you've given) that you can manipulate however you want. Assuming you want to write it out:
with open("data.json","w") as f:
json.dump(data,f)

Categories