JSONPath issues with Python and jsonpath_ng (Parse error near token ?) - python
I'm trying to work with jsonpath_ng Python library. For most of the JSONPath filters I usually use it works.
However, I'm struggling with a simple filter clause. It can be summarized in 2 lines.
from jsonpath_ng.ext import parse
jsonpath_expression = parse(f"$.jobs.*.jobSummary.[?(#.storagePolicy.storagePolicyName=='{SPname}')].sizeOfApplication")
My JSON payload is this one:
{
"processinginstructioninfo": {
"attributes": [
{
"name": "WebServer",
"value": "IDM-COMMSERVE"
}
]
},
"totalRecordsWithoutPaging": 161,
"jobs": [
{
"jobSummary": {
"sizeOfApplication": 65552265428,
"vsaParentJobID": 28329591,
"commcellId": 2,
"backupSetName": "defaultBackupSet",
"opType": 59,
"totalFailedFolders": 0,
"totalFailedFiles": 0,
"alertColorLevel": 0,
"jobAttributes": 288232025419153408,
"jobAttributesEx": 67108864,
"isVisible": true,
"localizedStatus": "Completed",
"isAged": false,
"totalNumOfFiles": 0,
"jobId": 28329592,
"jobSubmitErrorCode": 0,
"sizeOfMediaOnDisk": 34199,
"currentPhase": 0,
"status": "Completed",
"lastUpdateTime": 1661877467,
"percentSavings": 99.99995,
"localizedOperationName": "Snap Backup",
"statusColor": "black",
"pendingReason": "",
"errorType": 0,
"backupLevel": 2,
"jobElapsedTime": 59,
"jobStartTime": 1661877408,
"currentPhaseName": "",
"jobType": "Snap Backup",
"isPreemptable": 0,
"backupLevelName": "Incremental",
"attemptStartTime": 0,
"pendingReasonErrorCode": "",
"appTypeName": "Virtual Server",
"percentComplete": 100,
"averageThroughput": 27472.637,
"localizedBackupLevelName": "Incremental",
"currentThroughput": 0,
"subclientName": "default",
"destClientName": "desktop-1058kvf",
"jobEndTime": 1661877467,
"dataSource": {
"dataSourceId": 0
},
"subclient": {
"clientName": "desktop-1058kvf",
"instanceName": "VMInstance",
"backupsetId": 161,
"commCellName": "idm-commserve",
"instanceId": 2,
"subclientId": 235,
"clientId": 71,
"appName": "Virtual Server",
"backupsetName": "defaultBackupSet",
"applicationId": 106,
"subclientName": "default"
},
"storagePolicy": {
"storagePolicyName": "IDM-Metallic-Replica_ReplicationPlan",
"storagePolicyId": 78
},
"destinationClient": {
"clientId": 71,
"clientName": "desktop-1058kvf",
"displayName": "idm-laptop1"
},
"userName": {
"userName": "admin",
"userId": 1
},
"clientGroups": [
{
"clientGroupId": 4,
"clientGroupName": "Laptop Clients"
},
{
"clientGroupId": 46,
"clientGroupName": "Clients For Commserv LiveSync"
},
{
"clientGroupId": 47,
"clientGroupName": "idm-vcsa"
},
{
"clientGroupId": 55,
"clientGroupName": "Laptop plan test clients"
}
]
}
}
]
}
I need to get just the "sizeOfApplication" parameter for every object with a particular "storagePolicyName". That's it. Say, in this case, that the "storagePolicyName" I'm looking values for is "IDM-Metallic-Replica_ReplicationPlan" as an example.
I usually go to My favourite JSONPath site to test the JSONpath I use, and this one
"$.jobs.*.jobSummary.[?(#.storagePolicy.storagePolicyName=='IDM-Metallic-Replica_ReplicationPlan')].sizeOfApplication" works.
But, on Python side, I keep getting "jsonpath_ng.exceptions.JsonPathParserError: Parse error at 1:21 near token ? (?)" errors.
What am I doing wrong?
Thank you!
Mattia
I think the problem here is that jsonpath_ng is being stricter to the JSONPath proposal than the other parsers you have tried.
The first problem is that there shouldn't be a . immediately before a filter condition [?(...)]. So the first step is to remove the . after jobSummary in jobSummary.[?(#storagePolicy....
I made that change to your JSONPath expression, and used jsonpath_ng to run it on your sample data. The parser error had gone, but it returned no matches. So it's still not right.
From reading the JSONPath proposal, it's not clear if you can use a filter operator such as [?(...)] on an object, or only on an array. When used on an array it would return all elements of the array that match the filter. If a JSONPath parser does support a filter on an object, then it seems it returns the object if the filter matches and an empty list of matches otherwise.
I would guess that jsonpath_ng only permits filters on arrays. So let's modify your JSONPath expression to only use filters on arrays. Your JSON has an array in $.jobs, and within each element of this array you want to look within the jobSummary object for a storagePolicy with storagePolicyName=={SPname}. So the following JSONPath expression should return the matching job:
$.jobs[?(#.jobSummary.storagePolicy.storagePolicyName=='{SPname}')]
What we then want to do is to get the value of the sizeOfApplication property within the jobSummary object within each matching job. Note that the matches returned by the above JSONPath expression are elements of the jobs array, not jobSummary objects. We can't just select sizeOfApplication because we're one level further up than we were before. We need to go back into the jobSummary object to get the sizeOfApplication:
$.jobs[?(#.jobSummary.storagePolicy.storagePolicyName=='{SPname}')].jobSummary.sizeOfApplication
I used jsonpath_ng to run this JSONPath expression on your sample data and it gave me the output [65552265428], which seems to be the expected output.
Related
How to fix 'parse error on (VAR_SIGN)' in a graphql query in python
I am having trouble with GraphQL queries made in python. It says that the signal $ that determine a variable in the query cannot be parsed. Error message: {"errors":[{"message":"Parse error on \"$\" (VAR_SIGN) at [3, 3]","locations":[{"line":3,"column":3}]}]} Is there other way to use variables in this kind of request? Here is my query, I didn't paste the fragment because I think it's not the problem query ApplicationIndexQuery( $status: Boolean! $page: Int $perPage: Int $filters: ApplicationFilter $sort: String ) { allOpportunityApplication(page: $page, per_page: $perPage, filters: $filters, sort: $sort) { ...ApplicationList_list } } variables = { "status": True, "page": 1, "perPage": 517, "filters": { "date_realized": { "from": "2018-12-01", "to": "2019-03-31" }, "person_home_mc": 1535, "programmes": 5 }
query should be at the top level, but it seems like in your example it's enclosed in curly braces. See below: https://github.com/apollographql/graphql-tag/issues/180#issuecomment-386540792
I need to filter a specific values inside a list with tuples
i need to filter out data from a list that i get: [{"end": 1547230999000, "attributes": {}, "metric": "system.cpu.idle", "interval": 20, "start": 1547227400000, "length": 180, "query_index": 0, "aggr": null, "scope": "host:osboxes", "pointlist": [[1547227400000.0, 99.6485366821289], [1547227420000.0, 99.60060119628906], [1547227440000.0, 99.40513610839844], "expression": "system.cpu.idle{host:osboxes}", "unit": [{"family": "percentage", "scale_factor": 1.0, "name": "percent", "short_name": "%", "plural": "percent", "id": 17}, null], "display_name": "system.cpu.idle"}], "to_date": 1547231000000, "resp_version": 1, "query": "system.cpu.idle{*}by{host}", "message": "", "group_by": ["host"]} i only want the data that comes after the pointlist key and before the expression key, thought about regex for this problem, i'm not sure about it though. 2) from the tuples i'd get after the 1st filter say for example: [1547227420000.0, 99.60060119628906] i need just the 2nd value in each one in some kind of structure. Tried using regex but i cant seem to find the correct rule to grab just the stuff i want. Here's the whole json ( dictionary) that i get as an input: [{"end": 1547230999000, "attributes": {}, "metric": "system.cpu.idle", "interval": 20, "start": 1547227400000, "length": 180, "query_index": 0, "aggr": null, "scope": "host:osboxes", "pointlist": [[1547227400000.0, 99.6485366821289], [1547227420000.0, 99.60060119628906], [1547227440000.0, 99.40513610839844], [1547227460000.0, 99.5660171508789], [1547227480000.0, 99.68238067626953], [1547227500000.0, 99.58213806152344], [1547227520000.0, 99.56404876708984], [1547227540000.0, 99.59886169433594], [1547227560000.0, 99.33905792236328], [1547227580000.0, 99.49874877929688], [1547227600000.0, 99.69874572753906], [1547227620000.0, 99.58246231079102], [1547227640000.0, 99.01371002197266], [1547227660000.0, 99.53114318847656], [1547227680000.0, 99.48202896118164], [1547227700000.0, 99.49647521972656], [1547227720000.0, 99.68254089355469], [1547227740000.0, 99.43094635009766], [1547227760000.0, 99.38209533691406], [1547227780000.0, 99.6488265991211], [1547227800000.0, 99.42307662963867], [1547227820000.0, 99.28117370605469], [1547227840000.0, 99.51512908935547], [1547227860000.0, 96.35371780395508], [1547227880000.0, 99.0471420288086], [1547227900000.0, 99.59866333007812], [1547227920000.0, 99.41494750976562], [1547227940000.0, 99.4984130859375], [1547227960000.0, 99.5489501953125], [1547227980000.0, 99.48962783813477], [1547228000000.0, 99.58173370361328], [1547228020000.0, 99.63229370117188], [1547228040000.0, 99.38098907470703], [1547228060000.0, 99.46452331542969], [1547228080000.0, 99.5501480102539], [1547228100000.0, 99.46395111083984], [1547228120000.0, 99.6651611328125], [1547228140000.0, 99.66544342041016], [1547228160000.0, 99.5067024230957], [1547228180000.0, 99.53192901611328], [1547228200000.0, 99.58263397216797], [1547228220000.0, 99.4233169555664], [1547228240000.0, 99.51488494873047], [1547228260000.0, 99.69884490966797], [1547228280000.0, 99.17123413085938], [1547228300000.0, 99.48178100585938], [1547228320000.0, 99.61544799804688], [1547228340000.0, 99.38138961791992], [1547228360000.0, 99.49983215332031], [1547228380000.0, 99.58074951171875], [1547228400000.0, 99.44026565551758], [1547228420000.0, 99.56558227539062], [1547228440000.0, 99.61634826660156], [1547228460000.0, 99.2971076965332], [1547228480000.0, 99.514404296875], [1547228500000.0, 99.56529235839844], [1547228520000.0, 99.48181915283203], [1547228540000.0, 99.49799346923828], [1547228560000.0, 99.56507110595703], [1547228580000.0, 99.47320556640625], [1547228600000.0, 99.49816131591797], [1547228620000.0, 99.59886169433594], [1547228640000.0, 99.0047836303711], [1547228660000.0, 99.48117065429688], [1547228680000.0, 99.66544342041016], [1547228700000.0, 99.49843215942383], [1547228720000.0, 99.48194885253906], [1547228740000.0, 99.63235473632812], [1547228760000.0, 99.36409378051758], [1547228780000.0, 93.1688461303711], [1547228800000.0, 99.34782409667969], [1547228820000.0, 99.46506118774414], [1547228840000.0, 99.33065795898438], [1547228860000.0, 99.59893035888672], [1547228880000.0, 99.47415924072266], [1547228900000.0, 99.46299743652344], [1547228920000.0, 99.5824966430664], [1547228940000.0, 99.39748764038086], [1547228960000.0, 99.46452331542969], [1547228980000.0, 99.71566772460938], [1547229000000.0, 99.4896354675293], [1547229020000.0, 99.481689453125], [1547229040000.0, 99.48186492919922], [1547229060000.0, 99.43965148925781], [1547229080000.0, 99.41500854492188], [1547229100000.0, 99.56536102294922], [1547229120000.0, 99.45612716674805], [1547229140000.0, 99.28033447265625], [1547229160000.0, 98.72547149658203], [1547229180000.0, 99.36448669433594], [1547229200000.0, 99.39749145507812], [1547229220000.0, 99.55000305175781], [1547229240000.0, 99.32996368408203], [1547229260000.0, 99.43115234375], [1547229280000.0, 99.41422271728516], [1547229300000.0, 99.41427993774414], [1547229320000.0, 99.4978256225586], [1547229340000.0, 99.63327026367188], [1547229360000.0, 99.45573425292969], [1547229380000.0, 99.04618835449219], [1547229400000.0, 99.56463623046875], [1547229420000.0, 99.42306137084961], [1547229440000.0, 99.36380004882812], [1547229460000.0, 99.6164779663086], [1547229480000.0, 99.48064422607422], [1547229500000.0, 99.44741821289062], [1547229520000.0, 99.5820083618164], [1547229540000.0, 99.23918914794922], [1547229560000.0, 99.38034057617188], [1547229580000.0, 99.58187103271484], [1547229600000.0, 99.47303771972656], [1547229620000.0, 99.44770050048828], [1547229640000.0, 99.56521606445312], [1547229660000.0, 99.36420822143555], [1547229680000.0, 93.31424713134766], [1547229700000.0, 99.19745635986328], [1547229720000.0, 99.3642807006836], [1547229740000.0, 99.3148422241211], [1547229760000.0, 99.41403198242188], [1547229780000.0, 98.98696899414062], [1547229800000.0, 99.36422729492188], [1547229820000.0, 99.59711456298828], [1547229840000.0, 99.41479110717773], [1547229860000.0, 99.4476089477539], [1547229880000.0, 99.59845733642578], [1547229900000.0, 99.42321014404297], [1547229920000.0, 99.46488189697266], [1547229940000.0, 99.59845733642578], [1547229960000.0, 99.51408767700195], [1547229980000.0, 99.53137969970703], [1547230000000.0, 99.59893035888672], "expression": "system.cpu.idle{host:osboxes}", "unit": [{"family": "percentage", "scale_factor": 1.0, "name": "percent", "short_name": "%", "plural": "percent", "id": 17}, null], "display_name": "system.cpu.idle"}], "to_date": 1547231000000, "resp_version": 1, "query": "system.cpu.idle{*}by{host}", "message": "", "group_by": ["host"]} So the end result should achieve something of that nature: structure = [99.6485366821289,99.60060119628906,99.40513610839844...and so on]`
I'm assuming you're dealing with regular Python objects, so regexes aren't the way to go. If you are reading some json in, you can do this beforehand: import json with open("source_file.json", "r") as fh: data = json.load(fh) Collecting the data you need is then: new_list = [] for obj in data: new_list.extend(second for _, second in obj['pointlist']) I'm guessing here that in case you have multiple "pointlist" instances you'd want to gather all of them. If you know there will be only one, this would work just as well: new_list = [second for _, second in data[0]['pointlist']] This is known as a list comprehension and is a quick way to process lists. The _, second is called destructuring. Here _ is a dummy name, you could write first, second just as well, and the technique works whenever you have a list or tuple with a known number of items.
Comparing value in a JSON using Python
I receive a fairly uncomfortable JSON to work with, which looks as follows: [ { "attributes": [ { "type": "COMMAND", "name": "COMMAND", "value": [ "buttonState" ] }, { "type": "std_msgs.msg.Bool", "name": "buttonState", "value": { "data": false } } ], "type": "sensor", "id": "s_2" }] And I would like to compare a piece of data (more precisely - value of Button state) but I seem to fail. Tried following: import requests import json yo = 1 switchPost = "http://192.168.0.104:7896/iot/d?k=123456789&i=san_1_switch&d=sw|{}" robGet = "http://192.168.0.109:10100/robot/sen_2" r = requests.get(robGet, headers={"content-type":"application/json"}) resp = json.loads(r.text) for attrs in (resp['attributes']['value']): if attrs['data'] == false: yo = 100 break g = requests.post(switchPost.format(yo), headers={"content-type":"text/plain"}) print(r.text) Unfortunately, the error I receive is the following: for attrs in (resp['attributes']['value']): TypeError: list indices must be integers, not str
In your JSON, the fact that it is wrapped in [ then ] means it is a JSON array, but with just one element. So, as your error message suggests, resp needs an integer as its index, for which element of the array you want. resp[0] then refers to { "attributes": [ { "type": "COMMAND", "name": "COMMAND", "value": [ "buttonState" ] }, { "type": "std_msgs.msg.Bool", "name": "buttonState", "value": { "data": false } } ], "type": "sensor", "id": "s_2" } (notice no [] now, so it's a JSON object) Then you want resp[0]['attributes'] to refer to the single part of this object, 'attributes' which again refers to an array. Therefore for attribute in resp[0]['attributes'] will allow you to loop through this array. To get the boolean value you want, you'll then want to find which element of that array has 'name' of 'buttonState' and check the corresponding 'value'. In all, you're probably looking for something like: for attribute in resp[0]['attributes']: if attribute['name'] == 'buttonState' and attribute['value']['data'] is False: # Do your thing here
resp is a list so, to get first element, access it as resp[0]. Same with resp[0]['attributes'] So you can access it as follows resp[0]['attributes'][0]['value'] You can restructure your for loop as follows for d in resp[0]['attributes']: if isinstance(d['value'], dict) and d['value'].get('data') == false: yo = 100 break
The answer is in the error message I think: TypeError: list indices must be integers, not str The first entry in attributes has a value that is a list, so you can't get 'data' from that. Since you have a mix of types, you might need to check if 'value' is a list or a dict. Edit: Jumped the gun here I think. #dennlinger gives an explanation to your error message. But you'll get it again once you're past that...
Accessing nested objects with python
I have a response that I receive from foursquare in the form of json. I have tried to access the certain parts of the object but have had no success. How would I access say the address of the object? Here is my code that I have tried. url = 'https://api.foursquare.com/v2/venues/explore' params = dict(client_id=foursquare_client_id, client_secret=foursquare_client_secret, v='20170801', ll=''+lat+','+long+'', query=mealType, limit=100) resp = requests.get(url=url, params=params) data = json.loads(resp.text) msg = '{} {}'.format("Restaurant Address: ", data['response']['groups'][0]['items'][0]['venue']['location']['address']) print(msg) Here is an example of json response: "items": [ { "reasons": { "count": 0, "items": [ { "summary": "This spot is popular", "type": "general", "reasonName": "globalInteractionReason" } ] }, "venue": { "id": "412d2800f964a520df0c1fe3", "name": "Central Park", "contact": { "phone": "2123106600", "formattedPhone": "(212) 310-6600", "twitter": "centralparknyc", "instagram": "centralparknyc", "facebook": "37965424481", "facebookUsername": "centralparknyc", "facebookName": "Central Park" }, "location": { "address": "59th St to 110th St", "crossStreet": "5th Ave to Central Park West", "lat": 40.78408342593807, "lng": -73.96485328674316, "labeledLatLngs": [ { "label": "display", "lat": 40.78408342593807, "lng": -73.96485328674316 } ], the full response can be found here
Like so addrs=data['items'][2]['location']['address']
Your code (at least as far as loading and accessing the object) looks correct to me. I loaded the json from a file (since I don't have your foursquare id) and it worked fine. You are correctly using object/dictionary keys and array positions to navigate to what you want. However, you mispelled "address" in the line where you drill down to the data. Adding the missing 'a' made it work. I'm also correcting the typo in the URL you posted. I answered this assuming that the example JSON you linked to is what is stored in data. If that isn't the case, a relatively easy way to see exact what python has stored in data is to import pprint and use it like so: pprint.pprint(data). You could also start an interactive python shell by running the program with the -i switch and examine the variable yourself.
data["items"][2]["location"]["address"] This will access the address for you.
You can go to any level of nesting by using integer index in case of an array and string index in case of a dict. Like in your case items is an array #items[int index] items[0] Now items[0] is a dictionary so we access by string indexes item[0]['location'] Now again its an object s we use string index item[0]['location']['address]
Parse Json data
a= [{ "data" : { "check": true, }, "AMI": { "status": 1, "firewall":{ "status": enable }, "d_suffix": "x.y.com", "id": 4 }, "tags": [ #Sometime tags could be like "tags": ["default","auto"] "default" ], "hostname": "abc.com", } ] How to get a hostname on the basis of tags?I am trying to implement it using for i in a: if i['tags'] == 'default': output = i['hostname'] but it's failing because 'tags' is a list which is not mapping to hostname key.Is there any way i can get hostname on the basis of 'tags'?
Use in to test if something is in a list. You also need to put default in quotes to make it a string. for i in a: if 'default' in i['tags']: output = i['hostname'] break If you only need to find one match, you should break out of the loop once you find it. If you need to find multiple matches, use #phihag's answer with the list comprehension.
To get all hostnames tagged as default, use a list comprehension: def_hostnames = [i['hostname'] for i in a if 'default' in i['tags']] print('Default hostnames: %s' % ','.join(def_hostnames)) If you only want the first hit, either use def_hostnames[0] or the equivalent generator expression: print('first: %s' % next(i['hostname'] for i in a if 'default' in i['tags'])) Your current code fails because it uses default, which is a variable named default. You want to look for a string default.
Make sure that you have everything in Json format like a= [{ "data" : { "check": True, }, "AMI": { "status": 1, "firewall":{ "status": "enable" }, "d_suffix": "x.y.com", "id": 4 }, "tags": [ "default" ], "hostname": "abc.com", } ] and then you can easily get it by using in for i in a: if 'default' in i['tags']: output = i['hostname']