json strip multiple lists [duplicate] - python

This question already has answers here:
Why can't Python parse this JSON data? [closed]
(3 answers)
Closed 5 years ago.
I am looking for more info regarding this issue I have. So far I have checked the JSON encoding/decoding but it was not precisely what I was looking for.
I am looking for some way to strip this kind of list quite easily:
//response
{
"age":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"age2":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"days_month":31,
"year":2017
}
So how do I easily extract the data? i.e. I want to get the result age of person in age2 with # == 3.
To get the results for year/days_months I found the solution with google:
j=json.loads(r.content)
print(j['year'])
to retrieve the data. Probably I have missed something somewhere on the internet, but I could not find the specific solution for this case.

I think this is what #Jean-François Fabre tried to indicate:
import json
response = """
{
"age":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"age2":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"days_month":31,
"year":2017
}
"""
j = json.loads(response)
# note that the [2] means the third element in the "age2" list-of-dicts
print(j['age2'][2]['#']) # -> 3
print(j['age2'][2]['age']) # -> 16
json.loads() converts a string in JSON format into a Python object. In particular it converts JSON objects into Python dictionaries and JSON lists into Python list objects. This means you can access the contents of the result stored in the variable j in this case, just like you would if it was a native mixture of one or more of those types of Python datatypes (and would look very similar to what is shown in the response).

As the search criterion you are looking for is not contained in the indices of the respective datastructures, I would do it using a list comprehension. For your example, this would be
[person['age'] for person in j['age2'] if person['#'] == u'3'][0]
This iterates through all the items in the list under 'age2', and puts all the items where the number is '3' into a list. The [0] selects the first entry of the list.
However, this is very inefficient. If you have large datasets, you might want to have a look at pandas:
df = pandas.DataFrame(j['age2'])
df[df['#'] == '3']['age']
which is much more performant as long as your data can be represented by a sort of series or table.

Related

join two JSON objects in Python on common data point [duplicate]

This question already has answers here:
Why does creating a list of tuples using list comprehension requires parentheses?
(2 answers)
Why do tuples in a list comprehension need parentheses? [duplicate]
(3 answers)
Closed 8 months ago.
I keep getting a syntax error for this and Google is no help for my specific issue.
I'm trying to merge two data sets into a single dictionary. One data set comes from https://universalis.app/api/v2/marketable and looks to be an array. The other comes from https://raw.githubusercontent.com/ffxiv-teamcraft/ffxiv-teamcraft/master/apps/client/src/assets/data/items.json and appears to be just an object of objects. Example below with what I've tried.
Code:
import requests
import json
url = "https://universalis.app/api/v2/marketable"
response = json.loads(requests.get(url).text)
marketableItems = [
item
for item in response
]
url = "https://raw.githubusercontent.com/ffxiv-teamcraft/ffxiv-teamcraft/master/apps/client/src/assets/data/items.json"
allItemsResponse = json.loads(requests.get(url).text)
itemDictionary = [
Item, allItemsResponse[str(Item)]["en"]
for Item in marketableItems
]
this produces:
Item, allItemsResponse[str(Item)]["en"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: did you forget parentheses around the comprehension target?
I've googled a fair bit for this exact Syntax Error, but I'm not really able to find any sort of guide on how to join two objects like this. I'm able to get allItemsResponse[str(Item)]["en"] to return data, I just want it paired with the original data from the first URL.

Data inside a JSON has letters and numbers I do not need, how to get data I need in Python

I am looking at extracting data from within a JSON file, but the data I need has numbers and letters before and sometimes after the data. I would like to know if it is possible to remove the unnecessary numbers and letter I do not need. Here is an example of the data:
"most_common_aircraft":[{"planned_aircraft":"B738/L","dcount":4592},{"planned_aircraft":"H/B744/L","dcount":3639},{"planned_aircraft":"H/B77L/L","dcount":2579},{"planned_aircraft":"H/B772/L","dcount":1894},{"planned_aircraft":"H/B763/L","dcount":1661},{"planned_aircraft":"H/B748/L","dcount":1303},{"planned_aircraft":"B712/L","dcount":1289},{"planned_aircraft":"B739/L","dcount":1198},{"planned_aircraft":"H/B77W/L","dcount":978},{"planned_aircraft":"B738","dcount":957}]
"H/B77L/L , B752/L, A320/X, B738,"
all I am interested in is the main 4 letters/numbers, for example instead of "H/B77L/L" I want just "B77L", instead of "B752/L" I want "B752". The data is very mixed, so some will have a letters in front, some at the end and some with both, then there are others that are already in the correct format I want. Is there a way to remove the additional letters during the extracting of data from a JSON file using Python, if not would it be better as I am using Pandas to extracting them all to a dataframe then compare it to another dataframe which has the correct sequence without the additional letters?
I have managed to find the answer and solve my problem. I will put it here so to help others that may have a similar problem -
for entry in json_data['results']:
for value in entry['most_common_aircraft']:
for splitted_string in value['planned_aircraft'].split('/'):
if len(splitted_string) == 4:
value['planned_aircraft'] = splitted_string

Best way to store dictionary items for later retrieval? [duplicate]

This question already has answers here:
Appending a dictionary to a list in a loop
(5 answers)
Closed 4 years ago.
I am doing a fairly complex task of reading in a python model and then performing various tasks on it, afterwards that gets written out as individual XML files. However along with this I need to provide various summary file depending on what the individual python model contains.
In Ruby, I would simply store this data in a struct and then parse the array of data. In Python, the dictionary is equivalent to struct, but what's not obvious to me in my testing is how I can add to the values in a dictionary so that if I have:
name: "John"
place: "Atlanta"
age: "18"
All of this neatly fits into a dictionary. But what about the next record?
When I use update, it replaces the dictionary items with the new data. So I thought, I would then use a list to simply append the list with my dictionary data. However, when I append my list (because I used update for the dictionary), my list now contains a list of all the same data.
What is the proper Python way to store multiple dictionary items so they can be accessed later like they are a single record? I thought maybe a tuple but that didn't seem to get me very far either, so I'm obviously doing something very wrong.
I would make a list with dictionaries in them, so the result would be like this:
struct = [{"name": "John", "place": "Atlanta", "age": "18"}, {"name": "Mary", "place": "New York", "age": "22"}]
Then you can for example loop over the list and print the values like this;
for ls in struct:
print("Name:", ls["name"])
print("Place:", ls["place"])
print("Age:", ls["age"])

Django query returning too many values

I'm trying to get a list of the names in a table using Django. The field I'm searching for is "name", and I print out my response, which gives the following:
[u"name1", u"name2"]
However, when I send that to a website in javascript, I see that the length is 16, though console.log shows the same result as the python print statements. When I try to iterate over the list that prints as above, I get the integers 0-15 (the loop I am using is
for (var name in names)).
Why is the string representation of this list so much different than the actual representation, and how do I get a representation that matches the print representation if I can't iterate over it or anything?
This is because names is actually a string within your javascript. You need to pass back the json list or convert the stringified json into objects. This second part can be done with JSON.parse(). Unfortunately, your question doesn't show how you're returning the data or how you're handling the data within javascript, so I can't help you any further than this for now.

How do I remove an extra square bracket from JSON in Python?

I have JSON of the following form:
{"blah":
[
[
{"first":
{"something":"that","something":"else","another":"thing","key":"value"}...[etc.]
}
]
]
}
that I'm trying to parse in Python. I've imported json (or simplejson, depending on what version of Python you're using) and everything goes pretty well until I get to this block:
for result in page['blah']:
that = result['first']
a_list.append(that)
which throws the error "TypeError: list indices must be integers, not str".
I'm pretty sure this error is due to the extra pair of square-bracket that makes the JSON inside look like a list.
My question, assuming that's the case, is, How do I remove it and still have valid JSON to parse as dictionaries?
Other workarounds welcome. If I need to supply more info, let me know. Thanks!
(Added the missing curly bracket and changed a couple of confusing terms--I was trying to come up with generic terms on the fly, sorry for any confusion.)
If there's always exactly one "extra" set of array brackets:
for result in page['blah']:
that = result[0]['this']
list.append(that)
There is not need to remove the brackets from the JSON string. I think making sure to remove only the rights is not worth the effort. Just figure out the right way to access the values you want.
The "extra" brackets are not the only problem. this is a property of an object which is the value of first. So to access this, you'd have to write
that = result[0]['first']['this']
Whether this always works or not depends on the left-out JSON data.
First, you are right - your error is related to the incorrect use of the Python data types and JSON output.
Second, don't use list and all other Python reserved words when creating your variables.
Finally, if you simply want to get all results for 'this' inner key, you can try using the following code:
data = {"blah":
[
[
{"first":
{"this":"that",
"something":"else","another":"thing","key":"value"}}
]
]
}
outres = []
for k,v in data.items():#iterate over top dictionary('blah',....)
for sv in v: # iterate through first list
for tv in sv: # iterate through second list
for fk,fv in tv.iteritems(): # iterate through each dicionary from second list
if 'this' in fv:
outres.append(fv['this'])
print outres
Please note that my sample is based on your data sample - so if there is any additional levels in your data structure, or if any other rules should be applied, then the code should be modified.

Categories