Nested Json Array not handled by pandas dataframe / pd.json_normalize

Nested Json Array not handled by pandas dataframe / pd.json_normalize - python

Any help is really appreciated,
I have the below JSON, provided by API call. I've omitted sensitive data, but the key names are as presented ("value", "value_raw").
[{
"Position": "1234",
"StartDate": "2020-11-21",
"ID": "1234",
"CloseDate": "2020-12-07",
"Title": "This is a title",
"data": [{
"value": 1234
},
{
"value": "some text"
},
{
"value": "some text",
"value_raw": 11111
},
{
"value_raw": 11111,
"value": "some text"
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 22222
},
{
"value_raw": 2222222,
"value": "some text"
},
{
"value_raw": "null",
"value": "null"
},
{
"value_raw": "null",
"value": "null"
},
{
"value_raw": 2222222,
"value": "some text"
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 2222
},
{
"value": 1
},
{
"value": "some text",
"value_raw": 22222
}
]
}, {
"Position": "1235",
"StartDate": "2020-12-21",
"ID": "1235",
"CloseDate": "2021-01-12",
"Title": "some text",
"data": [{
"value": 1235
},
{
"value": "some text"
},
{
"value": "some text",
"value_raw": 1111
},
{
"value": "some text",
"value_raw": 1111
},
{
"value": "null"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value_raw": 11111,
"value": "some text"
},
{
"value_raw": "null",
"value": "null"
},
{
"value": "some text",
"value_raw": 1111
},
{
"value_raw": "null",
"value": "null"
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 22222
},
{
"value": 1
},
{
"value_raw": 22222,
"value": "some text"
}
]
}, {
"ID": "1236",
"Position": "1236",
"StartDate": "2021-07-12",
"data": [{
"value": 1236
},
{
"value": "some text"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value": "some text",
"value_raw": 1111
},
{
"value": "null"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value_raw": "null",
"value": "null"
},
{
"value": "null",
"value_raw": "null"
},
{
"value": "some text",
"value_raw": 111
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 12223
},
{
"value": 1
},
{
"value": "some text",
"value_raw": 2222
}
],
"Title": "some text",
"CloseDate": "2021-07-23"
}
]
When I normalize "data" using;
df = pd.json_normalize(mydata, record_path=['data'])
I end up with an output of 2 columns x 42 rows (excl. headings), illustration:
value
value_raw
1234
This is a title
some text
11111
Corporation
11111
null
some text
22222
some text
2222222
null
null
null
null
The only data im interested in is the key "value", I'd also like to know how to lay this data out as 3 rows x 14 columns (one row for each ID = '1234', '1235' & '1236' and no column headings needed as they provide zero benefit with the naming convention "value")
Any starting point would be great, I have spent hours looking at previous questions. What I have noticed is that the JSON I receive is very different to all of the examples out there.
Thanks everyone

Nice question. I think it's not possible with json_normalize.
Therefore did it with a loop and list comprehension:
values_list_all_rows = []
for json_element in json_list:
values_list_per_row = [value_dict["value"] for value_dict in json_element["data"] if "value" in value_dict]
values_list_all_rows.append(values_list_per_row)
pd.DataFrame(values_list_all_rows)
Gives (it will set None as value where rows has less values than other column):
0 1 2 3 4 5 6 7 8 9 10 11 12 13
1234 some text some text some text null some text some text null null some text null some text 1 some text
1235 some text some text some text null some text some text null some text null null some text 1 some text
1236 some text some text some text null some text some text null null some text null some text 1 None

Related

Python - trying to convert time from utc to cst in api response

Below is code I am using to get data from an api. And below that is the response. I am trying to convert datetime from UTC to CST and then present the data with that time zone instead. But I am having trouble isolating datetime
import requests
import json
weather = requests.get('...')
j = json.loads(weather.text)
print (json.dumps(j, indent=2))
Response:
{
"metadata": null,
"data": [
{
"datetime": "2022-12-11T05:00:00Z",
"is_day_time": false,
"icon_code": 5,
"weather_text": "Clear with few low clouds and few cirrus",
"temperature": {
"value": 45.968,
"units": "F"
},
"feels_like_temperature": {
"value": 39.092,
"units": "F"
},
"relative_humidity": 56,
"precipitation": {
"precipitation_probability": 4,
"total_precipitation": {
"value": 0.0,
"units": "in"
}
},
"wind": {
"speed": {
"value": 5.144953471725125,
"units": "mi/h"
},
"direction": 25
},
"wind_gust": {
"value": 9.014853256979242,
"units": "mi/h"
},
"pressure": {
"value": 29.4171829577118,
"units": "inHg"
},
"visibility": {
"value": 6.835083114610673,
"units": "mi"
},
"dew_point": {
"value": 31.01,
"units": "F"
},
"cloud_cover": 31
},
{
"datetime": "2022-12-11T06:00:00Z",
"is_day_time": false,
"icon_code": 4,
"weather_text": "Clear with few low clouds",
"temperature": {
"value": 45.068,
"units": "F"
},
"feels_like_temperature": {
"value": 38.066,
"units": "F"
},
"relative_humidity": 56,
"precipitation": {
"precipitation_probability": 5,
"total_precipitation": {
"value": 0.0,
"units": "in"
}
},
"wind": {
"speed": {
"value": 5.167322834645669,
"units": "mi/h"
},
"direction": 27
},
"wind_gust": {
"value": 8.724051539012168,
"units": "mi/h"
},
"pressure": {
"value": 29.4213171559632,
"units": "inHg"
},
"visibility": {
"value": 5.592340730136005,
"units": "mi"
},
"dew_point": {
"value": 30.2,
"units": "F"
},
"cloud_cover": 13
},
{
"datetime": "2022-12-11T07:00:00Z",
"is_day_time": false,
"icon_code": 4,
"weather_text": "Clear with few low clouds",
"temperature": {
"value": 44.33,
"units": "F"
},
"feels_like_temperature": {
"value": 37.364,
"units": "F"
},
"relative_humidity": 56,
"precipitation": {
"precipitation_probability": 4,
"total_precipitation": {
"value": 0.0,
"units": "in"
}
},
"wind": {
"speed": {
"value": 4.988367931281317,
"units": "mi/h"
},
"direction": 28
},
"wind_gust": {
"value": 8.254294917680744,
"units": "mi/h"
},
"pressure": {
"value": 29.4165923579616,
"units": "inHg"
},
"visibility": {
"value": 7.456454306848007,
"units": "mi"
},
"dew_point": {
"value": 29.714,
"units": "F"
},
"cloud_cover": 22
}
],
"error": null

I am assuming what you mean is that you want to present the data in the current time of the Central Time zone. As of the date this question was asked, that would be CST (Central Standard Time). At another time it will be CDT (Central Daylight Time) based on daylight savings time rules that are followed in the Country/City for the time zone for which you wish to localize the data. The rules are all nicely kept in the IANA Timezone Database.
So the trick is that you pick your Country/City from the Timezone DB that follows the rules as they apply to your current time zone. For Central Time, America/Chicago usually works but YMMV.
There are a lot of ways to do this. This example is inefficiently iterating through the dictionary created by json.loads and replacing the time string with a converted string. The key is using the dateutil library to parse the timestamp string and convert using the proper UTC offset as defined for the time zone in the IANA database.
Hopefully this example has enough pieces you can copy and adapt to your own needs.
from dateutil.parser import parse
from dateutil import tz
import json
j = json.loads(weather)
# Loop through each data entry, reformatting the time
for entry in j["data"]:
if "datetime" in entry.keys():
parsed_dt = parse(entry["datetime"])
converted = parsed_dt.astimezone(tz.gettz("America/Chicago"))
entry["datetime"] = converted.isoformat()
print (json.dumps(j, indent=2))
The resulting JSON has datetime fields that contain an ISO timestamp for the CST time.
{
"metadata": null,
"data": [{
"datetime": "2022-12-10T23:00:00-06:00",
"is_day_time": false,
"icon_code": 5,
"weather_text": "Clear with few low clouds and few cirrus",
"temperature": {
"value": 45.968,
"units": "F"
},
"feels_like_temperature": {
"value": 39.092,
"units": "F"
},
"relative_humidity": 56,
"precipitation": {
"precipitation_probability": 4,
"total_precipitation": {
"value": 0.0,
"units": "in"
}
},
"wind": {
"speed": {
"value": 5.144953471725125,
"units": "mi/h"
},
"direction": 25
},
"wind_gust": {
"value": 9.014853256979242,
"units": "mi/h"
},
"pressure": {
"value": 29.4171829577118,
"units": "inHg"
},
"visibility": {
"value": 6.835083114610673,
"units": "mi"
},
"dew_point": {
"value": 31.01,
"units": "F"
},
"cloud_cover": 31
},
{
"datetime": "2022-12-11T00:00:00-06:00",
"is_day_time": false,
"icon_code": 4,
"weather_text": "Clear with few low clouds",
"temperature": {
"value": 45.068,
"units": "F"
},
"feels_like_temperature": {
"value": 38.066,
"units": "F"
},
"relative_humidity": 56,
"precipitation": {
"precipitation_probability": 5,
"total_precipitation": {
"value": 0.0,
"units": "in"
}
},
"wind": {
"speed": {
"value": 5.167322834645669,
"units": "mi/h"
},
"direction": 27
},
"wind_gust": {
"value": 8.724051539012168,
"units": "mi/h"
},
"pressure": {
"value": 29.4213171559632,
"units": "inHg"
},
"visibility": {
"value": 5.592340730136005,
"units": "mi"
},
"dew_point": {
"value": 30.2,
"units": "F"
},
"cloud_cover": 13
},
{
"datetime": "2022-12-11T01:00:00-06:00",
"is_day_time": false,
"icon_code": 4,
"weather_text": "Clear with few low clouds",
"temperature": {
"value": 44.33,
"units": "F"
},
"feels_like_temperature": {
"value": 37.364,
"units": "F"
},
"relative_humidity": 56,
"precipitation": {
"precipitation_probability": 4,
"total_precipitation": {
"value": 0.0,
"units": "in"
}
},
"wind": {
"speed": {
"value": 4.988367931281317,
"units": "mi/h"
},
"direction": 28
},
"wind_gust": {
"value": 8.254294917680744,
"units": "mi/h"
},
"pressure": {
"value": 29.4165923579616,
"units": "inHg"
},
"visibility": {
"value": 7.456454306848007,
"units": "mi"
},
"dew_point": {
"value": 29.714,
"units": "F"
},
"cloud_cover": 22
}
],
"error": null
}

Extract data from JSON index loaded file

My JSON file looks like:
{
"numAccounts": xxxx,
"filtersApplied": {
"accountIds": "All",
"checkIds": "All",
"categories": [
"cost_optimizing"
],
"statuses": "All",
"regions": "All",
"organizationalUnitIds": [
"yyyyy"
]
},
"categoryStatusMap": {
"cost_optimizing": {
"statusMap": {
"RULE_ERROR": {
"name": "Blue",
"count": 11
},
"ERROR": {
"name": "Red",
"count": 11
},
"OK": {
"name": "Green",
"count": 11
},
"WARN": {
"name": "Yellow",
"count": 11
}
},
"name": "Cost Optimizing",
"monthlySavings": 1111
}
},
"accountStatusMap": {
"xxxxxxxx": {
"cost_optimizing": {
"statusMap": {
"OK": {
"name": "Green",
"count": 1111
},
"WARN": {
"name": "Yellow",
"count": 111
}
},
"name": "Cost Optimizing",
"monthlySavings": 1111
}
},
Which I load into memory using pandas:
df = pd.read_json('file.json', orient='index')
I find the index orient the most suitable because it gives me:
print(df)
0
numAccounts 125
filtersApplied {'accountIds': 'All', 'checkIds': 'All', 'cate...
categoryStatusMap {'cost_optimizing': {'statusMap': {'RULE_ERROR...
accountStatusMap {'xxxxxxx': {'cost_optimizing': {'statusM...
Now, how can I access the accountStatusMap entry?
I tried account_status_map = df['accountStatusMap'] which gives me a
KeyError: 'accountStatusMap'
Is there something specific to the index orientation in how to access specific entries in a dataframe?

Python - Get Nested Data from Multiple Levels

Wasn't sure how to title this question but I am working with the Quickbooks Online API and when querying a report like BalanceSheet or GeneralLedger the API returns data rows in multiple nested levels which is quite frustrating to parse through.
Example of the BalanceSheet return included below. I am only interested in the data from "Row" objects but as you can see that can be returned in 1, 2, 3 or more different levels of data. I am thinking of going through each level to check for Rows and then get each Row but that seems overly complex as I would need multiple for loops for each level.
I'm wondering if there is a better way to get each "Row" in that data without regard to which level it is on? Any ideas would be appreciated!
Here's an example of a return from their sandbox data:
{
"Header": {
"Time": "2021-04-28T14:12:17-07:00",
"ReportName": "BalanceSheet",
"DateMacro": "this calendar year-to-date",
"ReportBasis": "Accrual",
"StartPeriod": "2021-01-01",
"EndPeriod": "2021-04-28",
"SummarizeColumnsBy": "Month",
"Currency": "USD",
"Option": [
{
"Name": "AccountingStandard",
"Value": "GAAP"
},
{
"Name": "NoReportData",
"Value": "false"
}
]
},
"Columns": {
"Column": [
{
"ColTitle": "",
"ColType": "Account",
"MetaData": [
{
"Name": "ColKey",
"Value": "account"
}
]
},
{
"ColTitle": "Jan 2021",
"ColType": "Money",
"MetaData": [
{
"Name": "StartDate",
"Value": "2021-01-01"
},
{
"Name": "EndDate",
"Value": "2021-01-31"
},
{
"Name": "ColKey",
"Value": "Jan 2021"
}
]
},
{
"ColTitle": "Feb 2021",
"ColType": "Money",
"MetaData": [
{
"Name": "StartDate",
"Value": "2021-02-01"
},
{
"Name": "EndDate",
"Value": "2021-02-28"
},
{
"Name": "ColKey",
"Value": "Feb 2021"
}
]
},
{
"ColTitle": "Mar 2021",
"ColType": "Money",
"MetaData": [
{
"Name": "StartDate",
"Value": "2021-03-01"
},
{
"Name": "EndDate",
"Value": "2021-03-31"
},
{
"Name": "ColKey",
"Value": "Mar 2021"
}
]
},
{
"ColTitle": "Apr 1-28, 2021",
"ColType": "Money",
"MetaData": [
{
"Name": "StartDate",
"Value": "2021-04-01"
},
{
"Name": "EndDate",
"Value": "2021-04-28"
},
{
"Name": "ColKey",
"Value": "Apr 1-28, 2021"
}
]
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "ASSETS"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "Current Assets"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "Bank Accounts"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Checking",
"id": "35"
},
{
"value": "1201.00"
},
{
"value": "1201.00"
},
{
"value": "1201.00"
},
{
"value": "1201.00"
}
],
"type": "Data"
},
{
"ColData": [
{
"value": "Savings",
"id": "36"
},
{
"value": "800.00"
},
{
"value": "800.00"
},
{
"value": "800.00"
},
{
"value": "800.00"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Bank Accounts"
},
{
"value": "2001.00"
},
{
"value": "2001.00"
},
{
"value": "2001.00"
},
{
"value": "2001.00"
}
]
},
"type": "Section",
"group": "BankAccounts"
},
{
"Header": {
"ColData": [
{
"value": "Accounts Receivable"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Accounts Receivable (A/R)",
"id": "84"
},
{
"value": "5281.52"
},
{
"value": "5281.52"
},
{
"value": "5281.52"
},
{
"value": "5281.52"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Accounts Receivable"
},
{
"value": "5281.52"
},
{
"value": "5281.52"
},
{
"value": "5281.52"
},
{
"value": "5281.52"
}
]
},
"type": "Section",
"group": "AR"
},
{
"Header": {
"ColData": [
{
"value": "Other Current Assets"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Inventory Asset",
"id": "81"
},
{
"value": "596.25"
},
{
"value": "596.25"
},
{
"value": "596.25"
},
{
"value": "596.25"
}
],
"type": "Data"
},
{
"ColData": [
{
"value": "Undeposited Funds",
"id": "4"
},
{
"value": "2062.52"
},
{
"value": "2062.52"
},
{
"value": "2062.52"
},
{
"value": "2062.52"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Other Current Assets"
},
{
"value": "2658.77"
},
{
"value": "2658.77"
},
{
"value": "2658.77"
},
{
"value": "2658.77"
}
]
},
"type": "Section",
"group": "OtherCurrentAssets"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Current Assets"
},
{
"value": "9941.29"
},
{
"value": "9941.29"
},
{
"value": "9941.29"
},
{
"value": "9941.29"
}
]
},
"type": "Section",
"group": "CurrentAssets"
},
{
"Header": {
"ColData": [
{
"value": "Fixed Assets"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "Truck",
"id": "37"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Original Cost",
"id": "38"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Truck"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
}
]
},
"type": "Section"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Fixed Assets"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
},
{
"value": "13495.00"
}
]
},
"type": "Section",
"group": "FixedAssets"
}
]
},
"Summary": {
"ColData": [
{
"value": "TOTAL ASSETS"
},
{
"value": "23436.29"
},
{
"value": "23436.29"
},
{
"value": "23436.29"
},
{
"value": "23436.29"
}
]
},
"type": "Section",
"group": "TotalAssets"
},
{
"Header": {
"ColData": [
{
"value": "LIABILITIES AND EQUITY"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "Liabilities"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "Current Liabilities"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"Header": {
"ColData": [
{
"value": "Accounts Payable"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Accounts Payable (A/P)",
"id": "33"
},
{
"value": "1602.67"
},
{
"value": "1602.67"
},
{
"value": "1602.67"
},
{
"value": "1602.67"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Accounts Payable"
},
{
"value": "1602.67"
},
{
"value": "1602.67"
},
{
"value": "1602.67"
},
{
"value": "1602.67"
}
]
},
"type": "Section",
"group": "AP"
},
{
"Header": {
"ColData": [
{
"value": "Credit Cards"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Mastercard",
"id": "41"
},
{
"value": "157.72"
},
{
"value": "157.72"
},
{
"value": "157.72"
},
{
"value": "157.72"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Credit Cards"
},
{
"value": "157.72"
},
{
"value": "157.72"
},
{
"value": "157.72"
},
{
"value": "157.72"
}
]
},
"type": "Section",
"group": "CreditCards"
},
{
"Header": {
"ColData": [
{
"value": "Other Current Liabilities"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Arizona Dept. of Revenue Payable",
"id": "89"
},
{
"value": "0.00"
},
{
"value": "0.00"
},
{
"value": "0.00"
},
{
"value": "0.00"
}
],
"type": "Data"
},
{
"ColData": [
{
"value": "Board of Equalization Payable",
"id": "90"
},
{
"value": "370.94"
},
{
"value": "370.94"
},
{
"value": "370.94"
},
{
"value": "370.94"
}
],
"type": "Data"
},
{
"ColData": [
{
"value": "Loan Payable",
"id": "43"
},
{
"value": "4000.00"
},
{
"value": "4000.00"
},
{
"value": "4000.00"
},
{
"value": "4000.00"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Other Current Liabilities"
},
{
"value": "4370.94"
},
{
"value": "4370.94"
},
{
"value": "4370.94"
},
{
"value": "4370.94"
}
]
},
"type": "Section",
"group": "OtherCurrentLiabilities"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Current Liabilities"
},
{
"value": "6131.33"
},
{
"value": "6131.33"
},
{
"value": "6131.33"
},
{
"value": "6131.33"
}
]
},
"type": "Section",
"group": "CurrentLiabilities"
},
{
"Header": {
"ColData": [
{
"value": "Long-Term Liabilities"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Notes Payable",
"id": "44"
},
{
"value": "25000.00"
},
{
"value": "25000.00"
},
{
"value": "25000.00"
},
{
"value": "25000.00"
}
],
"type": "Data"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Long-Term Liabilities"
},
{
"value": "25000.00"
},
{
"value": "25000.00"
},
{
"value": "25000.00"
},
{
"value": "25000.00"
}
]
},
"type": "Section",
"group": "LongTermLiabilities"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Liabilities"
},
{
"value": "31131.33"
},
{
"value": "31131.33"
},
{
"value": "31131.33"
},
{
"value": "31131.33"
}
]
},
"type": "Section",
"group": "Liabilities"
},
{
"Header": {
"ColData": [
{
"value": "Equity"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
]
},
"Rows": {
"Row": [
{
"ColData": [
{
"value": "Opening Balance Equity",
"id": "34"
},
{
"value": "-9337.50"
},
{
"value": "-9337.50"
},
{
"value": "-9337.50"
},
{
"value": "-9337.50"
}
],
"type": "Data"
},
{
"ColData": [
{
"value": "Retained Earnings",
"id": "2"
},
{
"value": "1642.46"
},
{
"value": "1642.46"
},
{
"value": "1642.46"
},
{
"value": "1642.46"
}
],
"type": "Data"
},
{
"ColData": [
{
"value": "Net Income"
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
},
{
"value": ""
}
],
"type": "Data",
"group": "NetIncome"
}
]
},
"Summary": {
"ColData": [
{
"value": "Total Equity"
},
{
"value": "-7695.04"
},
{
"value": "-7695.04"
},
{
"value": "-7695.04"
},
{
"value": "-7695.04"
}
]
},
"type": "Section",
"group": "Equity"
}
]
},
"Summary": {
"ColData": [
{
"value": "TOTAL LIABILITIES AND EQUITY"
},
{
"value": "23436.29"
},
{
"value": "23436.29"
},
{
"value": "23436.29"
},
{
"value": "23436.29"
}
]
},
"type": "Section",
"group": "TotalLiabilitiesAndEquity"
}
]
}
}

Get different values from repeating item JSON

I have this json derived dict:
{
"stats": [
{
"name": "Jengas",
"time": 166,
"uid": "177098244407558145",
"id": 1
},
{
"name": "- k",
"time": 20,
"uid": "199295228664872961",
"id": 2
},
{
"name": "MAD MARX",
"time": "0",
"uid": "336539711785009153",
"id": 3
},
{
"name": "loli",
"time": 20,
"uid": "366299640976375818",
"id": 4
},
{
"name": "Woona",
"time": 20,
"uid": "246996981178695686",
"id": 5
}
]
}
I want to get the "time" from everybody in the list and use it with sort.
So the result I get has this:
TOP 10:
Jengas: 166
Loli: 20
My first try is to list different values from repeating item.
Right now the code is:
with open('db.json') as json_data:
topvjson = json.load(json_data)
print(topvjson)
d = topvjson['stats'][0]['time']
print(d)

Extract the stats list, apply sort to it with the appropriate key:
from json import loads
data = loads("""{
"stats": [{
"name": "Jengas",
"time": 166,
"uid": "177098244407558145",
"id": 1
}, {
"name": "- k",
"time": 20,
"uid": "199295228664872961",
"id": 2
}, {
"name": "MAD MARX",
"time": "0",
"uid": "336539711785009153",
"id": 3
}, {
"name": "loli",
"time": 20,
"uid": "366299640976375818",
"id": 4
}, {
"name": "Woona",
"time": 20,
"uid": "246996981178695686",
"id": 5
}]
}""")
stats = data['stats']
stats.sort(key = lambda entry: int(entry['time']), reverse=True)
print("TOP 10:")
for entry in stats[:10]:
print("%s: %d" % (entry['name'], int(entry['time'])))
This prints:
TOP 10:
Jengas: 166
- k: 20
loli: 20
Woona: 20
MAD MARX: 0
Note that your time is neither an integer nor string: there are both 0 and "0" in the dataset. That's why you need the conversion int(...).

You can sort the list of dict values like:
Code:
top_three = [(x[1], -x[0]) for x in sorted(
(-int(user['time']), user['name']) for user in stats['stats'])][:3]
This works by taking the time and the name and building a tuple. The tuples can the be sorted, and then the names can be extracted (via: x[1]) after the sort.
Test Code:
stats = {
"stats": [{
"name": "Jengas",
"time": 166,
"uid": "177098244407558145",
"id": 1
}, {
"name": "- k",
"time": 20,
"uid": "199295228664872961",
"id": 2
}, {
"name": "MAD MARX",
"time": "0",
"uid": "336539711785009153",
"id": 3
}, {
"name": "loli",
"time": 20,
"uid": "366299640976375818",
"id": 4
}, {
"name": "Woona",
"time": 20,
"uid": "246996981178695686",
"id": 5
}]
}
top_three = [x[1] for x in sorted(
(-int(user['time']), user['name']) for user in stats['stats'])][:3]
print(top_three)
Results:
[('Jengas', 166), ('- k', 20), ('Woona', 20)]

Here's a way to do it using the built-in sorted() function:
data = {
"stats": [
{
"name": "Jengas",
"time": 166,
"uid": "177098244407558145",
"id": 1
},
{
etc ...
}
]
}
print('TOP 3')
sorted_by_time = sorted(data['stats'], key=lambda d: int(d['time']), reverse=True)
for i, d in enumerate(sorted_by_time, 1):
if i > 3: break
print('{name}: {time}'.format(**d))
Output:
TOP 3
Jengas: 166
- k: 20
loli: 20

Combine 2 JSON files into 1 file in Node or Python (i.e. longitude and latitude)

I want to append the longitude to a latitude stored in 2 separated json files
The result should be stored in a 3rd file
How can I do that on Python OR Javascript/Node?
Many thanks for your support,
LATITUDE
{
"tags": [{
"name": "LATITUDE_deg",
"results": [{
"groups": [{
"name": "type",
"type": "number"
}],
"values": [
[1123306773000, 46.9976859318, 3],
[1123306774000, 46.9976859319, 3]
],
"attributes": {
"customer": ["Acme"],
"host": ["server1"]
}
}],
"stats": {
"rawCount": 2
}
}]
}
LONGITUDE
{
"tags": [{
"name": "LONGITUDE_deg",
"results": [{
"groups": [{
"name": "type",
"type": "number"
}],
"values": [
[1123306773000, 36.9976859318, 3],
[1123306774000, 36.9976859317, 3]
],
"attributes": {
"customer": ["Acme"],
"host": ["server1"]
}
}],
"stats": {
"rawCount": 2
}
}]
}
Expected result: LATITUDE_AND_LONGITUDE
{
"tags": [{
"name": "LATITUDE_AND_LONGITUDE_deg",
"results": [{
"groups": [{
"name": "type",
"type": "number"
}],
"values": [
[1123306773000, 46.9976859318, 36.9976859318, 3],
[1123306774000, 46.9976859319, 36.9976859317, 3]
],
"attributes": {
"customer": ["Acme"],
"host": ["server1"]
}
}],
"stats": {
"rawCount": 2
}
}]
}

I have written the solution with a colleague, find the source code on github: https://gist.github.com/Abdelkrim/715eb222cc318219196c8be293c233bf

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Nested Json Array not handled by pandas dataframe / pd.json_normalize - python

Related

Python - trying to convert time from utc to cst in api response

Extract data from JSON index loaded file

Python - Get Nested Data from Multiple Levels

Get different values from repeating item JSON

Combine 2 JSON files into 1 file in Node or Python (i.e. longitude and latitude)

Categories

Resources