I have below JSON file from which I want to extract only
("workers": {"usersRunning": 1, "usersWaiting": 0, "total": 8, "jobsWaiting": 0, "inUse": 4})
part, and then put it into csv file or text file (tab deliminated). I am new to python so any help will be apprciated..
{
"workers": {
"usersRunning": 1,
"usersWaiting": 0,
"total": 8,
"jobsWaiting": 0,
"inUse": 4
},
"users": {
"activeUsers": 1,
"activity": [{
"maxWorkers": 4,
"inProgress": 4,
"displayName": "abc",
"waiting": 0
}]
}
}
I recommend using pandas. It has methods to read the json and you can use dataframe filtering to find the data you need.
Examples here: https://www.listendata.com/2019/07/how-to-filter-pandas-dataframe.html
I would recommend you use pandas and convert to excel
The following is sample that will help you to get your answer
json_ = {"workers": {"usersRunning": 1, "usersWaiting": 0, "total": 8, "jobsWaiting": 0, "inUse": 4}, "users": {"activeUsers": 1, "activity": [{"maxWorkers": 4, "inProgress": 4, "displayName": "abc", "waiting": 0}]}}
import pandas as pd
df = pd.DataFrame(data= json_['workers'], index=[0])
df.to_excel('json_.xlsx')
Related
Please could someone help me with convert this nested JSON to a CSV file?
{"campaignTitle": "Template Campaign", "listName": "Trial", "leadId": 573, "timezone": "Australia/Sydney", "isComplete": 0, "customerKey": "576", "phone1": "+61212345678", "phone2": "", "phone3": "", "leadUploadDate": "2020-07-03 16:25:07", "lastDiallerTimestamp": "2020-07-09 13:59:55", "scheduledCallTimestamp": "2020-07-09 15:59:50", "campaignId": 4, "listId": 4, "isDialling": 0, "leadData": "{\"Email\":\"xxx#xxx.com.au\",\"Address\":\"73 White Road\",\"MQL20\":null,\"HQL2\":null,\"HQL1\":null,\"Industry\":\"\",\"CompanyName\":\"Cofi-Com Trading Pty Limited\",\"HQL3\":null,\"RecordComments1\":null,\"RecordComments2\":null,\"RecordComments3\":null,\"RecordComments4\":null,\"MQL10\":null,\"MQL4\":null,\"MQL14\":null,\"MQL5\":null,\"MQL13\":null,\"MQL6\":null,\"MQL12\":null,\"City\":\"West Ryde\",\"MQL7\":null,\"MQL11\":null,\"MQL18\":null,\"Postcode\":\"2114\",\"MQL1\":null,\"MQL17\":null,\"BasicLead\":null,\"MQL2\":null,\"MQL16\":null,\"CallRecording\":null,\"MQL3\":null,\"MQL15\":null,\"MQL19\":null,\"MQL8\":null,\"MQL9\":null,\"State\":\"\",\"GlobalCompanySize\":null,\"Country\":\"AU\",\"LastName\":\"Black\",\"LocalCompanySize\":\"100 - 249\",\"HQL_Timeframe3\":null,\"HQL_Timeframe2\":null,\"HQL_Timeframe1\":null,\"Authority\":null,\"Content_Syndication\":null,\"Salutation\":\"Mr\",\"JobTitle\":\"Information Technology Head\",\"Filtering3\":null,\"Filtering2\":null,\"Filtering4\":null,\"FirstName\":\"John\",\"Filtering1\":null,\"RH_RID\":null,\"HQL_OpID1\":null,\"HQL_OpID3\":null,\"Meiro_ID\":null,\"HQL_OpID2\":null,\"QCOptIn\":null}", "dialAttempts": 1, "diallerOutcomes": [], "wrapCodeId": 0, "leadInteractions": [{"interactionId": 578, "activities": [642]}], "leadActivities": [{"activityId": 642, "interactions": [578]}]}
Refer the following code to convert json into dataframe for the given sample Data. The code uses inbuilt pd.DataFrame.from_dict() and json.
import json
import pandas as pd
with open('json_in.json', 'r') as f:
json_in=f.read()
json_in=json.loads(json_in)
#json_in={"campaignTitle": "Template Campaign", "listName": "Trial", "leadId": 573, "timezone": "Australia/Sydney", "isComplete": 0, "customerKey": "576", "phone1": "+61212345678", "phone2": "", "phone3": "", "leadUploadDate": "2020-07-03 16:25:07", "lastDiallerTimestamp": "2020-07-09 13:59:55", "scheduledCallTimestamp": "2020-07-09 15:59:50", "campaignId": 4, "listId": 4, "isDialling": 0, "leadData": "{\"Email\":\"xxx#xxx.com.au\",\"Address\":\"73 White Road\",\"MQL20\":null,\"HQL2\":null,\"HQL1\":null,\"Industry\":\"\",\"CompanyName\":\"Cofi-Com Trading Pty Limited\",\"HQL3\":null,\"RecordComments1\":null,\"RecordComments2\":null,\"RecordComments3\":null,\"RecordComments4\":null,\"MQL10\":null,\"MQL4\":null,\"MQL14\":null,\"MQL5\":null,\"MQL13\":null,\"MQL6\":null,\"MQL12\":null,\"City\":\"West Ryde\",\"MQL7\":null,\"MQL11\":null,\"MQL18\":null,\"Postcode\":\"2114\",\"MQL1\":null,\"MQL17\":null,\"BasicLead\":null,\"MQL2\":null,\"MQL16\":null,\"CallRecording\":null,\"MQL3\":null,\"MQL15\":null,\"MQL19\":null,\"MQL8\":null,\"MQL9\":null,\"State\":\"\",\"GlobalCompanySize\":null,\"Country\":\"AU\",\"LastName\":\"Black\",\"LocalCompanySize\":\"100 - 249\",\"HQL_Timeframe3\":null,\"HQL_Timeframe2\":null,\"HQL_Timeframe1\":null,\"Authority\":null,\"Content_Syndication\":null,\"Salutation\":\"Mr\",\"JobTitle\":\"Information Technology Head\",\"Filtering3\":null,\"Filtering2\":null,\"Filtering4\":null,\"FirstName\":\"John\",\"Filtering1\":null,\"RH_RID\":null,\"HQL_OpID1\":null,\"HQL_OpID3\":null,\"Meiro_ID\":null,\"HQL_OpID2\":null,\"QCOptIn\":null}", "dialAttempts": 1, "diallerOutcomes": [], "wrapCodeId": 0, "leadInteractions": [{"interactionId": 578, "activities": [642]}], "leadActivities": [{"activityId": 642, "interactions": [578]}]}
df=pd.DataFrame.from_dict(json_in, orient='index')
df_final=pd.DataFrame.from_dict(json.loads(df.loc['leadData',:][0]), orient='index')
#To get transpose of the dataframe - Values in the Column Rather than index
df_final=df_final.T
#To copy a particular column to another dataframe
df_final.loc[:,"campaignTitle"]=df.loc["campaignTitle",:][0]
df_final.to_csv("<output-file.csv>", index=None)
I have this sample json data, and need to grab only the MAC addresses so I can convert the mac to a list of manufacturers later.
[
{
"aps": {
"00:20:90:B3:16:25": {
"ssid": "",
"encryption": "Open",
"hidden": 1,
"channel": 11,
"signal": -23,
"wps": 0,
"last_seen": 1594356454,
"clients": []
},
"06:AA:A0:84:7F:D8": {
"ssid": "",
"encryption": "Open",
"hidden": 1,
"channel": 6,
"signal": -75,
"wps": 0,
"last_seen": 1594356452,
"clients": []
},
"1E:51:A4:D4:B7:29": {
"ssid": "",
"encryption": "WPA Mixed PSK (CCMP TKIP)",
"hidden": 1,
"channel": 11,
"signal": -63,
"wps": 0,
"last_seen": 1594356448,
"clients": []
}
}
}
]
This is my python program so far, but im not sure how to isolate the MAC address
import json
f = open('recon_data.json',)
data = json.load(f)
print(data["aps"])
f.close()
I get an error every time I run the program weather im asking for aps or ssid information
Traceback (most recent call last):
File "recon.py", line 12, in
print(data["ssid"])
TypeError: list indices must be integers or slices, not str
This is because the data you're loading is a list. Try data[0]["aps"]
As for getting all the mac addresses they are the keys in that dict so you can just use list on that inner dict to get all the keys:
import json
with open('recon_data.json') as f
data = json.load(f)
print(list(data[0]['aps']))
This will print a list of all the MAC addresses
['00:20:90:B3:16:25', '06:AA:A0:84:7F:D8', '1E:51:A4:D4:B7:29']
Input data is like below.But, it actually contains thousands of dictionaries under this list and serial_ids are repeated throughout the list.
[{
"serial_id": 1,
"name": "ABC"
},
{
"serial_id": 6,
"name": "DEF"
},
{
"serial_id": 8,
"name": "GHI"
},
{
"serial_id": 0,
"name": "JKL"
},
{
"serial_id": 6,
"name": "VVV"
}]
Now, I know the range of serial_id but I don't want to hardcode it.
My task is to find the total number of users (i.e. name_count basically) per serial id. It will be better if I can get a table like structure sorted in descending order containing columns, serial_id and user_count per serial_id.
Questions are:
Can we make use of Dataframe concept? If possible, I would like to.
I am unable to get any method to achieve the required output.
Thanks in Advance !!
Since the JSON data is pulled from an API, below is the code I tried to but failed badly.
#Python libraries
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
from collections import Counter
url1 = 'INPUT URL'
#print ('Retrieving',url1)
#uh = urllib2.urlopen(url1)
r = requests.get(url1)
r = r.text
#print r
#print ('Retrieved', len(r), 'characters')
try:js = json.loads(r) # js -> Native Python list
except:js = None
#print js
info = json.dumps(js , indent =4) #Prints out the JSON data in a nice format which we call as "Pretty Print"
#print (info)
'''
#print ('User Count:' , len(info))
for item in (js):
print ('Name' , item["name"])
'''
'''
user_count = 0
for item in (js):
#df = {'serial_id': Series[item["affiliate_id"]]} //ERROR
df = DataFrame({'serial_id': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
#Hard-coded the serial_id since we know the range of the affiliate_id
print(df)
Let's use Pandas dataframes:
from io import StringIO
import pandas as pd
jstring = StringIO("""[{
"serial_id": 1,
"name": "ABC"
},
{
"serial_id": 6,
"name": "DEF"
},
{
"serial_id": 8,
"name": "GHI"
},
{
"serial_id": 0,
"name": "JKL"
},
{
"serial_id": 6,
"name": "VVV"
}]""")
df = pd.read_json(jstring)
df_out = df.groupby('serial_id')['name'].count().reset_index(name='name_count')
print(df_out)
Output:
serial_id name_count
0 0 1
1 1 1
2 6 2
3 8 1
I have query set which returns a json string after converted list.
[{"pid": 1, "loc": "KL", "sid": 1, "sd": "south-1"},
{"pid": 1, "loc": "KL", "sid": 2, "sd": "north-5"},
{"pid": 1, "loc": "KL", "sid": 3, "sd": "west-3"}
]
I have tried many serializer options but no idea how to make the above as:
[{"pid": 1,
"s": [{"sid": 1, "sd": "south-1",
"sid": 2, "sd": "north-5",
"sid": 3, "sd": "west-3"
}]
}]
Firstly, there's an error in your expected output. You probably meant:
[{"pid": 1,
"s": [{"sid": 1, "sd": "south-1"},
{"sid": 2, "sd": "north-5"},
{"sid": 3, "sd": "west-3"}
],
"loc": "KL"
}]
ie s should be a list of dictionaries and not one dict (and clashing keys). I've added "loc": "KL" since that looks like it's missing.
Assuming each query returns only the same pid and loc, you can create s as a list with each sid and sd in the original query:
>>> q = ... # as above
>>> r = {"pid": q[0]["pid"], "loc": q[0]["loc"]} # since pid and loc are always the same
>>> r["s"] = [{"sid": x["sid"], "sd": x["sd"]} for x in q]
>>> print r
[{'pid': 1,
's': [{'sid': 1, 'sd': 'south-1'},
{'sid': 2, 'sd': 'north-5'},
{'sid': 3, 'sd': 'west-3'}
],
'loc': 'KL'
}]
>>> print json.dumps(r) # gives the output as a json string
I have a json file
{
"rows": [
{
"votes": {
"funny": 0,
"useful": 1,
"cool": 0
},
"user_id": "zvNimI98mrmhgNOOrzOiGg",
"review_id": "I7Kte2FwXWPCwdm7ispu1A",
"text": "Pretty good dinner with a nice selection of food"
},
{
"votes": {
"funny": 2,
"useful": 5,
"cool": 0
},
"user_id": "Au3Qs-AAZEWu2_4gIMwRgw",
"review_id": "SSlO5u2nIJ8PoAKAgN5m3Q",
"text": "Yeah, thats right a five freakin star rating."
}
]
}
I just want to read the "text" one by one i.e. I want to access the first "text", do some operation on it, and then move onto the next "text".
It's a simple matter to open a file, read the contents as JSON, then iterate over the data you get:
import json
with open("my_data.json") as my_data_file:
my_data = json.load(my_data_file)
for row in my_data["rows"]:
do_something(row["text"])
You can simply access the data like in a dict, since your current json data is already one:
>>> text = """{
"rows": [
{
"votes": {
"funny": 0,
"useful": 1,
"cool": 0
},
"user_id": "zvNimI98mrmhgNOOrzOiGg",
"review_id": "I7Kte2FwXWPCwdm7ispu1A",
"text": "Pretty good dinner with a nice selection of food"
},
{
"votes": {
"funny": 2,
"useful": 5,
"cool": 0
},
"user_id": "Au3Qs-AAZEWu2_4gIMwRgw",
"review_id": "SSlO5u2nIJ8PoAKAgN5m3Q",
"text": "Yeah, thats right a five freakin star rating."
}
]
}"""
Assuming the above is your json text, (which can be obtained using a simple
with open("json_file.txt", "r") as f: text = f.read(), you can now get convert the json into a dictionary format using
>>> import json
>>> json_data = json.loads(text)
To access the data, you can now operae normally as you would on a dict.
So, in a list comprehension, this becomes:
>>> print [d["text"] for d in json_data["rows"]]
['Pretty good dinner with a nice selection of food',
'Yeah, thats right a five freakin star rating.']
And in a loop, this becomes
>>> for d in json_data["rows"]:
... print d["text"]
Pretty good dinner with a nice selection of food
Yeah, thats right a five freakin star rating.
Note that the json is not read line by line, it is converted in entirety and only then the required fields are accessed.