Converting a NDJSON to CSV in Python

Converting a NDJSON to CSV in Python - python

Please could someone help me with convert this nested JSON to a CSV file?
{"campaignTitle": "Template Campaign", "listName": "Trial", "leadId": 573, "timezone": "Australia/Sydney", "isComplete": 0, "customerKey": "576", "phone1": "+61212345678", "phone2": "", "phone3": "", "leadUploadDate": "2020-07-03 16:25:07", "lastDiallerTimestamp": "2020-07-09 13:59:55", "scheduledCallTimestamp": "2020-07-09 15:59:50", "campaignId": 4, "listId": 4, "isDialling": 0, "leadData": "{\"Email\":\"xxx#xxx.com.au\",\"Address\":\"73 White Road\",\"MQL20\":null,\"HQL2\":null,\"HQL1\":null,\"Industry\":\"\",\"CompanyName\":\"Cofi-Com Trading Pty Limited\",\"HQL3\":null,\"RecordComments1\":null,\"RecordComments2\":null,\"RecordComments3\":null,\"RecordComments4\":null,\"MQL10\":null,\"MQL4\":null,\"MQL14\":null,\"MQL5\":null,\"MQL13\":null,\"MQL6\":null,\"MQL12\":null,\"City\":\"West Ryde\",\"MQL7\":null,\"MQL11\":null,\"MQL18\":null,\"Postcode\":\"2114\",\"MQL1\":null,\"MQL17\":null,\"BasicLead\":null,\"MQL2\":null,\"MQL16\":null,\"CallRecording\":null,\"MQL3\":null,\"MQL15\":null,\"MQL19\":null,\"MQL8\":null,\"MQL9\":null,\"State\":\"\",\"GlobalCompanySize\":null,\"Country\":\"AU\",\"LastName\":\"Black\",\"LocalCompanySize\":\"100 - 249\",\"HQL_Timeframe3\":null,\"HQL_Timeframe2\":null,\"HQL_Timeframe1\":null,\"Authority\":null,\"Content_Syndication\":null,\"Salutation\":\"Mr\",\"JobTitle\":\"Information Technology Head\",\"Filtering3\":null,\"Filtering2\":null,\"Filtering4\":null,\"FirstName\":\"John\",\"Filtering1\":null,\"RH_RID\":null,\"HQL_OpID1\":null,\"HQL_OpID3\":null,\"Meiro_ID\":null,\"HQL_OpID2\":null,\"QCOptIn\":null}", "dialAttempts": 1, "diallerOutcomes": [], "wrapCodeId": 0, "leadInteractions": [{"interactionId": 578, "activities": [642]}], "leadActivities": [{"activityId": 642, "interactions": [578]}]}

Refer the following code to convert json into dataframe for the given sample Data. The code uses inbuilt pd.DataFrame.from_dict() and json.
import json
import pandas as pd
with open('json_in.json', 'r') as f:
json_in=f.read()
json_in=json.loads(json_in)
#json_in={"campaignTitle": "Template Campaign", "listName": "Trial", "leadId": 573, "timezone": "Australia/Sydney", "isComplete": 0, "customerKey": "576", "phone1": "+61212345678", "phone2": "", "phone3": "", "leadUploadDate": "2020-07-03 16:25:07", "lastDiallerTimestamp": "2020-07-09 13:59:55", "scheduledCallTimestamp": "2020-07-09 15:59:50", "campaignId": 4, "listId": 4, "isDialling": 0, "leadData": "{\"Email\":\"xxx#xxx.com.au\",\"Address\":\"73 White Road\",\"MQL20\":null,\"HQL2\":null,\"HQL1\":null,\"Industry\":\"\",\"CompanyName\":\"Cofi-Com Trading Pty Limited\",\"HQL3\":null,\"RecordComments1\":null,\"RecordComments2\":null,\"RecordComments3\":null,\"RecordComments4\":null,\"MQL10\":null,\"MQL4\":null,\"MQL14\":null,\"MQL5\":null,\"MQL13\":null,\"MQL6\":null,\"MQL12\":null,\"City\":\"West Ryde\",\"MQL7\":null,\"MQL11\":null,\"MQL18\":null,\"Postcode\":\"2114\",\"MQL1\":null,\"MQL17\":null,\"BasicLead\":null,\"MQL2\":null,\"MQL16\":null,\"CallRecording\":null,\"MQL3\":null,\"MQL15\":null,\"MQL19\":null,\"MQL8\":null,\"MQL9\":null,\"State\":\"\",\"GlobalCompanySize\":null,\"Country\":\"AU\",\"LastName\":\"Black\",\"LocalCompanySize\":\"100 - 249\",\"HQL_Timeframe3\":null,\"HQL_Timeframe2\":null,\"HQL_Timeframe1\":null,\"Authority\":null,\"Content_Syndication\":null,\"Salutation\":\"Mr\",\"JobTitle\":\"Information Technology Head\",\"Filtering3\":null,\"Filtering2\":null,\"Filtering4\":null,\"FirstName\":\"John\",\"Filtering1\":null,\"RH_RID\":null,\"HQL_OpID1\":null,\"HQL_OpID3\":null,\"Meiro_ID\":null,\"HQL_OpID2\":null,\"QCOptIn\":null}", "dialAttempts": 1, "diallerOutcomes": [], "wrapCodeId": 0, "leadInteractions": [{"interactionId": 578, "activities": [642]}], "leadActivities": [{"activityId": 642, "interactions": [578]}]}
df=pd.DataFrame.from_dict(json_in, orient='index')
df_final=pd.DataFrame.from_dict(json.loads(df.loc['leadData',:][0]), orient='index')
#To get transpose of the dataframe - Values in the Column Rather than index
df_final=df_final.T
#To copy a particular column to another dataframe
df_final.loc[:,"campaignTitle"]=df.loc["campaignTitle",:][0]
df_final.to_csv("<output-file.csv>", index=None)

Related

Pandas how to have row with specified value and then subrow of this row

Hello everyone so im creating an xlsx file using Pandas , im using json_normalize() to proceed data now i know their is MultiIndex Dataframe , but their is not a way how to affect a value to the main row like the image bellow , is their is a way to proceed that ?
data = [
{"header": "INDUSTRY",
"surface": 540,
"gaz": 405,
"fioul": 135},
{
"header": "AGRI",
"surface": 540,
"gaz": 405,
"fioul": 135
},
{
"header": "INDUSTRY",
"surface": 55,
"gaz": 405,
"fioul": 135
},
]

i just used empty index and it worked
import pandas as pd
df_multindex = pd.DataFrame({'valeurs': [5,0, 3, 4, 4, 5, 6],
'unités': ["m",360, 180, 360, 360, 540,720]},
index=[['surface','surface', 'surface',
'surface', 'montant', 'montant', 'montant'],
['','gaz', 'charbon', 'fioul',
"", 'montant charbon', 'montant fioul']])
df_multindex.index.rename(['champ',''], inplace=True)
df_multindex

Json file content extract and copy to excel/text

I have below JSON file from which I want to extract only
("workers": {"usersRunning": 1, "usersWaiting": 0, "total": 8, "jobsWaiting": 0, "inUse": 4})
part, and then put it into csv file or text file (tab deliminated). I am new to python so any help will be apprciated..
{
"workers": {
"usersRunning": 1,
"usersWaiting": 0,
"total": 8,
"jobsWaiting": 0,
"inUse": 4
},
"users": {
"activeUsers": 1,
"activity": [{
"maxWorkers": 4,
"inProgress": 4,
"displayName": "abc",
"waiting": 0
}]
}
}

I recommend using pandas. It has methods to read the json and you can use dataframe filtering to find the data you need.
Examples here: https://www.listendata.com/2019/07/how-to-filter-pandas-dataframe.html

I would recommend you use pandas and convert to excel
The following is sample that will help you to get your answer
json_ = {"workers": {"usersRunning": 1, "usersWaiting": 0, "total": 8, "jobsWaiting": 0, "inUse": 4}, "users": {"activeUsers": 1, "activity": [{"maxWorkers": 4, "inProgress": 4, "displayName": "abc", "waiting": 0}]}}
import pandas as pd
df = pd.DataFrame(data= json_['workers'], index=[0])
df.to_excel('json_.xlsx')

How can I fetch specific information for a dictionary thats in an api

So I am writing a code that can give me certain information. The url https://api.brawlhalla.com/player/28472387/ranked?api_key=MY_API_KEY
provides information about my profile. When print it in text I get
{
"name": "Twitter: ufrz_",
"brawlhalla_id": 28472387,
"rating": 2093,
"peak_rating": 2110,
"tier": "Diamond",
"wins": 140,
"games": 257,
"region": "US-E",
"global_rank": 0,
"region_rank": 0,
"legends": [
{
"legend_id": 3,
"legend_name_key": "bodvar",
"rating": 870,
"peak_rating": 870,
"tier": "Tin 4",
"wins": 2,
"games": 4
},
{
"legend_id": 4,
"legend_name_key": "cassidy",
"rating": 968,
"peak_rating": 968,
"tier": "Bronze 2",
"wins": 0,
"games": 0
},
{
"legend_id": 5,
"legend_name_key": "orion",
"rating": 1131,
"peak_rating": 1131,
"tier": "Silver 1",
"wins": 1,
"games": 3
},
(not the full page.)
Here is the code I used to fetch this
import requests
url = "https://api.brawlhalla.com/player/28472387/ranked?api_key= MY_API_KEY"
r = requests.get(url)
print(r.text)
Now for example how would I go about fetching my rating and not the actual word but the number "2093" I tried someway but they didn't work. I am using bs4 and request and new to both so I really don't know how I would get this.
(Just want to say sorry for poorly worded question I don't really know how word my issue so my apologies in advance)

First of all, you have to convert your result to a json object:
data = r.json()
Then, you can request using data['rating']
For your question :
how would I go about getting the ranking for the legend_key_name "bodvar" how could I specifically get that legends ranking.
for legend in data['legends']:
if legend['legend_name_key'] == "bovdar"
print(legend['rating'])
return legend['rating']
or using a function :
def getLegendByName(data, legendName):
for legend in data['legends']:
if legend['legend_name_key'] == legendName:
return legend
return None
legendName = "bodvar"
data = r.json()
legend = getLegendByName(data, legendName)
if legend is not None:
legendRating = legend['rating']
else
print("There is no legend that exists with this name"

Json to Python converting

I have this sample json data, and need to grab only the MAC addresses so I can convert the mac to a list of manufacturers later.
[
{
"aps": {
"00:20:90:B3:16:25": {
"ssid": "",
"encryption": "Open",
"hidden": 1,
"channel": 11,
"signal": -23,
"wps": 0,
"last_seen": 1594356454,
"clients": []
},
"06:AA:A0:84:7F:D8": {
"ssid": "",
"encryption": "Open",
"hidden": 1,
"channel": 6,
"signal": -75,
"wps": 0,
"last_seen": 1594356452,
"clients": []
},
"1E:51:A4:D4:B7:29": {
"ssid": "",
"encryption": "WPA Mixed PSK (CCMP TKIP)",
"hidden": 1,
"channel": 11,
"signal": -63,
"wps": 0,
"last_seen": 1594356448,
"clients": []
}
}
}
]
This is my python program so far, but im not sure how to isolate the MAC address
import json
f = open('recon_data.json',)
data = json.load(f)
print(data["aps"])
f.close()
I get an error every time I run the program weather im asking for aps or ssid information
Traceback (most recent call last):
File "recon.py", line 12, in
print(data["ssid"])
TypeError: list indices must be integers or slices, not str

This is because the data you're loading is a list. Try data[0]["aps"]
As for getting all the mac addresses they are the keys in that dict so you can just use list on that inner dict to get all the keys:
import json
with open('recon_data.json') as f
data = json.load(f)
print(list(data[0]['aps']))
This will print a list of all the MAC addresses
['00:20:90:B3:16:25', '06:AA:A0:84:7F:D8', '1E:51:A4:D4:B7:29']

replace quotes in json file using python

How to we convert singles quotes to double quotes in json file using python script.
file name: strings.json
File content
[{'postId':'328e9497740b456154c636349','postTimestamp': '1521543600','pageType': '/home.php:topnews','viewTime': 1521545993647,'user_name': 'windows-super-user','gender': 3,'likes': '8','id': 'ffa1e07529ac917f6d573a','postImg': 1,'postDesc': [753],'origLink': 0,'duration': 0,'timestamp': 9936471521545,'back_time': 1521545993693},{'postId':'15545154c636349','postTimestamp': '547773600', 'pageType': '/home.php:topnews','viewTime': 45993647,'user_name': 'linux user','gender': 3,'likes': '8','id': '695e45a17f6d573a','postImg': 1,'postDesc': [953],'origLink': 0,'duration': 0,'timestamp': 545993647,'back_time': 85993693},{'postId':'9098897740b456154c636349','postTimestamp': '899943600', 'pageType': '/home.php:topnews','viewTime': 1521545993647,'user_name': 'unix_super_user','gender': 3,'likes': '8','id': '917f6d573a695e45affa1e07','postImg': 1,'postDesc': [253],'origLink': 0,'duration': 0,'timestamp': 193647,'back_time': 1521545993693}]
I have tried the below code, and it is not working;
with open('strings.json') as f:
jstr = json.dump(f)
print(jstr)
expected output:
[
{
"postId":"328e9497740b456154c636349",
"postTimestamp": "1521543600",
"pageType": "/home.php:topnews",
"viewTime": 1521545993647,
"user_name": "windows-super-user",
"gender": 3,
"likes": "8",
"id": "ffa1e07529ac917f6d573a",
"postImg": 1,
"postDesc": [753],
"origLink": 0,
"duration": 0,
"timestamp": 9936471521545,
"back_time": 1521545993693
},
{
"postId":"15545154c636349",
"postTimestamp": "547773600",
"pageType": "/home.php:topnews",
"viewTime": 45993647,
"user_name": "linux user",
"gender": 3,
"likes": "8",
"id": "695e45a17f6d573a",
"postImg": 1,
"postDesc": [953],
"origLink": 0,
"duration": 0,
"timestamp": 545993647,
"back_time": 85993693
}
]

Single quotes are not valid for strings in JSON, so that file isn't valid JSON as far as any parser is concerned.
If you want to replace all single quotes with double quotes, just do something like:
# Read in the file contents as text
with open('strings.json') as f:
invalid_json = f.read()
# Replace all ' with "
valid_json = invalid_json.replace("'", '"')
# Verify that the JSON is valid now and this doesn't raise an exception
json.loads(valid_json)
# Save the modified text back to the file
with open('strings.json.fixed', 'w') as f:
f.write(valid_json)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting a NDJSON to CSV in Python - python

Related

Pandas how to have row with specified value and then subrow of this row

Json file content extract and copy to excel/text

How can I fetch specific information for a dictionary thats in an api

Json to Python converting

replace quotes in json file using python

Categories

Resources