Unable to convert JSON file to CSV using Python - python

I was trying to convert the below JSON file into a csv file.
JSON file
[{
"SubmitID":1, "Worksheet":3, "UserID":65,
"Q1":"395",
"Q2":"2178",
"Q3":"2699",
"Q4":"1494"},{
"SubmitID":2, "Worksheet":3, "UserID":65,
"Q4":"1394"},{
"SubmitID":3, "Worksheet":4, "UserID":65,
"Q1":"1629",
"Q2":"1950",
"Q3":"0117",
"Q4":"1816",
"Empty":" "}]
However, my Python code below gives the error message "TypeError: Expected String or Unicode". May I know how should I modify my program to make it work?
import json
import pandas as pd
f2 = open('temp.json')
useful_input = json.load(f2)
df=pd.read_json(useful_input)
print(df)
df.to_csv('results.csv')

You just need to pass the address string to pd.read_json():
df=pd.read_json("temp.json")

You have not to use json module:
Try:
import pandas as pd
df=pd.read_json("temp.json")
print(df)
df.to_csv('results.csv')

import pandas as pd
df = pd.read_json('data.json')
df.to_csv('data.csv', index=False, columns=['title', 'subtitle', 'date', 'description'])
import pandas as pd
df = pd.read_csv("data.csv")
df = df[df.columns[:4]]
df.dropna(how='all')
df.to_json('data.json', orient='records')

Related

Why I am getting this error in reading csv file in jupyter

import pandas as pd
df = pd.read_csv(r"C:\Users\deban\OneDrive\Desktop\spam.csv")
df.head()
Deatils of the error
try this
import pandas as pd
df = pd.read_csv('C:/Users/deban/OneDrive/Desktop/spam.csv', engine='python')
df.head()
or this:
import pandas as pd
data = pd.read_csv('C:/Users/deban/OneDrive/Desktop/spam.csv', encoding='utf-8')
df.head()

Convert json data into dataframe

I am unable to flatten the json data from this API into a dataframe. I have tried using json_normalize but I gives a NotImplemented error. Can someone help me with it? I need the columns: stationId, start, timestep, temperature where there are several values for temperature and rest of the columns should have same values.
import requests
import json
import pandas as pd
response_API = requests.get('https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds=10865,G005')
print(response_API.status_code)
data = response_API.text
json.loads(data)
df= ?
You can do it many ways, but your current approach should use json() instead of text
import requests
import json
import pandas as pd
response_API = requests.get('https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds=10865,G005')
print(response_API.status_code)
data = response_API.json() <--- it should be json()
print(data)
OR directly read json to df from the URL using read_json()
df = pd.read_json("https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds=10865,G005")
print(df)
Edit:
import requests
import json
import pandas as pd
response_API = requests.get('https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds=10865,G005')
print(response_API.status_code)
data = response_API.json()
result = []
for station, value in data.items():
for forecast, val in value.items():
if forecast in ['forecast1', 'forecast2']:
result.append(val)
df = pd.DataFrame(result)
print(df)

Removing Values from Pandas Read Excel

I am trying to read values from an excel and change them to json to use in my API.
I am getting:
{"Names":{"0":"Tom","1":"Bill","2":"Sally","3":"Cody","4":"Betty"}}
I only want to see the values. What I would like to get is this:
{"Names":{"Tom", "Bill", "Sally", "Cody", "Betty"}}
I haven't figured out how to remove the numbers before the values.
The code I am using is as follows:
import pandas as pd
df = pd.read_excel(r'C:\Users\User\Desktop\Names.xlsx')
json_str = df.to_json()
print(json_str)
As mentioned in the comments your desired result is not valid json.
maybe you can do this:
import json
import pandas as pd
df = pd.read_excel(r'C:\Users\User\Desktop\Names.xlsx')
json_str = df.to_json()
temp = json.loads(json_str)
temp['Names'] = list(temp['Names'].values())
print(json.dumps(temp))

how to save a pandas DataFrame to an excel file?

I am trying to load data from the web source and save it as a Excel file but not sure how to do it. What should I do?
import requests
import pandas as pd
import xmltodict
url = "https://www.kstan.ua/sitemap.xml"
res = requests.get(url)
raw = xmltodict.parse(res.text)
data = [[r["loc"], r["lastmod"]] for r in raw["urlset"]["url"]]
print("Number of sitemaps:", len(data))
df = pd.DataFrame(data, columns=["links", "lastmod"])
df.to_csv("output.csv", index=False)
OR
df.to_excel("output.xlsx")
You can write the dataframe to excel using the pandas ExcelWriter, such as this:
import pandas as pd
with pd.ExcelWriter('path_to_file.xlsx') as writer:
dataframe.to_excel(writer)
If you want to create multiple sheets in the same file
with pd.ExcelWriter('csv_s/results.xlsx') as writer:
same_res.to_excel(writer, sheet_name='same')
diff_res.to_excel(writer, sheet_name='sheet2')

JSON to CSV with Leading Zeros

I'm writing a code to convert JSON to CSV; where i need to retain the leading zeros
I have the file emp.json which has numeric values in tag. eg: 000, 001, etc along with other tags.
import pandas as pd
df = pd.read_json('emp.json')
df.to_csv('test1.csv', index= False)
I get the CSV file but the leading zeros in column are removed.
Convert the data type to be string
import pandas as pd
df = pd.read_json('emp.json',dtype=str)
df.to_csv('test1.csv', index= False)
Another way to do it
import json
import pandas as pd
jsondata = '[{"Code":"001","Description":"Afghanistan"},{"Code":"002","Description":"Albania"}]'
jdata = json.loads(jsondata)
df = pd.DataFrame(jdata)
print (df.T)
df.to_csv('test1.csv', index= False)
Code:https://repl.it/repls/BurdensomeCompassionateCommercialsoftware
Maybe have a dtype argument being object:
import pandas as pd
df = pd.read_json('emp.json',dtype=object)
df.to_csv('test1.csv', index= False)
object is just a synonym of str,
Or you can use str:
import pandas as pd
df = pd.read_json('emp.json',dtype=str)
df.to_csv('test1.csv', index= False)

Categories