I am new here , need some help with writing to json file:
I have a dataframe with below values, which is created by reading a excel file
need to write this to json file with object as column dtls
Output :
A similar task is considered in the question:
Converting Excel into JSON using Python
Different approaches are possible to solve this problem.
I hope, it works for your solution.
import pandas as pd
import json
df = pd.read_excel('./TfidfVectorizer_sklearn.xlsx')
df.to_json('new_file1.json', orient='records') # excel to json
# read json and then append details to it
with open('./new_file1.json', 'r') as json_file:
a = {}
data = json.load(json_file)
a['details'] = data
# write new json with details in it
with open("./new_file1.json", "w") as jsonFile:
json.dump(a, jsonFile)
JSON Output:
Related
Can someone help me to convert this excel to json format using python please, i have tasks and subtasks like in this picture link
You can either use the xlrd library or the pandas library. Read the documentation of both and choose which would be best for you.
Pandas would look like this (stolen from someone else):
import pandas
import json
# Read excel document
excel_data_df = pandas.read_excel('data.xlsx', sheet_name='sheet1')
# Convert excel to string
# (define orientation of document in this case from up to down)
thisisjson = excel_data_df.to_json(orient='records')
# Print out the result
print('Excel Sheet to JSON:\n', thisisjson)
# Make the string into a list to be able to input in to a JSON-file
thisisjson_dict = json.loads(thisisjson)
# Define file to write to and 'w' for write option -> json.dump()
# defining the list to write from and file to write to
with open('data.json', 'w') as json_file:
json.dump(thisisjson_dict, json_file)
Converting Excel into JSON using Python
I have a JSON file data.json which contains large tweeter records. I am trying to load this JSON file to Jupyter and then transfer to a pandas data frame for further analysis. So far I have written the following code:
sample of tweets are '{"_id":{"$oid":"5ec248611c9b498cdbf095a1"},"created_at":"Mon Dec 31 23:19:39 +0000 2018","id":{"$numberLong":"1079879790738325504"},"id_str":"1079879790738325504","text":"NPAF's Artist in Residence, Composer Glenn McClure is at the Park at work on his unusual sonification compostions
import json
import csv
json_file = "\\Users\\data.json"
header = ["id_str", "created_at", "lang", "text"]
tweets_processed = 0
with open(json_file, 'r') as infile:
print("json file: ", json_file)
for line in infile:
tweet = json.loads(line)
#row = [tweet['id_str'], tweet['created_at'], tweet['lang'], tweet['text']]
#csvwriter.writerow(row)
tweets_processed += 1
#print("tweet processed: ", tweet_processed)
This is the code so far I have written basically to read my json file and pass it to pandas dataframe. Any help on how to get my json data into pandas dataframe? Thanks in advance.
You never imported the name csvwriter name into the namespace.
In this instance, you should likely be using csv.writer.writerow(). Alternatively, if you are trying to use the csvwriter package (which I doubt you are trying to do, then you need to add import csvwriter to the top of the file.
The takeaway is to read the docs of the package you are trying to use and importing everything into the proper namspace.
The easiest way to do this is use pandas own function read_json().
import pandas as pd
dataset = pd.read_json('file.json')
File must contain format like this
I am trying to convert a pd.series output into a json string in the format [{"Name":0},{"Count of Outliers":4},{"Age":4},{"Photo":0}] using the output from Sum_Outliers as per the screenshot and write the json file to my local drive. Below is the code I am trying to work out with. It would also be great if someone could could help me with converting the series into a pandas dataframe and writing it in a similar json
#Output to be written
import os
import json
os.chdir(Output)
ini_string = {'Columns': sum_Outliers}
# printing initial json
ini_string = json.dumps(ini_string)
# converting string to json
final_dictionary = json.loads(ini_string)
with open('Optimal_K_value_result.txt', 'w') as json_file:
json.dump(final_dictionary, json_file)
Thanks in advance
Trying to do a simple script to save as csv. My python Version is 3.8.3 and I am using window 10.
I am trying to use the tool pandas : https://pandas.pydata.org
I am trying to get data results from the URL https://barcode.monster/api/3061990141101. I installed Pandas to convert the JSON file to csv. There is an "index" problem. None of the answers I found worked.
Value error if using all scalar values, you must pass an index.
I looked all over Google and forums, and tried adding "index_col=0" , also "index=false".
Below is my python script :
with urllib.request.urlopen("https://barcode.monster/api/3061990141101") as url:
data = json.loads(url.read().decode())
print(data)
with open('data.json', encoding='utf-8-sig') as f_input:
df = pd.read_json(f_input, index_col=0)
df.to_csv('test.csv', encoding='utf-8', index=False)
I'm sure this is obvious for any Python dev, I'm just learning from scratch. Many thanks.
There are many ways of solving this issue. If you're going to use open a text file, then you'll need to convert the string to json. Try using json.load(f). Doing so, you can call DataFrame. You will then need to either set the index to the first item or wrap the json data in a json object.
For example:
with open('data.json', "r") as f_input:
text = f_input.read()
jsonData = json.loads(text)
df = pd.DataFrame(jsonData, index=[0])
df.to_csv('test.csv', encoding='utf-8', index=False)
Or:
with open('data.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame({'data': data})
df.to_csv('test.csv', encoding='utf-8', index=False)
I have a collection of JSON documents. I need to aggregate the data from all these documents into a portable format like CSV for easy access to data in excel or other analytics tools.
The problem I face is that I am creating JSON document by adding keys one by one. Because of this all the keys in the JSON get randomized and I'm not sure that when I parse the JSON document into CSV it will retain its schema (not as in RDBMS but the 2d excel schema)
I just want to ensure that when I update the CSV file everytime with csv.writerow() each value should correspond to its header which was set first time.
Any ideas how can I achieve my goal?
One way is to use csv.DictWriter to create the CSV file:
import json
import csv
# Two JSON documents
jsondoc1 = '''{"a":"aardvark", "b":"bengal tiger"}'''
jsondoc2 = '''{"a":"Samuel Adams", "b":"Carter Braxton"}'''
# Create a CSV file, then use csv.DictWriter() to write the header
# and one for for each JSON document
with open("output.csv", "wt") as output_file:
output_file = csv.DictWriter(output_file, ["a", "b"])
output_file.writeheader()
output_file.writerow(json.loads(jsondoc1))
output_file.writerow(json.loads(jsondoc2))
Result:
a,b
aardvark,bengal tiger
Samuel Adams,Carter Braxton