From grouped rows in Excel to json using python - python

Can someone help me to convert this excel to json format using python please, i have tasks and subtasks like in this picture link

You can either use the xlrd library or the pandas library. Read the documentation of both and choose which would be best for you.
Pandas would look like this (stolen from someone else):
import pandas
import json
# Read excel document
excel_data_df = pandas.read_excel('data.xlsx', sheet_name='sheet1')
# Convert excel to string
# (define orientation of document in this case from up to down)
thisisjson = excel_data_df.to_json(orient='records')
# Print out the result
print('Excel Sheet to JSON:\n', thisisjson)
# Make the string into a list to be able to input in to a JSON-file
thisisjson_dict = json.loads(thisisjson)
# Define file to write to and 'w' for write option -> json.dump()
# defining the list to write from and file to write to
with open('data.json', 'w') as json_file:
json.dump(thisisjson_dict, json_file)
Converting Excel into JSON using Python

Related

convert excel to json file in python

I am new here , need some help with writing to json file:
I have a dataframe with below values, which is created by reading a excel file
need to write this to json file with object as column dtls
Output :
A similar task is considered in the question:
Converting Excel into JSON using Python
Different approaches are possible to solve this problem.
I hope, it works for your solution.
import pandas as pd
import json
df = pd.read_excel('./TfidfVectorizer_sklearn.xlsx')
df.to_json('new_file1.json', orient='records') # excel to json
# read json and then append details to it
with open('./new_file1.json', 'r') as json_file:
a = {}
data = json.load(json_file)
a['details'] = data
# write new json with details in it
with open("./new_file1.json", "w") as jsonFile:
json.dump(a, jsonFile)
JSON Output:

Convert a series object to json and write it to local drive

I am trying to convert a pd.series output into a json string in the format [{"Name":0},{"Count of Outliers":4},{"Age":4},{"Photo":0}] using the output from Sum_Outliers as per the screenshot and write the json file to my local drive. Below is the code I am trying to work out with. It would also be great if someone could could help me with converting the series into a pandas dataframe and writing it in a similar json
#Output to be written
import os
import json
os.chdir(Output)
ini_string = {'Columns': sum_Outliers}
# printing initial json
ini_string = json.dumps(ini_string)
# converting string to json
final_dictionary = json.loads(ini_string)
with open('Optimal_K_value_result.txt', 'w') as json_file:
json.dump(final_dictionary, json_file)
Thanks in advance

Python transfer excel formatting between two Excel documents

I'd like to copy the formatting between two Excel sheets in python.
Here is the situation:
I have a script that effectively "alters" (ie overwrites) an excel file by opening it using pd.ExcelWriter, then updates values in the rows. Finally, file is overwritten using ExcelWriter.
The Excel file is printed/shared/read by humans between updates done by the code. Humans will do things like change number formatting, turn on/off word wrap, and alter column widths.
My goal is the code updates should only alter the content of the file, not the formatting of the columns.
Is there a way I can read/store/write the sheet format within python so the output file has the same column formatting as the input file?
Here's the basic idea of what I am doing right now:
df_in= pd.read_excel("myfile.xlsx")
# Here is where I'd like to read in format of the first sheet of this file
xlwriter = pd.ExcelWriter('myfile.xlsx', engine='xlsxwriter')
df_out = do_update(df_in)
df_out.to_excel(xlwriter,'sheet1')
# Here is where I'd like to apply the format I read earlier to the sheet
xlwriter.save()
Note: I have played with xlsxwriter.set_column and add_format. As far as I can tell, these don't help me read the format from the current file
Pandas uses xlrd package for parsing Excel documents to DataFrames.
Interoperability between other xlsx packages and xlrd could be problematic when it comes to the data structure used to represent formatting information.
I suggest using openpyxl as your engine when instantiating pandas.ExcelWriter. It comes with reader and writer classes that are interoperable.
import pandas as pd
from openpyxl.styles.stylesheet import apply_stylesheet
from openpyxl.reader.excel import ExcelReader
xlreader = ExcelReader('myfile.xlsx', read_only=True)
xlwriter = pd.ExcelWriter('myfile.xlsx', engine='openpyxl')
df_in = pd.read_excel("myfile.xlsx")
df_out = do_update(df_in)
df_out.to_excel(xlwriter,'sheet1')
apply_stylesheet(xlreader.archive, xlwriter.book)
xlwriter.save()

Writing value to given filed in csv file using pandas or csv module

Is there any way you can write value to specific place in given .csv file using pandas or csv module?
I have tried using csv_reader to read the file and find a line which fits my requirements though I couldn't figure out a way to switch value which is in the file to mine.
What I am trying to achieve here is that I have a spreadsheet of names and values. I am using JSON to update the values from the server and after that I want to update my spreadsheet also.
The latest solution which I came up with was to create separate sheet from which I will get updated data, but this one is not working, though there is no sequence in which the dict is written to the file.
def updateSheet(fileName, aValues):
with open(fileName+".csv") as workingSheet:
writer = csv.DictWriter(workingSheet,aValues.keys())
writer.writeheader()
writer.writerow(aValues)
I will appreciate any guidance and tips.
You can try this way to operate the specified csv file
import pandas as pd
a = ['one','two','three']
b = [1,2,3]
english_column = pd.Series(a, name='english')
number_column = pd.Series(b, name='number')
predictions = pd.concat([english_column, number_column], axis=1)
save = pd.DataFrame({'english':a,'number':b})
save.to_csv('b.csv',index=False,sep=',')

Updating a schema bound csv from a collection of json documents

I have a collection of JSON documents. I need to aggregate the data from all these documents into a portable format like CSV for easy access to data in excel or other analytics tools.
The problem I face is that I am creating JSON document by adding keys one by one. Because of this all the keys in the JSON get randomized and I'm not sure that when I parse the JSON document into CSV it will retain its schema (not as in RDBMS but the 2d excel schema)
I just want to ensure that when I update the CSV file everytime with csv.writerow() each value should correspond to its header which was set first time.
Any ideas how can I achieve my goal?
One way is to use csv.DictWriter to create the CSV file:
import json
import csv
# Two JSON documents
jsondoc1 = '''{"a":"aardvark", "b":"bengal tiger"}'''
jsondoc2 = '''{"a":"Samuel Adams", "b":"Carter Braxton"}'''
# Create a CSV file, then use csv.DictWriter() to write the header
# and one for for each JSON document
with open("output.csv", "wt") as output_file:
output_file = csv.DictWriter(output_file, ["a", "b"])
output_file.writeheader()
output_file.writerow(json.loads(jsondoc1))
output_file.writerow(json.loads(jsondoc2))
Result:
a,b
aardvark,bengal tiger
Samuel Adams,Carter Braxton

Categories