I am trying to convert a pd.series output into a json string in the format [{"Name":0},{"Count of Outliers":4},{"Age":4},{"Photo":0}] using the output from Sum_Outliers as per the screenshot and write the json file to my local drive. Below is the code I am trying to work out with. It would also be great if someone could could help me with converting the series into a pandas dataframe and writing it in a similar json
#Output to be written
import os
import json
os.chdir(Output)
ini_string = {'Columns': sum_Outliers}
# printing initial json
ini_string = json.dumps(ini_string)
# converting string to json
final_dictionary = json.loads(ini_string)
with open('Optimal_K_value_result.txt', 'w') as json_file:
json.dump(final_dictionary, json_file)
Thanks in advance
Related
I am new here , need some help with writing to json file:
I have a dataframe with below values, which is created by reading a excel file
need to write this to json file with object as column dtls
Output :
A similar task is considered in the question:
Converting Excel into JSON using Python
Different approaches are possible to solve this problem.
I hope, it works for your solution.
import pandas as pd
import json
df = pd.read_excel('./TfidfVectorizer_sklearn.xlsx')
df.to_json('new_file1.json', orient='records') # excel to json
# read json and then append details to it
with open('./new_file1.json', 'r') as json_file:
a = {}
data = json.load(json_file)
a['details'] = data
# write new json with details in it
with open("./new_file1.json", "w") as jsonFile:
json.dump(a, jsonFile)
JSON Output:
Can someone help me to convert this excel to json format using python please, i have tasks and subtasks like in this picture link
You can either use the xlrd library or the pandas library. Read the documentation of both and choose which would be best for you.
Pandas would look like this (stolen from someone else):
import pandas
import json
# Read excel document
excel_data_df = pandas.read_excel('data.xlsx', sheet_name='sheet1')
# Convert excel to string
# (define orientation of document in this case from up to down)
thisisjson = excel_data_df.to_json(orient='records')
# Print out the result
print('Excel Sheet to JSON:\n', thisisjson)
# Make the string into a list to be able to input in to a JSON-file
thisisjson_dict = json.loads(thisisjson)
# Define file to write to and 'w' for write option -> json.dump()
# defining the list to write from and file to write to
with open('data.json', 'w') as json_file:
json.dump(thisisjson_dict, json_file)
Converting Excel into JSON using Python
I have a JSON file data.json which contains large tweeter records. I am trying to load this JSON file to Jupyter and then transfer to a pandas data frame for further analysis. So far I have written the following code:
sample of tweets are '{"_id":{"$oid":"5ec248611c9b498cdbf095a1"},"created_at":"Mon Dec 31 23:19:39 +0000 2018","id":{"$numberLong":"1079879790738325504"},"id_str":"1079879790738325504","text":"NPAF's Artist in Residence, Composer Glenn McClure is at the Park at work on his unusual sonification compostions
import json
import csv
json_file = "\\Users\\data.json"
header = ["id_str", "created_at", "lang", "text"]
tweets_processed = 0
with open(json_file, 'r') as infile:
print("json file: ", json_file)
for line in infile:
tweet = json.loads(line)
#row = [tweet['id_str'], tweet['created_at'], tweet['lang'], tweet['text']]
#csvwriter.writerow(row)
tweets_processed += 1
#print("tweet processed: ", tweet_processed)
This is the code so far I have written basically to read my json file and pass it to pandas dataframe. Any help on how to get my json data into pandas dataframe? Thanks in advance.
You never imported the name csvwriter name into the namespace.
In this instance, you should likely be using csv.writer.writerow(). Alternatively, if you are trying to use the csvwriter package (which I doubt you are trying to do, then you need to add import csvwriter to the top of the file.
The takeaway is to read the docs of the package you are trying to use and importing everything into the proper namspace.
The easiest way to do this is use pandas own function read_json().
import pandas as pd
dataset = pd.read_json('file.json')
File must contain format like this
I have .ndjson file that has 20GB that I want to open with Python. File is to big so I found a way to split it into 50 peaces with one online tool. This is the tool: https://pinetools.com/split-files
Now I get one file, that has extension .ndjson.000 (and I do not know what is that)
I'm trying to open it as json or as a csv file, to read it in pandas but it does not work.
Do you have any idea how to solve this?
import json
import pandas as pd
First approach:
df = pd.read_json('dump.ndjson.000', lines=True)
Error: ValueError: Unmatched ''"' when when decoding 'string'
Second approach:
with open('dump.ndjson.000', 'r') as f:
my_data = f.read()
print(my_data)
Error: json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 104925061 (char 104925060)
I think the problem is that I have some emojis in my file, so I do not know how to encode them?
ndjson is now supported out of the box with argument lines=True
import pandas as pd
df = pd.read_json('/path/to/records.ndjson', lines=True)
df.to_json('/path/to/export.ndjson', lines=True)
I think the pandas.read_json cannot handle ndjson correctly.
According to this issue you can do sth. like this to read it.
import ujson as json
import pandas as pd
records = map(json.loads, open('/path/to/records.ndjson'))
df = pd.DataFrame.from_records(records)
P.S: All credits for this code go to KristianHolsheimer from the Github Issue
The ndjson (newline delimited) json is a json-lines format, that is, each line is a json. It is ideal for a dataset lacking rigid structure ('non-sql') where the file size is large enough to warrant multiple files.
You can use pandas:
import pandas as pd
data = pd.read_json('dump.ndjson.000', lines=True)
In case your json strings do not contain newlines, you can alternatively use:
import json
with open("dump.ndjson.000") as f:
data = [json.loads(l) for l in f.readlines()]
I'm new on stack-overflow.
I'm trying to convert pkl file into json file using python. Below is my sample code
import pickle
import pandas as pd
# Load pickle file
input_file = open('file.pkl', 'rb')
new_dict = pickle.load(input_file)
input_file()
# Create a Pandas DataFrame
data_frame = pd.DataFrame(new_dict)
# Copy DataFrame index as a column
data_frame['index'] = data_frame.index
# Move the new index column to the from of the DataFrame
index = data_frame['index']
data_frame.drop(labels=['index'], axis=1, inplace = True)
data_frame.insert(0, 'index', index)
# Convert to json values
json_data_frame = data_frame.to_json(orient='values', date_format='iso', date_unit='s')
with open('data.json', 'w') as js_file:
js_file.write(json_data_frame)
When I run this code I got error that TypeError: '_io.TextIOWrapper' object is not callable. By following some same issues This one and This one, these issues suggested to use write method with input_file() at line 7 but still I'm getting this error io.UnsupportedOperation: write which is probably a writing method but I'm using it with reading and for reading I'm unable to fine any method.
I also tried to read pickle file in following way
with open ('file.pkl', 'rb') as input_file:
new_dict = pickle.load(input_file)
and I'm getting this error
DataFrame constructor not properly called!.
I need some kind suggestions that how I can solve this problem?
Any suggestions about other tools which can perform this task, will be appreciable. Thanks