JSON to CSV in python separating letters instead of values with comma - python

I have been trying to convert a JSON fie to CSV in python but the obtained csv is very vague with each letter being separated with comma rather than the word as a whole from the key - value pair. The code which I have tried and the obtained csv output are given below.
SAMPLE JSON FILE
"details":[
{
"name": "sreekumar, ananthu",
"type": "faculty/academician",
"personal": {
"age": "28",
"address": [
{
"street": "xyz",
"city": "abc",
}
]
}
SAMPLE CODE
import json
import csv
with open("json_data.json","r") as f:
data = json.loads(f)
csv_file = open("csv_file.csv","w")
csv_writer = csv.writer(csv_file)
for details in data['detail'];
for detail_key, detail_value in details.items():
if detail_key == 'name':
csv_writer.writerow(detail_value)
if detail_key == 'personal':
for personal_key, personal_value in detail_value.items():
if personal_key == 'age'
csv_writer.writerow(personal_value)
csv_file.close()
SAMPLE OUTPUT
s,r,e,e,k,u,m,a,ra,n,a,n,t,h,u,2,8

Related

am getting identical sha256 for each json file in python

I am in a huge hashing crisis. Using the chip-0007's default format I generatedfew JSON files. Using these files I have been trying to generate sha256 hash value. And I expect a unique hash value for each file.
However, python code isn't doing so. I thought there might be some issue with JSON file but, it is not. Something is to do with sha256 code.
All the json files ->
JSON File 1
{ "format": "CHIP-0007", "name": "adewale-the-amebo", "description": "Adewale always wants to be in everyone's business.", "attributes": [ { "trait_type": "Gender", "value": "male" } ], "collection": { "name": "adewale-the-amebo Collection", "id": "1" } }
JSON File 2
{ "format": "CHIP-0007", "name": "alli-the-queeny", "description": "Alli is an LGBT Stan.", "attributes": [ { "trait_type": "Gender", "value": "male" } ], "collection": { "name": "alli-the-queeny Collection", "id": "2" } }
JSON File 3
{ "format": "CHIP-0007", "name": "aminat-the-snnobish", "description": "Aminat never really wants to talk to anyone.", "attributes": [ { "trait_type": "Gender", "value": "female" } ], "collection": { "name": "aminat-the-snnobish Collection", "id": "3" } }
Sample CSV File:
Series Number,Filename,Description,Gender
1,adewale-the-amebo,Adewale always wants to be in everyone's business.,male
2,alli-the-queeny,Alli is an LGBT Stan.,male
3,aminat-the-snnobish,Aminat never really wants to talk to anyone.,female
Python CODE
TODO 2 : Generate a JSON file per entry in team's sheet in CHIP-0007's default format
new_jsonFile = f"{row[1]}.json"
json_data = {}
json_data["format"] = "CHIP-0007"
json_data["name"] = row[1]
json_data["description"] = row[2]
attribute_data = {}
attribute_data["trait_type"] = "Gender" # gender
attribute_data["value"] = row[3] # "value/male/female"
json_data["attributes"] = [attribute_data]
collection_data = {}
collection_data["name"] = f"{row[1]} Collection"
collection_data["id"] = row[0] # "ID of the NFT collection"
json_data["collection"] = collection_data
filepath = f"Json_Files/{new_jsonFile}"
with open(filepath, 'w') as f:
json.dump(json_data, f, indent=2)
C += 1
sha256_hash = sha256_gen(filepath)
temp.append(sha256_hash)
NEW.append(temp)
# TODO 3 : Calculate sha256 of the each entry
def sha256_gen(fn):
return hashlib.sha256(open(fn, 'rb').read()).hexdigest()
How can I generate a unique sha256 hash for each JSON?
I tried reading in byte blocks. That is also not working out. After many trials, I am going nowhere. Sharing the unexpected outputs of each JSON file:
[ All hashes are identical ]
Unexpected SHA256 output:
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Expected:
Unique Hash value. Different from each other
Because of output buffering, you're calling sha256_gen(filepath) before anything is written to the file, so you're getting the hash of an empty file. You should do that outside the with, so that the JSON file is closed and the buffer is flushed.
with open(filepath, 'w') as f:
json.dump(json_data, f, indent=2)
C += 1
sha256_hash = sha256_gen(filepath)
temp.append(sha256_hash)
NEW.append(temp)

flattening JSON file using json_normalise and choosing specific elements to convert to an excel sheet (Sample Attached)

{
"currency": {
"Wpn": {
"units": "KB_per_sec",
"type": "scalar",
"value": 528922.0,
"direction": "up"
}
},
"catalyst": {
"Wpn": {
"units": "ns",
"type": "scalar",
"value": 70144.0,
"direction": "down"
}
},
"common": {
"Wpn": {
"units": "ns",
"type": "scalar",
"value": 90624.0,
"direction": "down"
}
}
}
So I have to basically convert nested json into excel, for which my approach was to flatten json file using json_normalise , but as I am new to all these...I always seem to end up in KeyError...
Here's my code so far , assuming that the file is named as json.json
import requests
from pandas import json_normalize
with open('json.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame(sum([i[['Wpn'], ['value']] for i in data], []))
df.to_excel('Ai.xlsx')
I'm trying to get output on an excel sheet consisting of currency and common along with their resp. values as an output
I know , there are alot of similar questions , but trust me I have tried most of them and yet I didn't get any desirable output... Plz just help me in this
Try:
import json
import pandas as pd
with open('json.json', 'r') as f: data = json.load(f)
data = [{'key': k, 'wpn_value': v['Wpn']['value']} for k, v in data.items()]
print(data)
# here, the variable data looks like
# [{'key': 'currency', 'wpn_value': 528922.0}, {'key': 'catalyst', 'wpn_value': 70144.0}, {'key': 'common', 'wpn_value': 90624.0}]
df = pd.DataFrame(data).set_index('key') # set_index() optional
df.to_excel('Ai.xlsx')
The result looks like
key
wpn_value
currency
528922
catalyst
70144
common
90624

How to convert csv to nested arrays in json using python

I am trying to use csv file to read data and convert them into nested array using python.
my column values of csv are
"hallticket_Number ","student_name","gender","course_name","university_course_code ","university_college_code","caste","course_year","semester_yearly_exams","subject_name1","subject_code1","marks_or_grade_points_obtained1","maximum_marks_or_grade_points1","pass_mark1","no_of_credits1","pass_fail_absent1","subject_name2","subject_code2","marks_or_grade_points_obtained2","maximum_marks_or_grade_points2","no_of_credits2","pass_fail_absent2" ,"subject_name3","subject_code3", "marks_or_grade_points_obtained3","maximum_marks_or_grade_points3","no_of_credits3", "pass_fail_absent3" ,"subject_name4" ,"subject_code4" ,"marks_or_grade_points_obtained4","maximum_marks_or_grade_points4","no_of_credits4" , "pass_fail_absent4" ,"subject_code5", "marks_or_grade_points_obtained5" ,"maximum_marks_or_grade_points5","no_of_credits5","pass_fail_absent5","subject_name6","marks_or_grade_points_obtained6","maximum_marks_or_grade_points6", "no_of_credits6","pass_fail_absent","final_result_pass_fail","marks_or_sgpa_
The output i need in JSON is
{
"hallticket_": 22342,
"student_name": "abc",
"gender": "m",
"course_name":" fgd",
"course_code":52,
"college_code ":521,
"caste":"open",
"year":55,
"exam":"s1",
"subject": [ {
"subject_name1":"hh",
"subject_code1":52,
"marks_or_grade_points_obtained1":85,
"maximum_marks_or_grade_points1":50,
"pass_mark1":52,
"no_of_credits1":85,
"pass_fail_absent1":"pass"},]
"subject": [ {
"subject_name2":"hh",
"subject_code2":52,
"marks_or_grade_points_obtained2":85,
"maximum_marks_or_grade_points2":50,
"pass_mark2":52,
"no_of_credits2":85,
"pass_fail_absent2":"pass"},]
"subject": [ {
"subject_name3":"hh",
"subject_code3":52,
"marks_or_grade_points_obtained3":85,
"maximum_marks_or_grade_points3":50,
"pass_mark3":52,
"no_of_credits3":85,
"pass_fail_absent3":"pass"},]
"subject": [ {
"subject_name4":"hh",
"subject_code4":52,
"marks_or_grade_points_obtained4":85,
"maximum_marks_or_grade_points4":50,
"pass_mark4":52,
"no_of_credits4":85,
"pass_fail_absent4":"pass"},]
"subject": [ {
"subject_name5":"hh",
"subject_code5":52,
"marks_or_grade_points_obtained5":85,
"maximum_marks_or_grade_points5":50,
"pass_mark5":52,
"no_of_credits5":85,
"pass_fail_absent5":"pass"},]
"subject": [ {
"subject_name6":"hh",
"subject_code6":52,
"marks_or_grade_points_obtained6":85,
"maximum_marks_or_grade_points6":50,
"pass_mark6":52,
"no_of_credits6":85,
"pass_fail_absent6":"pass"},]
"final_result_pass_fail":"pass",
" marks_or_sgpa_obtained":"8.00",
"maximum_marks_sgpa":"10",
"total_credits":"135"
}
import csv
import json
# Open the CSV
f = open('data.csv', 'r')
reader = csv.DictReader(f)
# Parse the CSV into JSON
out = json.dumps([row for row in reader])
print(out)
Hopefully this will work as your expectations!

Write all values in one line csv.DictWriter

I'm having trouble to generate a well formatted CSV file out of some data i fetched from the leadfeeder API. In the csv file that is currently being created, not all values are in one row, id and leads are one column higher then the rest. Like here:
CSV Output
I later also like to load another json file and use it to map some values over the id and then put also the visits per lead into my csv file.
Do you also have some advice for this?
This is my code so far:
import json
import csv
csv_columns = ['name', 'industry', 'website_url', 'status', 'crm_lead_id', 'crm_organization_id', 'employee_count', 'id', 'type' ]
with open('data.json', 'r') as d:
d = json.load(d)
csv_file = 'lead_daten.csv'
try:
with open('leads.csv', 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=csv_columns, extrasaction='ignore')
writer.writeheader()
for item in d['data']:
writer.writerow(item)
writer.writerow(item['attributes'])
except IOError:
print("I/O error")
My json data has the following structure:
I need also some of the nested values like the id in relationships!
{
"data": [
{
"attributes": {
"crm_lead_id": null,
"crm_organization_id": null,
"employee_count": 5000,
"facebook_url": null,
"first_visit_date": "2019-01-31",
"industry": "Furniture",
"last_visit_date": "2019-01-31",
"linkedin_url": null,
"name": "Example Inc",
"phone": null,
"status": "new",
"twitter_handle": "example",
"website_url": "http://www.example.com"
},
"id": "s7ybF6VxqhQqVM1m1BCnZT_8SRo9XnuoxSUP5ChvERZS9",
"relationships": {
"location": {
"data": {
"id": "8SRo9XnuoxSUP5ChvERZS9",
"type": "locations"
}
}
},
"type": "leads"
},
{
"attributes": {
"crm_lead_id": null,
When you write to a csv, you must write one full row at a time. You current code writes one row with only id and type, and then a different row with the other fields.
The correct way is to first fully build a dictionary containing all the fields and only then write it in one single operation. Code could be:
...
writer.writeheader()
for item in d['data']:
item.update(item["attributes"])
writer.writerow(item)
...

Reformat non-serializable JSON-ish data into a format suitable for value extraction in Python

With the following simple Python script:
import json
file = 'toy.json'
data = json.loads(file)
print(data['gas']) # example
My data generates the error ...is not JSON serializable.
With this, slightly more sophisticated, Python script:
import json
import sys
#load the data into an element
data = open('transactions000000000029.json', 'r')
#dumps the json object into an element
json_str = json.dumps(data)
#load the json to a string
resp = json.loads(json_str)
#extract an element in the response
print(resp['gas'])
The same.
What I'd like to do is extract all the values of a particular index, so ideally I'd like to render the input like so:
...
"hash": "0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63",
"gasUsed": "21000",
"hash": "0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26"
"gasUsed": "21000"
...
The data looks like this:
{
"blockNumber": "1941794",
"blockHash": "0x41ee74e34cbf9ef4116febea958dbc260e2da3a6bf6f601bfaeb2cd9ab944a29",
"hash": "0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63",
"from": "0x3c0cbb196e3847d40cb4d77d7dd3b386222998d9",
"to": "0x2ba24c66cbff0bda0e3053ea07325479b3ed1393",
"gas": "121000",
"gasUsed": "21000",
"gasPrice": "20000000000",
"input": "",
"logs": [],
"nonce": "14",
"value": "0x24406420d09ce7440000",
"timestamp": "2016-07-24 20:28:11 UTC"
}
{
"blockNumber": "1941716",
"blockHash": "0x75e1602cad967a781f4a2ea9e19c97405fe1acaa8b9ad333fb7288d98f7b49e3",
"hash": "0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26",
"from": "0xa0480c6f402b036e33e46f993d9c7b93913e7461",
"to": "0xb2ea1f1f997365d1036dd6f00c51b361e9a3f351",
"gas": "121000",
"gasUsed": "21000",
"gasPrice": "20000000000",
"input": "",
"logs": [],
"nonce": "1",
"value": "0xde0b6b3a7640000",
"timestamp": "2016-07-24 20:12:17 UTC"
}
What would be the best way to achieve that?
I've been thinking that perhaps the best way would be to reformat it as valid json?
Or maybe to just treat it like regex?
Your json file is not valid. This data should be a list of dictionaries. You should then separate each dictionary with a comma, Like this:
[
{
"blockNumber":"1941794",
"blockHash": "0x41ee74bf9ef411d9ab944a29",
"hash":"0xf2ef9daf63",
"from":"0x3c0cbb196e3847d40cb4d77d7dd3b386222998d9",
"to":"0x2ba24c66cbff0bda0e3053ea07325479b3ed1393",
"gas":"121000",
"gasUsed":"21000",
"gasPrice":"20000000000",
"input":"",
"logs":[
],
"nonce":"14",
"value":"0x24406420d09ce7440000",
"timestamp":"2016-07-24 20:28:11 UTC"
},
{
"blockNumber":"1941716",
"blockHash":"0x75e1602ca8d98f7b49e3",
"hash":"0xf8f2a397b0f7bb1ff212e193c0252fab26",
"from":"0xa0480c6f402b036e33e46f993d9c7b93913e7461",
"to":"0xb2ea1f1f997365d1036dd6f00c51b361e9a3f351",
"gas":"121000",
"gasUsed":"21000",
"gasPrice":"20000000000",
"input":"",
"logs":[
],
"nonce":"1",
"value":"0xde0b6b3a7640000",
"timestamp":"2016-07-24 20:12:17 UTC"
}
]
Then use this to open the file:
with open('toy.json') as data_file:
data = json.load(data_file)
You can then render the desired output like:
for item in data:
print item['hash']
print item['gasUsed']
If each block is valid JSON data you can parse them seperatly:
data = []
with open('transactions000000000029.json') as inpt:
lines = []
for line in inpt:
if line.startswith('{'): # block starts
lines = [line]
else:
lines.append(line)
if line.startswith('}'): # block ends
data.append(json.loads(''.join(lines)))
for block in data:
print("hash: {}".format(block['hash']))
print("gasUsed: {}".format(block['gasUsed']))

Categories