Convert JSON *files* to CSV *files* using Python (Idle) - python

This question piggybacks a question I had posted yesterday. I actually got my code to work fine. I was starting small. I switched out the JSON in the Python code for multiple JSON files outside of the Python code. I actually got that to work beautifully. And then there was some sort of catastrophe, and my code was lost.
I have spent several hours trying to recreate it to no avail. I am actually using arcpy (ArcGIS's Python module) since I will later on be using it to perform some spatial analysis, but I don't think you need to know much about arcpy to help me out with this part (I don't think, but it may help).
Here is one version of my latest attempts, but it is not working. I switched out my actual path to just "Pathname." I actually have everything working up until the point when I try to populate the rows in the CSV (which are of latitude and longitude values. It is successfully writing the latitude/longitude headers in the CSV files). So apparently whatever is below dict_writer.writerows(openJSONfile) is not working:
import json, csv, arcpy
from arcpy import env
arcpy.env.workspace = r"C:\GIS\1GIS_DATA\Pathname"
workspaces = arcpy.ListWorkspaces("*", "Folder")
for workspace in workspaces:
arcpy.env.workspace = workspace
JSONfiles = arcpy.ListFiles("*.json")
for JSONfile in JSONfiles:
descJSONfile = arcpy.Describe(JSONfile)
JSONfileName = descJSONfile.baseName
openJSONfile = open(JSONfile, "wb+")
print "JSON file is open"
fieldnames = ['longitude', 'latitude']
with open(JSONfileName+"test.csv", "wb+") as f:
dict_writer = csv.DictWriter(f, fieldnames=fieldnames)
dict_writer.writerow(dict(zip(fieldnames, fieldnames)))
dict_writer.writerows(openJSONfile)
#Do I have to open the CSV files? Aren't they already open?
#openCSVfile = open(CSVfile, "r+")
for row in openJSONfile:
f.writerow( [row['longitude'], row['latitude']] )
Any help is greatly appreciated!!

You're not actually loading the JSON file.
You're trying to write rows from an open file instead of writing rows from json.
You will need to add something like this:
rows = json.load(openJSONfile)
and later:
dict_writer.writerows(rows)
The last two lines you have should be removed, since all the csv writing is done before you reach them, and they are outside of the loop, so they would only work for the last file anyway(they don't write anything, since there are no lines left in the file at that point).
Also, I see you're using with open... to open the csv file, but not the json file.
You should always use it rather than using open() without the with statement.

You should use a csv.DictWriter object to do everything. Here's something similar to your code with all the Arc stuff removed because I don't have it, that worked when I tested it:
import json, csv
JSONfiles = ['sample.json']
for JSONfile in JSONfiles:
with open(JSONfile, "rb") as openJSONfile:
rows = json.load(openJSONfile)
fieldnames = ['longitude', 'latitude']
with open(JSONfile+"test.csv", "wb") as f:
dict_writer = csv.DictWriter(f, fieldnames=fieldnames)
dict_writer.writeheader()
dict_writer.writerows(rows)
It was unnecessary to write out each row because your json file was a list of row dictionaries (assuming it was what you had embedded in your linked question).

I can't say I know for sure what was wrong, but putting all of the .JSON files in the same folder as my code (and changing my code appropriately) works. I will have to keep investigating why, when trying to read into other folders, it gives me the error:
IOError: [Errno 2] No such file or directory:
For now, the following code DOES work :)
import json, csv, arcpy, os
from arcpy import env
arcpy.env.workspace = r"C:\GIS\1GIS_DATA\MyFolder"
JSONfiles = arcpy.ListFiles("*.json")
print JSONfiles
for JSONfile in JSONfiles:
print "Current JSON file is: " + JSONfile
descJSONfile = arcpy.Describe(JSONfile)
JSONfileName = descJSONfile.baseName
with open(JSONfile, "rb") as openJSONfile:
rows = json.load(openJSONfile)
print "JSON file is loaded"
fieldnames = ['longitude', 'latitude']
with open(JSONfileName+"test.csv", "wb") as f:
dict_writer = csv.DictWriter(f, fieldnames = fieldnames)
dict_writer.writerow(dict(zip(fieldnames, fieldnames)))
dict_writer.writerows(rows)
print "CSVs are Populated with headers and rows from JSON file.", '\n'
Thanks everyone for your help.

Related

How do I read a csv file with Pythonista using an IPad?

I am fairly new to Python and I am trying to learn how to read and write csv files.I am programming on my iPad using Pythonista and I’ve encountered a problem I cant seem to solve. I want to read a csv file of which I don’t know the directory because of the limited iOS File Management App. The csv file is in the same folder where my python file is.
I’ve found on google that I can find the absolute directory by using the following code:
import os
print(os.path.abspath("google_stock_data.csv"))
Which spits out:
/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/google_stock_data.csv
Alright now on to my problem:
import csv
path = "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/google_stock_data.csv"
file = open(path, newline= '')
reader = csv.reader(file)
header = next(reader)
data = [row for row in reader]
print(header)
print(data[0])
The upper code gives me the error:
FileNotFoundError: [Errno 2] No such file or directory: '/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/google_stock_data.csv'
I know that the file exists and the directory should be correct since I’ve also tried finding it with pathlib and it turned out to be the same.
So what seems to cause the problem?
Try using the with open() (read) syntax. I have something very similar and this works for me. Your path is correct.
with open(path, 'r', encoding='utf-8') as reader:
reader = csv.DictReader(reader)
for row in reader:
# ...do stuff
The problem lies within how I named my file. The name of the csv file that I wanted to open was called "google_stock_data.csv". Note that this is the filename and does not contain its file suffix at the end (which would be ".csv").
If you want to use file = open(...) you have to also add the file suffix at the end of the filename.
This means that this is how it should look like in my case:
file = open('google_stock_data.csv.csv', newline= '')
Finding the absolute path with print(os.path.abspath("enter_file_name")) is not needed if you have the file in the folder where your code is.If you do, for whatever reason, dont forget to add the file suffix at the end.
As for how to output anything from the file both my code and
with open(path, 'r', encoding='utf-8') as reader:
reader = csv.DictReader(reader)
for row in reader:
# ...do stuff
from #Michael M. work perfectly fine if you add .csv at the end of where you declare path.

Python export to a TXT formatted or to Clipboard

I have a csv file that is something like BM13302, EM13203,etc
I have to read this from a file then reformat it to something like 'BM13302', 'EM13203',etc
What I'm having problems with is how do I export (write it either the clipboard or a file, I can cut and paste from. This is a tiny little project for reformatting some for part of some SQL code that's given to me in a unclean format and i have to spend a little while formatting it out. I would like to just point python to a directory and past the list in the file and have it export everything that way I need it.
I have the following code working
import os
f = open(r"/User/person/Desktop/folder/file.csv")
csv_f = csv.reader(f)
for row in csv_f:
print(row)
I get the expected results
I would like find out how to take the list(?) and format it like this
'BM1234', 'BM2351', '20394',....etc
and copy that to the clipboard
I thought something doing something like
with open('/Users/person/Desktop/csv/export.txt') as f:
f.write("open=", + "', '")
f.close()
nothing is printed. Can't find an example of what I'm needing. Anyone able to help me out??
Much Appreciate!
You can have the csv module quote things for you. As far as I know there is no clipboard in the python standard libs but there are various mechanisms out there. Here I'm using pyperclip which is reasonable for text-only copies.
import pyperclip
import csv
import io
def clip_csv(filename):
outbuf = io.StringIO()
with open('file.csv', newline='') as infile:
incsv = csv.reader(infile, skipinitialspace=True)
outcsv = csv.writer(outbuf, quotechar="'", quoting=csv.QUOTE_ALL)
outcsv.writerows(incsv)
pyperclip.copy(outbuf.getvalue())
clip_csv('file.csv')
# DEBUG: Verify by printing clipboard
print(pyperclip.paste())
I'm not sure but I think you try to add quote char ' to all data in csv
import csv
with open('export.csv', 'w') as f:
# use quote char `'` for all data
writer = csv.writer(f, quotechar="'", quoting=csv.QUOTE_ALL)
writer.writerow(["BM1234", "BM2351", "20394"])

Python - CSV file empty after rewriting using csv module

I'm attempting to rewrite specific cells in a csv file using Python.
However, whenever I try to modify an aspect of the csv file, the csv file ends up being emptied (the file contents becomes blank).
Minimal code example:
import csv
ReadFile = open("./Resources/File.csv", "rt", encoding = "utf-8-sig")
Reader = csv.reader(ReadFile)
WriteFile = open("./Resources/File.csv", "wt", encoding = "utf-8-sig")
Writer = csv.writer(WriteFile)
for row in Reader:
row[3] = 4
Writer.writerow(row)
ReadFile.close()
WriteFile.close()
'File.csv' looks like this:
1,2,3,FOUR,5
1,2,3,FOUR,5
1,2,3,FOUR,5
1,2,3,FOUR,5
1,2,3,FOUR,5
In this example, I'm attempting to change 'FOUR' to '4'.
Upon running this code, the csv file becomes empty instead.
So far, the only other question related to this that I've managed to find is this one, which does not seem to be dealing with rewriting specific cells in a csv file but instead deals with writing new rows to a csv file.
I'd be very grateful for any help anyone reading this could provide.
The following should work:
import csv
with open("./Resources/File.csv", "rt", encoding = "utf-8-sig") as ReadFile:
lines = list(csv.reader(ReadFile))
with open("./Resources/File.csv", "wt", encoding = "utf-8-sig") as WriteFile:
Writer = csv.writer(WriteFile)
for line in lines:
line[3] = 4
Writer.writerow(line)
When you open a writer with w option, it will delete the contents and start writing the file anew. The file is therefore, at the point when you start to read, empty.
Try writing to another file (like FileTemp.csv) and at the end of the program renaming FileTemp.csv to File.csv.

Python subprocess can't find the output of csv writer

I'm ripping some data from Mongo, sanitizing it via Python, and writing it to text file to import to Vertica. Vertica can't parse the python-written gzip (no idea why), so I'm trying to write the data to a csv and use bash to gzip the file instead.
csv_filename = '/home/deploy/tablecopy/{0}.csv'.format(vertica_table)
with open(csv_filename, 'wb') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
for replacement in mongo_object.find():
replacement_id = clean_value(replacement, "_id")
csv_writer.writerow([replacement_id, booking_id, style, added_ts])
subprocess.call(['gzip', 'file', csv_filename])
When I run this code, I get "gzip: file: No such file or directory," despite the fact that 1) the file is getting created immediately beforehand and 2) there's already a copy of the csv in the directory prior to the run, since this is a script that gets run repeatedly.
These points make me think that python is tying up the file somehow and bash can't see/access it. Any ideas on how to get this conversion to run?
Thanks
Just pass the csv_filename, gzip is looking for a file called "file" which does not exists so it errors not the csv_filename file:
subprocess.call(['gzip', csv_filename])
There is no file argument for gzip, you simply need to pass the filename.
You've already got the correct answer to your problem.... but alternately, you can use the gzip module to compress as you write so there is no need to call the gzip program at all. This example assumes you use python 3.x and you just have ascii text.
import gzip
csv_filename = '/home/deploy/tablecopy/{0}.csv'.format(vertica_table)
with gzip.open(csv_filename + '.gz', 'wt', encoding='ascii', newline='') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
for replacement in mongo_object.find():
replacement_id = clean_value(replacement, "_id")
csv_writer.writerow([replacement_id, booking_id, style, added_ts])

Python error: need more than one value to unpack

Ok, so I'm learning Python. But for my studies I have to do rather complicated stuff already. I'm trying to run a script to analyse data in excel files. This is how it looks:
#!/usr/bin/python
import sys
#lots of functions, not relevant
resultsdir = /home/blah
filename1=sys.argv[1]
filename2=sys.argv[2]
out = open(sys.argv[3],"w")
#filename1,filename2="CNVB_reads.403476","CNVB_reads.403447"
file1=open(resultsdir+"/"+filename1+".csv")
file2=open(resultsdir+"/"+filename2+".csv")
for line in file1:
start.p,end.p,type,nexons,start,end,cnvlength,chromosome,id,BF,rest=line.split("\t",10)
CNVs1[chr].append([int(start),int(end),float(BF)])
for line in file2:
start.p,end.p,type,nexons,start,end,cnvlength,chromosome,id,BF,rest=line.split("\t",10)
CNVs2[chr].append([int(start),int(end),float(BF)])
These are the titles of the columns of the data in the excel files and I want to split them, I'm not even sure if that is necessary when using data from excel files.
#more irrelevant stuff
out.write(filename1+","+filename2+","+str(chromosome)+","+str(type)+","+str(shared)+"\n")
This is what it should write in my output, 'shared' is what I have calculated, the rest is already in the files.
Ok, now my question, finally, when I call the script like that:
python script.py CNVB_reads.403476 CNVB_reads.403447 script.csv in my shell
I get the following error message:
start.p,end.p,type,nexons,start,end,cnvlength,chromosome,id,BF,rest=line.split("\t",10)
ValueError: need more than 1 value to unpack
I have no idea what is meant by that in relation to the data... Any ideas?
The line.split('\t', 10) call did not return eleven elements. Perhaps it is empty?
You probably want to use the csv module instead to parse these files.
import csv
import os
for filename, target in ((filename1, CNVs1), (filename2, CNVs2)):
with open(os.path.join(resultsdir, filename + ".csv"), 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter='\t')
for row in reader:
start.p, end.p = row[:2]
BF = float(row[8])
target[chr].append([int(start), int(end), BF])

Categories