Python subprocess can't find the output of csv writer

Python subprocess can't find the output of csv writer - python

I'm ripping some data from Mongo, sanitizing it via Python, and writing it to text file to import to Vertica. Vertica can't parse the python-written gzip (no idea why), so I'm trying to write the data to a csv and use bash to gzip the file instead.
csv_filename = '/home/deploy/tablecopy/{0}.csv'.format(vertica_table)
with open(csv_filename, 'wb') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
for replacement in mongo_object.find():
replacement_id = clean_value(replacement, "_id")
csv_writer.writerow([replacement_id, booking_id, style, added_ts])
subprocess.call(['gzip', 'file', csv_filename])
When I run this code, I get "gzip: file: No such file or directory," despite the fact that 1) the file is getting created immediately beforehand and 2) there's already a copy of the csv in the directory prior to the run, since this is a script that gets run repeatedly.
These points make me think that python is tying up the file somehow and bash can't see/access it. Any ideas on how to get this conversion to run?
Thanks

Just pass the csv_filename, gzip is looking for a file called "file" which does not exists so it errors not the csv_filename file:
subprocess.call(['gzip', csv_filename])
There is no file argument for gzip, you simply need to pass the filename.

You've already got the correct answer to your problem.... but alternately, you can use the gzip module to compress as you write so there is no need to call the gzip program at all. This example assumes you use python 3.x and you just have ascii text.
import gzip
csv_filename = '/home/deploy/tablecopy/{0}.csv'.format(vertica_table)
with gzip.open(csv_filename + '.gz', 'wt', encoding='ascii', newline='') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
for replacement in mongo_object.find():
replacement_id = clean_value(replacement, "_id")
csv_writer.writerow([replacement_id, booking_id, style, added_ts])

Related

Why does my code add newlines into my csv file? How can I get rid of them? [duplicate]

import csv
with open('thefile.csv', 'rb') as f:
data = list(csv.reader(f))
import collections
counter = collections.defaultdict(int)
for row in data:
counter[row[10]] += 1
with open('/pythonwork/thefile_subset11.csv', 'w') as outfile:
writer = csv.writer(outfile)
for row in data:
if counter[row[10]] >= 504:
writer.writerow(row)
This code reads thefile.csv, makes changes, and writes results to thefile_subset1.
However, when I open the resulting csv in Microsoft Excel, there is an extra blank line after each record!
Is there a way to make it not put an extra blank line?

The csv.writer module directly controls line endings and writes \r\n into the file directly. In Python 3 the file must be opened in untranslated text mode with the parameters 'w', newline='' (empty string) or it will write \r\r\n on Windows, where the default text mode will translate each \n into \r\n.
#!python3
with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile:
writer = csv.writer(outfile)
In Python 2, use binary mode to open outfile with mode 'wb' instead of 'w' to prevent Windows newline translation. Python 2 also has problems with Unicode and requires other workarounds to write non-ASCII text. See the Python 2 link below and the UnicodeReader and UnicodeWriter examples at the end of the page if you have to deal with writing Unicode strings to CSVs on Python 2, or look into the 3rd party unicodecsv module:
#!python2
with open('/pythonwork/thefile_subset11.csv', 'wb') as outfile:
writer = csv.writer(outfile)
Documentation Links
https://docs.python.org/3/library/csv.html#csv.writer
https://docs.python.org/2/library/csv.html#csv.writer

Opening the file in binary mode "wb" will not work in Python 3+. Or rather, you'd have to convert your data to binary before writing it. That's just a hassle.
Instead, you should keep it in text mode, but override the newline as empty. Like so:
with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile:

Note: It seems this is not the preferred solution because of how the extra line was being added on a Windows system. As stated in the python document:
If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.
Windows is one such platform where that makes a difference. While changing the line terminator as I described below may have fixed the problem, the problem could be avoided altogether by opening the file in binary mode. One might say this solution is more "elegent". "Fiddling" with the line terminator would have likely resulted in unportable code between systems in this case, where opening a file in binary mode on a unix system results in no effect. ie. it results in cross system compatible code.
From Python Docs:
On Windows, 'b' appended to the mode
opens the file in binary mode, so
there are also modes like 'rb', 'wb',
and 'r+b'. Python on Windows makes a
distinction between text and binary
files; the end-of-line characters in
text files are automatically altered
slightly when data is read or written.
This behind-the-scenes modification to
file data is fine for ASCII text
files, but it’ll corrupt binary data
like that in JPEG or EXE files. Be
very careful to use binary mode when
reading and writing such files. On
Unix, it doesn’t hurt to append a 'b'
to the mode, so you can use it
platform-independently for all binary
files.
Original:
As part of optional paramaters for the csv.writer if you are getting extra blank lines you may have to change the lineterminator (info here). Example below adapated from the python page csv docs. Change it from '\n' to whatever it should be. As this is just a stab in the dark at the problem this may or may not work, but it's my best guess.
>>> import csv
>>> spamWriter = csv.writer(open('eggs.csv', 'w'), lineterminator='\n')
>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])

The simple answer is that csv files should always be opened in binary mode whether for input or output, as otherwise on Windows there are problems with the line ending. Specifically on output the csv module will write \r\n (the standard CSV row terminator) and then (in text mode) the runtime will replace the \n by \r\n (the Windows standard line terminator) giving a result of \r\r\n.
Fiddling with the lineterminator is NOT the solution.

A lot of the other answers have become out of date in the ten years since the original question. For Python3, the answer is right in the documentation:
If csvfile is a file object, it should be opened with newline=''
The footnote explains in more detail:
If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

Use the method defined below to write data to the CSV file.
open('outputFile.csv', 'a',newline='')
Just add an additional newline='' parameter inside the open method :
def writePhoneSpecsToCSV():
rowData=["field1", "field2"]
with open('outputFile.csv', 'a',newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(rowData)
This will write CSV rows without creating additional rows!

I'm writing this answer w.r.t. to python 3, as I've initially got the same problem.
I was supposed to get data from arduino using PySerial, and write them in a .csv file. Each reading in my case ended with '\r\n', so newline was always separating each line.
In my case, newline='' option didn't work. Because it showed some error like :
with open('op.csv', 'a',newline=' ') as csv_file:
ValueError: illegal newline value: ''
So it seemed that they don't accept omission of newline here.
Seeing one of the answers here only, I mentioned line terminator in the writer object, like,
writer = csv.writer(csv_file, delimiter=' ',lineterminator='\r')
and that worked for me for skipping the extra newlines.

with open(destPath+'\\'+csvXML, 'a+') as csvFile:
writer = csv.writer(csvFile, delimiter=';', lineterminator='\r')
writer.writerows(xmlList)
The "lineterminator='\r'" permit to pass to next row, without empty row between two.

Borrowing from this answer, it seems like the cleanest solution is to use io.TextIOWrapper. I managed to solve this problem for myself as follows:
from io import TextIOWrapper
...
with open(filename, 'wb') as csvfile, TextIOWrapper(csvfile, encoding='utf-8', newline='') as wrapper:
csvwriter = csv.writer(wrapper)
for data_row in data:
csvwriter.writerow(data_row)
The above answer is not compatible with Python 2. To have compatibility, I suppose one would simply need to wrap all the writing logic in an if block:
if sys.version_info < (3,):
# Python 2 way of handling CSVs
else:
# The above logic

I used writerow
def write_csv(writer, var1, var2, var3, var4):
"""
write four variables into a csv file
"""
writer.writerow([var1, var2, var3, var4])
numbers=set([1,2,3,4,5,6,7,2,4,6,8,10,12,14,16])
rules = list(permutations(numbers, 4))
#print(rules)
selection=[]
with open("count.csv", 'w',newline='') as csvfile:
writer = csv.writer(csvfile)
for rule in rules:
number1,number2,number3,number4=rule
if ((number1+number2+number3+number4)%5==0):
#print(rule)
selection.append(rule)
write_csv(writer,number1,number2,number3,number4)

When using Python 3 the empty lines can be avoid by using the codecs module. As stated in the documentation, files are opened in binary mode so no change of the newline kwarg is necessary. I was running into the same issue recently and that worked for me:
with codecs.open( csv_file, mode='w', encoding='utf-8') as out_csv:
csv_out_file = csv.DictWriter(out_csv)

Reading and Writing into CSV file at the same time

I wanted to read some input from the csv file and then modify the input and replace it with the new value. For this purpose, I first read the value but then I'm stuck at this point as I want to modify all the values present in the file.
So is it possible to open the file in r mode in one for loop and then immediately in w mode in another loop to enter the modified data?
If there is a simpler way to do this please help me out
Thank you.

Yes, you can open the same file in different modes in the same program. Just be sure not to do it at the same time. For example, this is perfectly valid:
with open("data.csv") as f:
# read data into a data structure (list, dictionary, etc.)
# process lines here if you can do it line by line
# process data here as needed (replacing your values etc.)
# now open the same filename again for writing
# the main thing is that the file has been previously closed
# (after the previous `with` block finishes, python will auto close the file)
with open("data.csv", "w") as f:
# write to f here
As others have pointed out in the comments, reading and writing on the same file handle at the same time is generally a bad idea and won't work as you expect (unless for some very specific use case).

You can do open("data.csv", "rw"), this allows you to read and write at the same time.

Just like others have mentioned, modifying the same file as both input and output without any backup method is such a terrible idea, especially in a condensed file like most .csv files, which is normally more complicated than a single .Txt based file, but if you insisted you can do with the following:
import csv
file path = 'some.csv'
with open('some.csv', 'rw', newline='') as csvfile:
read_file = csv.reader(csvfile)
write_file = csv.writer(csvfile)
Note that code above will trigger an error with a message ValueError: must have exactly one of create/read/write/append mode.
For safety, I preferred to split it into two different files
import csv
in_path = 'some.csv'
out_path = 'Out.csv'
with open(in_path, 'r', newline='') as inputFile, open(out_path, 'w', newline='') as writerFile:
read_file = csv.reader(inputFile)
write_file = csv.writer(writerFile, delimiter=' ', quotechar='|', quoting=csv.QUOTE_MINIMAL)
for row in read_file:
# your modifying input data code here
........

How do I remove blank lines when exporting data to CSV file using Python? [duplicate]

import csv
with open('thefile.csv', 'rb') as f:
data = list(csv.reader(f))
import collections
counter = collections.defaultdict(int)
for row in data:
counter[row[10]] += 1
with open('/pythonwork/thefile_subset11.csv', 'w') as outfile:
writer = csv.writer(outfile)
for row in data:
if counter[row[10]] >= 504:
writer.writerow(row)
This code reads thefile.csv, makes changes, and writes results to thefile_subset1.
However, when I open the resulting csv in Microsoft Excel, there is an extra blank line after each record!
Is there a way to make it not put an extra blank line?

The csv.writer module directly controls line endings and writes \r\n into the file directly. In Python 3 the file must be opened in untranslated text mode with the parameters 'w', newline='' (empty string) or it will write \r\r\n on Windows, where the default text mode will translate each \n into \r\n.
#!python3
with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile:
writer = csv.writer(outfile)
In Python 2, use binary mode to open outfile with mode 'wb' instead of 'w' to prevent Windows newline translation. Python 2 also has problems with Unicode and requires other workarounds to write non-ASCII text. See the Python 2 link below and the UnicodeReader and UnicodeWriter examples at the end of the page if you have to deal with writing Unicode strings to CSVs on Python 2, or look into the 3rd party unicodecsv module:
#!python2
with open('/pythonwork/thefile_subset11.csv', 'wb') as outfile:
writer = csv.writer(outfile)
Documentation Links
https://docs.python.org/3/library/csv.html#csv.writer
https://docs.python.org/2/library/csv.html#csv.writer

Opening the file in binary mode "wb" will not work in Python 3+. Or rather, you'd have to convert your data to binary before writing it. That's just a hassle.
Instead, you should keep it in text mode, but override the newline as empty. Like so:
with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile:

Note: It seems this is not the preferred solution because of how the extra line was being added on a Windows system. As stated in the python document:
If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.
Windows is one such platform where that makes a difference. While changing the line terminator as I described below may have fixed the problem, the problem could be avoided altogether by opening the file in binary mode. One might say this solution is more "elegent". "Fiddling" with the line terminator would have likely resulted in unportable code between systems in this case, where opening a file in binary mode on a unix system results in no effect. ie. it results in cross system compatible code.
From Python Docs:
On Windows, 'b' appended to the mode
opens the file in binary mode, so
there are also modes like 'rb', 'wb',
and 'r+b'. Python on Windows makes a
distinction between text and binary
files; the end-of-line characters in
text files are automatically altered
slightly when data is read or written.
This behind-the-scenes modification to
file data is fine for ASCII text
files, but it’ll corrupt binary data
like that in JPEG or EXE files. Be
very careful to use binary mode when
reading and writing such files. On
Unix, it doesn’t hurt to append a 'b'
to the mode, so you can use it
platform-independently for all binary
files.
Original:
As part of optional paramaters for the csv.writer if you are getting extra blank lines you may have to change the lineterminator (info here). Example below adapated from the python page csv docs. Change it from '\n' to whatever it should be. As this is just a stab in the dark at the problem this may or may not work, but it's my best guess.
>>> import csv
>>> spamWriter = csv.writer(open('eggs.csv', 'w'), lineterminator='\n')
>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])

The simple answer is that csv files should always be opened in binary mode whether for input or output, as otherwise on Windows there are problems with the line ending. Specifically on output the csv module will write \r\n (the standard CSV row terminator) and then (in text mode) the runtime will replace the \n by \r\n (the Windows standard line terminator) giving a result of \r\r\n.
Fiddling with the lineterminator is NOT the solution.

A lot of the other answers have become out of date in the ten years since the original question. For Python3, the answer is right in the documentation:
If csvfile is a file object, it should be opened with newline=''
The footnote explains in more detail:
If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

Use the method defined below to write data to the CSV file.
open('outputFile.csv', 'a',newline='')
Just add an additional newline='' parameter inside the open method :
def writePhoneSpecsToCSV():
rowData=["field1", "field2"]
with open('outputFile.csv', 'a',newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(rowData)
This will write CSV rows without creating additional rows!

I'm writing this answer w.r.t. to python 3, as I've initially got the same problem.
I was supposed to get data from arduino using PySerial, and write them in a .csv file. Each reading in my case ended with '\r\n', so newline was always separating each line.
In my case, newline='' option didn't work. Because it showed some error like :
with open('op.csv', 'a',newline=' ') as csv_file:
ValueError: illegal newline value: ''
So it seemed that they don't accept omission of newline here.
Seeing one of the answers here only, I mentioned line terminator in the writer object, like,
writer = csv.writer(csv_file, delimiter=' ',lineterminator='\r')
and that worked for me for skipping the extra newlines.

with open(destPath+'\\'+csvXML, 'a+') as csvFile:
writer = csv.writer(csvFile, delimiter=';', lineterminator='\r')
writer.writerows(xmlList)
The "lineterminator='\r'" permit to pass to next row, without empty row between two.

Borrowing from this answer, it seems like the cleanest solution is to use io.TextIOWrapper. I managed to solve this problem for myself as follows:
from io import TextIOWrapper
...
with open(filename, 'wb') as csvfile, TextIOWrapper(csvfile, encoding='utf-8', newline='') as wrapper:
csvwriter = csv.writer(wrapper)
for data_row in data:
csvwriter.writerow(data_row)
The above answer is not compatible with Python 2. To have compatibility, I suppose one would simply need to wrap all the writing logic in an if block:
if sys.version_info < (3,):
# Python 2 way of handling CSVs
else:
# The above logic

I used writerow
def write_csv(writer, var1, var2, var3, var4):
"""
write four variables into a csv file
"""
writer.writerow([var1, var2, var3, var4])
numbers=set([1,2,3,4,5,6,7,2,4,6,8,10,12,14,16])
rules = list(permutations(numbers, 4))
#print(rules)
selection=[]
with open("count.csv", 'w',newline='') as csvfile:
writer = csv.writer(csvfile)
for rule in rules:
number1,number2,number3,number4=rule
if ((number1+number2+number3+number4)%5==0):
#print(rule)
selection.append(rule)
write_csv(writer,number1,number2,number3,number4)

When using Python 3 the empty lines can be avoid by using the codecs module. As stated in the documentation, files are opened in binary mode so no change of the newline kwarg is necessary. I was running into the same issue recently and that worked for me:
with codecs.open( csv_file, mode='w', encoding='utf-8') as out_csv:
csv_out_file = csv.DictWriter(out_csv)

Convert JSON files to CSV files using Python (Idle)

This question piggybacks a question I had posted yesterday. I actually got my code to work fine. I was starting small. I switched out the JSON in the Python code for multiple JSON files outside of the Python code. I actually got that to work beautifully. And then there was some sort of catastrophe, and my code was lost.
I have spent several hours trying to recreate it to no avail. I am actually using arcpy (ArcGIS's Python module) since I will later on be using it to perform some spatial analysis, but I don't think you need to know much about arcpy to help me out with this part (I don't think, but it may help).
Here is one version of my latest attempts, but it is not working. I switched out my actual path to just "Pathname." I actually have everything working up until the point when I try to populate the rows in the CSV (which are of latitude and longitude values. It is successfully writing the latitude/longitude headers in the CSV files). So apparently whatever is below dict_writer.writerows(openJSONfile) is not working:
import json, csv, arcpy
from arcpy import env
arcpy.env.workspace = r"C:\GIS\1GIS_DATA\Pathname"
workspaces = arcpy.ListWorkspaces("*", "Folder")
for workspace in workspaces:
arcpy.env.workspace = workspace
JSONfiles = arcpy.ListFiles("*.json")
for JSONfile in JSONfiles:
descJSONfile = arcpy.Describe(JSONfile)
JSONfileName = descJSONfile.baseName
openJSONfile = open(JSONfile, "wb+")
print "JSON file is open"
fieldnames = ['longitude', 'latitude']
with open(JSONfileName+"test.csv", "wb+") as f:
dict_writer = csv.DictWriter(f, fieldnames=fieldnames)
dict_writer.writerow(dict(zip(fieldnames, fieldnames)))
dict_writer.writerows(openJSONfile)
#Do I have to open the CSV files? Aren't they already open?
#openCSVfile = open(CSVfile, "r+")
for row in openJSONfile:
f.writerow( [row['longitude'], row['latitude']] )
Any help is greatly appreciated!!

You're not actually loading the JSON file.
You're trying to write rows from an open file instead of writing rows from json.
You will need to add something like this:
rows = json.load(openJSONfile)
and later:
dict_writer.writerows(rows)
The last two lines you have should be removed, since all the csv writing is done before you reach them, and they are outside of the loop, so they would only work for the last file anyway(they don't write anything, since there are no lines left in the file at that point).
Also, I see you're using with open... to open the csv file, but not the json file.
You should always use it rather than using open() without the with statement.

You should use a csv.DictWriter object to do everything. Here's something similar to your code with all the Arc stuff removed because I don't have it, that worked when I tested it:
import json, csv
JSONfiles = ['sample.json']
for JSONfile in JSONfiles:
with open(JSONfile, "rb") as openJSONfile:
rows = json.load(openJSONfile)
fieldnames = ['longitude', 'latitude']
with open(JSONfile+"test.csv", "wb") as f:
dict_writer = csv.DictWriter(f, fieldnames=fieldnames)
dict_writer.writeheader()
dict_writer.writerows(rows)
It was unnecessary to write out each row because your json file was a list of row dictionaries (assuming it was what you had embedded in your linked question).

I can't say I know for sure what was wrong, but putting all of the .JSON files in the same folder as my code (and changing my code appropriately) works. I will have to keep investigating why, when trying to read into other folders, it gives me the error:
IOError: [Errno 2] No such file or directory:
For now, the following code DOES work :)
import json, csv, arcpy, os
from arcpy import env
arcpy.env.workspace = r"C:\GIS\1GIS_DATA\MyFolder"
JSONfiles = arcpy.ListFiles("*.json")
print JSONfiles
for JSONfile in JSONfiles:
print "Current JSON file is: " + JSONfile
descJSONfile = arcpy.Describe(JSONfile)
JSONfileName = descJSONfile.baseName
with open(JSONfile, "rb") as openJSONfile:
rows = json.load(openJSONfile)
print "JSON file is loaded"
fieldnames = ['longitude', 'latitude']
with open(JSONfileName+"test.csv", "wb") as f:
dict_writer = csv.DictWriter(f, fieldnames = fieldnames)
dict_writer.writerow(dict(zip(fieldnames, fieldnames)))
dict_writer.writerows(rows)
print "CSVs are Populated with headers and rows from JSON file.", '\n'
Thanks everyone for your help.

CSV write error on Python 3

I am trying to save output from a module to CSV file and I got an error when I ran the following code, which is a part of a module:
base_keys = ['path', 'rDATE', 'cDate', 'cik', 'risk', 'word_count']
outFile = open('c:\\Users\\ahn_133\\Desktop\\Python Project\\MinkAhn_completed2.csv','wb')
dWriter = csv.DictWriter(outFile, fieldnames=base_keys)
dWriter.writerow(headerDict)
Here is the error message (base_keys are the headings.)
return self.writer.writerow(self._dict_to_list(rowdict))
TypeError: 'str' does not support the buffer interface
I dont' even understand what the error message is about. I use Python 3.3 and Windows 7.
Thanks for your time.

Opening a file in binary mode to write csv data to doesn't work in Python 3, simply put. What you want is to open in text mode and either use the default encoding or specify one yourself, i.e., your code should be written like:
import csv
k = ['hi']
out = open('bleh.csv', 'w', newline='', encoding='utf8') # mode could be 'wt' for extra-clarity
writer = csv.DictWriter(out, k)
writer.writerow({'hi': 'hey'})
Now, due to a bug, you also need to specify newline='' when opening this file for writing the CSV output.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python subprocess can't find the output of csv writer - python

Just pass the csv_filename, gzip is looking for a file called "file" which does not exists so it errors not the csv_filename file: subprocess.call(['gzip', csv_filename]) There is no file argument for gzip, you simply need to pass the filename.

Related

Why does my code add newlines into my csv file? How can I get rid of them? [duplicate]

Reading and Writing into CSV file at the same time

How do I remove blank lines when exporting data to CSV file using Python? [duplicate]

Convert JSON files to CSV files using Python (Idle)

CSV write error on Python 3

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python subprocess can't find the output of csv writer - python

Just pass the csv_filename, gzip is looking for a file called "file" which does not exists so it errors not the csv_filename file: subprocess.call(['gzip', csv_filename]) There is no file argument for gzip, you simply need to pass the filename.

Related

Why does my code add newlines into my csv file? How can I get rid of them? [duplicate]

Reading and Writing into CSV file at the same time

How do I remove blank lines when exporting data to CSV file using Python? [duplicate]

Convert JSON *files* to CSV *files* using Python (Idle)

CSV write error on Python 3

Categories

Resources

Convert JSON files to CSV files using Python (Idle)