This is my test code, but I have a more elaborate one - but they both don't work. In python 3.x.
import sys
def main():
inputfile = 'hi'
print(inputfile)
if __name__ == '__main__':
main()
EDIT: This what I want to use the terminal for (and syntax errors - same problem):
import csv
import sys
import json
inputfile = sys.argv[1]
outputfile = sys.argv[2]
# reading the csv
with open(inputfile, 'r') as inhandle: # r is reading while w is writing
reader = csv.DictReader(inhandle)
data = []
for row in reader:
data.append(row)
print(data)
# writing the json
with open(outputfile, "W") as outhandle:
json.dump(data, outhandle, indent=2)
As far as I understood by the code you've attached, hi must be wrote as 'hi'. In your original code, hi is regarded as another variable being assigned to inputfile, but it's not defined yet.
Related
I have a SSIS Loop package that is calling a python script multiple times.
The intent:
There is a folder of csv files. I need them converted to pipe-delimited text files. Some of the files have bad rows in them. The python script converts the csv files into the pipe files while removing the bad records.
the python code:
import csv
import sys
if len(sys.argv) != 4:
print(sys.argv)
sys.exit("usage: python csvtopipe.py <<SOURCE.csv>> <<TARGET.txt>> <<number of columns>>")
source = sys.argv[1]
target = sys.argv[2]
colcount = sys.argv[3]
file_comma = open(source, "r", encoding="unicode_escape")
reader_comma = csv.reader(file_comma, delimiter=',')
file_pipe = open(target, 'w', encoding="utf-8")
writer_pipe = csv.writer(file_pipe, delimiter='|', lineterminator='\n')
for row in reader_comma:
if len(row) == int(colcount):
print("write this..")
writer_pipe.writerow(row)
file_pipe.close()
file_comma.close()
The SSIS Package:
The python call from SSIS:
python csvtopipe.py <<SOURCE.csv>> <<TARGET.txt>> <<number of columns>>
The problem.
The loop works correctly, but when the individual call finishes, the file re-writes to 0 bytes. I can't tell if it's a SSIS problem or a python problem.
THanks!
UPDATE 1
This is the original version of the code. same result:
import csv
import sys
if len(sys.argv) != 4:
print(sys.argv)
sys.exit("usage: python csvtopipe.py <<SOURCE.csv>> <<TARGET.txt>> <<number of columns>>")
source = sys.argv[1]
target = sys.argv[2]
colcount = sys.argv[3]
with open(source, "r", encoding="unicode_escape") as file_comma:
reader_comma = csv.reader(file_comma, delimiter=',')
with open(target, 'w', encoding="utf-8") as file_pipe:
writer_pipe = csv.writer(file_pipe, delimiter='|', lineterminator='\n')
for row in reader_comma:
if len(row) == int(colcount):
print("write")
writer_pipe.writerow(row)
Firstly I would switch to using with open()... rather then separate open() and close() functions. This will help to ensure that the file is automatically closed in the event of a problem.
As the script is being invoked multiple times, I would add a timestamp to your output filename. This would help to ensure that each time it is run, a different file is produced.
Lastly, you could add a test to ensure that only one copy of the script is executed at the same time. For Windows based applications this can be done using a Windows Mutex. On Linux, the use of a file lock can be used. This approach is sometimes referred to as the singleton pattern.
import win32event
import win32api
from winerror import ERROR_ALREADY_EXISTS
from datetime import datetime
import csv
import sys
import os
import time
if len(sys.argv) != 4:
print(sys.argv)
sys.exit("usage: python csvtopipe.py <<SOURCE.csv>> <<TARGET.txt>> <<number of columns>>")
# Wait up to 30 seconds for another copy of the script to stop running
windows_mutex = win32event.CreateMutex(None, False, 'CSV2PIPE')
win32event.WaitForSingleObject(windows_mutex, 30000)
source = sys.argv[1]
target = sys.argv[2]
colcount = sys.argv[3]
# Add a filestamp
path, ext = os.path.splitext(target)
timestamp = datetime.now().strftime("%Y_%m_%d %H%M_%S")
target = f'{path}_{timestamp}{ext}'
with open(source, "r", encoding="unicode_escape") as file_comma, \
open(target, 'w', encoding="utf-8") as file_pipe:
reader_comma = csv.reader(file_comma, delimiter=',')
writer_pipe = csv.writer(file_pipe, delimiter='|', lineterminator='\n')
for row in reader_comma:
if len(row) == int(colcount):
print("write this..")
writer_pipe.writerow(row)
I want to "debug" my pyomo model. The output of the model.pprint() method looks helpful but it is too long so the console only displays and stores the last lines. How can I see the first lines. And how can I store this output in a file
(I tried pickle, json, normal f.write but since the output of .pprint() is of type NONE I wasn't sucessfull until now. (I am also new to python and learning python and pyomo in parallel).
None of this works :
'''
with open('some_file2.txt', 'w') as f:
serializer.dump(x, f)
import pickle
object = Object()
filehandler = open('some_file', 'wb')
pickle.dump(x, filehandler)
x = str(instance)
x = str(instance.pprint())
f = open('file6.txt', 'w')
f.write(x)
f.write(instance.pprint())
f.close()
Use the filename keyword argument to the pprint method:
instance.pprint(filename='foo.txt')
instance.pprint() prints in the console (stdout for standard output), but does not return the content (the return is None as you said). To have it print in a file, you can try to redirect the standard output to a file.
Try:
import sys
f = open('file6.txt', 'w')
sys.stdout = f
instance.pprint()
f.close()
It looks like there is a cleaner solution from Bethany =)
For me the accepted answer does not work, pprint has a different signature.
help(instance.pprint)
pprint(ostream=None, verbose=False, prefix='') method of pyomo.core.base.PyomoModel.ConcreteModel instance
# working for me:
with open(path, 'w') as output_file:
instance.pprint(output_file)
I have written a simple python script to hash a file and output the result. However, when I run the script (python scriptname.py), I don't get any output (expected it to print the checksum). I don't get any errors from the console either.
What am I doing wrong?
#!/usr/bin/env python
import hashlib
import sys
def sha256_checksum(filename, block_size=65536):
sha256 = hashlib.sha256()
filename = '/Desktop/testfile.txt'
with open(filename, 'rb') as f:
for block in iter(lambda: f.read(block_size), b''):
sha256.update(block)
return sha256.hexdigest()
def main():
for f in sys.argv[1:]:
checksum = sha256_checksum(f)
print(f + '\t' + checksum)
if __name__ == '__main__':
main()
def main():
for f in sys.argv[1:]:
The script expected arguments. If you run it without any arguments you don't see any ouput.
The main body suppose that you provide list of files for hashing but in hashing function you hardcoded
filename = '/Desktop/testfile.txt'
So, if you want to pass files for hashing as script arguments remove the line
filename = '/Desktop/testfile.txt'
and run
python scriptname.py '/Desktop/testfile.txt'
I've got the following code:
with open("test.txt", "r") as test, open("table.txt", "w") as table:
reader = csv.reader(test, delimiter="\t")
writer = csv.writer(table, delimiter="\t")
for row in reader:
if all(field not in keywords for field in row):
writer.writerow(row)
How am I able to convert it to a .py file which let you define table.txt and test when you run it. So that one have to write:
Script.py test.txt output.txt
Just use sys.argv:
import sys
import csv
with open(sys.argv[1], "r") as test, open(sys.argv[2], "w") as table:
# more here
Note, sys.argv[0] contains the script name (in your case, Script.py). To get the first argument, you should get sys.argv[1]; to get the second argument, you should get sys.argv[2] and so on.
I have a Python script which modifies a CSV file to add the filename as the last column:
import sys
import glob
for filename in glob.glob(sys.argv[1]):
file = open(filename)
data = [line.rstrip() + "," + filename for line in file]
file.close()
file = open(filename, "w")
file.write("\n".join(data))
file.close()
Unfortunately, it also adds the filename to the header (first) row of the file. I would like the string "ID" added to the header instead. Can anybody suggest how I could do this?
Have a look at the official csv module.
Here are a few minor notes on your current code:
It's a bad idea to use file as a variable name, since that shadows the built-in type.
You can close the file objects automatically by using the with syntax.
Don't you want to add an extra column in the header line, called something like Filename, rather than just omitting a column in the first row?
If your filenames have commas (or, less probably, newlines) in them, you'll need to make sure that the filename is quoted - just appending it won't do.
That last consideration would incline me to use the csv module instead, which will deal with the quoting and unquoting for you. For example, you could try something like the following code:
import glob
import csv
import sys
for filename in glob.glob(sys.argv[1]):
data = []
with open(filename) as finput:
for i, row in enumerate(csv.reader(finput)):
to_append = "Filename" if i == 0 else filename
data.append(row+[to_append])
with open(filename,'wb') as foutput:
writer = csv.writer(foutput)
for row in data:
writer.writerow(row)
That may quote the data slightly differently from your input file, so you might want to play with the quoting options for csv.reader and csv.writer described in the documentation for the csv module.
As a further point, you might have good reasons for taking a glob as a parameter rather than just the files on the command line, but it's a bit surprising - you'll have to call your script as ./whatever.py '*.csv' rather than just ./whatever.py *.csv. Instead, you could just do:
for filename in sys.argv[1:]:
... and let the shell expand your glob before the script knows anything about it.
One last thing - the current approach you're taking is slightly dangerous, in that if anything fails when writing back to the same filename, you'll lose data. The standard way of avoiding this is to instead write to a temporary file, and, if that was successful, rename the temporary file over the original. So, you might rewrite the whole thing as:
import csv
import sys
import tempfile
import shutil
for filename in sys.argv[1:]:
tmp = tempfile.NamedTemporaryFile(delete=False)
with open(filename) as finput:
with open(tmp.name,'wb') as ftmp:
writer = csv.writer(ftmp)
for i, row in enumerate(csv.reader(finput)):
to_append = "Filename" if i == 0 else filename
writer.writerow(row+[to_append])
shutil.move(tmp.name,filename)
You can try:
data = [file.readline().rstrip() + ",id"]
data += [line.rstrip() + "," + filename for line in file]
You can try changing your code, but using the csv module is recommended. This should give you the result you want:
import sys
import glob
import csv
filename = glob.glob(sys.argv[1])[0]
yourfile = csv.reader(open(filename, 'rw'))
csv_output=[]
for row in yourfile:
if len(csv_output) != 0: # skip the header
row.append(filename)
csv_output.append(row)
yourfile = csv.writer(open(filename,'w'),delimiter=',')
yourfile.writerows(csv_output)
Use the CSV module that comes with Python.
import csv
import sys
def process_file(filename):
# Read the contents of the file into a list of lines.
f = open(filename, 'r')
contents = f.readlines()
f.close()
# Use a CSV reader to parse the contents.
reader = csv.reader(contents)
# Open the output and create a CSV writer for it.
f = open(filename, 'wb')
writer = csv.writer(f)
# Process the header.
header = reader.next()
header.append('ID')
writer.writerow(header)
# Process each row of the body.
for row in reader:
row.append(filename)
writer.writerow(row)
# Close the file and we're done.
f.close()
# Run the function on all command-line arguments. Note that this does no
# checking for things such as file existence or permissions.
map(process_file, sys.argv[1:])
You can run this as follows:
blair#blair-eeepc:~$ python csv_add_filename.py file1.csv file2.csv
you can use fileinput to do in place editing
import sys
import glob
import fileinput
for filename in glob.glob(sys.argv[1]):
for line in fileinput.FileInput(filename,inplace=1) :
if fileinput.lineno()==1:
print line.rstrip() + " ID"
else
print line.rstrip() + "," + filename