reading csv file without for - python

I need to read a CSV file in python.
Since for last row I receive a 'NULL byte' error I would like to avoid using for keyword but the while.
Do you know how to do that?
reader = csv.reader( file )
for row in reader # I have an error at this line
# do whatever with row
I want to substitute the for-loop with a while-loop so that I can check if the row is NULL or not.
What is the function for reading a single row in the CSV module?
Thanks
Thanks
p.S. below the traceback
Traceback (most recent call last):
File "FetchNeuro_TodayTrades.py", line 189, in
for row in reader:
_csv.Error: line contains NULL byte

Maybe you could catch the exception raised by the CSV reader. Something like this:
filename = "my.csv"
reader = csv.reader(open(filename))
try:
for row in reader:
print 'Row read with success!', row
except csv.Error, e:
sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
Or you could use next():
while True:
try:
print reader.next()
except csv.Error:
print "Error"
except StopIteration:
print "Iteration End"
break

You need (always) to say EXACTLY what is the error message that you got. Please edit your question.
Probably this:
>>> import csv; csv.reader("\x00").next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_csv.Error: line contains NULL byte
>>>
The csv module is not 8-bit clean; see the docs: """Also, there are currently some issues regarding ASCII NUL characters."""
The error message is itself in error: it should be "NUL", not "NULL" :-(
If the last line in the file is empty, you won't get an exception, you'll merely get row == [].
Assuming the problem is one or more NULs in your file(s), you'll need to (1) speak earnestly to the creator(s) of your file(s) (2) failing that, read the whole file in (mode="rb"), strip out the NUL(s), and feed fixed_text.splitlines() to the csv reader.

The Django community has addressed Python CSV import issues, so it might be worth searching for CSV import there, or posting a question. Also, you could edit the offending line directly in the CSV file before trying the import.

You could try cleaning the file as you read it:
def nonull(stream):
for line in stream:
yield line.replace('\x00', '')
f = open(filename)
reader = csv.reader(nonull(f))
Assuming, of course, that simply ignoring NULL characters will work for you!

If your problem is specific to the last line being empty, you can use numpy.genfromtxt (or the old matplotlib.mlab.csv2rec)
$: cat >csv_file.txt
foo,bar,baz
yes,no,0
x,y,z
$:
$: ipython
>>> from numpy import genfromtxt
>>> genfromtxt("csv_file.txt", dtype=None, delimiter=',')
array([['foo', 'bar', 'baz'],
['yes', 'no', '0'],
['x', 'y', 'z']],
dtype='|S3')

not really sure what you mean, but you can always check for existence with if
>>> reader = csv.reader("file")
>>> for r in reader:
... if r: print r
...
if this is not what you want, you should describe your problem more clearly by showing examples of things that doesn't work for you, including sample file format and desired output you want.

I don't have an answer, but I can confirm the problem, and that most answers posted don't work. You cannot catch this exception. You cannot test for if line. Maybe you could check for the NULL byte directly, but I'm not swift enough to do that... If it is always on the last line, you could of course skip that.
import csv
FH = open('data.csv','wb')
line1 = [97,44,98,44,99,10]
line2 = [100,44,101,44,102,10]
for n in line1 + line2:
FH.write(chr(n))
FH.write(chr(0))
FH.close()
FH = open('data.csv')
reader = csv.reader(FH)
for line in reader:
if '\0' in line: continue
if not line: continue
print line
$ python script.py
['a', 'b', 'c']
['d', 'e', 'f']
Traceback (most recent call last):
File "script.py", line 11, in <module>
for line in reader:
_csv.Error: line contains NULL byte

Process the initial csv file and replace the Nul '\0' with blank, and then you can read it.
The actual code looks like this:
data_initial = open(csv_file, "rU")
reader = csv.reader((line.replace('\0','') for line in data_initial))
It works for me.
And the original answer is here:csv-contain null byte

Related

Remove a JSON file if an exception occurs

I am writing a program which stores some JSON-encoded data in a file, but sometimes the resulting file is blank (because there wasn't found any new data). When the program finds data and stores it, I do this:
with open('data.tmp') as f:
data = json.load(f)
os.remove('data.tmp')
Of course, if the file is blank this will raise an exception, which I can catch but does not let me to remove the file. I have tried:
try:
with open('data.tmp') as f:
data = json.load(f)
except:
os.remove('data.tmp')
And I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "MyScript.py", line 50, in run
os.remove('data.tmp')
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
How could I delete the file when the exception occurs?
How about separating out file reading and json loading? json.loads behaves exactly same as json.load but uses a string.
with open('data.tmp') as f:
dataread = f.read()
os.remove('data.tmp')
#handle exceptions as needed here...
data = json.loads(dataread)
I am late to the party. But the json dump and load modules seem to keep using files even after writing or reading data from them. What you can do is use dumps or loads modules to get the string representation and then use normal file.write() or file.read() on the result.
For example:
with open('file_path.json'), 'w') as file:
file.write(json.dumps(json_data))
os.remove('file_path.json')
Not the best alternative but it saves me a lot especially when using temp dir.
you need to edit the remove part, so it handles the non-existing case gracefully.
import os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
try:
if os.stat(fn).st_size > 0:
os.remove(fn) if os.path.exists(fn) else None
except OSError as e: # this would be "except OSError, e:" before Python 2.6
if e.errno != errno.ENOENT:
raise
see also Most pythonic way to delete a file which may not exist
you could extract the silent removal in a separate function.
also, from the same other SO question:
# python3.4 and above
import contextlib, os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
with contextlib.suppress(FileNotFoundError):
if os.stat(fn).st_size > 0:
os.remove(fn)
I personally like the latter approach better - it's explicit.

ValueError: must have exactly one of create/read/write/append mode

I have a file that I open and i want to search through till I find a specific text phrase at the beginning of a line. I then want to overwrite that line with 'sentence'
sentence = "new text" "
with open(main_path,'rw') as file: # Use file to refer to the file object
for line in file.readlines():
if line.startswith('text to replace'):
file.write(sentence)
I'm getting:
Traceback (most recent call last):
File "setup_main.py", line 37, in <module>
with open(main_path,'rw') as file: # Use file to refer to the file object
ValueError: must have exactly one of create/read/write/append mode
How can I get this working?
You can open a file for simultaneous reading and writing but it won't work the way you expect:
with open('file.txt', 'w') as f:
f.write('abcd')
with open('file.txt', 'r+') as f: # The mode is r+ instead of r
print(f.read()) # prints "abcd"
f.seek(0) # Go back to the beginning of the file
f.write('xyz')
f.seek(0)
print(f.read()) # prints "xyzd", not "xyzabcd"!
You can overwrite bytes or extend a file but you cannot insert or delete bytes without rewriting everything past your current position.
Since lines aren't all the same length, it's easiest to do it in two seperate steps:
lines = []
# Parse the file into lines
with open('file.txt', 'r') as f:
for line in f:
if line.startswith('text to replace'):
line = 'new text\n'
lines.append(line)
# Write them back to the file
with open('file.txt', 'w') as f:
f.writelines(lines)
# Or: f.write(''.join(lines))
You can't read and write to the same file. You'd have to read from main_path, and write to another one, e.g.
sentence = "new text"
with open(main_path,'rt') as file: # Use file to refer to the file object
with open('out.txt','wt') as outfile:
for line in file.readlines():
if line.startswith('text to replace'):
outfile.write(sentence)
else:
outfile.write(line)
Not the problem with the example code, but wanted to share as this is where I wound up when searching for the error.
I was getting this error due to the chosen file name (con.txt for example) when appending to a file on Windows. Changing the extension to other possibilities resulted in the same error, but changing the file name solved the problem. Turns out the file name choice caused a redirect to the console, which resulted in the error (must have exactly one of read or write mode): Why does naming a file 'con.txt' in windows make Python write to console, not file?

create valid json object in python

Each line is valid JSON, but I need the file as a whole to be valid JSON.
I have some data which is aggregated from a web service and dumped to a file, so it's JSON-eaque, but not valid JSON, so it can't be processed in the simple and intuitive way that JSON files can - thereby consituting a major pain in the neck, it looks (more or less) like this:
{"record":"value0","block":"0x79"}
{"record":"value1","block":"0x80"}
I've been trying to reinterpret it as valid JSON, my latest attempt looks like this:
with open('toy.json') as inpt:
lines = []
for line in inpt:
if line.startswith('{'): # block starts
lines.append(line)
However, as you can likely deduce by the fact that I'm posing this question- that doesn't work- any ideas about how I might tackle this problem?
EDIT:
Tried this:
with open('toy_two.json', 'rb') as inpt:
lines = [json.loads(line) for line in inpt]
print(lines['record'])
but got the following error:
Traceback (most recent call last):
File "json-ifier.py", line 38, in <module>
print(lines['record'])
TypeError: list indices must be integers, not str
Ideally I'd like to interact with it as I can with normal JSON, i.e. data['value']
EDIT II
with open('transactions000000000029.json', 'rb') as inpt:
lines = [json.loads(line) for line in inpt]
for line in lines:
records = [item['hash'] for item in lines]
for item in records:
print item
This looks like NDJSON that I've been working with recently. The specification is here and I'm not sure of its usefulness. Does the following work?
with open('the file.json', 'rb') as infile:
data = infile.readlines()
data = [json.loads(item.replace('\n', '')) for item in data]
This should give you a list of dictionaries.
Each line looks like a valid JSON document.
That's "JSON Lines" format (http://jsonlines.org/)
Try to process each line independantly (json.loads(line)) or use a specialized library (https://jsonlines.readthedocs.io/en/latest/).
def process(oneline):
# do what you want with each line
print(oneline['record'])
with open('toy_two.json', 'rb') as inpt:
for line in inpt:
process(json.loads(line))

Reading file error in Python

I am brand new to Python and am having a terrible time trying to read in a .csv file to work with. The code I am using is the following:
>>> dat = open('blue.csv','r')
>>> print dat()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'file' object is not callable
Could anyone help me diagnose this error or lend any suggestions on how to read the file in? Sorry if there is an answer to this question already, but I couldn't seem to find it.
You need to use read in order to read a file
dat = open('blue.csv','r')
print dat.read()
Alternatively, you can use with for self-closing
with open('blue.csv','r') as o:
data = o.read()
You can read the file:
dat = open('blue.csv', 'r').read()
Or you can open the file as a csv and read it row by row:
import csv
infile = open('blue.csv', 'r')
csvfile = csv.reader(infile)
for row in csvfile:
print row
column1 = row[0]
print column1
Check out the csv docs for more options for working with csv files.

Python: data to file then data from text file to list - TypeError: must be str, not bytes

I'm a beginner in programming and have decided to teach myself Python. After a few days, i've decided to code a little piece. I's pretty simple:
date of today
page i am at (i'm reading a book)
how i feel
then i add the data in a file. every time i launch the program, it adds a new line of data in the file
then i extract the data to make a list of lists.
truth is, i wanted to re-write my program in order to pickle a list and then unpickle the file. However, as i'm coping with an error i can't handle, i really really want to understand how to solve this. Therefore i hope you will be able to help me out :)
I've been struggling for the past hours on this apparently a simple and stupid problem. Though i don't find the solution. Here is the error and the code:
ERROR:
Traceback (most recent call last):
File "dailyshot.py", line 25, in <module>
SaveData(todaysline)
File "dailyshot.py", line 11, in SaveData
mon_pickler.dump(datatosave)
TypeError: must be str, not bytes
CODE:
import pickle
import datetime
def SaveData(datatosave):
with open('journey.txt', 'wb') as thefile:
my_pickler = pickle.Pickler(thefile)
my_pickler.dump(datatosave)
thefile.close()
todaylist = []
today = datetime.date.today()
todaylist.append(today)
page = input('Page Number?\n')
feel = input('How do you feel?\n')
todaysline = today.strftime('%d, %b %Y') + "; " + page + "; " + feel + "\n"
print('Thanks and Good Bye!')
SaveData(todaysline)
print('let\'s make a list now...')
thefile = open('journey.txt','rb')
thelist = [line.split(';') for line in thefile.readlines()]
thefile.close()
print(thelist)
Thanks a looot!
Ok so there are a few things to comment on here:
When you use a with statement, you don't have to explicitly close the file. Python will do that for you at the end of the with block (line 8).
You don't use todayList for anything. You create it, add an element and then just discard it. So it's probably useless :)
Why are you pickling string object? If you have strings just write them to the file as is.
If you pickle data on write you have to unpickle it on read. You shouldn't write pickled data and then just read the file as a plain text file.
Use a for append when you are just adding items to the file, w will overwrite your whole file.
What I would suggest is just writing a plain text file, where every line is one entry.
import datetime
def save(data):
with open('journey.txt', 'a') as f:
f.write(data + '\n')
today = datetime.date.today()
page = input('Page Number: ')
feel = input('How do you feel: ')
todaysline = ';'.join([today.strftime('%d, %b %Y'), page, feel])
print('Thanks and Good Bye!')
save(todaysline)
print('let\'s make a list now...')
with open('journey.txt','r') as f:
for line in f:
print(line.strip().split(';'))
Are you sure you posted the right code? That error can occur if you miss out the "b" when you open the file
eg.
with open('journey.txt', 'w') as thefile:
>>> with open('journey.txt', 'w') as thefile:
... pickler = pickle.Pickler(thefile)
... pickler.dump("some string")
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
TypeError: must be str, not bytes
The file should be opened in binary mode
>>> with open('journey.txt', 'wb') as thefile:
... pickler = pickle.Pickler(thefile)
... pickler.dump("some string")
...
>>>

Categories