So I am tasked with creating a function that returns the amount of times a substring appears in a given string and the index of the substring every time it appears.
But when I run my code, I get a "I/O operation on closed file" error. Anyone know how to fix this?
# 1. Import the text.csv file
import csv
with open('text.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
# 2. Complete function counter. The function should return the number
of times the substring appears & their index
def counter(substring):
substring_counter = 0
string = csv_reader
for substring in csv_file:
substring_counter = substring_counter + 1
print('Counter = ', substring_counter)
print(string.find(substring))
# do not edit the code below
counter("TCA")
This is happening because csv.reader is local to the module, not your function. The reader object returned by csv module maintains a reference a file handle (returned by open). On the first iteration of the for loop, that file handle ends up going to the end, i.e., it gets exhausted but on the second iteration, you are causing csv reader object to try to read from the end of a file which causes that error.
Before or after every iteration of the loop, you can reset file pointer to beginning by doing something like this: csv_file.seek(0)
A better solution would be to store all the file contents into a buffer and access it repeatedly (without having to re-read file contents from file handle).
Since you are using with (which is good), you have explicitly limited the scope of the open file, yet as the previous response pointed out, the csv reader object is used outside that scope. The reader in this case is just a wrapper for the file and does not read everything initially. You either need to read the whole file within the with or move the with inside the function and everything that references the file under it.
Related
Is there a way, in the code below, to access the variable utterances_dict outside of the with-block? The code below obviously returns the error: ValueError: I/O operation on closed file.
from csv import DictReader
utterances_dict = {}
utterance_file = 'toy_utterances.csv'
with open(utterance_file, 'r') as utt_f:
utterances_dict = DictReader(utt_f)
for line in utterances_dict:
print(line)
I am not an expert on DictReader implementation, however their documentation leaves the implementation open to the reader itself parsing the file after construction. Meaning it may be possible that the underlying file has to remain open until you are done using it. In this case, it would be problematic to attempt to use the utterances_dict outside of the with block because the underlying file will be closed by then.
Even if the current implementation of DictReader does in fact parse the whole csv on construction, it doesn't mean their implementation won't change in the future.
DictReader returns a view of the csv file.
Convert the result to a list of dictionaries.
from csv import DictReader
utterances = []
utterance_file = 'toy_utterances.csv'
with open(utterance_file, 'r') as utt_f:
utterances = [dict(row) for row in DictReader(utt_f) ]
for line in utterances:
print(line)
with open(file, 'rb') as readerfile:
reader = csv.reader(readerfile)
In the above syntax, can I perform the first and second line together? It seems unnecessary to use 2 variables ('readerfile' and 'reader' above) if I only need to use the latter.
Is the former variable ('readerfile') ever used?
Can I use the same variable name for both is that bad form?
You can do:
reader = csv.reader(open(file, 'rb'))
but that would mean you are not closing your file explicitly.
with open(file, 'rb') as readerfile:
The first line opens the file and stores the file object in readerfile. The with statement ensures that the file is closed when you exit the block by any means, including exceptions.
reader = csv.reader(readerfile)
The second line creates a CSV reader object using the file object. It needs the file object (otherwise where would it read the data from?). Of course you could conceivably store it in the same variable
readerfile = csv.reader(readerfile)
if you wanted to (and don't plan on using the file object again), but this will likely lead to confusion for readers of your code.
Note that you haven't read anything yet! You still need to iterate over the reader object in order to get the data that you're interested in, and if you close the file before that happens then the reader object won't work. The file object is used behind the scenes by the reader object, even if you "hide" it by overwriting the readerfile variable.
Lastly, if you really want to do everything on one line, you could conceivably define a function that abstracts the with statement:
def with1(context, func):
with context as x:
return func(x)
Now you can write this as one line:
data = with1(open(file, 'rb'), lambda readerfile: list(csv.reader(readerfile)))
It's by no means clearer, however.
This is not recommended at all
Why is it important to use one line?
Most python programmers know well the benefits of using the with statement. Keep in mind that readers might be lazy (that is -read line by line-) on some cases. You want to be able to handle the file with the correct statement, ensuring the correct closing, even if errors arise.
Nevertheless, you can use a one liner for this, as stated in other answers:
reader = csv.reader(open(file, 'rb'))
So basically you want a one-liner?
reader = csv.reader(open(file, 'rb'))
As said before, the problem with that is with open() allows you to do the following steps in one time:
Open the file
Do what you want with the file (inside your open block)
Close the file (that is implicit and you don't have to specify it)
If you don't use with open but directly open, you file stays opened until the object is garbage collected, and that could lead to unpredicted behaviour in some cases.
Plus, your original code (two lines) is much more readable than a one-liner.
If you put them together, then the file won't be closed automatically -- but that often doesn't really matter, since it will be closed automatically when the script terminates.
It's not common to need to reference the raw file once acsv.readerinstance has been created from (except possibly to explicitly close it if you're not using awithstatement).
If you use the same variable name for both, it will probably work because thecsv.readerinstance will still hold a reference to the file object, so it won't be garbage collected until the program ends. It's not a commonly idiom, however.
Since csv files are often processed sequentially, the following can be a fairly concise way to do it since thecsv.readerinstance frequently doesn't really need to be given a variable name and it will close the file properly even if an exception occurs:
with open(file, 'rb') as readerfile:
for row in csv.reader(readerfile):
process the data...
I have two functions. The first creates a new CSV file (from an existing CSV). The second appends the same data to the new CSV, but in a slightly different order of the rows.
When I run this together all in one file the first function works but the second does not. However when I tried putting the second function in a separate file then calling it in the first script, it did work, albeit I had to enter the input twice.
What do I need to change to get the second function to run properly?
import csv
export = raw_input('>')
new_file = raw_input('>')
ynabfile = open(export, 'rb')
reader = csv.reader(ynabfile)
def create_file():
with open(new_file, 'wb') as result:
writer = csv.writer(result)
for r in reader:
writer.writerow((r[3], r[5], r[6],r[7], r[7],
r[8],r[8],r[9],r[10]))
def append():
with open(new_file, 'ab') as result2:
writer2 = csv.writer(result2)
for i in reader:
writer.writerow((r[3], r[5], r[6], r[7], r[7],
r[8], r[8], r[10], r[9]))
create_file()
append()
I'm new to Python and programming in general, so if there is an all around better way to do this, I'm all ears.
The csv reader has already read the entire file pointed to by ynabfile, so on the second call (or any subsequent calls) to either create_file or append will not be able to fetch any more data using the reader until the file pointer is sent back to the beginning. In your case, a quick fix would be this:
create_file()
ynabfile.seek(0)
append()
I recommend restructuring your code a bit to avoid pitfalls like this. A few recommendations:
Read all the contents in ynabfile into another list instead, if you can fit the entirety of the file into memory
Have create_file and append take parameter of input and output file names
Alternatively, have those two functions take the file pointer (ynabfile in this case), and ensure that it is seeked to the beginning then create a new csv.reader instance using that.
If i've done the following:
import codecs
lines = codecs.open(somefile, 'r','utf8').readlines()
Is there a way to close the file that i've not initialized? If so, how? Normally, i could have done:
import codecs
reader = codecs.open(somefile, 'r','utf8')
lines = reader.readlines()
reader.close()
In CPython, the file object will close on its own once the reference count drops to 0, which is right after .readlines() returns. For other Python implementations it may take a little longer depending on the garbage collection algorithm used. The file is certainly going to be closed no later than program exit.
You should really use the file object as a context manager and have the with statement call close on it:
with codecs.open(somefile, 'r','utf8') as reader:
lines = reader.readlines()
As soon as the block of code indented under the with statement exits (be it with an exception, a return, continue or break statement, or simply because all code in the block finished executing), the reader file object will be closed.
Bonus tip: file objects are iterables, so the following also works:
with codecs.open(somefile, 'r','utf8') as reader:
lines = list(reader)
for the exact same result.
The code I have
i = 0
while i < len(newsymbolslist):
time = 102030
data = 895.233
array = [time], [data]
with open('StockPrice.csv', 'wb') as file:
file_writer = csv.writer(file)
file_writer.writerow(array)
file.close()
i += 1
I'm fairly new to Python so I'm not 100% sure why the previous code only enters data into the top row. My guess is that because I'm opening and the file each iteration it doesn't know that its not suppose to override. I know how to fix it in theory (if that is the problem). I'm just having trouble with syntax.
My guess: use the iterations (var i) to count how many rows down the file should write.
with open('StockPrice.csv', 'wb') as f:
file_writer = csv.writer(f)
for s in newsymbolslist:
time = 102030
data = 895.233
array = [time], [data]
file_writer.writerow(array)
Your first guess is correct: Every time you open the file in 'wb' mode, the file is effectively deleted (if it existed) and a new empty file is created. So only the contents written during the last iteration through the while-loop affects the contents of the file.
The solution is to open the file once (before the loop begins).
Note that opening the file with the with-statement guarantees that the file will be closed when Python leaves the with-block. So there is no need to call f.close() yourself.
From the documentation:
The most commonly-used values of mode are 'r' for reading, 'w' for writing (truncating the file if it already exists), and 'a' for appending (...)
If you want to write to the end of an existing file, open it in append mode, with 'a'. (Though in this case, yes, restructuring your loop is the better answer.)