I am having a txt file and I want to read its lines in python. Basically I am using the following method:
f = open(description,'r')
out = f.readlines()
for line in out:
line
What I want is to have access in every line of the text after the for loop. Thus, to store lines in a matrix or something list-like.
Instead of readlines you could use
lines = list(open(description, 'r'))
The opened file is an iterator, that yields lines. By calling list on it, you create a list of all of them. There's no real need to keep the open file around in a variable, doing it this way it will be closed.
But using readlines() to get a list is perfectly good as well.
Related
I'm studying the file section and I'm confused by the code below.
def printAllLines(fileObject):
for line in fileObject:
print(line, end = "")
In this case, does one iteration of a line equal one line of the original text file?
Is there any index in a text file?
Can I think of a pure text file as a list that contains multiple items?
And each item contains a line of text?
A file object created through the open() function in Python is an object that contains each line of the file. For a text file, it is a typing.TextIO object; if this were a binary file, it would be typing.BinaryIO. This object is iterable but cannot be indexed, as it does not define ___getitem___.
TL;DR: You can think of indexing a file using a for loop as syntactic sugar; you can cut down on lines using it, but don't think about it too hard.
To answer each of your questions:
Yes, each iteration of line in fileObject is one line of the text file.
No, you cannot index a text file without readline() or another function. You can't do fileObject[1], for example.
Don't think of it like this. Think of this format of indexing a file as a useful trick.
Hopefully this fully answers your question.
I have a couple of files inside a folder that I am trying to pull out text from, reason being that I eventually want to add this text into a newly created separate file. My tactic is to initialize a list and populate it with the text from each file, one by one. I have called this list myLines.
myLines = []
for line in f:
myLines.append(line)
for element in myLines:
f.write(myLines)
I get an error, and I know that it has something to do with .write() not accepting myLines because its a list rather than an argument. How would I go about turning the content of mylines into an acceptable argument for the write() method?
Thanks
IDK what's your intention of using myLines as the variable name. Given what you described it should be a list of texts, not a list of lines.
my_texts = []
# populate my_texts
for filename in input_files:
with open(filename) as f:
my_texts.append(f.read())
# write new file
with open('new_file_path.txt', 'w') as f:
f.write('\n'.join(my_texts))
assuming you want a new line separating texts from each file.
A more straightforward method would be to 1) open the output file in append mode (open('out_file_path', 'a')), and 2) read each input file in a loop, writing the content to the output file.
Try this -
myLines = []
for line in f:
myLines.append(line)
for element in myLines:
f.write(str(myLines))
Just convert myLines to string. You can use classes also if you want to preserve the type- list, but, that will be a bit lengthy.
I keep encountering an index error when trying to print a line from a text file. I'm new to python and I'm still trying to learn so I'd appreciate if you can try to be patient with me; if there is something else needed from me, please let me know!
The traceback reads as
...
print(f2.readlines()[1]):
IndexError: list index out of range
When trying to print line 2 (...[1]), I am getting this out of range error.
Here's the current script.
with open("f2.txt", "r") as f2:
print(f2.readlines()[1])
There are 3 lines with text in the file.
contents of f2.txt
peaqwenasd
lasnebsat
kikaswmors
It seems that f2.seek(0) was necessary here to solve the issue.
with open("f2.txt", "r") as f2:
f2.seek(0)
print(f2.readlines()[1])
You haven't given all the code needed to solve your problem, but your given symptoms point to multiple calls to readlines.
Read the documentation: readlines() reads the entire file and returns a list of the contents. As a consequence, the file pointer is at the end of the file. If you call readlines() again at this point, it returns an empty file.
You apparently have a readlines() call before the code you gave us. seek(0) resets the file pointer to the start of the file, and you're reading the entire file a second time.
There are many tutorials that show you canonical ways to iterate through the contents of a file. I strongly recommend that you use one of those. For instance:
with open("f2.txt", "r") as f2:
for line in f2.readlines():
# Here you can work with the lines in sequence
If you need to deal with the lines in non-sequential order, then
with open("f2.txt", "r") as f2:
content = list(f2.readlines())
# Now you can access content[2], content[1], etc.
I have three python lists:
filePaths
textToFind
textToReplace
The lists are always equal lengths and in the correct order.
I need to open each file in filePaths, find the line in textToFind, and replace the line with textToReplace. I have all the code that populates the lists. I am stuck on making the replacements. I have tried:
for line in fileinput.input(filePath[i], inplace=1):
sys.stdout.write(line.replace(find[i], replace[i]))
How do I iterate over each file to make the text replacements on each line that matches find?
When you need to use the indices of the items in a sequence while iterating over that sequence, use enumerate.
for i, path in enumerate(filePath):
for line in fileinput.input(path, inplace=1):
sys.stdout.write(line.replace(find[i], replace[i]))
Another option would be to use zip, which will give you one item from each sequence in order.
for path, find_text, replace_text in zip(filePath, textToFind, textToReplace):
for line in fileinput.input(path, inplace=1):
sys.stdout.write(line.replace(find_text, replace_text))
Note that for Python 2.x zip will produce a new list that can be iterated - so if the sequences you are zipping are huge it will consume memory. Python 3.x zip produces an iterator so it doesn't have that feature.
With a normal file object you could read the entire file into a variable and perform the string replacement on the whole file at once.
I might do something like this without more information
for my_file in file_paths:
with open(my_file, 'r') as cin, open(my_file, 'w') as cout:
lines = cin.readlines() #store the file in mem so i can overwrite it.
for line in lines:
line = line.replace(find, replace) # change as needed
cout.write(line)
Iterate over all the file paths, open the file up for reading and a separate one for writing. Store the files lines in a variable as in this code i will be overwriting the original file. Do your replace, remember if there is nothing to replace python just leaves the line alone. Write the line back to file.
You can read file to some temporary variable, make changes, and then write it back:
with open('file', 'r') as f:
text = f.read()
with open('file', 'w') as f:
f.write(text.replace('aaa', 'bbb'))
I have a text file called urldata.txt which I am opening and reading line by line. I wrote a for loop to read it line by line, but I want to save the output I receive as a list.
Here is what I have:
textdata = open("urldata.txt","r")
for line in textdata:
print(line)
this returns:
http://www.google.com
https://twitter.com/search?q=%23ASUcis355
https://github.com/asu-cis-355/course-info
I want to save these lines above as a list. Any suggestions?
I have tried appending and such, however, being new to Python I'm not sure how to go about this.
You just want a list of every line of the file?
urls = open("urldata.txt").read().splitlines()
If you just want the lines as a list, that's trivial:
with open("urldata.txt") as textdata:
lines = list(textdata)
If you want newlines stripped, use a list comprehension to do it:
with open("urldata.txt") as textdata:
lines = [line.rstrip('\r\n') for line in textdata]