How to read specific parts of a input file in Python 3? - python

Let's say I have an input file with the following data:
50 50
A
B
C
D
I know that I can extract the first line using the map function as follow:
x,y= map(int, input().split())
But I am unsure how I can retrieve the next 4 lines and put them into a list. I tried using the splitlines() function, since each value is on a seperate line, but that only returns the first value.
strings = input().splitlines()
How can I choose what parts of the input file I want to read and then store them in respective variables?

Open the file, read all the lines into a list, do what you want with them
with open("[filename]", "r") as f:
lines = list(map(lambda l: l.strip(), f.readlines()))
# do whatever with the lines here
# use lines.pop(0) if you want to remove the line from the list

Related

Reading an nth line of a textfile in python determined from a list

I have a function gen_rand_index that generates a random group of numbers in list format, such as [3,1] or [3,2,1]
I also have a textfile that that reads something like this:
red $1
green $5
blue $6
How do I write a function so that once python generates this list of numbers, it automatically reads that # line in the text file? So if it generated [2,1], instead of printing [2,1] I would get "green $5, red $1" aka the second line in the text file and the first line in the text file?
I know that you can do print(line[2]) and commands like that, but this won't work in my case because each time I am getting a different random number of a line that I want to read, it is not a set line I want to read each time.
row = str(result[gen_rand_index]) #result[gen_rand_index] gives me the random list of numbers
file = open("Foodinventory.txt", 'r')
for line in file:
print(line[row])
file.close()
I have this so far, but I am getting this
error: invalid literal for int() with base 10: '[4, 1]'
I also have gotten
TypeError: string indices must be integers
butI have tried replacing str with int and many things like that but I'm thinking the way I'm just approaching this is wrong. Can anyone help me? (I have only been coding for a couple days now so I apologize in advance if this question is really basic)
Okay, let us first get some stuff out of the way
Whenever you access something from a list the thing you put inside the box brackets [] should be an integer, eg: [5]. This tells Python that you want the 5th element. It cannot ["5"] because 5 in this case would be treated as a string
Therefore the line row = str(result[gen_rand_index]) should actually just be row = ... without the call to str. This is why you got the TypeError about list indices
Secondly, as per your description gen_rand_index would return a list of numbers.
So going by that, why don;t you try this
indices_to_pull = gen_rand_index()
file_handle = open("Foodinventory.txt", 'r')
file_contents = file_handle.readlines() # If the file is small and simle this would work fine
answer = []
for index in indices_to_pull:
answer.append(file_contents[index-1])
Explanation
We get the indices of the file lines from gen_rand_index
we read the entire file into memory using readlines()
Then we get the lines we want, Rememebr to subtract 1 as the list is indexed from 0
The error you are getting is because you're trying to index a string variable (line) with a string index (row). Presumably row will contain something like '[2,3,1]'.
However, even if row was a numerical index, you're not indexing what you think you're indexing. The variable line is a string, and it contains (on any given iteration) one line of the file. Indexing this variable will give you a single character. For example, if line contains green $5, then line[2] will yield 'e'.
It looks like your intent is to index into a list of strings, which represent all the lines of the file.
If your file is not overly large, you can read the entire file into a list of lines, and then just index that array:
with open('file.txt') as fp:
lines = fp.readlines()
print(lines[2]).
In this case, lines[2] will yield the string 'blue $6\n'.
To discard the trailing newline, use lines[2].strip() instead.
I'll go line by line and raise some issues.
row = str(result[gen_rand_index]) #result[gen_rand_index] gives me the random list of numbers
Are you sure it is gen_rand_index and not gen_rand_index()? If gen_rand_index is a function, you should call the function. In the code you have, you are not calling the function, instead you are using the function directly as an index.
file = open("Foodinventory.txt", 'r')
for line in file:
print(line[row])
file.close()
The correct python idiom for opening a file and reading line by line is
with open("Foodinventory.txt.", "r") as f:
for line in f:
...
This way you do not have to close the file; the with clause does this for you automatically.
Now, what you want to do is to print the lines of the file that correspond to the elements in your variable row. So what you need is an if statement that checks if the line number you just read from the file corresponds to the line number in your array row.
with open("Foodinventory.txt", "r") as f:
for i, line in enumerate(f):
if i == row[i]:
print(line)
But this is wrong: it would work only if your list's elements are ordered. That is not the case in your question. So let's think a little bit. You could iterate over your file multiple times, and each time you iterate over it, print out one line. But this will be inefficient: it will take time O(nm) where n==len(row) and m == number of lines in your file.
A better solution is to read all the lines of the file and save them to an array, then print the corresponding indices from this array:
arr = []
with open("Foodinventory.txt", "r") as f:
arr = list(f)
for i in row:
print(arr[i - 1]) # arrays are zero-indiced

Python list in lists from datafile with different layouts

Im a little stuck here. I'm trying to read a data file in Python 3.
I want to make a list of lists
*The first 36 lines:
each line is a list that's appended to the main list
f = open("a.data","r")
h = []
a = []
for word in range(0,797):
g = f.readline()
h.append(g.strip())
a.append(h)
h = []
But from the 37th line and beyond:
I need a loop where this happens:
The new line is a white line, pass
the next 4 lines should go into a new list 'h' and append to 'h' to 'a'
The thing is that readline() acts crazy for everything I tried
Any suggestions?
Thanks in advance.
ps the strings in the 4 lines are divided by a ;
Try this:
import re
with open('a.data', 'r') as f:
lst = re.split(';|\n{1,2}', f.read())
length = 36
lstoflst = [lst[i:i+length] for i in range(0, len(lst)-1, length)]
print(lstoflst)
I read the whole list, split at the newline and semicolon, and make a list of list with a list comprehension.
Please consider a better data format for your next report, like csv if possible.

reading from a particular tuple onwards from a file in python

Using seek and tell is not functioning properly as the tell returns the current position in bytes; I need to get the line number rather the position of file pointer to proceed.
I have a file glass.csv and I need to cluster the datasets. Each line in the file contains a number 1,2,3... like the below:
65,1.52172,13.48,3.74,0.90,72.01,0.18,9.61,0.00,0.07,1
66,1.52099,13.69,3.59,1.12,71.96,0.09,9.40,0.00,0.00,1
67,1.52152,13.05,3.65,0.87,72.22,0.19,9.85,0.00,0.17,1
68,1.52152,13.05,3.65,0.87,72.32,0.19,9.85,0.00,0.17,1
69,1.52152,13.12,3.58,0.90,72.20,0.23,9.82,0.00,0.16,1
70,1.52300,13.31,3.58,0.82,71.99,0.12,10.17,0.00,0.03,1
71,1.51574,14.86,3.67,1.74,71.87,0.16,7.36,0.00,0.12,2
72,1.51848,13.64,3.87,1.27,71.96,0.54,8.32,0.00,0.32,2
73,1.51593,13.09,3.59,1.52,73.10,0.67,7.83,0.00,0.00,2
74,1.51631,13.34,3.57,1.57,72.87,0.61,7.89,0.00,0.00,2
142,1.51851,13.20,3.63,1.07,72.83,0.57,8.41,0.09,0.17,2
143,1.51662,12.85,3.51,1.44,73.01,0.68,8.23,0.06,0.25,2
144,1.51709,13.00,3.47,1.79,72.72,0.66,8.18,0.00,0.00,2
145,1.51660,12.99,3.18,1.23,72.97,0.58,8.81,0.00,0.24,2
146,1.51839,12.85,3.67,1.24,72.57,0.62,8.68,0.00,0.35,2
147,1.51769,13.65,3.66,1.11,72.77,0.11,8.60,0.00,0.00,3
148,1.51610,13.33,3.53,1.34,72.67,0.56,8.33,0.00,0.00,3
149,1.51670,13.24,3.57,1.38,72.70,0.56,8.44,0.00,0.10,3
150,1.51643,12.16,3.52,1.35,72.89,0.57,8.53,0.00,0.00,3
I need to take some inputs from those tuples having 1 as the last number and save it in another file, (train.txt), and the remaining in another file, (test.txt). Likewise I need to take certain lines from those having 2 as the last number and append to the first file i.e. train.txt and remaining to test.txt.
I cannot get the second input but appends the first result itself.
The easiest way, assuming that you have a large file and can not simply load the whole file would be to use 1 file for each to do your sorting. If it is a small(ish) input file then just load as a comma separated file using the csv module.
As a quick and dirty method, (assuming smallish files).
data = []
with open('glass.csv', 'r') as infile:
for line in infile:
linedata = [float(val) for val in line.strip().split(',')]
data.append(linedata)
adata = sorted(data, key=lambda items: items[-1])
## Then open both your output files and write them in the required fields.
The default behavior for reading a text file is line-by-line. You can just do something like that:
with open('input.csv', 'r') as f, open('output_1.csv') as output_1, open('output_2.csv') as output_2:
for line in f:
line_fields = line.strip().split()[',']
if line_fields[-1] == '1':
output_1.write(line)
continue
if line_fields[-1] == '2':
output_2.write(line)
Or you can use the CSV module, it's much easier https://docs.python.org/2/library/csv.html

Writing from one file to another

I've been stuck on this Python homework problem for awhile now: "Write a complete python program that reads 20 real numbers from a file inner.txt and outputs them in sorted order to a file outter.txt."
Alright, so what I do is:
f=open('inner.txt','r')
n=f.readlines()
n.replace('\n',' ')
n.sort()
x=open('outter.txt','w')
x.write(print(n))
So my thought process is: Open the text file, n is the list of read lines in it, I replace all the newline prompts in it so it can be properly sorted, then I open the text file I want to write to and print the list to it. First problem is it won't let me replace the new line functions, and the second problem is I can't write a list to a file.
I just tried this:
>>> x= "34\n"
>>> print(int(x))
34
So, you shouldn't have to filter out the "\n" like that, but can just put it into int() to convert it into an integer. This is assuming you have one number per line and they're all integers.
You then need to store each value into a list. A list has a .sort() method you can use to then sort the list.
EDIT:
forgot to mention, as other have already said, you need to iterate over the values in n as it's a list, not a single item.
Here's a step by step solution that fixes the issues you have :)
Opening the file, nothing wrong here.
f=open('inner.txt','r')
Don't forget to close the file:
f.close()
n is now a list of each line:
n=f.readlines()
There are no list.replace methods, so I suggest changing the above line to n = f.read(). Then, this will work (don't forget to reassign n, as strings are immutable):
n = n.replace('\n','')
You still only have a string full of numbers. However, instead of replacing the newline character, I suggest splitting the string using the newline as a delimiter:
n = n.split('\n')
Then, convert these strings to integers:
`n = [int(x) for x in n]`
Now, these two will work:
n.sort()
x=open('outter.txt','w')
You want to write the numbers themselves, so use this:
x.write('\n'.join(str(i) for i in n))
Finally, close the file:
x.close()
Using a context manager (the with statement) is good practice as well, when handling files:
with open('inner.txt', 'r') as f:
# do stuff with f
# automatically closed at the end
I guess real means float. So you have to convert your results to float to sort properly.
raw_lines = f.readlines()
floats = map(float, raw_lines)
Then you have to sort it. To write result back, you have to convert to string and join with line endings:
sortеd_as_string = map(str, sorted_floats)
result = '\n'.join(sortеd_as_string)
Finally you have have to write result to destination.
Ok let's look it step by step what you want to do.
First: Read some integers out of a textfile.
Pythonic Version:
fileNumbers = [int(line) for line in open(r'inner.txt', 'r').readlines()]
Easy to get version:
fileNumbers = list()
with open(r'inner.txt', 'r') as fh:
for singleLine in fh.readlines():
fileNumbers.append(int(singleLine))
What it does:
Open the file
Read each line, convert it to int (because readlines return string values) and append it to the list fileNumbers
Second: Sort the list
fileNumbers.sort()
What it does:
The sort function sorts the list by it's value e.g. [5,3,2,4,1] -> [1,2,3,4,5]
Third: Write it to a new textfile
with open(r'outter.txt', 'a') as fh:
[fh.write('{0}\n'.format(str(entry))) for entry in fileNumbers]

How can I append to the new line of a file while using write()?

In Python:
Let's say I have a loop, during each cycle of which I produce a list with the following format:
['n1','n2','n3']
After each cycle I would like to write to append the produced entry to a file (which contains all the outputs from the previous cycles). How can I do that?
Also, is there a way to make a list whose entries are the outputs of this cycle? i.e.
[[],[],[]] where each internal []=['n1','n2','n3] etc
Writing single list as a line to file
Surely you can write it into a file like, after converting it to string:
with open('some_file.dat', 'w') as f:
for x in xrange(10): # assume 10 cycles
line = []
# ... (here is your code, appending data to line) ...
f.write('%r\n' % line) # here you write representation to separate line
Writing all lines at once
When it comes to the second part of your question:
Also, is there a way to make a list whose entries are the outputs of this cycle? i.e. [[],[],[]] where each internal []=['n1','n2','n3'] etc
it is also pretty basic. Assuming you want to save it all at once, just write:
lines = [] # container for a list of lines
for x in xrange(10): # assume 10 cycles
line = []
# ... (here is your code, appending data to line) ...
lines.append('%r\n' % line) # here you add line to the list of lines
# here "lines" is your list of cycle results
with open('some_file.dat', 'w') as f:
f.writelines(lines)
Better way of writing a list to file
Depending on what you need, you should probably use one of the more specialized formats, than just a text file. Instead of writing list representations (which are okay, but not ideal), you could use eg. csv module (similar to Excel's spreadsheet): http://docs.python.org/3.3/library/csv.html
f=open(file,'a') first para is the path of file,second is the pattern,'a' is append,'w' is write, 'r' is read ,and so on
im my opinion,you can use f.write(list+'\n') to write a line in a loop ,otherwise you can use f.writelines(list),it also functions.
Hope this can help you:
lVals = []
with open(filename, 'a') as f:
for x,y,z in zip(range(10), range(5, 15), range(10, 20)):
lVals.append([x,y,z])
f.write(str(lVals[-1]))

Categories