Python reading from file into multiple lists - python

I don't suppose someone could point me in the right direction?
I'm a bit wondering how best to pull values out of a text file then break them up and put them back into lists at the same place as their corresponding values.
I'm sorry If this isn't clear, maybe this will make it clearer. This is the code that outputs the file:
#while loop
with open('values', 'a') as o:
o.write("{}, {}, {}, {}, {}, {}, {}\n".format(FirstName[currentRow],Surname[currentRow], AnotherValue[currentRow], numberA, numberB))
currentRow+1
I would like to do the opposite and take the values, formatted as above and put them back into lists at the same place. Something like:
#while loop
with open('values', 'r') as o:
o.read("{}, {}, {}, {}, {}, {}, {}\n".format(FirstName[currentRow],Surname[currentRow], AnotherValue[currentRow], numberA, numberB))
currentRow +1
Thanks

I think the best corresponding way to do it is calling split on the text read in:
FirstName[currentRow],Surname[currentRow], AnotherValue[currentRow], numberA, numberB = o.read().strip().split(", ")
There is no real equivalent of formatted input, like scanf in C.

You should be able to do something like the following:
first_names = []
surnames = []
another_values = []
number_as = []
number_bs = []
for line in open('values', 'r'):
fn, sn, av, na, nb = line.strip().split(',')
first_names.append(fn)
surnames.append(sn)
another_values.append(av)
number_as.append(float(na))
number_bs.append(float(nb))
The for line in open() part iterates over each line in the file and the fn, sn, av, na, nb = line.strip().split(',') bit strips the newline \n off the end of each line and splits it on the commas.
In practice though I would probably use the CSV Module or something like Pandas which handle edge cases better. For example the above approach will break if a name or some other value has a comma in it!

with open('values.txt', 'r') as f:
first_names, last_names, another_values, a_values, b_values = \
(list(tt) for tt in zip(*[l.rstrip().split(',') for l in f]))
Unless you need update, list conversion list(tt) for tt in is also unnecessary.
May use izip from itertools instead of zip.

If you are allow to decide file format, saving and loading as json format may be useful.
import json
#test data
FirstName = range(5)
Surname = range(5,11)
AnotherValue = range(11,16)
numberAvec = range(16,21)
numberBvec = range(21,26)
#save
all = [FirstName,Surname,AnotherValue,numberAvec,numberBvec]
with open("values.txt","w") as fp:
json.dump(all,fp)
#load
with open("values.txt","r") as fp:
FirstName,Surname,AnotherValue,numberAvec,numberBvec = json.load(fp)
values.txt:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]]

Related

How to store data from text files into lists on integers [duplicate]

This question already has answers here:
How do I create variable variables?
(17 answers)
Closed 10 days ago.
Hi hope everyone is okay.
I am trying to find the most simple method to take data from a text file and store it into diffrent
variables. Below is the format of a text file:
TEXT FILE:
min:1,2,3,4,5,7,8,9
avg:1,2,3,4
max:1,2,3,4,5,1,2,3,44,55,32,12
I want to take each of these lines remove the part before the number starts (min,avg,max and the ':')
and store all the number data in seperate variables in their appropriate names.
NOTE: amount of numbers in each line may differ and shouldnt effect the code
desired in python:
min = [1,2,3,4,5,7,8,9]
avg = [1,2,3,4]
max = [1,2,3,4,5,1,2,3,44,55,32,12]
The code i have tried:
with open('input.txt', 'r') as input:
input = input.read()
input = input.strip().split(',')
After this part i am unsure which method would be best to achieve what I am trying to do.
Any help is appriciated!
There's no reasonable way to generate variables (by name) dynamically. Better to use a dictionary. Something like this:
my_dict = {}
with open('input.txt') as data:
for line in map(str.strip, data):
try:
key, vals = line.split(':')
my_dict[key.rstrip()] = list(map(int, vals.split(',')))
except ValueError:
pass
print(my_dict)
Output:
{'min': [1, 2, 3, 4, 5, 7, 8, 9], 'avg': [1, 2, 3, 4], 'max': [1, 2, 3, 4, 5, 1, 2, 3, 44, 55, 32, 12]}
Using exec for a string evaluation. Do that on trusted data to avoid injection attacks.
with open('input.txt', 'r') as fd:
data = fd.read()
# list of lines
lines = data.split('\n')
# python code format
code_format = '\n'.join("{} = [{}]".format(*line.partition(':')[::2]) for line in lines if line)
# execute the string as python code
exec(code_format)
print(avg)
#[1, 2, 3, 4]
Notice that there is a further side effect in this code evaluation since some variable identifiers overload those of the built-in functions min, max. So, if after the execution of the code you try to call such build-in functions you will get TypeError: 'list' object is not callable.
One way to re-approach the problem would be by pickling the objects and use pickle.dumps to save an object to a file and pickle.loads to retrieve the object, see doc.
This is how you store it in a python dictionary:
txtdict = {}
with open('input.txt', 'r') as f:
for line in f:
if line.strip():
name = line.split(':')[0]
txtdict[name] = [int(i) for j in line.strip().split(':')[1:] for i in j.split(',')]
Output:
{'min': [1, 2, 3, 4, 5, 7, 8, 9],
'avg': [1, 2, 3, 4],
'max': [1, 2, 3, 4, 5, 1, 2, 3, 44, 55, 32, 12]}

How to extract data from a text file and add it to a list?

Python noob here. I have this text file that has data arranged in particular way, shown below.
x = 2,4,5,8,9,10,12,45
y = 4,2,7,2,8,9,12,15
I want to extract the x values and y values from this and put them into their respective arrays for plotting graphs. I looked into some sources but could not find a particular solution as they all used the "readlines()" method that returns as a list with 2 strings. I can convert the strings to integers but the problem that I face is how do I only extract the numbers and not the rest?
I did write some code;
#lists for storing values of x and y
x_values = []
y_values = []
#opening the file and reading the lines
file = open('data.txt', 'r')
lines = file.readlines()
#splitting the first element of the list into parts
x = lines[0].split()
#This is a temporary variable to remove the "," from the string
temp_x = x[2].replace(",","")
#adding the values to the list and converting them to integer.
for i in temp_x:
x_value.append(int(i))
This gets the job done but the method I think is too crude. Is there a better way to do this?
You can use read().splitlines() and removeprefix():
with open('data.txt') as file:
lines = file.read().splitlines()
x_values = [int(x) for x in lines[0].removeprefix('x = ').split(',')]
y_values = [int(y) for y in lines[1].removeprefix('y = ').split(',')]
print(x_values)
print(y_values)
# output:
# [2, 4, 5, 8, 9, 10, 12, 45]
# [4, 2, 7, 2, 8, 9, 12, 15]
Since your new to python, here's a tip! : never open a file without closing it, it is common practice to use with to prevent that, as for your solution, you can do this :
with open('data.txt', 'r') as file:
# extract the lines
lines = file.readlines()
# extract the x and y values
x_values = [
int(el) for el in lines[0].replace('x = ', '').split(',') if el.isnumeric()
]
y_values = [
int(el) for el in lines[1].replace('y = ', '').split(',') if el.isnumeric()
]
# the final output
print(x_values, y_values)
output:
[2, 4, 5, 8, 9, 10, 12] [4, 2, 7, 2, 8, 9, 12, 15]
Used dictionary to store the data.
# read data from file
with open('data.txt', 'r') as fd:
lines = fd.readlines()
# store in a (x,y)-dictionary
out = {}
for label, coord in zip(('x', 'y'), lines):
# casting strings to integers
out[label] = list(map(int, coord.split(',')[1:]))
# display data
#
print(out)
#{'x': [4, 5, 8, 9, 10, 12, 45], 'y': [2, 7, 2, 8, 9, 12, 15]}
print(out['y'])
#[2, 7, 2, 8, 9, 12, 15]
In case desired output as list just substitute the main part with
out = []
for coord in lines:
# casting strings to integers
out.append(list(map(int, coord.split(',')[1:])))
X, Y = out

using while loop to read files

I have 200 files, from which I wanna choose the second column. I wanna store the second column of each file in a list called "colv". I wanna have colv[0]=[second column of the first file. colv[1] be the second column of the second file and so on.
I write this code, but it does not work, in this code colv[0] is the first number of the second column of the first file. Can anyone help me how to fix this issue:
colv = []
i = 1
colvar = "step7_1.colvar"
while os.path.isfile(colvar):
with open(colvar, "r") as f_in:
line = next(f_in)
for line in f_in:
a = line.split()[1]
colv.append(a)
i+=1
colvar = "step7_%d.colvar" %i
How about using Pandas' read_csv() since you mention that the data has a table-like structure. In particular, you could use
import pandas as pd
colv = []
i = 1
colvar = "step7_1.colvar"
while os.path.isfile(colvar):
df = pd.read_csv(colvar, sep=',')
colv.append(list(df[df.columns[1]]))
i+=1
colvar = "step7_%d.colvar" %i
It returned
>colv
[[5, 6, 7, 8], [8, 9, 10, 11], [12, 13, 14, 15]]
for my vanilla sample files step7_%d.colvar.
You might need to adjust the separator character with sep.
Use a list comprehension to get the 2nd element of all the lines into a list.
colv = []
i = 1
colvar = "step7_1.colvar"
while os.path.isfile(colvar):
with open(colvar, "r") as f_in:
readline(f_in) # skip 1st line
data = [line.split()[1] for line in f_in]
colv.append(data)
i+=1
colvar = "step7_%d.colvar" %i

Defining a function to read from a file and count the number of words in each line

I'm pretty new to python. I'm trying to define a function to read from a given file and count the number of words in each line and output the result as a list.
Here's my code:
def nWs(filename):
with open(filename,'r') as f:
k=[]
for line in f:
num_words=0
words=line.split()
num_words +=len(words)
k.append(num_words)
print (k)
print( nWs('random_file.txt') )
The expected output is something like:
[1, 22, 15, 10, 11, 13, 10, 10, 6, 0]
But it returns:
[1, 22, 15, 10, 11, 13, 10, 10, 6, 0]
None
I don't understand why this term None is returned. There's nothing wrong with the text file, its just random text and I'm only trying to print words in 1 file. So I don't understand this result. Can anyone explain why? And also how can I get rid of this None term.
I assume the indenting is correct when you tried to run it as it wouldn't run otherwise.
The None is the result of you calling the function in the print statement. Because nWs doesn't return anything, the print statement prints None. You could either call the function without the print statement or instead of using print in the function, use return and then print.
def nWs(filename):
with open(filename,'r') as f:
k=[]
for line in f:
num_words=0
words=line.split()
num_words +=len(words)
k.append(num_words)
print (k)
nWs('random_file.txt')
or
def nWs(filename):
with open(filename,'r') as f:
k=[]
for line in f:
num_words=0
words=line.split()
num_words +=len(words)
k.append(num_words)
return k
print(nWs('random_file.txt'))

write zip array vertical in csv

is there ways to display zipped text vertically in csv ?? I tried many difference type of \n ',' but still can't get the array to be vertical
if __name__ == '__main__': #start of program
master = Tk()
newDirRH = "C:/VSMPlots"
FileName = "J123"
TypeName = "1234"
Field = [1,2,3,4,5,6,7,8,9,10]
Court = [5,4,1,2,3,4,5,1,2,3]
for field, court in zip(Field, Court):
stringText = ','.join((str(FileName), str(TypeName), str(Field), str(Court)))
newfile = newDirRH + "/Try1.csv"
text_file = open(newfile, "w")
x = stringText
text_file.write(x)
text_file.close()
print "Done"
This is the method i am looking for for your Code i can't seem to add new columns as all the column will repeat 10x
You are not writing CSV data. You are writing Python string representations of lists. You are writing the whole Field and Court lists each iteration of your loop, instead of writing field and court, and Excel sees the comma in the Python string representation:
J123,1234,[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[5, 4, 1, 2, 3, 4, 5, 1, 2, 3]
J123,1234,[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[5, 4, 1, 2, 3, 4, 5, 1, 2, 3]
etc.
while you wanted to write:
J123,1234,1,5
J123,1234,2,4
etc.
Use the csv module to produce CSV files:
import csv
with open(newfile, "wb") as csvfile:
writer = csv.writer(csvfile)
for field, court in zip(Field, Court):
writer.writerow([FileName, TypeName, field, court])
Note the with statement; it takes care of closing the open file object for you. The csv module also makes sure everything is converted to strings.
If you want to write something only on the first row, keep a counter with your items; enumerate() makes that easy:
with open(newfile, "wb") as csvfile:
writer = csv.writer(csvfile)
# row of headers
writer.writerow(['FileName', 'TypeName', 'field', 'court'])
for i, (field, court) in enumerate(zip(Field, Court)):
row = [[FileName, TypeName] if i == 0 else ['', '']
writer.writerow(row + [field, court])

Categories