Python load list in list from text file - python

I want to load list in list from text file. I went through many examples but no solution. This is what I want to do
I am new bee to python
def main()
mainlist = [[]]
infile = open('listtxt.txt','r')
for line in infile:
mainlist.append(line)
infile.close()
print mainlist
`[[],['abc','def', 1],['ghi','jkl',2]]`
however what I want is something like this
[['abc','def',1],['ghi','jkl',2]]
my list contains
'abc','def',1
'ghi','jkl',2
'mno','pqr',3
what I want is when I access the list
print mainlist[0]
should return
'abc','def',1
any help will be highly appreciated
Thanks,

It seems to me that you could do this as:
from ast import literal_eval
with open('listtxt.txt') as f:
mainlist = [list(literal_eval(line)) for line in f]
This is the easist way to make sure that the types of the elements are preserved. e.g. a line like:
"foo","bar",3
will be transformed into 2 strings and an integer. Of course, the lines themselves need to be formatted as a python tuple... and this probably isn't the fastest approach due to it's generality and simplicity.

Maybe something like this.
mainlist = []
infile = open('listtxt.txt','r')
for line in infile:
mainlist.append(line.strip().split(','))
infile.close()
print mainlist

You're initializing mainlist with an empty list as first element, rather than as an empty list itself. Change:
mainlist = [[]]
to
mainlist = []

I'd try something like:
with open('listtxt.txt', 'r') as f:
mainlist = [line for line in f]

mainlist = []
infile = open('filelist.txt', 'r')
for line in infile:
line = line.replace('\n', '').replace('[', '').replace(']', '').replace("'", "").replace(' ', '')
mainlist.append(line.split(','))
infile.close()

You can use the json module like below (Python 3.x):
import json
def main()
mainlist = [[]]
infile = open('listtxt.txt','r')
data = json.loads(infile.read())
mainlist.append(data)
infile.close()
print(mainlist)
>>> [[],['abc','def', 1],['ghi','jkl',2]]
Your "listtxt.txt" file should look like this:
[["abc","def", 1],["ghi","jkl",2]]
To export your list, do this:
def export():
with open("listtxt.txt", 'w') as export_file:
json.dump(mainlist, export_file)
JSON module can load lists, found here

I had a list of surnames - one per line - in a text file which I wanted to read into a list. Here's how I did it (remembering to strip the newline character).
surnames = [name.strip("\n") for name in open("surnames.txt", "r")]

Related

read text file in python and extract specific value in each line?

I have a text file that each line of it is as follows:
n:1 mse_avg:8.46 mse_y:12.69 mse_u:0.00 mse_v:0.00 psnr_avg:38.86 psnr_y:37.10 psnr_u:inf psnr_v:inf
n:2 mse_avg:12.20 mse_y:18.30 mse_u:0.00 mse_v:0.00 psnr_avg:37.27 psnr_y:35.51 psnr_u:inf psnr_v:inf
I need to read each line extract psnr_y and its value in a matrix. does python have any other functions for reading a text file? I need to extract psnr_y from each line. I have a matlab code for this, but I need a python code and I am not familiar with functions in python. could you please help me with this issue?
this is the matlab code:
opt = {'Delimiter',{':',' '}};
fid = fopen('data.txt','rt');
nmc = nnz(fgetl(fid)==':');
frewind(fid);
fmt = repmat('%s%f',1,nmc);
tmp = textscan(fid,fmt,opt{:});
fclose(fid);
fnm = [tmp{:,1:2:end}];
out = cell2struct(tmp(:,2:2:end),fnm(1,:),2)
You can use regex like below:
import re
with open('textfile.txt') as f:
a = f.readlines()
pattern = r'psnr_y:([\d.]+)'
for line in a:
print(re.search(pattern, line)[1])
This code will return only psnr_y's value. you can remove [1] and change it with [0] to get the full string like "psnr_y:37.10".
If you want to assign it into a list, the code would look like this:
import re
a_list = []
with open('textfile.txt') as f:
a = f.readlines()
pattern = r'psnr_y:([\d.]+)'
for line in a:
a_list.append(re.search(pattern, line)[1])
use regular expression
r'psnr_y:([\d.]+)'
on each line read
and extract match.group(1) from the result
if needed convert to float: float(match.group(1))
Since I hate regex, I would suggest:
s = 'n:1 mse_avg:8.46 mse_y:12.69 mse_u:0.00 mse_v:0.00 psnr_avg:38.86 psnr_y:37.10 psnr_u:inf psnr_v:inf \nn:2 mse_avg:12.20 mse_y:18.30 mse_u:0.00 mse_v:0.00 psnr_avg:37.27 psnr_y:35.51 psnr_u:inf psnr_v:inf'
lst = s.split('\n')
out = []
for line in lst:
psnr_y_pos = line.index('psnr_y:')
next_key = line[psnr_y_pos:].index(' ')
psnr_y = line[psnr_y_pos+7:psnr_y_pos+next_key]
out.append(psnr_y)
print(out)
out is a list of the values of psnr_y in each line.
For a simple answer with no need to import additional modules, you could try:
rows = []
with open("my_file", "r") as f:
for row in f.readlines():
value_pairs = row.strip().split(" ")
print(value_pairs)
values = {pair.split(":")[0]: pair.split(":")[1] for pair in value_pairs}
print(values["psnr_y"])
rows.append(values)
print(rows)
This gives you a list of dictionaries (basically JSON structure but with python objects).
This probably won't be the fastest solution but the structure is nice and you don't have to use regex
import fileinput
import re
for line in fileinput.input():
row = dict([s.split(':') for s in re.findall('[\S]+:[\S]+', line)])
print(row['psnr_y'])
To verify,
python script_name.py < /path/to/your/dataset.txt

Reading from text file and storing in array [duplicate]

This question already has answers here:
Create new list from nested list and convert str into float
(4 answers)
Closed 3 years ago.
If I have a text file containing the following numbers:
5.078780 5.078993
7.633073 7.633180
2.919274 2.919369
3.410284 3.410314
How can read it and store it in an array, so that it becomes:
[[5.078780,5.078993],[7.633073,7.633180],[2.919274,2.919369],[3.410284,3.410314]]
with open('test.txt', 'r') as file:
output = [ line.strip().split(' ') for line in file.readlines()]
# Cast strings to floats
output = [[float(j) for j in i] for i in output]
print(output)
should give the desired output:
[[5.07878, 5.078993], [7.633073, 7.63318], [2.919274, 2.919369], [3.410284, 3.410314]]
Approach:
Have a result list = []
Split the text by newlines \n.
Now in a for-loop
split each line by a space char and assign to a tuple
append tuple to the result list
I'm refraining from writing code here to let you work it out.
This should do
with open ("data.txt", "r") as myfile:
data=myfile.readlines()
for i in range(len(data)):
data[i]=data[i].split()
You first want to retrieve the file content in an array of string (each string is one line of the file)
with open("myfile.txt", 'r') as f:
file_content = f.readlines()
Refer to open doc for more: https://docs.python.org/3/library/functions.html#open
Then you want to create a list
content_list = []
And then you want to fill it with each string, when each string should be split with a space(using split() function) which make a list with the two values and add it to content_list, use a for loop !
for line in file_content:
values = line.split(' ') # split the line at the space
content_list.append(values)
By the way, this can be simplified with a List Comprehension:
content_list = [s.split(' ') for s in file_content]
This should work,
with open('filepath') as f:
array = [line.split() for line in f.readlines()]
Python provides the perfect module for this, it's called csv:
import csv
def csv_to_array(file_name, **kwargs):
with open(file_name) as csvfile:
reader = csv.reader(csvfile, **kwargs)
return [list(map(float, row)) for row in reader]
print(csv_to_array('test.csv'))
If you later have a file with a different field separator, say ";", then you'll just have to change the call to:
print(csv_to_array('test.csv', delimiter=';'))
Note that if you don't care about importing numpy then this solution is even better.
To convert to this exact format :
with open('filepath', 'r') as f:
raw = f.read()
arr = [[float(j) for j in i.split(' ')] for i in raw.splitlines()]
print arr
outputs :
[[5.07878, 5.078993], [7.633073, 7.63318], [2.919274, 2.919369], [3.410284, 3.410314]]
with open('blah.txt', 'r') as file:
a=[[l.split(' ')[0], l.split(' ')[1] for l in file.readlines() ]

Nested lists in python containing a single string and not single letters

I need to load text from a file which contains several lines, each line contains letters separated by coma, into a 2-dimensional list. When I run this, I get a 2 dimensional list, but the nested lists contain single strings instead of separated values, and I can not iterate over them. how do I solve this?
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split()
matrix.append(line)
return matrix
result:
[['a,p,p,l,e'], ['a,g,o,d,o'], ['n,n,e,r,t'], ['g,a,T,A,C'], ['m,i,c,s,r'], ['P,o,P,o,P']]
I need each letter in the nested lists to be a single string so I can use them.
thanks in advance
split() function splits on white space by default. You can fix this by passing the string you want to split on. In this case, that would be a comma. The code below should work.
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split(',')
matrix.append(line)
return matrix
The input format you described conforms to CSV format. Python has a library just for reading CSV files. If you just want to get the job done, you can use this library to do the work for you. Here's an example:
Input(test.csv):
a,string,here
more,strings,here
Code:
>>> import csv
>>> lines = []
>>> with open('test.csv') as file:
... reader = csv.reader(file)
... for row in reader:
... lines.append(row)
...
>>>
Output:
>>> lines
[['a', 'string', 'here'], ['more', 'strings', 'here']]
Using the strip() function will get rid of the new line character as well:
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split(',')
line[-1] = line[-1].strip()
matrix.append(line)
return matrix

Python - Read in Comma Separated File, Create Two lists

New to Python here and I'm trying to learn/figure out the basics. I'm trying to read in a file in Python that has comma separated values, one to a line. Once read in, these values should be separated into two lists, one list containing the value before the "," on each line, and the other containing the value after it.
I've played around with it for quite a while, but I just can't seem to get it.
Here's what I have so far...
with open ("mid.dat") as myfile:
data = myfile.read().replace('\n',' ')
print(data)
list1 = [x.strip() for x in data.split(',')]
print(list1)
list2 = ?
List 1 creates a list, but it's not correct. List 2, I'm not even sure how to tackle.
PS - I have searched other similar threads on here, but none of them seem to address this properly. The file in question is not a CSV file, and needs to stay as a .dat file.
Here's a sample of the data in the .dat file:
113.64,889987.226
119.64,440987774.55
330.43,446.21
Thanks.
Use string slicing:
list1= []
list2 = []
with open ("mid.dat") as myfile:
for line in myfile:
line = line.split(",").rstrip()
list1.append( line[0])
list2.append( line[1])
Python's rstrip() method strips all kinds of trailing whitespace by default, so removes return carriage "\n" too
If you want to use only builtin packages, you can use csv.
import csv
with open("mid.dat") as myfile:
csv_records = csv.reader(myfile)
list1 = []
list2 = []
for row in csv_records:
list1.append(row[0])
list2.append(row[1])
Could try this, which creates lists of floats not strings however:
from ast import literal_eval
with open("mid.dat") as f:
list1, list2 = map(list, (zip(*map(literal_eval, f.readlines()))))
Can be simplified if you don't mind list1 and list2 as tuples.
The list(*zip(*my_2d_list)) pattern is a pretty common way of transposing 2D lists using only built-in functions. It's useful in this scenario because it's easy to obtain a list (call this result) of tuples on each line in the file (where result[0] would be the first tuple, and result[n] would be the nth), and then transpose result (call this resultT) such that resultT[0] would be all the 'left values' and resultT[1] would be the 'right values'.
let's keep it very simple.
list1 = []
list2 = []
with open ("mid.dat") as myfile:
for line in myfile:
x1,x2 = map(float,line.split(','))
list1.append(x1)
list2.append(x2)
print(list1)
print(list2)
You could do this with pandas.
import pandas as pd
df = pd.read_csv('data.csv', columns=['List 1','List 2'])
If your data is a text file the respective function also exists in the pandas package. Pandas is a very powerful tool for data such as yours.
After doing so you can split your data into two independent dataframes.
list1 = df['List 1']
list2 = df['List 2']
I would stick to a dataframe because data manipulation and analysis is much easier within the pandas framework.
Here is my suggestion to be short and readable, without any additional packages to install:
with open ("mid.dat") as myfile:
listOfLines = [line.rstrip().split(',') for line in myfile]
list1 = [line[0] for line in listOfLines]
list2 = [line[1] for line in listOfLines]ility
Note: I used rstrip() to remove the end of line character.
Following is a solution obtained by correcting your own attempt:
with open("test.csv", "r") as myfile:
datastr = myfile.read().replace("\n",",")
datalist = datastr.split(",")
list1 = []; list2=[]
for i in range(len(datalist)-1): # ignore empty last item of list
if i%2 ==0:
list1.append(datalist[i])
else:
list2.append(datalist[i])
print(list1)
print(list2)
Output:
['113.64', '119.64', '330.43']
['889987.226', '440987774.55', '446.21']

How do I get my list to print out every line?

I'm a little confused as to why this is not working. I'm trying to get my program to read every line out a csv file change it from a string to a float and then print it out line by line.
csv_list = open('example_data.csv','rb')
lists= csv_list.readlines()
csv_list.close()
for lines in lists:
lists_1 = lists.strip().split()
list_2 = [float(x) for x in lists_1]
print list_2
Any help would be appreciated.
First, don't use readlines. Simply iterate over file
for lines in csv_list:
...
second, use csv library for reading http://docs.python.org/2/library/csv.html
In your exapmple, it is csv, so don't split by whitespace but comma or semicolon.
Try this:
import pprint
with open('example_data.csv','rb') as csv_list:
lists= csv_list.readlines()
lists_1 = []
lists_2 = []
for lines in lists:
lists_1.append(lines.strip().split())
list_2.append([float(x) for x in lists_1])
pprint.pprint(list_2)
for lines in lists:
lists_1 = lines.strip().split() # 'lines' here
list_2 = [float(x) for x in lists_1]
print list_2 # print your list in a loop
print list_2 needs to be indented to the same level as the rest of the loop and it should be lines.strip().split()
print list_2 is outside of the for loop. You need to indent it.
Judging by the file name, I assume that the fields in your file is separated by comma. If that is the case, you need to split the line using the comma:
lists_1 = lists.strip().split(',')
Better yet, use the csv module. Here is an example:
import csv
with open('example_data.csv', 'rb') as f:
csvreader = csv.reader(f)
for line in csvreader:
line = [float(x) for x in line] # line is now a list of floats
print line

Categories