Want a deeper understanding of lists - python

Here's my sample code for a programming problem asking to split a string and sort the individual words to avoid duplicates. I know that this code is 100% correct, but I'm not really sure what the purpose of lst = list() line of code is?
How does the program know to put the file romeo in the list?
fname = input("Enter file name: ")
romeo = open(fname)
lst = list()
for line in romeo:
line = line.rstrip()
line = line.split()
for e in line:
if e not in lst:
lst.append(e)
lst.sort()
print(lst)

Maybe you are confused with iteration over the file. Iteration allows us to treat the file as a container which can be iterated just like we do for any other container like list or set or dict.items().
Also lst = list() means lst = []. This has got nothing to do with file iteration.

See below for more insights:
# the following line stores your input in fname as a str
fname = input("Enter file name: ")
# the following line opens the file named fname and stores it in romeo
romeo = open(fname)
# next line creates an empty list through the built in function list()
lst = list()
# now you go through all the lines in the file romeo
# each word is assigned to the variable line sequentially
for line in romeo:
# strip the line of evntual withespaces at the end of the string
line = line.rstrip()
# split the string on withespaces and stores each element
# of the splitted string in the list line which will then contain every
# word of the line.
line = line.split()
# now you go through all the elements in the list line
# each word is assigned to e sequentially
for e in line:
# now if the word is not contained in the list lst
if e not in lst:
# it inserts the word to the list in the last postion of the list lst.
lst.append(e)
# sort the list alphabetically
lst.sort()
print(lst)
Some notes:
you would probably want to add romeo.close() at the end of the script to close the file
it is important to note that not all the file will be stored in the lst list. Each word will be stored there only once thanks to if e not in lst:

List is a python object. Type help(list) in your interpreter. You would see your screen
Usually for some programming languages calling className() would create object of the type class. For example in C++
class MyClass{
var declarations
method definitions
}
MyObj=MyClass()
The MyObj in above code is object for your class MyClass. Apply same thing for your code lst is object type of list class that is predefined in Python which we call it as builtin data structure.
So your above lst definition would initialize lst to be a empty list.
The help section shows two types of constructors for list class those are used in different ways. The second type of constructor
list(iterable)
would create a list with already created sequence. For example
tuple1=(1,'mars')
new_list=list(tuple1)
print(new_list)
would create new list new_list using the tuple which is a iterable.

The purpose of lst = list() is to create an instance of list called lst.
You could also replace it by
lst = []
it's exactely the same.
The line lst.append(e) is filling it. Here more about the append method

Related

How to add numbers from a file into a list?

I am trying to read a file that has a list of numbers in each line. I want to take only the list of numbers and not the corresponding ID number and put it into a single list to later sort by frequencies in a dictionary.
I've tried to add the numbers into the list and I am able to get just the numbers that I need but I can not get it to add to the list correctly.
I have the function to read the file and to find just the location that I want to read from the line. I then try to add it to the list but it continues to come up like:
['23,43,56,', '67,87,34',]
And I want it to look like this:
[23, 43, 56, 67, 87, 34]
Here is my Code
def frequency():
f = open('Loto4.txt', "r")
list = []
for line in f:
line.strip('\n')
start = line.find("[")
end = line.find("]")
line = line[start+1:end-1]
list.append(line)
print(line)
print(list)
frequency()
This is the file that I am reading:
1:[36,37,38,9]
2:[3,5,28,25]
3:[10,14,15,9]
4:[23,9,31,41]
5:[5,2,21,9]
Try using a list comprehension on the line with append (i changed it to extend), also please do not name variables a default python builtin, since list is one, I renamed it to l, but please do this on your own next time, also see #MichaelButscher's comment:
def frequency():
f = open('Loto4.txt', "r")
l = []
for line in f:
line = line.strip('\n')
start = line.find("[")
end = line.find("]")
line = line[start + 1:end]
l.extend([int(i) for i in line.split(',')])
print(line)
print(l)
frequency()
The literal_eval method of ast module can be used in this case.
from ast import literal_eval
def frequency()
result_list = list()
with open('Loto4.txt') as f:
for line in f:
result_list.extend(list(literal_eval(line)))
print (result_list)
return result_list
The literal_eval method of ast (abstract syntax tree) module is used to safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing.
def frequency():
f = open('Loto4.txt', "r")
retval = []
for line in f:
line.strip('\n')
start = line.find("[")
end = line.find("]")
line = line[start+1:end-1]
retval.extend([int(x) for x in line.split(',')])
print(line)
print(retval)
frequency()
I changed the name of the list to retval - since list is a builtin class.

Replacing characters in a string loop

I have a template txt file. This txt file is to be written as 10 new files but each with some characters changed according to a list of arbitrary values:
with open('template.txt') as template_file:
template = template_file.readlines()
for i in range(10):
with open('output_%s.txt' % i, 'w') as new_file:
new_file.writelines(template_file)
The length of the list is the same as the number of new files (10).
I am trying to replace part of the 2nd line of each new file with the value in my list.
So for example, I want line 2, positions [5:16] in each new file replaced with the respective value in the list..
File 0 will have element 0 of the list
File 1 will have element 1 of the list
etc..
I tried using the replace() method:
list = [element0, element1, etc...element9]
for i in template_file:
i.replace(template_file[2][5:16], list_element)
But it will only replace all the files with the first list element... It wont loop over.
Any help appreciated
There are a couple of problems I can find which prevent your code from working:
You should write template out, which is a list of lines, not template_file, which is a file object
In Python, strings are immutable, meaning they cannot be changed. The replace function does not change the string, it returns a new copy of the string. Furthermore, replace will replace a substring with a new text, regardless of where that substring is. If you want to replace at a specific index, I suggest to slice the string yourself. For example:
line2 = '0123456789ABCDEFG'
element = '-ho-ho-ho-'
line2 = line2[:5] + element + line2[16:]
# line2 now is '01234-ho-ho-ho-G'
Please do not use list as a variable name. It is a type, which can be used to construct a new list as such:
empty = list() # ==> []
letters = list('abc') # ==> ['a', 'b', 'c']
The expression template_file[2][5:16] is incorrect: First, it should be template, not template_file. Second, the second line should be template[1], not template[2] since Python list are zero based
The list_element variable is not declared in your code
Solution 1
That being said, I find that it is easier to structure your template file as a real template with placeholders. I'll talk about that later. If you still insist to replace index 5-16 of line 2 with something, here is a solution I tested and it works:
with open('template.txt') as template_file:
template = template_file.readlines()
elements = ['ABC', 'DEF', 'GHI', 'JKL']
for i, element in enumerate(elements):
with open('output_%02d.txt' % i, 'w') as out_file:
line2 = template[1]
line2 = line2[:5] + element + line2[16:]
for line_number, line in enumerate(template, 1):
if line_number == 2:
line = line2
out_file.write(line)
Notes
The code writes out all lines, but with special replacement applies to line 2
The code is clunky, nested deeply
I don't like having to hard code the index numbers (5, 16) because if the template changes, I have to change the code as well
Solution 2
If you have control of the template file, I suggest to use the string.Template class to make search and replace easier. Since I don't know what your template file looks like, I am going to make up my own template file:
line #1
This is my ${token} to be replaced
line #3
line #4
Note that I intent to replace ${token} with one of the elements in the code. Now on to the code:
import string
with open('template.txt') as template_file:
template = string.Template(template_file.read())
elements = ['ABC', 'DEF', 'GHI', 'JKL']
for i, element in enumerate(elements):
with open('output_%02d.txt' % i, 'w') as out_file:
out_file.write(template.substitute(token=element))
Notes
I read the whole file in at once with template_file.read(). This could be a problem if the template file is large, but previous solution als ran into the same performance issue as this one
I use the string.Template class to make search/replace easier
Search and replace is done by substitute(token=element) which said: replace all the $token or ${token} instances in the template with element.
The code is much cleaner and dare I say, easier to read.
Solution 3
If the template file is too large to fit in memory at once, you can modify the first solution to read it line-by-line instead of reading all lines in at once. I am not going to present that solution here, just a asuggestion.
Looks like you need
list = [element0, element1, etc...element9]
for i in list:
template_file = template_file.replace(template_file[2][5:16], i)

how to turn string from csv file into list in python

I have a CSV file that contains matrix:
1,9,5,78
4.9,0,24,7
6,2,3,8
10,21.4,8,7
I want to create a function that returns list of lists:
[[1.0,9.0,5.0,78.0],[4.9,0.0,24.0,7.0],[6.0,2.0,3.0,8.0],[10.0,21.4,8.0,7.0]]
this is my attempt:
fileaname=".csv"
def get_csv_matrix(fileaname):
mat=open(fileaname,'r')
mat_list=[]
for line in mat:
line=line.strip()
mat_line=[line]
mat_list.append(mat_line)
return mat_list
but I get list of lists with one string:
[['1,9,5,78'], ['4.9,0,24,7'], ['6,2,3,8'], ['10,21.4,8,7']]
how can i turn the lists of strings to lists of floats?
mat_line = [line]
This line just takes the line as a single string and makes it into a one element list. If you want to separate it by commas, instead do:
mat_line = line.split(',')
If you want to also turn them into numbers, you'll have to do:
mat_line = [float(i) for i in line.split(',')]
I find it easier to read a list comprehension than a for loop.
def get_csv_matrix(filename):
with open(filename) as input_file:
return [[float(i) for i in line.split(',')] for line in input_file]
print (get_csv_matrix("data.csv"))
The above function opens a file (I use with to avoid leaking open file descriptors), iterates over the lines, splits each line, and converts each item into a floating-point number.
Try
fileaname=".csv"
def get_csv_matrix(fileaname):
mat=open(fileaname,'r')
mat_list=[]
for line in mat:
line=line.strip()
mat_line=line.split(",")
for i in mat_line:
i_position = line.index(i)
line[i_position] = float(i)
mat_list.append(mat_line)
return mat_list
If any object in mat_line isn't an integer, you will come up with an error, so I suggest you create a validation method to be absolutely sure that it is an integer.

Reading a file into a list on python. How to take out words

I am reading a file into a list and spliting it so that every word is in a list. However I do not want specific words to be brought up in the list, I would like to skip them. I called the trash list filterList written below.
this is my code:
with open('USConstitution.txt') as f:
lines = f.read().split() #read everything into the list
filterList = ["a","an","the","as","if","and","not"] #define a filterList
for word in lines:
if word.lower() not in filterList:
word.append(aList) #place them in a new list called aList that does not contain anything in filterList
print(aList) #print that new list
I am getting this error:
AttributeError: 'str' object has no attribute 'append'
Can someone help ? thanks
You need to give,
aList.append(word)
List object only has the attribute append. And also you need to declare the list first. Then only you could append the items to that list.
ie,
with open('USConstitution.txt') as f:
lines = f.read().split() #read everything into the list
filterList = ["a","an","the","as","if","and","not"] #define a filterList
aList = []
for word in lines:
if word.lower() not in filterList:
aList.append(word) #place them in a new list called aList that does not contain anything in filterList
print(aList)

Create a List that contain each Line of a File [duplicate]

This question already has answers here:
How to read a file line-by-line into a list?
(28 answers)
Why does this iterative list-growing code give IndexError: list assignment index out of range? How can I repeatedly add (append) elements to a list?
(9 answers)
Closed 7 months ago.
I'm trying to open a file and create a list with each line read from the file.
i=0
List=[""]
for Line in inFile:
List[i]=Line.split(",")
i+=1
print List
But this sample code gives me an error because of the i+=1 saying that index is out of range.
What's my problem here? How can I write the code in order to increment my list with every new Line in the InFile?
It's a lot easier than that:
List = open("filename.txt").readlines()
This returns a list of each line in the file.
I did it this way
lines_list = open('file.txt').read().splitlines()
Every line comes with its end of line characters (\n\r); this way the characters are removed.
my_list = [line.split(',') for line in open("filename.txt")]
Please read PEP8. You're swaying pretty far from python conventions.
If you want a list of lists of each line split by comma, I'd do this:
l = []
for line in in_file:
l.append(line.split(','))
You'll get a newline on each record. If you don't want that:
l = []
for line in in_file:
l.append(line.rstrip().split(','))
A file is almost a list of lines. You can trivially use it in a for loop.
myFile= open( "SomeFile.txt", "r" )
for x in myFile:
print x
myFile.close()
Or, if you want an actual list of lines, simply create a list from the file.
myFile= open( "SomeFile.txt", "r" )
myLines = list( myFile )
myFile.close()
print len(myLines), myLines
You can't do someList[i] to put a new item at the end of a list. You must do someList.append(i).
Also, never start a simple variable name with an uppercase letter. List confuses folks who know Python.
Also, never use a built-in name as a variable. list is an existing data type, and using it as a variable confuses folks who know Python.
f.readlines() returns a list that contains each line as an item in the list
if you want eachline to be split(",") you can use list comprehensions
[ list.split(",") for line in file ]
Assuming you also want to strip whitespace at beginning and end of each line, you can map the string strip function to the list returned by readlines:
map(str.strip, open('filename').readlines())
... Also If you want to get rid of \n
In case the items on your list are with \n and you want to get rid of them:
with open('your_file.txt') as f:
list= f.read().splitlines()
I am not sure about Python but most languages have push/append function for arrays.

Categories