I am trying to read a file that has a list of numbers in each line. I want to take only the list of numbers and not the corresponding ID number and put it into a single list to later sort by frequencies in a dictionary.
I've tried to add the numbers into the list and I am able to get just the numbers that I need but I can not get it to add to the list correctly.
I have the function to read the file and to find just the location that I want to read from the line. I then try to add it to the list but it continues to come up like:
['23,43,56,', '67,87,34',]
And I want it to look like this:
[23, 43, 56, 67, 87, 34]
Here is my Code
def frequency():
f = open('Loto4.txt', "r")
list = []
for line in f:
line.strip('\n')
start = line.find("[")
end = line.find("]")
line = line[start+1:end-1]
list.append(line)
print(line)
print(list)
frequency()
This is the file that I am reading:
1:[36,37,38,9]
2:[3,5,28,25]
3:[10,14,15,9]
4:[23,9,31,41]
5:[5,2,21,9]
Try using a list comprehension on the line with append (i changed it to extend), also please do not name variables a default python builtin, since list is one, I renamed it to l, but please do this on your own next time, also see #MichaelButscher's comment:
def frequency():
f = open('Loto4.txt', "r")
l = []
for line in f:
line = line.strip('\n')
start = line.find("[")
end = line.find("]")
line = line[start + 1:end]
l.extend([int(i) for i in line.split(',')])
print(line)
print(l)
frequency()
The literal_eval method of ast module can be used in this case.
from ast import literal_eval
def frequency()
result_list = list()
with open('Loto4.txt') as f:
for line in f:
result_list.extend(list(literal_eval(line)))
print (result_list)
return result_list
The literal_eval method of ast (abstract syntax tree) module is used to safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing.
def frequency():
f = open('Loto4.txt', "r")
retval = []
for line in f:
line.strip('\n')
start = line.find("[")
end = line.find("]")
line = line[start+1:end-1]
retval.extend([int(x) for x in line.split(',')])
print(line)
print(retval)
frequency()
I changed the name of the list to retval - since list is a builtin class.
Related
Here's my sample code for a programming problem asking to split a string and sort the individual words to avoid duplicates. I know that this code is 100% correct, but I'm not really sure what the purpose of lst = list() line of code is?
How does the program know to put the file romeo in the list?
fname = input("Enter file name: ")
romeo = open(fname)
lst = list()
for line in romeo:
line = line.rstrip()
line = line.split()
for e in line:
if e not in lst:
lst.append(e)
lst.sort()
print(lst)
Maybe you are confused with iteration over the file. Iteration allows us to treat the file as a container which can be iterated just like we do for any other container like list or set or dict.items().
Also lst = list() means lst = []. This has got nothing to do with file iteration.
See below for more insights:
# the following line stores your input in fname as a str
fname = input("Enter file name: ")
# the following line opens the file named fname and stores it in romeo
romeo = open(fname)
# next line creates an empty list through the built in function list()
lst = list()
# now you go through all the lines in the file romeo
# each word is assigned to the variable line sequentially
for line in romeo:
# strip the line of evntual withespaces at the end of the string
line = line.rstrip()
# split the string on withespaces and stores each element
# of the splitted string in the list line which will then contain every
# word of the line.
line = line.split()
# now you go through all the elements in the list line
# each word is assigned to e sequentially
for e in line:
# now if the word is not contained in the list lst
if e not in lst:
# it inserts the word to the list in the last postion of the list lst.
lst.append(e)
# sort the list alphabetically
lst.sort()
print(lst)
Some notes:
you would probably want to add romeo.close() at the end of the script to close the file
it is important to note that not all the file will be stored in the lst list. Each word will be stored there only once thanks to if e not in lst:
List is a python object. Type help(list) in your interpreter. You would see your screen
Usually for some programming languages calling className() would create object of the type class. For example in C++
class MyClass{
var declarations
method definitions
}
MyObj=MyClass()
The MyObj in above code is object for your class MyClass. Apply same thing for your code lst is object type of list class that is predefined in Python which we call it as builtin data structure.
So your above lst definition would initialize lst to be a empty list.
The help section shows two types of constructors for list class those are used in different ways. The second type of constructor
list(iterable)
would create a list with already created sequence. For example
tuple1=(1,'mars')
new_list=list(tuple1)
print(new_list)
would create new list new_list using the tuple which is a iterable.
The purpose of lst = list() is to create an instance of list called lst.
You could also replace it by
lst = []
it's exactely the same.
The line lst.append(e) is filling it. Here more about the append method
I have a CSV file that contains matrix:
1,9,5,78
4.9,0,24,7
6,2,3,8
10,21.4,8,7
I want to create a function that returns list of lists:
[[1.0,9.0,5.0,78.0],[4.9,0.0,24.0,7.0],[6.0,2.0,3.0,8.0],[10.0,21.4,8.0,7.0]]
this is my attempt:
fileaname=".csv"
def get_csv_matrix(fileaname):
mat=open(fileaname,'r')
mat_list=[]
for line in mat:
line=line.strip()
mat_line=[line]
mat_list.append(mat_line)
return mat_list
but I get list of lists with one string:
[['1,9,5,78'], ['4.9,0,24,7'], ['6,2,3,8'], ['10,21.4,8,7']]
how can i turn the lists of strings to lists of floats?
mat_line = [line]
This line just takes the line as a single string and makes it into a one element list. If you want to separate it by commas, instead do:
mat_line = line.split(',')
If you want to also turn them into numbers, you'll have to do:
mat_line = [float(i) for i in line.split(',')]
I find it easier to read a list comprehension than a for loop.
def get_csv_matrix(filename):
with open(filename) as input_file:
return [[float(i) for i in line.split(',')] for line in input_file]
print (get_csv_matrix("data.csv"))
The above function opens a file (I use with to avoid leaking open file descriptors), iterates over the lines, splits each line, and converts each item into a floating-point number.
Try
fileaname=".csv"
def get_csv_matrix(fileaname):
mat=open(fileaname,'r')
mat_list=[]
for line in mat:
line=line.strip()
mat_line=line.split(",")
for i in mat_line:
i_position = line.index(i)
line[i_position] = float(i)
mat_list.append(mat_line)
return mat_list
If any object in mat_line isn't an integer, you will come up with an error, so I suggest you create a validation method to be absolutely sure that it is an integer.
In Python, I'm reading a large file with many many lines. Each line contains a number and then a string such as:
[37273738] Hello world!
[83847273747] Hey my name is James!
And so on...
After I read the txt file and put it into a list, I was wondering how I would be able to extract the number and then sort that whole line of code based on the number?
file = open("info.txt","r")
myList = []
for line in file:
line = line.split()
myList.append(line)
What I would like to do:
since the number in message one falls between 37273700 and 38000000, I'll sort that (along with all other lines that follow that rule) into a separate list
This does exactly what you need (for the sorting part)
my_sorted_list = sorted(my_list, key=lambda line: int(line[0][1:-2]))
Use tuple as key value:
for line in file:
line = line.split()
keyval = (line[0].replace('[','').replace(']',''),line[1:])
print(keyval)
myList.append(keyval)
Sort
my_sorted_list = sorted(myList, key=lambda line: line[0])
How about:
# ---
# Function which gets a number from a line like so:
# - searches for the pattern: start_of_line, [, sequence of digits
# - if that's not found (e.g. empty line) return 0
# - if it is found, try to convert it to a number type
# - return the number, or 0 if that conversion fails
def extract_number(line):
import re
search_result = re.findall('^\[(\d+)\]', line)
if not search_result:
num = 0
else:
try:
num = int(search_result[0])
except ValueError:
num = 0
return num
# ---
# Read all the lines into a list
with open("info.txt") as f:
lines = f.readlines()
# Sort them using the number function above, and print them
lines = sorted(lines, key=extract_number)
print ''.join(lines)
It's more resilient in the case of lines without numbers, it's more adjustable if the numbers might appear in different places (e.g. spaces at the start of the line).
(Obligatory suggestion not to use file as a variable name because it's a builtin function name already, and that's confusing).
Now there's an extract_number() function, it's easier to filter:
lines2 = [L for L in lines if 37273700 < extract_number(L) < 38000000]
print ''.join(lines2)
i'm very begginer in python.
i have a file with lists of coordinates. it seems like that :
[-122.661927,45.551161], [-98.51377733,29.655474], [-84.38042879, 33.83919567].
i'm trying to put this into a list with:
with open('file.txt', 'r') as f:
for line in f:
list.append(line)
the result i got is
['[-122.661927,45.551161], [-98.51377733,29.655474], [-84.38042879, 33.83919567]']
could sombody help me how can i get rid of the "'" marks at the beggining and the end of the list?
Try using ast.literal_eval.
Example -
import ast
lst = []
with open('file.txt', 'r') as f:
for line in f:
lst.extend(ast.literal_eval(line))
From documentation -
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
Also, please note its bad to use list as a variable name, as it shadows the list built-in function.
Use ast.literal_eval to convert the string list to list objects,also you can use a list comprehension to loop over your file object that is more faster than python loops and directly returns a list:
import ast
with open('file.txt', 'r') as f:
my_list=[ast.literal_eval(line) for line in f]
answer = []
with open('file.txt') as infile:
for line in infile:
line = line.strip().rstrip('.').replace('[', ' ').replace(']', ' ').replace(',', ' ')
parts = map(float, line.split())
answer.extend(zip(parts, parts))
Ouput:
In [83]: answer
Out[83]:
[(-122.661927, 45.551161),
(-98.51377733, 29.655474),
(-84.38042879, 33.83919567)]
I have a one line txt file, file1.txt, that has a series of 10 numbers as such;
10,45,69,85,21,32,11,71,20,30
I want to take these numbers from the txt file and then add them to a list and then sort the numbers in ascending order.
I have tried
myfile1 = open('file1.txt', 'r').readlines()
but this seems to give me a list of length 1, which obviously can't be sorted.
In [101]: myfile1
Out[101]: ['10,45,69,85,21,32,11,71,20,30']
I'm guessing there is something wrong with how I am reading the text file however I can't seem to find a suitable way.
.readlines() does what it says: it reads the file in line by line. In your example, there is only one line, so the length is 1.
With that one line, you need to split on commas:
with open(file1.txt,'r') as myfile:
for line in myfile:
print sorted(map(int, line.split(',')))
Or, if you have multiple lines with lots of numbers:
data = []
with open(file1.txt,'r') as myfile:
for line in myfile:
data.extend(map(int, line.split(',')))
print sorted(data)
Here I use with with keyword to open the file, which can be iterated over line by line. Then, use the the split method of strings on each line, which returns a list of strings. Then, I use map to convert these strings into integers by applying the int type casting function to each item in the list. This list can then be sorted. Make sure to take a look at the string methods page on the Python documentation.
A test without the input file:
numbers = "10,45,69,85,21,7,32,11,71,20,30"
data = []
data.extend(map(int, numbers.split(',')))
print sorted(data)
prints
[7, 10, 11, 20, 21, 30, 32, 45, 69, 71, 85]
A little obfuscated to do as a 1-liner, but basically:
with open('file1.txt', 'r') as f:
data = sorted(map(int, f.readline().split(',')))
What this does:
Read 1 line: f.readline()
Split that line on ',' characters: .split(',')
Map the list of string to int values: map(int, list)
Sort the list of int: sorted(list)