I have a sample file called 'scores.txt' which holds the following values:
10,0,6,3,7,4
I want to be able to somehow take each value from the line, and append it to a list so that it becomes sampleList = [10,0,6,3,7,4].
I have tried doing this using the following code below,
score_list = []
opener = open('scores.txt','r')
for i in opener:
score_list.append(i)
print (score_list)
which partially works, but for some reason, it doesn't do it properly. It just sticks all the values into one index instead of separate indexes. How can I make it so all the values get put into their own separate index?
You have CSV data (comma separated). Easiest is to use the csv module:
import csv
all_values = []
with open('scores.txt', newline='') as infile:
reader = csv.reader(infile)
for row in reader:
all_values.extend(row)
Otherwise, split the values. Each line you read is a string with the ',' character between the digits:
all_values = []
with open('scores.txt', newline='') as infile:
for line in infile:
all_values.extend(line.strip().split(','))
Either way, all_values ends up with a list of strings. If all your values are only consisting of digits, you could convert these to integers:
all_values.extend(map(int, row))
or
all_values.extend(map(int, line.strip().split(',')))
That is an efficient way how to do that without using any external package:
with open('tmp.txt','r') as f:
score_list = f.readline().rstrip().split(",")
# Convert to list of int
score_list = [int(v) for v in score_list]
print score_list
Just use split on comma on each line and add the returned list to your score_list, like below:
opener = open('scores.txt','r')
score_list = []
for line in opener:
score_list.extend(map(int,line.rstrip().split(',')))
print( score_list )
Related
This question already has answers here:
Create new list from nested list and convert str into float
(4 answers)
Closed 3 years ago.
If I have a text file containing the following numbers:
5.078780 5.078993
7.633073 7.633180
2.919274 2.919369
3.410284 3.410314
How can read it and store it in an array, so that it becomes:
[[5.078780,5.078993],[7.633073,7.633180],[2.919274,2.919369],[3.410284,3.410314]]
with open('test.txt', 'r') as file:
output = [ line.strip().split(' ') for line in file.readlines()]
# Cast strings to floats
output = [[float(j) for j in i] for i in output]
print(output)
should give the desired output:
[[5.07878, 5.078993], [7.633073, 7.63318], [2.919274, 2.919369], [3.410284, 3.410314]]
Approach:
Have a result list = []
Split the text by newlines \n.
Now in a for-loop
split each line by a space char and assign to a tuple
append tuple to the result list
I'm refraining from writing code here to let you work it out.
This should do
with open ("data.txt", "r") as myfile:
data=myfile.readlines()
for i in range(len(data)):
data[i]=data[i].split()
You first want to retrieve the file content in an array of string (each string is one line of the file)
with open("myfile.txt", 'r') as f:
file_content = f.readlines()
Refer to open doc for more: https://docs.python.org/3/library/functions.html#open
Then you want to create a list
content_list = []
And then you want to fill it with each string, when each string should be split with a space(using split() function) which make a list with the two values and add it to content_list, use a for loop !
for line in file_content:
values = line.split(' ') # split the line at the space
content_list.append(values)
By the way, this can be simplified with a List Comprehension:
content_list = [s.split(' ') for s in file_content]
This should work,
with open('filepath') as f:
array = [line.split() for line in f.readlines()]
Python provides the perfect module for this, it's called csv:
import csv
def csv_to_array(file_name, **kwargs):
with open(file_name) as csvfile:
reader = csv.reader(csvfile, **kwargs)
return [list(map(float, row)) for row in reader]
print(csv_to_array('test.csv'))
If you later have a file with a different field separator, say ";", then you'll just have to change the call to:
print(csv_to_array('test.csv', delimiter=';'))
Note that if you don't care about importing numpy then this solution is even better.
To convert to this exact format :
with open('filepath', 'r') as f:
raw = f.read()
arr = [[float(j) for j in i.split(' ')] for i in raw.splitlines()]
print arr
outputs :
[[5.07878, 5.078993], [7.633073, 7.63318], [2.919274, 2.919369], [3.410284, 3.410314]]
with open('blah.txt', 'r') as file:
a=[[l.split(' ')[0], l.split(' ')[1] for l in file.readlines() ]
I need to load text from a file which contains several lines, each line contains letters separated by coma, into a 2-dimensional list. When I run this, I get a 2 dimensional list, but the nested lists contain single strings instead of separated values, and I can not iterate over them. how do I solve this?
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split()
matrix.append(line)
return matrix
result:
[['a,p,p,l,e'], ['a,g,o,d,o'], ['n,n,e,r,t'], ['g,a,T,A,C'], ['m,i,c,s,r'], ['P,o,P,o,P']]
I need each letter in the nested lists to be a single string so I can use them.
thanks in advance
split() function splits on white space by default. You can fix this by passing the string you want to split on. In this case, that would be a comma. The code below should work.
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split(',')
matrix.append(line)
return matrix
The input format you described conforms to CSV format. Python has a library just for reading CSV files. If you just want to get the job done, you can use this library to do the work for you. Here's an example:
Input(test.csv):
a,string,here
more,strings,here
Code:
>>> import csv
>>> lines = []
>>> with open('test.csv') as file:
... reader = csv.reader(file)
... for row in reader:
... lines.append(row)
...
>>>
Output:
>>> lines
[['a', 'string', 'here'], ['more', 'strings', 'here']]
Using the strip() function will get rid of the new line character as well:
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split(',')
line[-1] = line[-1].strip()
matrix.append(line)
return matrix
I'm working on a script to remove bad characters from a csv file then to be stored in a list.
The script runs find but doesn't remove bad characters so I'm a bit puzzled any pointers or help on why it's not working is appreciated
def remove_bad(item):
item = item.replace("%", "")
item = item.replace("test", "")
return item
raw = []
with open("test.csv", "rb") as f:
rows = csv.reader(f)
for row in rows:
raw.append((remove_bad(row[0].strip()),
row[1].strip().title()))
print raw
If I have a csv-file with one line:
tst%,testT
Then your script, slightly modified, should indeed filter the "bad" characters. I changed it to pass both items separately to remove_bad (because you mentioned you had to "remove bad characters from a csv", not only the first row):
import csv
def remove_bad(item):
item = item.replace("%","")
item = item.replace("test","")
return item
raw = []
with open("test.csv", "rb") as f:
rows = csv.reader(f)
for row in rows:
raw.append((remove_bad(row[0].strip()), remove_bad(row[1].strip()).title()))
print raw
Also, I put title() after the function call (else, "test" wouldn't get filtered out).
Output (the rows will get stored in a list as tuples, as in your example):
[('tst', 'T')]
Feel free to ask questions
import re
import csv
p = re.compile( '(test|%|anyotherchars)') #insert bad chars insted of anyotherchars
def remove_bad(item):
item = p.sub('', item)
return item
raw =[]
with open("test.csv", "rb") as f:
rows = csv.reader(f)
for row in rows:
raw.append( ( remove_bad(row[0].strip()),
row[1].strip().title() # are you really need strip() without args?
) # here you create a touple which you will append to array
)
print raw
I have a csv file with the following structure:
1234,5678,"text1"
983453,2141235,"text2"
I need to convert each line to a tuple and create a list. Here is what I did
with open('myfile.csv') as f1:
mytuples = [tuple(line.strip().split(',')) for line in f1.readlines()]
However, I want the first 2 columns to be integers, not strings. I was not able to figure out how to continue with this, except by reading the file line by line once again and parsing it. Can I add something to the code above so that I transform str to int as I convert the file to list of tuples?
This is a csv file. Treat it as such.
import csv
with open("test.csv") as csvfile:
reader = csv.reader(csvfile)
result = [(int(a), int(b), c) for a,b,c in reader]
If there's a chance your input may not be what you think it is:
import csv
with open('test.csv') as csvfile:
reader = csv.reader(csvfile)
result = []
for line in reader:
this_line = []
for col in line:
try:
col = int(col)
except ValueError:
pass
this_line.append(col)
result.append(tuple(this_line))
Instead of trying to cram all of the logic in a single line, just spread it out so that it is readable.
with open('myfile.csv') as f1:
mytuples = []
for line in f1:
tokens = line.strip().split(',')
mytuples.append( (int(tokens[0]), int(tokens[1]), tokens[2]) )
Real python programmers aren't afraid of using multiple lines.
You can use isdigit() to check if all letters within element in row is digit so convert it to int , so replace the following :
tuple(line.strip().split(','))
with :
tuple(int(i) if i.isdigit() else i for i in (line.strip().split(','))
You can cram this all into one line if you really want, but god help me I don't know why you'd want to. Try giving yourself room to breathe:
def get_tuple(token_list):
return (int(token_list[0]), int(token_list[1]), token_list[2])
mytuples = []
with open('myfile.csv') as f1:
for line in f1.readlines():
token_list = line.strip().split(',')
mytuples.append(get_tuple(token_list))
Isn't that way easier to read? I like list comprehension as much as the next guy, but I also like knowing what a block of code does when I sit down three weeks later and start reading it!
I'm writing a program that reads names and statistics related to those names from a file. Each line of the file is another person and their stats. For each person, I'd like to make their last name a key and everything else linked to that key in the dictionary. The program first stores data from the file in an array and then I'm trying to get those array elements into the dictionary, but I'm not sure how to do that. Plus I'm not sure if each time the for loop iterates, it will overwrite the previous contents of the dictionary. Here's the code I'm using to attempt this:
f = open("people.in", "r")
tmp = None
people
l = f.readline()
while l:
tmp = l.split(',')
print tmp
people = {tmp[2] : tmp[0])
l = f.readline()
people['Smith']
The error I'm currently getting is that the syntax is incorrect, however I have no idea how to transfer the array elements into the dictionary other than like this.
Use key assignment:
people = {}
for line in f:
tmp = l.rstrip('\n').split(',')
people[tmp[2]] = tmp[0]
This loops over the file object directly, no need for .readline() calls here, and removes the newline.
You appear to have CSV data; you could also use the csv module here:
import csv
people = {}
with open("people.in", "rb") as f:
reader = csv.reader(f)
for row in reader:
people[row[2]] = row[0]
or even a dict comprehension:
import csv
with open("people.in", "rb") as f:
reader = csv.reader(f)
people = {r[2]: r[0] for r in reader}
Here the csv module takes care of the splitting and removing newlines.
The syntax error stems from trying close the opening { with a ) instead of }:
people = {tmp[2] : tmp[0]) # should be }
If you need to collect multiple entries per row[2] value, collect these in a list; a collections.defaultdict instance makes that easier:
import csv
from collections import defaultdict
people = defaultdict(list)
with open("people.in", "rb") as f:
reader = csv.reader(f)
for row in reader:
people[row[2]].append(row[0])
In repsonse to Generalkidd's comment above, multiple people with the same last time, an addition to Martijn Pieter's solution, posted as an answer for better formatting:
import csv
people = {}
with open("people.in", "rb") as f:
reader = csv.reader(f)
for row in reader:
if not row[2] in people:
people[row[2]] = list()
people[row[2]].append(row[0])