I have 200 files, from which I wanna choose the second column. I wanna store the second column of each file in a list called "colv". I wanna have colv[0]=[second column of the first file. colv[1] be the second column of the second file and so on.
I write this code, but it does not work, in this code colv[0] is the first number of the second column of the first file. Can anyone help me how to fix this issue:
colv = []
i = 1
colvar = "step7_1.colvar"
while os.path.isfile(colvar):
with open(colvar, "r") as f_in:
line = next(f_in)
for line in f_in:
a = line.split()[1]
colv.append(a)
i+=1
colvar = "step7_%d.colvar" %i
How about using Pandas' read_csv() since you mention that the data has a table-like structure. In particular, you could use
import pandas as pd
colv = []
i = 1
colvar = "step7_1.colvar"
while os.path.isfile(colvar):
df = pd.read_csv(colvar, sep=',')
colv.append(list(df[df.columns[1]]))
i+=1
colvar = "step7_%d.colvar" %i
It returned
>colv
[[5, 6, 7, 8], [8, 9, 10, 11], [12, 13, 14, 15]]
for my vanilla sample files step7_%d.colvar.
You might need to adjust the separator character with sep.
Use a list comprehension to get the 2nd element of all the lines into a list.
colv = []
i = 1
colvar = "step7_1.colvar"
while os.path.isfile(colvar):
with open(colvar, "r") as f_in:
readline(f_in) # skip 1st line
data = [line.split()[1] for line in f_in]
colv.append(data)
i+=1
colvar = "step7_%d.colvar" %i
Related
I'm converting a .txt file with annotations into another annotation format in a .csv file. The annotation format is as follows: filepath,x1,y1,x2,y2,classname. For pictures which haven't an instance of any class in them, annotation is like this: filepath,,,,,.
The problem is, that the .writerrow method of the csv.writer class doesn't write more than one comma after another.
My code is like this:
with open(annotation_file, 'r') as file:
lines = file.readlines()
splitted_lines = [line.split(' ') for line in lines]
with open(out_file, 'w', newline = '') as out:
csv_writer = csv.writer(out,delimiter= ';' )
for l in splitted_lines:
if len(l) == 1:
# indicate empty images
csv_writer.writerow([l[0] + ',,,,,'])
l is a list that contains a single string, so by l[0] + ',,,,,' I want to concatenate l with five commas.
Thank you in advance
set missing values as empty strings and fill the list
with open(annotation_file, 'r') as file:
lines = file.readlines()
splitted_lines = [line.split(' ') for line in lines]
with open(out_file, 'w', newline='') as out:
csv_writer = csv.writer(out, delimiter=';')
for l in splitted_lines:
if len(l) == 1:
# indicate empty images
csv_writer.writerow(l + ['' for _ in range(5)])
else:
csv_writer.writerow(l)
Given sample data:
data = [
[1, 2, 3, 4, 5, 6],
[1, 2, 3, 4, 5, 6],
[1, 2, 3, 4, 5, 6],
[1],
]
it outputs:
1;2;3;4;5;6
1;2;3;4;5;6
1;2;3;4;5;6
1;;;;;
which is inline with what you want
I discovered my problem, l is a string which contained a '\n' at the end. Because of this the writer wasn't able to write the five commas to the string. I changed the code like displayed below what fixed the problem.
with open(annotation_file, 'r') as file:
lines = file.readlines()
splitted_lines = [line.split(' ') for line in lines]
with open(out_file, 'w', newline = '') as out:
csv_writer = csv.writer(out,delimiter= ';' )
for l in splitted_lines:
if len(l) == 1:
# indicate empty images
l[0] = l[0].replace('\n', '')
csv_writer.writerow([l[0] + ',,,,,'])
else:
csv_writer.writerow(['something else'])
Thanks anyway #DelphiX
I have a function called new_safe() and I made a lst with new safe numbers. Now I need to read the sae.txt file with a for loop. Every number that is in safe.txt needs to be deleted from the lst I made.
def nieuwe_kluis():
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
for lijst in lst:
print(lijst)
file = open('kluizen.txt', 'r')
for line in file:
if lst == file:
lst.remove(file)
print(line)
You get the lines from the file, but you use file in the comparison and the remove rather than line.
You also don't check if the line (which I assume is one number per line) is a number within the list. Instead you just try to compare the list itself.
Assuming that each line in the file contains one number, you should try something along the lines of
for line in file:
number = int(line)
if number in lst:
lst.remove(number)
print(line)
Basically, I have a list which indicates the line number of the CSV file, and I want to concatenate rows that follows the list.
For instance, my list is [0, 7, 10, 11, 27, 31]
This means I want to concatenate my rows from line1 to line 7 into a single row.
Line 8 to 10 to a single row.
Line 11 to 11 (same line so it will simply do nothing)
Line 12 to 27
Line 28 to 31
I have tried using a while loop and islice from itertools. However, I only get the output of Line 1 to Line 7.
Here is my code.
import csv
from itertools import islice
with open('csvtest.csv', 'rb') as f:
reader = csv.reader(f)
#row1 = next(reader)
merged = []
list = [0, 7, 10, 11, 27, 31]
x=0
while x < len(list):
for line in islice(f, list[x], list[x+1]):
#print line1
line = line.rstrip()
merged.append(line)
x += 1
print merged #gives ['fsfs', 'sf', '1231', 'afsa', '', '', 'asfasfsaf;0'] which is lines 1 to 7
Would anyone let me know what happened to my while loop? Or is it a problem with the append list part?
I have fixed the code, basically you need to change how to implement islice.
Updating the answer on the basis of new information.
import csv
from itertools import islice
with open('output2.csv','wb') as w:
writer = csv.writer(w)
delimiter_list = []
merged = []
with open('csvtest.csv', 'rb') as f:
reader = csv.reader(f)
for num, line in enumerate(reader, 1):
line = (" ".join(line))
if line.endswith(';0'):
#print 'found at line:', num
delimiter_list.append(num)
with open('csvtest.csv', 'rb') as f:
x=0
while x < len(delimiter_list)-1:
row = []
# islice(f,N) returns next N lines
for line in islice(f, delimiter_list[x+1]-delimiter_list[x]):
line = line.rstrip()
row.append(line)
x += 1
# add each row to final list
merged.append(row)
print merged
writer.writerows(merged)
Im a python noob and I'm stuck on a problem.
filehandler = open("data.txt", "r")
alist = filehandler.readlines()
def insertionSort(alist):
for line in alist:
line = list(map(int, line.split()))
print(line)
for index in range(2, len(line)):
currentvalue = line[index]
position = index
while position>1 and line[position-1]>currentvalue:
line[position]=line[position-1]
position = position-1
line[position]=currentvalue
print(line)
insertionSort(alist)
for line in alist:
print line
Output:
[4, 19, 2, 5, 11]
[4, 2, 5, 11, 19]
[8, 1, 2, 3, 4, 5, 6, 1, 2]
[8, 1, 1, 2, 2, 3, 4, 5, 6]
4 19 2 5 11
8 1 2 3 4 5 6 1 2
I am supposed to sort lines of values from a file. The first value in the line represents the number of values to be sorted. I am supposed to display the values in the file in sorted order.
The print calls in insertionSort are just for debugging purposes.
The top four lines of output show that the insertion sort seems to be working. I can't figure out why when I print the lists after calling insertionSort the values are not sorted.
I am new to Stack Overflow and Python so please let me know if this question is misplaced.
for line in alist:
line = list(map(int, line.split()))
line starts out as eg "4 19 2 5 11". You split it and convert to int, ie [4, 19, 2, 5, 11].
You then assign this new value to list - but list is a local variable, the new value never gets stored back into alist.
Also, list is a terrible variable name because there is already a list data-type (and the variable name will keep you from being able to use the data-type).
Let's reorganize your program:
def load_file(fname):
with open(fname) as inf:
# -> list of list of int
data = [[int(i) for i in line.split()] for line in inf]
return data
def insertion_sort(row):
# `row` is a list of int
#
# your sorting code goes here
#
return row
def save_file(fname, data):
with open(fname, "w") as outf:
# list of list of int -> list of str
lines = [" ".join(str(i) for i in row) for row in data]
outf.write("\n".join(lines))
def main():
data = load_file("data.txt")
data = [insertion_sort(row) for row in data]
save_file("sorted_data.txt", data)
if __name__ == "__main__":
main()
Actually, with your data - where the first number in each row isn't actually data to sort - you would be better to do
data = [row[:1] + insertion_sort(row[1:]) for row in data]
so that the logic of insertion_sort is cleaner.
As #Barmar mentioned above, you are not modifying the input to the function. You could do the following:
def insertionSort(alist):
blist = []
for line in alist:
line = list(map(int, line.split()))
for index in range(2, len(line)):
currentvalue = line[index]
position = index
while position>1 and line[position-1]>currentvalue:
line[position]=line[position-1]
position = position-1
line[position]=currentvalue
blist.append(line)
return blist
blist = insertionSort(alist)
print(blist)
Alternatively, modify alist "in-place":
def insertionSort(alist):
for k, line in enumerate(alist):
line = list(map(int, line.split()))
for index in range(2, len(line)):
currentvalue = line[index]
position = index
while position>1 and line[position-1]>currentvalue:
line[position]=line[position-1]
position = position-1
line[position]=currentvalue
alist[k] = line
insertionSort(alist)
print(alist)
I don't suppose someone could point me in the right direction?
I'm a bit wondering how best to pull values out of a text file then break them up and put them back into lists at the same place as their corresponding values.
I'm sorry If this isn't clear, maybe this will make it clearer. This is the code that outputs the file:
#while loop
with open('values', 'a') as o:
o.write("{}, {}, {}, {}, {}, {}, {}\n".format(FirstName[currentRow],Surname[currentRow], AnotherValue[currentRow], numberA, numberB))
currentRow+1
I would like to do the opposite and take the values, formatted as above and put them back into lists at the same place. Something like:
#while loop
with open('values', 'r') as o:
o.read("{}, {}, {}, {}, {}, {}, {}\n".format(FirstName[currentRow],Surname[currentRow], AnotherValue[currentRow], numberA, numberB))
currentRow +1
Thanks
I think the best corresponding way to do it is calling split on the text read in:
FirstName[currentRow],Surname[currentRow], AnotherValue[currentRow], numberA, numberB = o.read().strip().split(", ")
There is no real equivalent of formatted input, like scanf in C.
You should be able to do something like the following:
first_names = []
surnames = []
another_values = []
number_as = []
number_bs = []
for line in open('values', 'r'):
fn, sn, av, na, nb = line.strip().split(',')
first_names.append(fn)
surnames.append(sn)
another_values.append(av)
number_as.append(float(na))
number_bs.append(float(nb))
The for line in open() part iterates over each line in the file and the fn, sn, av, na, nb = line.strip().split(',') bit strips the newline \n off the end of each line and splits it on the commas.
In practice though I would probably use the CSV Module or something like Pandas which handle edge cases better. For example the above approach will break if a name or some other value has a comma in it!
with open('values.txt', 'r') as f:
first_names, last_names, another_values, a_values, b_values = \
(list(tt) for tt in zip(*[l.rstrip().split(',') for l in f]))
Unless you need update, list conversion list(tt) for tt in is also unnecessary.
May use izip from itertools instead of zip.
If you are allow to decide file format, saving and loading as json format may be useful.
import json
#test data
FirstName = range(5)
Surname = range(5,11)
AnotherValue = range(11,16)
numberAvec = range(16,21)
numberBvec = range(21,26)
#save
all = [FirstName,Surname,AnotherValue,numberAvec,numberBvec]
with open("values.txt","w") as fp:
json.dump(all,fp)
#load
with open("values.txt","r") as fp:
FirstName,Surname,AnotherValue,numberAvec,numberBvec = json.load(fp)
values.txt:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]]