When I run this function on the file with 10 lines, it prints out the lines, but returns the length of 0. If I reverse the order, it prints out the length, but not the file content. I suppose this is a scope related issue, but not sure how to fix it
def read_samples(name):
with open ( '../data/samples/' + name + '.csv', encoding='utf-8', newline='') as file:
data = csv.reader(file)
for row in data:
print (row)
lines = len ( list ( data ) )
print(lines)
You are getting 0 because you have already looped over data.So now it is 0.It is an iterator which is consumed.
def read_samples(name):
with open ( '../data/samples/' + name + '.csv', encoding='utf-8', newline='') as file:
data = csv.reader(file)
x=0
for row in data:
x+=1
print (row)
lines = x
print(lines)
The file reader remembers its place in the file like a bookmark
Printing each line and getting the length both move the bookmark all the way to the end of the file
Add data.seek(0) between the loop and getting the length. This moves the bookmark back to the start of the file
Try
def read_samples(name):
with open ( '../data/samples/' + name + '.csv', encoding='utf-8', newline='') as file:
my_list = []
data = csv.reader(file)
for row in data:
print (row)
my_list.append(row)
lines = len (my_list)
print(lines)
The csv.reader() function returns an iterator (which are "consumed" when you use them) so you can only use data once. You can coerce convert it to a list before you do any operations on it to do what you want, e.g. data = list(data).
Related
I created a program to create a csv where every number from 0 to 1000000
import csv
nums = list(range(0,1000000))
with open('codes.csv', 'w') as f:
writer = csv.writer(f)
for val in nums:
writer.writerow([val])
then another program to remove a number from the file taken as input
import csv
import os
while True:
members= input("Please enter a number to be deleted: ")
lines = list()
with open('codes.csv', 'r') as readFile:
reader = csv.reader(readFile)
for row in reader:
if all(field != members for field in row):
lines.append(row)
else:
print('Removed')
os.remove('codes.csv')
with open('codes.csv', 'w') as writeFile:
writer = csv.writer(writeFile)
writer.writerows(lines)
The above code is working fine on any other device except my pc, in the first program it creates the csv file with empty rows between every number, in the second program the number of empty rows multiplies and the file size also multiples.
what is wrong with my device then?
Thanks in advance
I think you shouldn't use a csv file for single column data. Use a json file instead.
And the code that you've written for checking which value to not remove, is unnecessary. Instead you could write a list of numbers to the file, and read it back to a variable while removing a number you desire to, using the list.remove() method.
And then write it back to the file.
Here's how I would've done it:
import json
with open("codes.json", "w") as f: # Write the numbers to the file
f.write(json.dumps(list(range(0, 1000000))))
nums = None
with open("codes.json", "r") as f: # Read the list in the file to nums
nums = json.load(f)
to_remove = int(input("Number to remove: "))
nums.remove(to_remove) # Removes the number you want to
with open("codes.json", "w") as f: # Dump the list back to the file
f.write(json.dumps(nums))
Seems like you have different python versions.
There is a difference between built-in python2 open() and python3 open(). Python3 defaults to universal newlines mode, while python2 newlines depends on mode argument open() function.
CSV module docs provides a few examples where open() called with newline argument explicitly set to empty string newline='':
import csv
with open('some.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(someiterable)
Try to do the same. Probably without explicit newline='' your writerows calls add one more newline character.
CSV file from English - Comma-Separated Values, you have a record with spaces.
To remove empty lines - when opening a file for writing, add newline="".
Since this format is tabular data, you cannot simply delete the element, the table will go. It is necessary to insert an empty string or "NaN" instead of the deleted element.
I reduced the number of entries and made them in the form of a table for clarity.
import csv
def write_csv(file, seq):
with open(file, 'w', newline='') as f:
writer = csv.writer(f)
for val in seq:
writer.writerow([v for v in val])
nums = ((j*10 + i for i in range(0, 10)) for j in range(0, 10))
write_csv('codes.csv', nums)
nums_new = []
members = input("Please enter a number, from 0 to 100, to be deleted: ")
with open('codes.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
rows_new = []
for elem in row:
if elem == members:
elem = ""
rows_new.append(elem)
nums_new.append(rows_new)
write_csv('codesdel.csv', nums_new)
I've got a csv file such as:
cutsets
x1
x3,x5
x2
x4,x6
x5,x7
x6,x8
x7,x9
x6,x8,x10
I run the following Py script:
import csv
# Reads Boolean expression from cutsets file
expr = []
with open("MCS_overlap.csv", "r") as csv_file:
csv_reader = csv.reader(csv_file)
# skip the first row
next(csv_reader)
for lines in csv_reader:
expr = expr + lines + ['|']
del expr[-1]
final_expr=str(''.join(expr)).replace(",","&")
print("The Boolean expression is")
print(final_expr)
and get the output:
The Boolean expression is
x1|x3x5|x2|x4x6|x5x7|x6x8|x7x9|x6x8x10
With final_expr=str(''.join(expr)).replace(",","&") I was hoping to get a "&" between any two variables enclosed by a "|", e.g. "x4&x6","x6&x8&x10". But as can be seen the variables were simply concatenated. How do I accomplish insert "&" given I cannot change the format of the input file?
Thanks
Gui
Here you go:
expr =[]
f = open('MCS_overlap.csv')
expr.append(f.read())
final_expr = expr[0].replace('\n', '|').replace(',', '&')
print(final_expr)
Prints:
'x1|x3&x5|x2|x4&x6|x5&x7|x6&x8|x7&x9|x6&x8&x10'
Because you are using csv module, lines is a list and as a result expr is a list with elements being all x-es and some pipes |. You can print to see for yourself. When you do ''.join(expr) it just concatenates all elements, no commas (i.e. nothing to replace).
this should do
import csv
# Reads Boolean expression from cutsets file
with open("MCS_overlap.csv", "r") as csv_file:
csv_reader = csv.reader(csv_file)
# skip the first row
next(csv_reader)
lines = ('&'.join(line) for line in csv_reader)
final_expr = '|'.join(lines)
print(final_expr)
of course, you can do without csv module
with open("MCS_overlap.csv", "r") as csv_file:
next(csv_file)
lines = (line.strip().replace(',', "&") for line in csv_file)
final_expr = '|'.join(lines)
print(final_expr)
Note, both snippets not tested, but I expect to do the task for you.
This is my code i am able to print each line but when blank line appears it prints ; because of CSV file format, so i want to skip when blank line appears
import csv
import time
ifile = open ("C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv", "rb")
for line in csv.reader(ifile):
if not line:
empty_lines += 1
continue
print line
If you want to skip all whitespace lines, you should use this test: ' '.isspace().
Since you may want to do something more complicated than just printing the non-blank lines to the console(no need to use CSV module for that), here is an example that involves a DictReader:
#!/usr/bin/env python
# Tested with Python 2.7
# I prefer this style of importing - hides the csv module
# in case you do from this_file.py import * inside of __init__.py
import csv as _csv
# Real comments are more complicated ...
def is_comment(line):
return line.startswith('#')
# Kind of sily wrapper
def is_whitespace(line):
return line.isspace()
def iter_filtered(in_file, *filters):
for line in in_file:
if not any(fltr(line) for fltr in filters):
yield line
# A dis-advantage of this approach is that it requires storing rows in RAM
# However, the largest CSV files I worked with were all under 100 Mb
def read_and_filter_csv(csv_path, *filters):
with open(csv_path, 'rb') as fin:
iter_clean_lines = iter_filtered(fin, *filters)
reader = _csv.DictReader(iter_clean_lines, delimiter=';')
return [row for row in reader]
# Stores all processed lines in RAM
def main_v1(csv_path):
for row in read_and_filter_csv(csv_path, is_comment, is_whitespace):
print(row) # Or do something else with it
# Simpler, less refactored version, does not use with
def main_v2(csv_path):
try:
fin = open(csv_path, 'rb')
reader = _csv.DictReader((line for line in fin if not
line.startswith('#') and not line.isspace()),
delimiter=';')
for row in reader:
print(row) # Or do something else with it
finally:
fin.close()
if __name__ == '__main__':
csv_path = "C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv"
main_v1(csv_path)
print('\n'*3)
main_v2(csv_path)
Instead of
if not line:
This should work:
if not ''.join(line).strip():
my suggestion would be to just use the csv reader who can delimite the file into rows. Like this you can just check whether the row is empty and if so just continue.
import csv
with open('some.csv', 'r') as csvfile:
# the delimiter depends on how your CSV seperates values
csvReader = csv.reader(csvfile, delimiter = '\t')
for row in csvReader:
# check if row is empty
if not (row):
continue
You can always check for the number of comma separated values. It seems to be much more productive and efficient.
When reading the lines iteratively, as these are a list of comma separated values you would be getting a list object. So if there is no element (blank link), then we can make it skip.
with open(filename) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=",")
for row in csv_reader:
if len(row) == 0:
continue
You can strip leading and trailing whitespace, and if the length is zero after that the line is empty.
import csv
with open('userlist.csv') as f:
reader = csv.reader(f)
user_header = next(reader) # Add this line if there the header is
user_list = [] # Create a new user list for input
for row in reader:
if any(row): # Pick up the non-blank row of list
print (row) # Just for verification
user_list.append(row) # Compose all the rest data into the list
This example just prints the data in array form while skipping the empty lines:
import csv
file = open("data.csv", "r")
data = csv.reader(file)
for line in data:
if line: print line
file.close()
I find it much clearer than the other provided examples.
import csv
ifile=csv.reader(open('C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv', 'rb'),delimiter=';')
for line in ifile:
if set(line).pop()=='':
pass
else:
for cell_value in line:
print cell_value
This is my code i am able to print each line but when blank line appears it prints ; because of CSV file format, so i want to skip when blank line appears
import csv
import time
ifile = open ("C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv", "rb")
for line in csv.reader(ifile):
if not line:
empty_lines += 1
continue
print line
If you want to skip all whitespace lines, you should use this test: ' '.isspace().
Since you may want to do something more complicated than just printing the non-blank lines to the console(no need to use CSV module for that), here is an example that involves a DictReader:
#!/usr/bin/env python
# Tested with Python 2.7
# I prefer this style of importing - hides the csv module
# in case you do from this_file.py import * inside of __init__.py
import csv as _csv
# Real comments are more complicated ...
def is_comment(line):
return line.startswith('#')
# Kind of sily wrapper
def is_whitespace(line):
return line.isspace()
def iter_filtered(in_file, *filters):
for line in in_file:
if not any(fltr(line) for fltr in filters):
yield line
# A dis-advantage of this approach is that it requires storing rows in RAM
# However, the largest CSV files I worked with were all under 100 Mb
def read_and_filter_csv(csv_path, *filters):
with open(csv_path, 'rb') as fin:
iter_clean_lines = iter_filtered(fin, *filters)
reader = _csv.DictReader(iter_clean_lines, delimiter=';')
return [row for row in reader]
# Stores all processed lines in RAM
def main_v1(csv_path):
for row in read_and_filter_csv(csv_path, is_comment, is_whitespace):
print(row) # Or do something else with it
# Simpler, less refactored version, does not use with
def main_v2(csv_path):
try:
fin = open(csv_path, 'rb')
reader = _csv.DictReader((line for line in fin if not
line.startswith('#') and not line.isspace()),
delimiter=';')
for row in reader:
print(row) # Or do something else with it
finally:
fin.close()
if __name__ == '__main__':
csv_path = "C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv"
main_v1(csv_path)
print('\n'*3)
main_v2(csv_path)
Instead of
if not line:
This should work:
if not ''.join(line).strip():
my suggestion would be to just use the csv reader who can delimite the file into rows. Like this you can just check whether the row is empty and if so just continue.
import csv
with open('some.csv', 'r') as csvfile:
# the delimiter depends on how your CSV seperates values
csvReader = csv.reader(csvfile, delimiter = '\t')
for row in csvReader:
# check if row is empty
if not (row):
continue
You can always check for the number of comma separated values. It seems to be much more productive and efficient.
When reading the lines iteratively, as these are a list of comma separated values you would be getting a list object. So if there is no element (blank link), then we can make it skip.
with open(filename) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=",")
for row in csv_reader:
if len(row) == 0:
continue
You can strip leading and trailing whitespace, and if the length is zero after that the line is empty.
import csv
with open('userlist.csv') as f:
reader = csv.reader(f)
user_header = next(reader) # Add this line if there the header is
user_list = [] # Create a new user list for input
for row in reader:
if any(row): # Pick up the non-blank row of list
print (row) # Just for verification
user_list.append(row) # Compose all the rest data into the list
This example just prints the data in array form while skipping the empty lines:
import csv
file = open("data.csv", "r")
data = csv.reader(file)
for line in data:
if line: print line
file.close()
I find it much clearer than the other provided examples.
import csv
ifile=csv.reader(open('C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv', 'rb'),delimiter=';')
for line in ifile:
if set(line).pop()=='':
pass
else:
for cell_value in line:
print cell_value
I just want the last number of each line.
with open(home + "/Documents/stocks/" + filePath , newline='') as f:
stockArray = (line.split(',') for line in f.readlines())
for line in stockArray:
List = line.pop()
#print(line.pop())
#print(', '.join(line))
else:
print("Finished")
I tried using the line.pop() to take the last element but it only takes it from one line? How can I get it from each line and store it in list?
You probably just want something like:
last_col = [line.split(',')[-1] for line in f]
For more complicated csv files, you might want to look into the csv module in the standard library as that will properly handle quoting of fields, etc.
my_list = []
with open(home + "/Documents/stocks/" + filePath , newline='') as f:
for line in f:
my_list.append(line[-1]) # adds the last character to the list
That should do it.
If you want to add the last element of a list from the file:
my_list = []
with open(home + "/Documents/stocks/" + filePath , newline='') as f:
for line in f:
my_list.append(line.split(',')[-1]) # adds the last character to the list