I'm trying to print the values between two heights in a prn file.
The error I have is that the delimiter just use one whitespace and not all.
The code works correctly because with .csv and .txt files works. I get the error in the first line of the loop because it's taking the second whitespace and not the first number. I tried too with a tab but the error that I got is list index out of range
This is the error I got
Traceback (most recent call last): File
"c:/Users/name/Desktop/script.py", line 86, in
col = float(row[4].replace(',','.')) ValueError: could not convert string to float: ''
with open(fileName) as File:
reader = csv.reader(File, delimiter=delimiterOption)
next(reader)
for row in reader:
col = float(row[4].replace(',','.'))
if col>=float(height1) and col<=float(height2) or col==float(height1) and col==float(height2) or col<=float(height1) and col>=float(height2):
print(row)
This is the .prn file
Example: If I put the values 133 and 135 should print the first line(name1)
You're having problems with your file because the csv module expects CSV files have their values separated by a single character. Your file's values are separated by several space characters, not just one. So it thinks that when you have a string like "foo bar" (with four spaces in between the real words), you're actually looking at five values, the middle three of which are blank.
Fortunately, there is a way around this issue in this specific case. The reader class accepts a parameter named skipinitialspace which tells it to skip any extra whitespace immediately after a separator character. This works even if the separator is whitespace itself!
So try:
reader = csv.reader(File, delimiter=" ", skipinitialspace=True)
Related
this is how my data looks like when opened in Microsoft Excel.
As can be seen all the contents of the cell except 218(aligned to the right of the cell) are parsed as strings(aligned to the left of the cell). It is because they start with white space(it is " 4,610" instead of "4610").
I would like to remove all those white spaces at the beginning and also replace those commas(not the ones that make csvs csvs) because if comma exists 4 and 610 may be read into different cells.
Here's what I tried:
this is what i tried with inspiration from this stackoverflow answer:
import csv
import string
with open("old_dirty_file.csv") as bad_file:
reader = csv.reader(bad_file, delimiter=",")
with open("new_clean_file.csv", "w", newline="") as clean_file:
writer = csv.writer(clean_file)
for rec in reader:
writer.writerow(map(str.replace(__old=',', __new='').strip, rec))
But, I get this error:
Traceback (most recent call last):
File "C:/..,,../clean_files.py", line 9, in <module>
writer.writerow(map(str.replace(__old=',', __new='').strip, rec))
TypeError: descriptor 'replace' of 'str' object needs an argument
How do I clean those files?
Just need to separate replacement from stripping because python doesn't know which string the replacement should be made in.
for rec in reader:
rec = (i.replace(__old=',', __new='') for i in rec)
writer.writerow(map(str.strip, rec))
or combine them into a single function:
repstr = lambda string, old=',', new='': string.replace(old, new).strip()
for rec in reader:
writer.writerow(map(repstr, rec))
Want to find the delimiter in the text file.
The text looks:
ID; Name
1; John Mak
2; David H
4; Herry
The file consists of tabs with the delimiter.
I tried with following: by referring
with open(filename, 'r') as f1:
dialect = csv.Sniffer().sniff(f1.read(1024), "\t")
print 'Delimiter:', dialect.delimiter
The result shows: Delimiter:
Expected result: Delimiter: ;
sniff can conclude with only one single character as the delimiter. Since your CSV file contains two characters as the delimiter, sniff will simply pick one of them. But since you also pass in the optional second argument to sniff, it will only pick what's contained in that value as a possible delimiter, which in your case, is '\t' (which is not visible from your print output).
From sniff's documentation:
If the optional delimiters parameter is given, it is interpreted as a
string containing possible valid delimiter characters.
Sniffing is not guaranteed to work.
Here is one approach that will work with any kind of delimiter.
You start with what you assume is the most common delimiter ; if that fails, then you try others until you manage to parse the row.
import csv
with open('sample.csv') as f:
reader = csv.reader(f, delimiter=';')
for row in reader:
try:
a,b = row
except ValueError:
try:
a,b = row[0].split(None, 1)
except ValueError:
a,b = row[0].split('\t', 1)
print('{} - {}'.format(a.strip(), b.strip()))
You can play around with this at this replt.it link, play with the sample.csv file if you want to try out different delimiters.
You can combine sniffing with this to catch any odd delimiters that are not known to you.
I'm new to Python and I have the following csv file (let's call it out.csv):
DATE,TIME,PRICE1,PRICE2
2017-01-15,05:44:27.363000+00:00,0.9987,1.0113
2017-01-15,13:03:46.660000+00:00,0.9987,1.0113
2017-01-15,21:25:07.320000+00:00,0.9987,1.0113
2017-01-15,21:26:46.164000+00:00,0.9987,1.0113
2017-01-16,12:40:11.593000+00:00,,1.0154
2017-01-16,12:40:11.593000+00:00,1.0004,
2017-01-16,12:43:34.696000+00:00,,1.0095
and I want to truncate the second column so the csv looks like:
DATE,TIME,PRICE1,PRICE2
2017-01-15,05:44:27,0.9987,1.0113
2017-01-15,13:03:46,0.9987,1.0113
2017-01-15,21:25:07,0.9987,1.0113
2017-01-15,21:26:46,0.9987,1.0113
2017-01-16,12:40:11,,1.0154
2017-01-16,12:40:11,1.0004,
2017-01-16,12:43:34,,1.0095
This is what I have so far..
with open('out.csv','r+b') as nL, open('outy_3.csv','w+b') as nL3:
new_csv = []
reader = csv.reader(nL)
for row in reader:
time = row[1].split('.')
new_row = []
new_row.append(row[0])
new_row.append(time[0])
new_row.append(row[2])
new_row.append(row[3])
print new_row
nL3.writelines(new_row)
I can't seem to get a new line in after writing each line to the new csv file.
This definitely doesnt look or feel pythonic
Thanks
The missing newlines issue is because the file.writelines() method doesn't automatically add line separators to the elements of the argument it's passed, which it expects to be an sequence of strings. If these elements represent separate lines, then it's your responsibility to ensure each one ends in a newline.
However, your code is tries to use it to only output a single line of output. To fix that you should use file.write() instead because it expects its argument to be a single string—and if you want that string to be a separate line in the file, it must end with a newline or have one manually added to it.
Below is code that does what you want. It works by changing one of the elements of the list of strings that the csv.reader returns in-place, and then writes the modified list to the output file as single string by join()ing them all back together, and then manually adds a newline the end of the result (stored in new_row).
import csv
with open('out.csv','rb') as nL, open('outy_3.csv','wt') as nL3:
for row in csv.reader(nL):
time_col = row[1]
try:
period_location = time_col.index('.')
row[1] = time_col[:period_location] # only keep characters in front of period
except ValueError: # no period character found
pass # leave row unchanged
new_row = ','.join(row)
print(new_row)
nL3.write(new_row + '\n')
Printed (and file) output:
DATE,TIME,PRICE1,PRICE2
2017-01-15,05:44:27,0.9987,1.0113
2017-01-15,13:03:46,0.9987,1.0113
2017-01-15,21:25:07,0.9987,1.0113
2017-01-15,21:26:46,0.9987,1.0113
2017-01-16,12:40:11,,1.0154
2017-01-16,12:40:11,1.0004,
2017-01-16,12:43:34,,1.0095
I am trying to remove a row from a csv file if the 2nd column matches a string. My csv file has the following information:
Name
15 Dog
I want the row with "Name" in it removed. The code I am using is:
import csv
reader = csv.reader(open("info.csv", "rb"), delimiter=',')
f = csv.writer(open("final.csv", "wb"))
for line in reader:
if "Name" not in line:
f.writerow(line)
print line
But the "Name" row isn't removed. What am I doing wrong?
EDIT: I was using the wrong delimiter. Changing it to \t worked. Below is the code that works now.
import csv
reader = csv.reader(open("info.csv", "rb"), delimiter='\t')
f = csv.writer(open("final.csv", "wb"))
for line in reader:
if "Name" not in line:
f.writerow(line)
print line
Seems that you are specifying the wrong delimiter (comma)in csv.reader
Each line yielded by reader is a list, split by your delimiter. Which, by the way, you specified as ,, are you sure that is the delimiter you want? Your sample is delimited by tabs.
Anyway, you want to check if 'Name' is in any element of a given line. So this will still work, regardless of whether your delimiter is correct:
for line in reader:
if any('Name' in x for x in line):
#write operation
Notice the difference. This version checks for 'Name' in each list element, yours checks if 'Name' is in the list. They are semantically different because 'Name' in ['blah blah Name'] is False.
I would recommend first fixing the delimiter error. If you still have issues, use if any(...) as it is possible that the exact token 'Name' is not in your list, but something that contains 'Name' is.
I am trying to read a csv file in python. The csv file has 1400 rows. I opened the csv file using the following command:
import csv
import sys
f=csv.reader(open("/Users/Brian/Desktop/timesheets_9_1to10_5small.csv","rU"),
dialect=csv.excel_tab)
Then I tried to loop through the file to pull the first name from each row using the following commmands:
for row in f:
g=row
s=g[0]
end_of_first_name=s.find(",")
first_name=s[0:end_of_first_name]
I got the following error message:
Traceback (most recent call last):
File "", line 3, in module
s=g[0]
IndexError: list index out of range
Does anyone know why I would get this error message and how I can correct it?
You should not open the file in universal newline mode (U). Open the file in binary mode instead:
f=csv.reader(open("/Users/Brian/Desktop/timesheets_9_1to10_5small.csv","rb"),
dialect=csv.excel_tab)
CSV does it's own newline handling, including managing newlines in quotes.
Next, print your rows with print repr(row) to verify that you are getting the output you are expecting. Using repr instead of the regular string representation shows you much more about the type of objects you are handling, highlighting such differences as strings versus integers ('1' vs. 1).
Thirdly, if you want to select part of a string up to a delimiter such as a comma, use .split(delimiter, 1) or .partition(delimiter)[0]:
>>> 'John,Jack,Jill'.partition(',')[0]
'John'
row and g point to an empty list. I don't know if that necessarily means that it is empty line in the file as csv may have other issues with it.
line_counter = 0
for row in f:
line_counter = line_counter + 1
g=row
if len(g) == 0:
print "line",line_counter,"may be empty or malformed"
continue
Or, as Martijn points out, the Pythonic way is using enumerate:
for line_counter, row in enumerate(f,start=1):
g=row
if len(g) == 0:
print "line",line_counter,"may be empty or malformed"
continue