Trouble converting strings to float from inputfile - python

I'm trying to fill some floaty values from a file into a tuple with the following python code:
with open(filename, 'r') as file:
i=0
lines = file.readlines()
for line in lines:
if (line != None) and (i != 0) and (i != 1): #ignore first 2 lines
splitted = line.split(";")
s = splitted[3].replace(",",".")
lp1.heating_list[i-2] = float(s)
i+=1
The values originate from a .csv-file, where lines look like this:
MFH;0:15;0,007687511;0,013816233;0,023092447;
The problem is I get:
lp1.heating_list[i-2] = float(s)
ValueError: could not convert string to float:
And I have no idea whats wrong. Please illumiate me.

This probally means what is says. Your variable s is a string which is not a float.
You could try adding this snippet to print/find your problematic string.
try:
lp1.heating_list[i-2] = float(s)
except ValueError:
print("tried to convert {} on line {}".format(s, i))
See also a similiar question: https://stackoverflow.com/a/8420179/4295853

from io import StringIO
txt = """ligne1
ligne2
MFH;0:15;0,007687511;0,013816233;0,023092447;
MFH;0:15;0,007687511;0,013816233;0,023092447;
MFH;0:15;0,007687511;0,013816233;0,023092447;
MFH;0:15;0,007687511;0,013816233;0,023092447;
"""
lp1_heating = {}
with StringIO(txt) as file:
lines = file.readlines()[2:] # ignore first 2 lines
for i, line in enumerate(lines):
splitted = line.split(";")
s = splitted[3].replace(",",".")
lp1_heating[i] = float(s)
print(lp1_heating )
{0: 0.013816233, 1: 0.013816233, 2: 0.013816233, 3: 0.013816233}

Related

Python: How to split a dictionary value in float number

My text file looks like so: comp_339,9.93/10
I want to extract just rating point as a float value. I couldn't solve this with 'if' statement.
Is there any simple solution for this?
d2 = {}
with open('ra.txt', 'r') as r:
for line in r:
s = line.strip().split(',')
d2[s[0]] = "".join(s[1:])
print(d2)
You can do it like this:
with open('ra.txt', 'r') as r:
for line in r:
s = line.strip().split(',')
rating, _ = s[1].split("/", 1)
print(rating)
First you split the line string into "comp399" and "9.93/10"
then you keep the latter and split it again with "/" and keep the first part and convert to float.
line = "comp_339,9.93/10"
s = float(line.split()[1].split('/')[0])
# output: 9.93
You may use a simple regular expression:
import re
line = "comp_339,9.93/10"
before, rating, highscore = re.split(r'[/,]+', line)
print(rating)
Which yields
9.93
You should try this code
ratings={}
with open('ra.txt' , 'r') as f:
for l in f.readlines():
if len(l):
c, r= ratings.split(',')
ratings[c.strip()]=float(r.strip().split('/')[0].strip())

Extract text in string until certain new line ("\n") [duplicate]

We have a large raw data file that we would like to trim to a specified size.
How would I go about getting the first N lines of a text file in python? Will the OS being used have any effect on the implementation?
Python 3:
with open(path_to_file) as input_file:
head = [next(input_file) for _ in range(lines_number)]
print(head)
Python 2:
with open(path_to_file) as input_file:
head = [next(input_file) for _ in xrange(lines_number)]
print head
Here's another way (both Python 2 & 3):
from itertools import islice
with open(path_to_file) as input_file:
head = list(islice(path_to_file, lines_number))
print(head)
N = 10
with open("file.txt", "a") as file: # the a opens it in append mode
for i in range(N):
line = next(file).strip()
print(line)
If you want to read the first lines quickly and you don't care about performance you can use .readlines() which returns list object and then slice the list.
E.g. for the first 5 lines:
with open("pathofmyfileandfileandname") as myfile:
firstNlines=myfile.readlines()[0:5] #put here the interval you want
Note: the whole file is read so is not the best from the performance point of view but it
is easy to use, fast to write and easy to remember so if you want just perform
some one-time calculation is very convenient
print firstNlines
One advantage compared to the other answers is the possibility to select easily the range of lines e.g. skipping the first 10 lines [10:30] or the lasts 10 [:-10] or taking only even lines [::2].
What I do is to call the N lines using pandas. I think the performance is not the best, but for example if N=1000:
import pandas as pd
yourfile = pd.read_csv('path/to/your/file.csv',nrows=1000)
There is no specific method to read number of lines exposed by file object.
I guess the easiest way would be following:
lines =[]
with open(file_name) as f:
lines.extend(f.readline() for i in xrange(N))
The two most intuitive ways of doing this would be:
Iterate on the file line-by-line, and break after N lines.
Iterate on the file line-by-line using the next() method N times. (This is essentially just a different syntax for what the top answer does.)
Here is the code:
# Method 1:
with open("fileName", "r") as f:
counter = 0
for line in f:
print line
counter += 1
if counter == N: break
# Method 2:
with open("fileName", "r") as f:
for i in xrange(N):
line = f.next()
print line
The bottom line is, as long as you don't use readlines() or enumerateing the whole file into memory, you have plenty of options.
Based on gnibbler top voted answer (Nov 20 '09 at 0:27): this class add head() and tail() method to file object.
class File(file):
def head(self, lines_2find=1):
self.seek(0) #Rewind file
return [self.next() for x in xrange(lines_2find)]
def tail(self, lines_2find=1):
self.seek(0, 2) #go to end of file
bytes_in_file = self.tell()
lines_found, total_bytes_scanned = 0, 0
while (lines_2find+1 > lines_found and
bytes_in_file > total_bytes_scanned):
byte_block = min(1024, bytes_in_file-total_bytes_scanned)
self.seek(-(byte_block+total_bytes_scanned), 2)
total_bytes_scanned += byte_block
lines_found += self.read(1024).count('\n')
self.seek(-total_bytes_scanned, 2)
line_list = list(self.readlines())
return line_list[-lines_2find:]
Usage:
f = File('path/to/file', 'r')
f.head(3)
f.tail(3)
most convinient way on my own:
LINE_COUNT = 3
print [s for (i, s) in enumerate(open('test.txt')) if i < LINE_COUNT]
Solution based on List Comprehension
The function open() supports an iteration interface. The enumerate() covers open() and return tuples (index, item), then we check that we're inside an accepted range (if i < LINE_COUNT) and then simply print the result.
Enjoy the Python. ;)
For first 5 lines, simply do:
N=5
with open("data_file", "r") as file:
for i in range(N):
print file.next()
If you want something that obviously (without looking up esoteric stuff in manuals) works without imports and try/except and works on a fair range of Python 2.x versions (2.2 to 2.6):
def headn(file_name, n):
"""Like *x head -N command"""
result = []
nlines = 0
assert n >= 1
for line in open(file_name):
result.append(line)
nlines += 1
if nlines >= n:
break
return result
if __name__ == "__main__":
import sys
rval = headn(sys.argv[1], int(sys.argv[2]))
print rval
print len(rval)
If you have a really big file, and assuming you want the output to be a numpy array, using np.genfromtxt will freeze your computer. This is so much better in my experience:
def load_big_file(fname,maxrows):
'''only works for well-formed text file of space-separated doubles'''
rows = [] # unknown number of lines, so use list
with open(fname) as f:
j=0
for line in f:
if j==maxrows:
break
else:
line = [float(s) for s in line.split()]
rows.append(np.array(line, dtype = np.double))
j+=1
return np.vstack(rows) # convert list of vectors to array
This worked for me
f = open("history_export.csv", "r")
line= 5
for x in range(line):
a = f.readline()
print(a)
I would like to handle the file with less than n-lines by reading the whole file
def head(filename: str, n: int):
try:
with open(filename) as f:
head_lines = [next(f).rstrip() for x in range(n)]
except StopIteration:
with open(filename) as f:
head_lines = f.read().splitlines()
return head_lines
Credit go to John La Rooy and Ilian Iliev. Use the function for the best performance with exception handle
Revise 1: Thanks FrankM for the feedback, to handle file existence and read permission we can futher add
import errno
import os
def head(filename: str, n: int):
if not os.path.isfile(filename):
raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), filename)
if not os.access(filename, os.R_OK):
raise PermissionError(errno.EACCES, os.strerror(errno.EACCES), filename)
try:
with open(filename) as f:
head_lines = [next(f).rstrip() for x in range(n)]
except StopIteration:
with open(filename) as f:
head_lines = f.read().splitlines()
return head_lines
You can either go with second version or go with the first one and handle the file exception later. The check is quick and mostly free from performance standpoint
Starting at Python 2.6, you can take advantage of more sophisticated functions in the IO base clase. So the top rated answer above can be rewritten as:
with open("datafile") as myfile:
head = myfile.readlines(N)
print head
(You don't have to worry about your file having less than N lines since no StopIteration exception is thrown.)
This works for Python 2 & 3:
from itertools import islice
with open('/tmp/filename.txt') as inf:
for line in islice(inf, N, N+M):
print(line)
fname = input("Enter file name: ")
num_lines = 0
with open(fname, 'r') as f: #lines count
for line in f:
num_lines += 1
num_lines_input = int (input("Enter line numbers: "))
if num_lines_input <= num_lines:
f = open(fname, "r")
for x in range(num_lines_input):
a = f.readline()
print(a)
else:
f = open(fname, "r")
for x in range(num_lines_input):
a = f.readline()
print(a)
print("Don't have", num_lines_input, " lines print as much as you can")
print("Total lines in the text",num_lines)
Here's another decent solution with a list comprehension:
file = open('file.txt', 'r')
lines = [next(file) for x in range(3)] # first 3 lines will be in this list
file.close()
An easy way to get first 10 lines:
with open('fileName.txt', mode = 'r') as file:
list = [line.rstrip('\n') for line in file][:10]
print(list)
#!/usr/bin/python
import subprocess
p = subprocess.Popen(["tail", "-n 3", "passlist"], stdout=subprocess.PIPE)
output, err = p.communicate()
print output
This Method Worked for me
Simply Convert your CSV file object to a list using list(file_data)
import csv;
with open('your_csv_file.csv') as file_obj:
file_data = csv.reader(file_obj);
file_list = list(file_data)
for row in file_list[:4]:
print(row)

how to find the sum of a list when there is a empty element

if I have a file which contains
10,20,,30
10,20,5,20
3,20,19,50
10,20,10,30
how to get the sum of each line and ignore that the first line has a blank element
this is what I've got, but it only gives me this error message
TypeError: unsupported operand type(s) for +: 'int' and 'str'
file = open(inputFile, 'r')
for line in file:
result = sum(line)
print(result)
Version 1:
with open("input.txt") as f:
for line in f:
result = sum(int(x or 0) for x in line.split(","))
print(result)
This handles the empty string case (since int("" or 0) == int(0) == 0), and is fairly short, but is otherwise not exceedingly robust.
Version 2:
with open("input.txt") as f:
for line in f:
total = 0
for item in line.split(","):
try:
total += int(item)
except ValueError:
pass
print(total)
This is more robust to malformed inputs (it will skip all invalid items rather than throwing an exception). This may be useful if you're parsing messy (e.g. hand-entered) data.
It's not working because you try to sum string like this
'10,20,,30'
or this
'10,20,5,20'
My solution:
import re
regex = re.compile(r',,|,')
with open(file, 'r') as f:
for line in f:
s = 0
for x in regex.split(line):
s += int(x)
print(s)
My code looks stupid, but it works, you can found another way, let the codes looks more smart and professional.
def my_function():
file = open("exampleValues","r")
total = 0
for line in file:
a,b,c,d=line.split(",")
if a=='':
a=0
if b=='':
b=0
if c=='':
c=0
if d=='':
d=0
a=int(a)
b = int(b)
c = int(c)
d = int(d)
print(a+b+c+d)
my_function()

Python - delete strings that are float numbers

I have a file with lots of lines that contain numbers seperated by coma , on each line.
Some lines contain numbers that are float numbers ie: ['9.3']
i have been trying to delete those numbers from the list for about 1h~ but no luck. and whenever it tries to use int on a float number it gives error. Im not sure how to remove these floating numbers from the lines
the numbers: https://pastebin.com/7vveTxjW
this is what i've done so far:
with open('planets.txt','r') as f:
lst = []
temp = []
for line in f:
l = line.strip()
l = l.split(',')
for x in l:
if x == '':
l[l.index(x)] = 0
elif x == '\n' or x == '0':
print "removed value", l[l.index(x)]
del l[l.index(x)]
try:
temp.append([int(y) for y in l])
except ValueError:
pass
First off, modifying the list you are iterating over is a bad idea. A better idea might be to construct a new list from the elements that can be converted to an int.
def is_int(element):
try:
int(element)
except ValueError:
return False
return True
with open('planets.txt','r') as f:
lst = []
temp = []
for line in f:
l = line.strip().split(',')
temp.append([int(y) for y in l if is_int(y)])
If you want to include the float values' integral component, you can do:
def is_float(element):
try:
float(element)
except ValueError:
return False
return True
with open('planets.txt','r') as f:
lst = []
temp = []
for line in f:
l = line.strip().split(',')
temp.append([int(float(y)) for y in l if is_float(y)])
looks like youre over complicating it, once you have got all your numbers from the file to a list just run this
numbers=["8","9","10.5","11.1"]
intList=[]
for num in numbers:
try:
int_test=int(num)
intList.append(num)
except ValueError:
pass
just change my numbers list to your list name
You can just match digits followed by a point followed by more digits:
import re
output_list = []
input = open('input.txt', 'r')
for line in input:
if '.' not in line:
output_list.append(line)
else:
output_list.append(re.sub(r"(\d+\.\d+)", '', line))
print("Removed", re.search(r"(\d+\.\d+)", line).group(1))
I would keep the numbers as string and simply check if there is a 'dot' in each 'splitted item' to identify floats.
with open('planets.txt','r') as f:
for line in f:
line = ','.join([item for item in line.strip().split(',') if not '.' in item])
print(line)
# ... write to file ...
When you get a list that contains digits (for instance, ['1','2.3','4.5','1.0']) you can use the following
intDigits = [ i for i in digits if float(i) - int(float(i)) == 0]
Doing int() on a float shouldn't give an error. Maybe your 'float' is actually a string as you're reading from the file, because
int('9.3')
doesn't work but
int(9.3)
does.
Edit:
How about applying this function to every number
def intify(n):
if n == '':
return n
try:
return int(n)
except ValueError:
return int(float(n))
If you just need to remove floats this was the simplest method for me.
mixed = [1,float(1),2,3,4.3,5]
ints = [i for i in mixed if type(i) is not float]
Results in: [1, 2, 3, 5]

Error: Could not convert string to float

This is my code. I want to read the data from "data.txt" and convert it to type float, but there is a problem converting the data. The error says " could not convert string to float: ".
def CrearMatriz():
archi = open("data.txt", "r")
num = archi.readlines()
for lines in num:
nums= [float(x) for x in lines.strip().split(",")]
return nums
Numero = CrearMatriz()
The file contains numbers separated by commas.
The only way I can reproduce this is by having a blank line at the end of the file. So my data file looks like this:
12.34 , 56.78 , 90.12 , 34.56
There's a blank line after it, but I can't display that! So here it is in another format:
od -xc data.txt
0000000 3231 332e 2034 202c 3635 372e 2038 202c
1 2 . 3 4 , 5 6 . 7 8 ,
0000020 3039 312e 2032 202c 3433 352e 0a36 000a
9 0 . 1 2 , 3 4 . 5 6 \n \n
0000037
Note the \n\n near end-of file.
Try this code:
def CrearMatriz():
archi = open("data.txt", "r")
num = archi.readlines()
for line in num:
line=line.strip()
if line:
nums=[float(x) for x in line.split(",")]
return nums
Numero = CrearMatriz()
However, that does not work if you have more than one line of numbers! It will only return the last line of numbers. So this might be better (depending on what you need):
def CrearMatriz():
nums = []
for line in open("data.txt", "r"):
line=line.strip()
if line:
nums.extend([float(x) for x in line.split(",")])
return nums
Numero = CrearMatriz()
If you want a list of lists, then change the extend to append.
I am not saying it's the perfect answer, but I suggest you do your parsing outside of the list comprehension proper and append to nums on success, at least until your formatting fixes everything.
From looking at the errors I am getting, one possibility is that it looks like lines.strip() doesn't account for blank lines in your file.
def CrearMatriz():
archi = open("data.txt", "r")
num = archi.readlines()
for lines in num:
nums= [float(x) for x in lines.strip().split(",")]
return nums
def CrearMatriz2():
with open("data.txt", "r") as archi:
#having it in a separate function allows more robust formatting
#and error handling. append to nums on success
def tofloat(nums, x, lines, cntr):
try:
res = float(x)
assert isinstance(res, float)
nums.append(float(x))
except Exception, e:
msg = "exception %s on field:%s: for line #%s:%s:" % (e, x, cntr,lines)
print (msg)
return
nums = []
num = archi.readlines()
for cntr, lines in enumerate(num):
[tofloat(nums, x, lines, cntr) for x in lines.strip().split(",")]
return nums
data="""
0.12,30.2,30.5,22
3,abc,4
"""
with open("data.txt","w") as f_test:
f_test.write(data)
try:
print "\n\ncalling CrearMatriz"
Numero = CrearMatriz()
print "\nCrearMatriz()=>", Numero
except Exception, e:
print e
try:
print "\n\ncalling CrearMatriz2"
Numero = CrearMatriz2()
print "\nCrearMatriz2()=>", Numero
except Exception, e:
print e
and this gives...
calling CrearMatriz
could not convert string to float:
calling CrearMatriz2
exception could not convert string to float: on field:: for line #0:
:
exception could not convert string to float: abc on field:abc: for line #2:3,abc,4
:
exception could not convert string to float: on field:: for line #3:
:
CrearMatriz2()=> [0.12, 30.2, 30.5, 22.0, 3.0, 4.0]
I'm interested in the type of
lines.strip().split(",")
It seems to me that strip() should work on a string, but lines would be a list of strings. You might need a double comprehension, something like:
[[float(x) for x in line.strip().split(",")] for line in lines]
Without your data, I can't verify this, though.
You might try removing the float(x) and just using
nums= [x for x in lines.strip().split(",")]
print(nums)
to see what the exact values are that float(x) is choking on.

Categories