Float iteration needed for calculation from .csv file - python

I am looking to calculate the standard deviation for each column of a .csv file I have. Everything works up to this point but I keep getting the "TypeError: 'float object is not iterable" message. However, I need two of these values to be floats in order to retain the decimal places for accuracy. Is there a way to do this calculation with floats and not using iteration? Or is there an exception in the rule where I am able to use floats?
Here is the needed part of my code:
import math
fileChoice = input("Enter the file name: ")
inputFile = open(fileChoice)
headers = inputFile.readline().strip().split(',')
dataColumns = []
for i in headers:
dataColumns.append([])
rowCount = 0
for row in inputFile:
rowCount = rowCount + 1
comps = row.strip().split(',')
for j in range(len(comps)):
dataColumns[j].append(float(comps[j]))
l = 0
for entry in dataColumns:
mean = sum(dataColumns[l])/rowCount
stdDevSum = 0
for x in dataColumns:
stdDevSum = float(stdDevSum) + (((float(comps[row]) - float(mean))** 2) for row in range(rowCount))
stdDev = math.sqrt(stdDevSum / rowCount)
print(l + 1, headers[l], max(dataColumns[l]), min(dataColumns[l]), "{0:0.2f}".format(mean), stdDev)
l = l + 1
inputFile.close()
Edit:
Solution has been found

There's an error in this line:
stdDev = math.sqrt((sum((float(comps[l]) - float(mean)) ** 2) in range(rowCount)) / rowCount)
The error comes specifically from the expression sum((float(comps[l]) - float(mean)). When you do sum(something), Python tries to iterate over something. But in this case, the thing it's trying to iterate on is float(comps[l]) - float(mean), which is just a number. Hence the error: 'float' object is not iterable.
Also note that your use of in range(rowCount) is wrong. a in b means "return true if a is in b, return false otherwise". You were probably looking for the for i in iterable syntax.
Solution
I'm assuming that you want the sum of comps[row] - mean for each row. Try this:
stdDev = math.sqrt(sum( (float(comps[row]) - float(mean)) **2 for row in range(rowCount) ) / rowCount)

Related

Error when passing list of float values to function

Unsupported operand type(s) for -: 'str' and 'str'
I am passing two lists to a functions to find the starting distance,ending distance ,starting time, ending time using sesnor data. when the list contains only integer values, it doesn't throw any error and works fine but when i tried to convert the list to floating value , its showing an error
x = ["%.2f"%(b*1000) for b in t] # t is a list of time values
y = [c*0.002 for c in values]# values is a list of sensor values
z = ["%.2f"%(d*48.484) for d in y]
p1,t1 = min_distance(z,x)
p2,t2 = max_distance(z,x)
def min_distance(self,z,x):
count = True
i = 0
while count and (i+1) !=len(z):
if abs(z[i] - z[i+1]) >= 1:
count = False
else:
i +=1
min_value = z[i])
min_time = x[i])
return min_value,min_time
def max_distance(self,z,x):
count = 0
j = 1
while count<20:
if abs(z[-j] - z[-j-1]) >=1:
count +=1
else:
j +=1
max_value = z[-j+20]
max_time = x[-j+20]
return max_value,max_time
x, y and z are arrays of strings, not integers or floats. You are creating them using string substitution, which inserts and formats a float value but results in a string.
To ensure you get a float, ensure either that b and c are floats, or force the integer to a float by adding a decimal point. For example, if b is an integer, b*1000 will result in an integer, but b*1000.0 or float(b*1000) will give a float
Try, for example: x = [b*1000 for b in t] and so on then format any resulting string (like you had done) only when you need output visible to the user.

Rounding to 2 decimal places in Python

This isn't a duplicate because I have checked everything before this post on this site. I think I have managed to do the first two bullet points. The first one I will do through a string but I am willing to change that if you know another way. The 2nd one is using comma seperators for the $'s. So I will use a float but once again am willing to change if better way is found.
But I am stuck.
And the "print("%.2f") % str) is something I found but I need work on rounding to two decimal spaces and the last bullet point.
Code:
import random
def random_number():
random_dollars = random.uniform(1.00, 10000.00)
print(round(random_dollars, 2))
print("%.2f") % str
print(random_number())
Shell:
C:\Users\jacke\PycharmProjects\ASLevelHomeworkWeek18\venv\Scripts\python.exe C:/Users/jacke/PycharmProjects/ASLevelHomeworkWeek18/ASLevelHomeworkWeek18.py 6567.62 Traceback (most recent call last): %.2f File
C:/Users/jacke/PycharmProjects/ASLevelHomeworkWeek18/ASLevelHomeworkWeek18.py", line 10, in <module> print(random_number()) File
C:/Users/jacke/PycharmProjects/ASLevelHomeworkWeek18/ASLevelHomeworkWeek18.py", line 7, in random_number print("%.2f") % str TypeError: unsupported operand type(s) for %: 'NoneType' and 'type' Process finished with exit code 1
You can format currency like this:
def random_number():
random_dollars = random.uniform(1, 10000)
result = '$ {:0>9}'.format('{:,.2f}'.format(random_dollars))
print(result)
{:0>10} means: pad string left to width 9 with 0's.
{:,.2f} rounds to two decimal places (.2f) and adds a comma as thousands-separator.
Just one side note: by using random.uniform(1, 10000) most of your numbers will be large (>1000), if you want to test your script with small amounts you could use random_dollars = 10**random.uniform(0, 4) instead:
def random_number():
random_dollars = 10**random.uniform(0, 4)
result = '$ {:0>9}'.format('{:,.2f}'.format(random_dollars))
print(result)
If I get what you are saying you want to round a number to 2 decimal places. Here is how I would do it.
import random
def random_number():
random_dollars = random.uniform(1, 10000)
split = str(random_dollars).split(".")
if (len(split) == 2 ):
if (len(split[1]) == 1 ):# checks if 1 digit after decimal place
answer = split[0] + ".0"
split[1] = str(int(int(split[1]) / (10 ** (len(split[1]) - 3) )))
# Gets 3 decimal places
if int(split[1][-1:]) => 5: #Checks if last digit is above or equal to 5
split[1] = int(split[1][:-1])
split[1] += 1
else:
split[1] = int(split[1][:-1])
answer = split[0] + '.' + str(split[1])
else:
answer = split[0] + ".00"
print(answer)
random_number()
This makes it so if the random number is somehow 100 it will add 2 zeros. If the number is like 100.1 it will add one zero. It will also round it.
def random_number():
random_dollars = random.uniform (1.00, 10000.00)
n = round(random_dollars,2)
bd, d = str(n).split('.')
if len(d) == 1:
n = bd + "." + d + '0'
return n
else:
return n
for i in range(1, 20):
print(random_number())
7340.55
7482.70
3956.81
3044.50
4108.57
4864.90
235.00
9831.98
960.97
1172.28
5221.31
3663.50
5410.50
3448.52
8288.13
293.48
1390.68
9216.15
6493.65
TL;DR: you have to put the % directly after the string and you have to put a real variable there, not the type str
from the last line of your error message you can see that the problem is the % operator. You can also see that it tried to do the operation with two objects of types 'NoneType' and 'type'. Since you put the entire print statement in front of the % and print returns None (which is of type NoneType), the first operand is of type NoneType. then, the second operand is the type str, which is, as just said, a type. You can fix this by moving the % operator after the string and replacing str with your variable random_dollars since that is what you want to insert into the string.
import random
def random_number():
random_dollars = random.uniform(1.00, 10000.00)
print(round(random_dollars, 2))
# this:
print("%.2f" % random_dollars)
print(random_number())

Error with user input for standard deviation program

My program is meant to calculate the standard deviation for 5 values given by the users. There is an issue with my code when getting the input in a for loop. Why is that?
givenValues = []
def average(values):
for x in range(0, 6):
total = total + values[x]
if(x==5):
average = total/x
return average
def sqDiff(values):
totalSqDiff = 0
sqDiff = []
av = average(values)
for x in range(0,6):
sqDiff[x] = (values[x] - av)**2
totalSqDiff = totalSqDiff + sqDiff[x]
avSqDiff = totalSqDiff / 5
SqDiffSquared = avSqDiff**2
return SqDiffSquared
for counter in range(0,6):
givenValues[counter] = float(input("Please enter a value: "))
counter = counter + 1
sqDiffSq = sqDiff(givenValues)
print("The standard deviation for the given values is: " + sqDiffSq)
There are several errors in your code.
Which you can easily find out by reading the errormessages your code produces:
in the Function average
insert the line total = 0
you are using it before asigning it.
List appending
Do not use for example
sqDiff[x] = (values[x] - av)**2
You can do this when using dict's but not lists! Since python cannot be sure that the list indices will be continuously assigned use sqDiff.append(...) instead.
Do not concatenate strings with floats. I recommend to read the PEP 0498
(https://www.python.org/dev/peps/pep-0498/) which gives you an idea on how string could/should be formated in python

Bulk updating a slice of a Python list

I have written a simple implementation of the Sieve of Eratosthenes, and I would like to know if there is a more efficient way to perform one of the steps.
def eratosthenes(n):
primes = [2]
is_prime = [False] + ((n - 1)/2)*[True]
for i in xrange(len(is_prime)):
if is_prime[i]:
p = 2*i + 1
primes.append(p)
is_prime[i*p + i::p] = [False]*len(is_prime[i*p + i::p])
return primes
I am using Python's list slicing to update my list of booleans is_prime. Each element is_prime[i] corresponds to an odd number 2*i + 1.
is_prime[i*p + i::p] = [False]*len(is_prime[i*p + i::p])
When I find a prime p, I can mark all elements corresponding to multiples of that prime False, and since all multiples smaller than p**2 are also multiples of smaller primes, I can skip marking those. The index of p**2 is i*p + i.
I'm worried about the cost of computing [False]*len(is_prime[i*p + 1::p]) and I have tried to compare it to two other strategies that I couldn't get to work.
For some reason, the formula (len(is_prime) - (i*p + i))/p (if positive) is not always equal to len(is_prime[i*p + i::p]). Is it because I've calculated the length of the slice wrong, or is there something subtle about slicing that I haven't caught?
When I use the following lines in my function:
print len(is_prime[i*p + i::p]), ((len(is_prime) - (i*p + i))/p)
is_prime[i*p + i::p] = [False]*((len(is_prime) - (i*p + i))/p)
I get the following output (case n = 50):
>>> eratosthenes2(50)
7 7
3 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in eratosthenes2
ValueError: attempt to assign sequence of size 2 to extended slice of size 3
I also tried replacing the bulk updating line with the following:
for j in xrange(i*p + i, len(is_prime), p):
is_prime[j] = False
But this fails for large values of n because xrange doesn't take anything bigger than a long. I gave up on trying to wrestle itertools.count into what I needed.
Are there faster and more elegant ways to bulk-update the list slice? Is there anything I can do to fix the other strategies that I tried, so that I can compare them to the working one? Thanks!
Use itertools.repeat():
is_prime[i*p + 1::p] = itertools.repeat(False, len(is_prime[i*p + 1::p]))
The slicing syntax will iterate over whatever you put on the right-hand side; it doesn't need to be a full-blown sequence.
So let's fix that formula. I'll just borrow the Python 3 formula since we know that works:
1 + (hi - 1 - lo) / step
Since step > 0, hi = stop and lo = start, so we have:
1 + (len(is_prime) - 1 - (i*p + 1))//p
(// is integer division; this future-proofs our code for Python 3, but requires 2.7 to run).
Now, put it all together:
slice_len = 1 + (len(is_prime) - 1 - (i*p + 1))//p
is_prime[i*p + 1::p] = itertools.repeat(False, slice_len)
Python 3 users: Please do not use this formula directly. Instead, just write len(range(start, stop, step)). That gives the same result with similar performance (i.e. it's O(1)) and is much easier to read.

Python: "long int too large to convert to float" when calculating pi

I get this error when using a python script that calculates pi using the Gauss-Legendre algorithm. You can only use up to 1024 iterations before getting this:
C:\Users\myUsernameHere>python Desktop/piWriter.py
End iteration: 1025
Traceback (most recent call last):
File "Desktop/piWriter.py", line 15, in <module>
vars()['t' + str(sub)] = vars()['t' + str(i)] - vars()['p' + str(i)] * math.
pow((vars()['a' + str(i)] - vars()['a' + str(sub)]), 2)
OverflowError: long int too large to convert to float
Here is my code:
import math
a0 = 1
b0 = 1/math.sqrt(2)
t0 = .25
p0 = 1
finalIter = input('End iteration: ')
finalIter = int(finalIter)
for i in range(0, finalIter):
sub = i + 1
vars()['a' + str(sub)] = (vars()['a' + str(i)] + vars()['b' + str(i)])/ 2
vars()['b' + str(sub)] = math.sqrt((vars()['a' + str(i)] * vars()['b' + str(i)]))
vars()['t' + str(sub)] = vars()['t' + str(i)] - vars()['p' + str(i)] * math.pow((vars()['a' + str(i)] - vars()['a' + str(sub)]), 2)
vars()['p' + str(sub)] = 2 * vars()['p' + str(i)]
n = i
pi = math.pow((vars()['a' + str(n)] + vars()['b' + str(n)]), 2) / (4 * vars()['t' + str(n)])
print(pi)
Ideally, I want to be able to plug in a very large number as the iteration value and come back a while later to see the result.
Any help appreciated!
Thanks!
Floats can only represent numbers up to sys.float_info.max, or 1.7976931348623157e+308. Once you have an int with more than 308 digits (or so), you are stuck. Your iteration fails when p1024 has 309 digits:
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216L
You'll have to find a different algorithm for pi, one that doesn't require such large values.
Actually, you'll have to be careful with floats all around, since they are only approximations. If you modify your program to print the successive approximations of pi, it looks like this:
2.914213562373094923430016933707520365715026855468750000000000
3.140579250522168575088244324433617293834686279296875000000000
3.141592646213542838751209274050779640674591064453125000000000
3.141592653589794004176383168669417500495910644531250000000000
3.141592653589794004176383168669417500495910644531250000000000
3.141592653589794004176383168669417500495910644531250000000000
3.141592653589794004176383168669417500495910644531250000000000
In other words, after only 4 iterations, your approximation has stopped getting better. This is due to inaccuracies in the floats you are using, perhaps starting with 1/math.sqrt(2). Computing many digits of pi requires a very careful understanding of the numeric representation.
As noted in previous answer, the float type has an upper bound on number size. In typical implementations, sys.float_info.max is 1.7976931348623157e+308, which reflects the use of 10 bits plus sign for the exponent field in a 64-bit floating point number. (Note that 1024*math.log(2)/math.log(10) is about 308.2547155599.)
You can add another half dozen decades to the exponent size by using the Decimal number type. Here is an example (snipped from an ipython interpreter session):
In [48]: import decimal, math
In [49]: g=decimal.Decimal('1e12345')
In [50]: g.sqrt()
Out[50]: Decimal('3.162277660168379331998893544E+6172')
In [51]: math.sqrt(g)
Out[51]: inf
This illustrates that decimal's sqrt() function performs correctly with larger numbers than does math.sqrt().
As noted above, getting lots of digits is going to be tricky, but looking at all those vars hurts my eyes. So here's a version of your code after (1) replacing your use of vars with dictionaries, and (2) using ** instead of the math functions:
a, b, t, p = {}, {}, {}, {}
a[0] = 1
b[0] = 2**-0.5
t[0] = 0.25
p[0] = 1
finalIter = 4
for i in range(finalIter):
sub = i + 1
a[sub] = (a[i] + b[i]) / 2
b[sub] = (a[i] * b[i])**0.5
t[sub] = t[i] - p[i] * (a[i] - a[sub])**2
p[sub] = 2 * p[i]
n = i
pi_approx = (a[n] + b[n])**2 / (4 * t[n])
Instead of playing games with vars, I've used dictionaries to store the values (the link there is to the official Python tutorial) which makes your code much more readable. You can probably even see an optimization or two now.
As noted in the comments, you really don't need to store all the values, only the last, but I think it's more important that you see how to do things without dynamically creating variables. Instead of a dict, you could also have simply appended the values to a list, but lists are always zero-indexed and you can't easily "skip ahead" and set values at arbitrary indices. That can occasionally be confusing when working with algorithms, so let's start simple.
Anyway, the above gives me
>>> print(pi_approx)
3.141592653589794
>>> print(pi_approx-math.pi)
8.881784197001252e-16
A simple solution is to install and use the arbitrary-precisionmpmath module which now supports Python 3. However, since I completely agree with DSM that your use ofvars()to create variables on the fly is an undesirable way to implement the algorithm, I've based my answer on his rewrite of your code and [trivially] modified it to make use ofmpmath to do the calculations.
If you insist on usingvars(), you could probably do something similar -- although I suspect it might be more difficult and the result would definitely harder to read, understand, and modify.
from mpmath import mpf # arbitrary-precision float type
a, b, t, p = {}, {}, {}, {}
a[0] = mpf(1)
b[0] = mpf(2**-0.5)
t[0] = mpf(0.25)
p[0] = mpf(1)
finalIter = 10000
for i in range(finalIter):
sub = i + 1
a[sub] = (a[i] + b[i]) / 2
b[sub] = (a[i] * b[i])**0.5
t[sub] = t[i] - p[i] * (a[i] - a[sub])**2
p[sub] = 2 * p[i]
n = i
pi_approx = (a[n] + b[n])**2 / (4 * t[n])
print(pi_approx) # 3.14159265358979

Categories