KeyError with a poisson process using pandas

KeyError with a poisson process using pandas - python

I am trying to create a function which will simulate a poison process for a changeable dt and total time, and have the following:
def compound_poisson(lamda,mu,sigma,dt,T):
points = pd.Series(0)
out = pd.Series(0)
inds = simple_poisson(lamda,dt,T)
for ind in inds.index:
if inds[ind+dt] > inds[ind]:
points[ind+dt] = np.random.normal(mu,sigma)
else:
points[ind+dt] = 0
out = out.append(np.cumsum(points),ignore_index=True)
out.index = np.linspace(0,T,int(T/dt + 1))
return out
However, I receive a "KeyError: 0.010000000000000002", which should not be in the index at all. Is this a result of being lax with float objects?

In short, yes, it's a floating point error. It's quite hard to know how you got there, but probably something like this:
>>> 0.1 * 0.1
0.010000000000000002
Maybe use round?

Related

How to convert voltage (or frequency) floating number read backs to mV (or kHz)?

I am successfully able to read back data from an instrument:
When the read back is a voltage, I typically read back values such as 5.34e-02 Volts.
When the read back is frequency, I typically read values like 2.95e+04or 1.49e+05 with units Hz.
I would like to convert the voltage read back of 5.34e-02 to exponent e-3 (aka millivolts), ie.. 53.4e-3. next, I would like to extract the mantissa 53.4 out of this because I want all my data needs to be in milliVolts.
Similarly, I would like to convert all the frequency such as 2.95e+04 (or 1.49e+05) to kiloHz, ie... 29.5e+03 or 149e+03. Next would like to extract the mantissa 29.5 and 149 from this since all my data needs to be kHz.
Can someone suggest how to do this?

Well, to convert volts to millivolts, you multiply by 1000. To convert Hz to kHz, you divide by 1000.
>>> reading = 5.34e-02
>>> millivolts = reading * 1000
>>> print(millivolts)
53.400000000000006
>>> hz = 2.95e+04
>>> khz = hz /1000
>>> khz
29.5
>>>
FOLLOW-UP
OK, assuming your real goal is to keep the units the same but adjust the exponent to a multiple of 3, see if this meets your needs.
def convert(val):
if isinstance(val,int):
return str(val)
cvt = f"{val:3.2e}"
if 'e' not in cvt:
return cvt
# a will be #.##
# b will be -##
a,b = cvt.split('e')
exp = int(b)
if exp % 3 == 0:
return cvt
if exp % 3 == 1:
a = a[0]+a[2]+a[1]+a[3]
exp = abs(exp-1)
return f"{a}e{b[0]}{exp:02d}"
a = a[0]+a[2]+a[3]+a[1]
exp = abs(exp-2)
return f"{a}e{b[0]}{exp:02d}"
for val in (5.34e-01, 2.95e+03, 5.34e-02, 2.95e+04, 5.34e-03, 2.95e+06):
print( f"{val:3.2e} ->", convert(val) )
Output:
5.34e-01 -> 534.e-03
2.95e+03 -> 2.95e+03
5.34e-02 -> 53.4e-03
2.95e+04 -> 29.5e+03
5.34e-03 -> 5.34e-03
2.95e+06 -> 2.95e+06

In this case, I think multiplying/dividing by 1000 is enough to move between SI prefixes. But when units get more complicated it might help to use a library like Pint to keep track of things and make sure you're calculating what you think you are.
In this case you might do:
import pint
ureg = pint.UnitRegistry()
Q = ureg.Quantity
reading_v = Q(5.34e-02, 'volts')
reading_mv = reading_v.to('millivolts')
print(reading_mv.magnitude)
but it seems overkill here.

Python, different rounding from variable, and hardcoded value

I have a simple math formula that results in a decimal number (0.97745) that I want to round to 4 numbers.
When I do that from my evaluated variable I get (0.9774), but when I hardcode that number into function round(), I get 0.9775
Here is the code
zero = 0.9700
effective_beta = 0.00745
loan = {}
loan['beta2'] = 0.0
loan['beta3'] = 0.0
mrktdiff_2 =0.08880400
mrktdiff_3 = 0.026463592000
forecasted_pt = (float(zero) + float(effective_beta) + float(loan['beta2'] or 0.) * float(mrktdiff_2) +
float(loan['beta3'] or 0.) * float(mrktdiff_3))
print("before rounding forecastedpt is ")
print(forecasted_pt)
print("after rounding")
print(round(forecasted_pt,4))
print("Dont get this part")
print(round(0.97745,4))
The reason why I use the float operators is due to the that these variables are dynamic and sometimes can result in string / null values.
Also when I run the same code in php I get the 0.9775 value for this.
Edit:
I ran the code in katacoda.com editor, and got the following:
before rounding forecastedpt is
0.97745
after rounding
0.9774
Dont get this part
0.9775
But running it in repl.com I get the first value as: 0.97744999999999 so I guess it could be in the precision of the expression itself

try it :
zero = 0.9700
effective_beta = 0.00745
loan = {}
loan['beta2'] = 0.0
loan['beta3'] = 0.0
mrktdiff_2 =0.08880400
mrktdiff_3 = 0.026463592000
forecasted_pt = (float(zero) + float(effective_beta) + float(loan['beta2'] or 0.) * float(mrktdiff_2) + float(loan['beta3'] or 0.) * float(mrktdiff_3))
print("before rounding forecastedpt is ")
print(forecasted_pt)
print("after rounding")
print(round(forecasted_pt,5))
numb = round(forecasted_pt,5)
print(round(numb,4))
print("Dont get this part")
print(round(0.97745,4))
the output:
before rounding forecastedpt is
0.9774499999999999
after rounding
0.97745
0.9775
Dont get this part
0.9775
the round function in any language dont round all entire number its strip the numbers to round only the last number of it.

From the python documentation:
Note: The behavior of round() for floats can be surprising: for example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This is not a bug: it’s a result of the fact that most decimal fractions can’t be represented exactly as a float. See Floating Point Arithmetic: Issues and Limitations for more information.

Try this:
I remember on my statistics class 10 years ago. My Professor has always been advising us to round the calculation up to 6 decimal points as statistics is all about estimation, it counts alot.
zero = 0.9700
effective_beta = 0.00745
loan = {}
loan['beta2'] = 0.0
loan['beta3'] = 0.0
mrktdiff_2 =0.08880400
mrktdiff_3 = 0.026463592000
forecasted_pt = round((float(zero) + float(effective_beta) + float(loan['beta2'] or 0.) * float(mrktdiff_2) + float(loan['beta3'] or 0.) * float(mrktdiff_3)),6)
print(round(forecasted_pt,4))

format() function is always returning "0.00"

I have some calculation that I am running that would provide a double/float back:
a = float(4)
b = float(56100)
c = a / b
Now when run the script, I get this:
7.1301e-05
I just need to format this response so that I get 7.13. But when I try to do this I get 0.00:
percentage_connections_used = float(a) / float(b)
percentage_float = float(percentage_connections_used)
print(format(percentage_float, '.2f'))
I can't seem to figure out why it would return 0 when trying to format it. Can someone possibly tell me what is going on? This is Python 2.7

I think your format is correct, but when you try to round to 2 decimal places It actually rounds to 0.00.
7.8125e-05 = 0.000078125
When rendered as 2 decimals, you get 0.00.
You could do a little string manipulation to parse out the 7.8125 figure by using:
d = float(str(c).split('e')[0])
It's a little verbose, though, and maybe someone in the community can do better.
By the way, I get 7.1301...e-05 when I run a/b.

7.8125e-05 is the same as 0.000078125 so formatting it with only two decimal points gives you 0.00. You could do '.7f' which would get you 0.0000713. If you want it to output in scientific notation, you should do that explicitly. Try this:
a = float(4)
b = float(56100)
c = a / b
print("{:.2e}".format(c))

Python/Numpy: Division gives me an unexpected deprecation-Warning

Im reading data from a csv, then looping it, then I want to divide it by the mean value to normalize it but getting a warning. The code is:
A = genfromtxt("train.txt", delimiter=';', skip_header=1)
lowid = A[:,1].min(axis=0)
highid = A[:,1].max(axis=0)
X = []
Y = []
for i in np.arange(lowid, highid):
I = A[A[:,1] == i][:, [0,2,3]]
meanp = np.mean(I[:,1]);
meanq = np.mean(I[:,2]);
for j in np.arange(I[:,0].min(axis=0)+2, I[:,0].max(axis=0)):
weekday = int(I[j,0]) % 7
# NORMALIZE:
P = I[j,1] / meanp
pP = I[j-1,1] / meanp
ppP = I[j-2,1] / meanp
X.append([weekday, P, pP, ppP])
Y.append(I[j,2])
the train.txt looks like this:
day;itemID;price;quantity
1;1;4.73;6
1;2;7.23;0
1;3;10.23;1
1;4;17.9;0
1;5;1.81;1
1;6;12.39;1
1;7;7.17;1
1;8;7.03;0
1;9;13.61;0
1;10;36.45;1
1;11;24.67;0
1;12;12.04;0
1;13;11.85;0
The warnings:
weekday = int(I[j,0]) % 7
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
P = I[j,1] / meanp
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
pP = I[j-1,1] / meanp
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
ppP = I[j-2,1] / meanp
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
Y.append(I[j,2])
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
What is the problem?
Thanks
EDIT Okay that was a pretty fast fix myself:
The j has got to be of integer type. I fixed it like this:
for j in range(int(I[:,0].min(axis=0))+2, int(I[:,0].max(axis=0))):
good solution like this? Im new to python...

Okay that was a pretty fast fix myself: The j has got to be of integer type.
I fixed it like this:
for j in range(int(I[:,0].min(axis=0))+2, int(I[:,0].max(axis=0))):
using the python range function OR explicitely defining the data-type for arange like this (thanks #Davidmh):
for j in np.arange(I[:,0].min(axis=0)+2, I[:,0].max(axis=0), dtype=np.int):

Getting part of R object from python using rpy2

I can get the output I need using R, but I can not reproduce within python's rpy2 module.
In R:
> wilcox.test(c(1,2,3), c(100,200,300), alternative = "less")$p.value
gives
[1] 0.05
In python:
import rpy2.robjects as robjects
rwilcox = robjects.r['wilcox.test']
x = robjects.IntVector([1,2,3,])
y = robjects.IntVector([100,200,300])
z = rwilcox(x,y, alternative = "less")
print z
gives:
Wilcoxon rank sum test
data: 1:3 and c(100L, 200L, 300L)
W = 0, p-value = 0.05
alternative hypothesis: true location shift is less than 0
And:
z1 = z.rx('p.value')
print z1
gives:
$p.value
[1] 0.05
Still trying to get a final value of 0.05 stored as a variable, but this seems to be closer to a final answer.
I am unable to figure out what my python code needs to be to to store the p.value in a new variable.

z1 is a ListVector containing one FloatVector with one element:
>>> z1
<ListVector - Python:0x4173368 / R:0x36fa648>
[FloatVector]
p.value: <class 'rpy2.robjects.vectors.FloatVector'>
<FloatVector - Python:0x4173290 / R:0x35e6b38>
[0.050000]
You can extract the float itself with z1[0][0] or just float(z1[0]):
>>> z1[0][0]
0.05
>>> type(z1[0][0])
<type 'float'>
>>> float(z1[0])
0.05
In general you are going to have an easier time figuring out what is going on in an interactive session if you just supply the name of the object you want a representation of. Using print x statement transforms things through str(x) when the repr(x) representation used implicitly by the interactive loop is much more helpful. If you are doing things in a script, use print repr(x) instead.

Just using list() ?
pval = z.rx2('p-value')
print list(pval) # [0.05]
rpy2 also works well with numpy:
import numpy
pval = numpy.array(pval)
print pval # array([ 0.05])
http://rpy.sourceforge.net/rpy2/doc-2.3/html/numpy.html#from-rpy2-to-numpy

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

KeyError with a poisson process using pandas - python

In short, yes, it's a floating point error. It's quite hard to know how you got there, but probably something like this: >>> 0.1 * 0.1 0.010000000000000002 Maybe use round?

Related

How to convert voltage (or frequency) floating number read backs to mV (or kHz)?

Python, different rounding from variable, and hardcoded value

format() function is always returning "0.00"

Python/Numpy: Division gives me an unexpected deprecation-Warning

Getting part of R object from python using rpy2

Categories

Resources