Problems with bnlearn as library regarding float numbers

Problems with bnlearn as library regarding float numbers - python

I'm trying this notebook but on float numbers
https://github.com/erdogant/bnlearn/blob/master/notebooks/bnlearn.ipynb
Has anyone used "structure_learning.fit()" from bnlearn with float numbers?
My chart is blank. When I run a simple correlation on my dataframe, I get results so is not a a dataframe problem.
Another hint about my hypotheses : When I transform my float to binary, it works

Bnlearn in python only works with binary and not with cont values. This library is an adaptation of an R library so not everything is done. Currently P(A/B) can be done only for binary problems in this library. Please check the math of P(A/B) to understand

Related

Is there an python function or extension that is is similar to Matlab's format short?

The command format short in Matlab makes all the print outs in the command window be "Short, fixed-decimal format with 4 digits after the decimal point."
I know there is np.round, but I would like to have this functionality that Matlab offers in python so I dont have to write round every time. This in order to get a better overview of arrays/dataframes when they are printed.
I am interested in automatic rounding of numbers/floats printed in the terminal without using np.round
Ideally I would like also to be able to choose the number of digits (4).
Thanks

You can use numpy.set_printoptions, from the documentation:
np.set_printoptions(precision=4)
np.array([1.123456789])
[1.1235]

QBASIC and Python : floating point number formatting/rounding off issue

We are trying to convert some qbasic scripts into python scripts.
The scripts are used to generate some reports. Generally the reports generated by qbasic and python scripts should be exactly same.
While generating a report we need to format a floating point number in a particular format.
We use the following commands for formatting the number.
For QBASIC, we use
PRINT USING "########.###"; VAL(MYNUM$)
For Python, we use
print('{:12.3f}'.format(mynum))
where MYNUM$ and mynum having the floating point value.
But in certain cases, the formatted value differs between python and qbasic.
The result become as follows,
Can anyone help me to sort out this problem and make the python formatting work like qbasic?

This seems to be an related to the datatype (maybe 32bit float in qbasic and 64bit in python) used and how rounding is implemented. For example when you use:
from ctypes import c_float
print(floor(c_float(mynum).value*1000+.5)/1000)
c_float converts the python float into C format.
it will give me the numbers exactly in python exactly as in qbasic.

Python Panda.read_csv rounds to get import errors?

I have a 10000 x 250 dataset in a csv file. When I use the command
data = pd.read_csv('pool.csv', delimiter=',',header=None)
while I am in the correct path I actually import the values.
First I get the Dataframe. Since I want to work with the numpy package I need to convert this to its values using
data = data.values
And this is when i gets weird. I have at position [9999,0] in the file a -0.3839 as value. However after importing and calculating with it I noticed, that Python (or numpy) does something strange while importing.
Calling the value of data[9999,0] SHOULD give the expected -0.3839, but gives something like -0.383899892....
I already imported the file in other languages like Matlab and there was no issue of rounding those values. I aswell tried to use the .to_csv command from the pandas package instead of .values. However there is the exact same problem.
The last 10 elements of the first column are
-0.2716
0.3711
0.0487
-1.518
0.5068
0.4456
-1.753
-0.4615
-0.5872
-0.3839
Is there any import routine, which does not have those rounding errors?

Passing float_precision='round_trip' should solve this issue:
data = pd.read_csv('pool.csv',delimiter=',',header=None,float_precision='round_trip')

That's a floating point error. This is because of how computers work. (You can look it up if you really want to know how it works.) Don't be bothered by it, it is very small.
If you really want to use exact precision (because you are testing for exact values) you can look at the decimal module of Python, but your program will be a lot slower (probably like 100 times slower).
You can read more here: https://docs.python.org/3/tutorial/floatingpoint.html
You should know that all languages have this problem, only some are better in hiding it. (Also note that in Python3 this "hiding" of the floating point error has been improved.)
Since this problem cannot be solved by an ideal solution, you are given the task to solve it yourself and choose the most appropriate solution for your situtation
I don't know about 'round_trip' and its limitations, but it probably can help you. Other solutions would be to use float_format from the to_csv method. (https://docs.python.org/3/library/string.html#format-specification-mini-language)

python not able to convert very small decimal to log

I am using a function that multiplies probabilities there by creating very small values. I am using decimal.Decimal module to handle it and then when the compuation is complete I convert that decimal to logofOdds using math.log module/function. But, below a certain proability python cannot convert these very small probabilities to log2 or 10 of likelyhood ratio.
I am getting ValueError: math domain error
So, I printed the value before the traceback started and it seems to be this number:
2.4876626750969332485460767406646530276378975654773588506772125620858727319570054153525540357327805722211631386444621446226193195409521079089382667946955357511114536197822067973513019098983691433561051610219726750413489309980667312714519374641433925197450250314924925500181809328656811236486523523785835600132361529950090E-366
Other small numbers like this are getting handled by math.log though in the same program:
5.0495856951184114023890172277484001329118412629157526209503867218204386939259819037402424581363918720565886924655927609161379229574865468595907661385853201472751861413845827437245978577896538019445515183910587509474989069747817303700894727201121392323641965506674606552182934813779310061601566189062725979740753305935661E-31
Is it true? any way to fix this. I know I can take the log of the probs and then sum it along the way, but when I tried to do that, it seems I have to update several places in my program - could take significant hours or days. and there is another process to convert it back to decimal.
Thanks,

If you want to take logarithms of Decimal objects, use the ln or log10 methods. Aside from a weird special case for huge ints, math.log casts inputs to float.
whatever_decimal.ln()

Exact calculations in python [duplicate]

>>> float(str(0.65000000000000002))
0.65000000000000002
>>> float(str(0.47000000000000003))
0.46999999999999997 ???
What is going on here?
How do I convert 0.47000000000000003 to string and the resultant value back to float?
I am using Python 2.5.4 on Windows.

str(0.47000000000000003) give '0.47' and float('0.47') can be 0.46999999999999997.
This is due to the way floating point number are represented (see this wikipedia article)
Note: float(repr(0.47000000000000003)) or eval(repr(0.47000000000000003)) will give you the expected result, but you should use Decimal if you need precision.

float (and double) do not have infinite precision. Naturally, rounding errors occur when you operate on them.

This is a Python FAQ
The same question comes up quite regularly in comp.lang.python also.
I think reason it is a FAQ is that because python is perfect in all other respects ;-), we expect it to perform arithmetic perfectly - just like we were taught at school. However, as anyone who has done a numerical methods course will tell you, floating point numbers are a very long way from perfect.
Decimal is a good alternative and if you want more speed and more options gmpy is great too.

by this example
I think this is an error in Python when you devide
>>> print(int(((48/5.0)-9)*5))
2
the easy way, I solve this problem by this
>>> print(int(round(((48/5.0)-9)*5,2)))
3

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problems with bnlearn as library regarding float numbers - python

Bnlearn in python only works with binary and not with cont values. This library is an adaptation of an R library so not everything is done. Currently P(A/B) can be done only for binary problems in this library. Please check the math of P(A/B) to understand

Related

Is there an python function or extension that is is similar to Matlab's format short?

QBASIC and Python : floating point number formatting/rounding off issue

Python Panda.read_csv rounds to get import errors?

python not able to convert very small decimal to log

Exact calculations in python [duplicate]

Categories

Resources