I ran into an interesting problem. I have a MySQL database that contains some doubles with very precise decimal values (for example, 0.00895406607247756, 17 decimal places). This is scientific data so this high level of precision is very important.
I'm using MySQLdb in Python to select data from the database:
cursor.execute("SELECT * FROM " ...etc...)
for n in range(cursor.rowcount):
row = cursor.fetchone()
print row
For some reason, when it gets to my very precise decimals, they've been truncated to a maximum of 14 decimal places. (i.e. the previous decimal becomes 0.00895406607248)
Is there any way to get the data from MySQLdb in its original form without truncating?
What MySQL datatype are you using to store the data? Is it DECIMAL(18,17)? DECIMALs have up to 65 digits of precision.
If you set the MySQL data type to use DECIMAL(...), then MySQLdb will convert the data to a Python decimal.Decimal object, which should preserve the precision.
Related
I queried my postgres database to retrieve information from a table:
SQLResults = cursor.execute('SELECT x.some_12_long_integer from test as x;')
Now when I run this query in database, I get 1272140958198, but when I dump this in a dataframe:
The x.some_12_long_integer is int8.
I am using xlsxwriter:
excel_writer = ExcelWriter(
self.fullFilePath, engine="xlsxwriter",
engine_kwargs={'options': {'strings_to_numbers': True}}
)
frame = DataFrame(SQLResults[0])
frame.to_excel(excel_writer, sheet_name="Sheet1", index=False)
When it is converted to Excel it produces 1.27214E+12 but when I format the cell in the Excel file I get 1272140958198.
How can I make it just stay as 1272140958198 instead of 1.27214E+12?
It is not possible to disable scientific notation in pandas. However, there are a few ways to work around this:
Convert the numeric values to strings using .to_string() . This will remove the scientific notation and provide a more user-friendly representation of the data. Perform mathematical operations on the numeric values in order to round them off to a specific number of decimal places. This will remove most of the scientific notation, but some may still remain.
I have a Oracle table with columns of type VARCHAR2 (i.e. string) and of type NUMBER (i.e. a numeric value with a fractional part). And the numeric columns contain indeed values with decimal points, not integer values.
However when I read this table into a Pandas dataframe via pandas.read_sql I receive the numeric columns in the data frame as int64. How can I avoid this and receive instead float columns with the full decimal values?
I'm using the following versions
python : 3.7.4.final.0
pandas : 1.0.3
Oracle : 18c Enterprise Edition / Version 18.9.0.0.0
I have encountered the same thing. I am not sure if this is the reason but I assume that NUMBER type without any size restrictions is too big for pandas and it is automatically truncated to int64 or the type is improperly chosen by pandas – default NUMBER might be treated as an integer. You can limit the type of the column to e.g. NUMBER(5,4) and pandas should recognise it correctly as a float.
I also found out that using pd.read_sql gives me proper types in contrast to pd.read_sql_table.
I have a Pandas DataFrame that I'm sending to MySQL via to_sql with sqlalchemy. My floats in SQL sometimes show decimal places that are slightly off (compared to the df) and result in an error: "Warning: (1265, "Data truncated for column 'Dividend' at row 1")". How do I round the floats so that they match the value in the DataFrame?
The values are pulled from a CSV and converted from strings to floats. They appear fine when written to Excel, but when sent to SQL, the numbers are slightly off.
I've looked into the issues with floats when it comes to binary, but I can't figure out how to override that during the transfer from DataFrame to SQL.
from sqlalchemy import create_engine
import pandas as pd
def str2float(val):
return float(val)
data = pd.read_csv(
filepath_or_buffer = filename,
converters = {'col1':str2float}
db = create_engine('mysql://user:pass#host/database')
data.to_sql(con=db, name='tablename', if_exists='append', index=False)
db.dispose()
Most floats pull over similar to 0.0222000000, but every once in awhile it will appear like 0.0221999995. Ideally I would like it to automatically truncate all the 0s at the end, but I would settle for the first example. However I need to have it round up to match the float that was stored in the DataFrame.
I had a similar problem. The number I imported into the data-frame had 3 decimal places. But when inserted into the SQL table, it had 12 digits.
I just used .round() method and it worked for me.
df["colname"] = df["colname"].round(3)
I am facing the following issue when working with a sqlite database and python.
This happens also with an external program such as SQLiteStudio, so it is not python.
Suppose to have a table containing a column of type string.
If you enter an entry to that column such as 43234e4324 (so, all the entries of the kind ####e##### give the issue), the value stored in the database is converted to inf!
For sure, since e may be interpreted as the exponent, it would make sense if the column is of type float, but it is string!
Here is a working example:
import sqlite3
a = sqlite3.connect('test.sqlite')
a.execute('CREATE TABLE test (p STRING)')
a.execute('INSERT INTO test (p) VALUES (03242e4444)')
a.commit()
b = a.execute('SELECT p FROM test')
b.fetchall()
and you get [(inf,)]...
If you insert a different value without the 'e' you would get the correct string.
I want to enter a string like 43243e3423 without it being converted. How can I do this?
A number with an e in it is a floating point number in exponential notation. 43234e3423 is the notation for 43234x103423. Since that's far too big to be stored as a floating point number, you get inf.
To enter strings, you should put them in quotes.
a.execute('INSERT INTO test (p) VALUES ("03242e4444")')
The other problem is that there's no STRING datatype. An unrecognized datatype is treated as NUMERIC, so even if you put quotes around the value, it gets converted as if it were a number. Use TEXT instead.
a.execute('CREATE TABLE test (p TEXT)')
I have a column in mysql which is intended to take decimal values (e.g. 0.00585431)
In python, I have a function that gets this value from a webpage, then returns it into a variable. When I print this variable I get [u'0.00585431'] (which is strange)
I then try to insert this into the mysql column which is set to take a decimal(10,0) value. however, the database stores it as just a 0
the code to insert is nothing special and works for other things:
cur.execute("""INSERT INTO earnings VALUES (%s)""", (variable))
if I change the column to a string type then it stores the whole [u'0.00585431']. So i imagine that when I try to store it as a decimal its not actually taking a proper decimal value and stores a 0 instead?
any thoughts on how to fix this?
DECIMAL(10,0) will give 0 to the right of the comma.
The declaration syntax for a DECIMAL column remains DECIMAL(M,D),
although the range of values for the arguments has changed somewhat:
M is the maximum number of digits (the precision). It has a range of 1
to 65. This introduces a possible incompatibility for older
applications, because previous versions of MySQL permit a range of 1
to 254. (The precision of 65 digits actually applies as of MySQL
5.0.6. From 5.0.3 to 5.0.5, the precision is 64 digits.)
D is the number of digits to the right of the decimal point (the
scale). It has a range of 0 to 30 and must be no larger than M.
Try to change your column datatype to DECIMAL(10,8)
If your values will always be in same format as 0.00585431 then DECIMAL(9,8) would suffice.
https://dev.mysql.com/doc/refman/5.0/en/precision-math-decimal-changes.html