I am trying to read in some currency data from an excel sheet which has values with 10 decimal places, but when i read it in within the df it is dropping all the values to 2 decimal places which i dont want. Any help on how to get all the decimal places back would be great.
import pandas as pd
import numpy as np
import xlrd as xlrd
import xlsxwriter
df = pd.read_excel('Currency Data.xls','PPU')
df.head()
I would suggest looking at the Decimal library for python
https://docs.python.org/2/library/decimal.html
Related
I need to analyze data with Python. I created a csv file in excel before. But when I show the table with python, only one column seems to have all numbers in decimals. Excel also looks fine, but when I show it in python it looks decimal. Please help!
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("covıd.csv" , sep=';', encoding='latin-1')
df
The Total Case part looks decimal. not improving. It wasn't like this before.
Before reading into pandas, data is used in sasdataset. My data looks like
SNYDJCM--integer
740.19999981
After reading into pandas my data is changing as below
SNYDJCM--converting to float
740.200000
How to get same value after reading into pandas dataframe
Steps followed:
1)
import pandas as pd
2)
pd.read_sas(path,format='sas7bdat',encoding='iso-8859-1')
Need your help
Try import SAS7BDAT and casting your file before reading:
from sas7bdat import SAS7BDAT
SAS7BDAT('FILENAME.sas7bdat')
df = pd.read_sas('FILENAME.sas7bdat',format='sas7bdat')
or use it to directly read the file:
from sas7bdat import SAS7BDAT
sas_file = SAS7BDAT('FILENAME.sas7bdat')
df = sas_file.to_data_frame()
or use pyreadstat to read the file:
import pyreadstat
df, meta = pyreadstat.read_sas7bdat('FILENAME.sas7bdat')
First 740.19999981 is no integer 740 would be the nearest integer. But also when you round 740.19999981 down to 6 digits you will get 740.200000. I would sugest printing out with higher precision and see if it is really changed.
print("%.12f"%(x,))
I'm loading some excel files with pandas.read_excel() and converting to a numpy array with .to_numpy().
Fast forward, I solved the problem by specifying dtype=object. But still, I'm curious about what triggers this response.
Here is a simplified version. This is what the excel file contains:
Then I use:
import pandas as pd
import numpy as np
data = pd.read_excel('test_file2.xlsx', sheet_name='other').to_numpy()
print(data)
And the outcome is a string, an integer, and a float.
[['true']
[0]
[4.4]]
However, if both numbers in excel are floats like this:
Then this is the outcome:
[[True]
[nan]
[nan]]
Can anyone explain why in the second case there is such a conversion which basically leads to losing the number values?
Background: The following code works to export a pandas df as an excel file:
import pandas as pd
import xlsxwriter
writer = pd.ExcelWriter('Excel_File.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
Problem:
My ID column in the excel file shows up like
8.96013E+17 instead of 896013350764773376
I try to alter it in excel using format and zipcode but it still gives the wrong ID 896013350764773000
Question: Using excel or python code, how do I keep my original 896013350764773376 ID format?
Excel uses IEEE754 doubles to represent numbers and they have 15 digits of precision. So you are not going to be able to represent an 18 digit id as a number in Excel. You will need to convert it to a string to maintain all the digits.
Is there a way to export data from SPSS to CSV including a zero before the decimal point?
Currently, I have ".41" and I would like to export "0.41" into my CSV file.
Any suggestion?
It seems difficult to do it directly in SPSS.
One possible answer: using python + pandas.
import pandas as pd
def add_leading_zero_to_csv(path_to_csv_file):
# open the file
df_csv = pd.read_csv(path_to_csv_file)
# you can possibly specify the format of a column if needed
df_csv['specific_column'] = df_csv['specific_column'].map(lambda x: '%.2f' % x)
# save the file (with a precision of 3 for all the floats)
df_csv.to_csv(path_to_csv_file, index=False, float_format='%.3g')
More info about the "g" formatting: Format Specification Mini-Language.
And be careful with the floating point problem (e.g., see the answer to this question)