NumPy Reads Binary Data Incorrectly?

NumPy Reads Binary Data Incorrectly? - python

I'm working with a binary file "uint64.bin" whose entire contents can be represented by: 0x0f2d1e6002df9000
My python code is as follows:
import numpy as np
import pandas as pd
my_dtype = np.dtype([('mytag','>u8')])
with open("uint64.bin", 'rb') as fh:
data = np.fromfile(file=fh, dtype=my_dtype)
df = pd.DataFrame(data, columns=data.dtype.names)
print(df.get_values()[0])
What prints is 1093563682234798080 whereas the output should be 1093563682234798100 (the difference is in the 0x14 bits). What's going on?
I'm on Windows 64-bit and using Python 3.7.

Just do print(0x0f2d1e6002df9000) in Python. It gives you:
1093563682234798080
So the answer you're getting is correct, your assumption about what it should be is incorrect.

Related

Absolute number is coming with decimals using pandas

I'm using pandas to apply some format level changes on a csv and storing the result in a target file. The source file has some integers, but after pandas operation the integers are converted to decimals. For e.g. 3 in source file converted to 3.0. I would like the integers remain as integers.
Any pointers on how to get this working? Will be really helpful, thank you!
import pandas as pd
# reading the csv file
df = pd.read_csv(source)
# updating the column value/data
df['Regular'] = df['Regular'].replace({',': '_,'})
# writing into the file
df.to_csv(target, index=False)

You can specify data type for pandas read_csv(), eg.:
df = pd.read_csv(source, dtype={'column_name_a': 'Int32', 'column_name_b': 'Int32'})
see docs here :: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

pandas vs sasdataset ,values are exact correct

Before reading into pandas, data is used in sasdataset. My data looks like
SNYDJCM--integer
740.19999981
After reading into pandas my data is changing as below
SNYDJCM--converting to float
740.200000
How to get same value after reading into pandas dataframe
Steps followed:
1)
import pandas as pd
2)
pd.read_sas(path,format='sas7bdat',encoding='iso-8859-1')
Need your help

Try import SAS7BDAT and casting your file before reading:
from sas7bdat import SAS7BDAT
SAS7BDAT('FILENAME.sas7bdat')
df = pd.read_sas('FILENAME.sas7bdat',format='sas7bdat')
or use it to directly read the file:
from sas7bdat import SAS7BDAT
sas_file = SAS7BDAT('FILENAME.sas7bdat')
df = sas_file.to_data_frame()
or use pyreadstat to read the file:
import pyreadstat
df, meta = pyreadstat.read_sas7bdat('FILENAME.sas7bdat')

First 740.19999981 is no integer 740 would be the nearest integer. But also when you round 740.19999981 down to 6 digits you will get 740.200000. I would sugest printing out with higher precision and see if it is really changed.
print("%.12f"%(x,))

reading from binary file in numpy or pandas (Matlab code available)

I have a binary file that I could read it using this Matlab code:
fid = fopen(path);
a = fread(fid,10e6,"float");
What is the equivalent code in Python using Numpy or Pandas libraries?

Looks like you are looking for this:
https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.fromfile.html

Extracting data from a large csv file:causes dtype warnings

I work for a company and I recently switched from using spreadsheet package to python. Since, I am very new to python there are alot of things that I have difficulty grasping.Using python, I am trying to extract data from a large csv file(37791 rows and 316 columns.) Here is a piece of code I wrote:
Solution 1
import numpy as np
import pandas as pd
df=pd.read_csv=('C:\\Users\\Maxwell\\Desktop\\Test.data.csv',skiprows=1)
data=df.loc[:,['Steps','Parameter']]
This command generates an error,i.e, it gives a DtypeWwarning:columns (0,1,2,3........81) have mixed types. Specify dtype option on import or set low memory= False
So, I found a workaround.
Solution 2
import pandas as pd
import numpy as np
df=pd.read_csv(('C:\\Users\\Maxwell\\Desktop\\Test.data.csv',skiprows=1,error_bad_lines=False, index_col=False, dtype='unicode')
data=df.loc[:,['Steps','Parameter']]
Two questions:
i)I was able to get around the error, but now the columns that I want(Steps & Parameter)have been converted to objects(probably due to the dtype='unicode' command). How can I convert Steps column into an integer type and parameter into a float.
ii) Some people say that dtype warning isn't really an error. But, I found out that when I use Solution 1 and read the csv file. The Steps column contains some floats.The original csv file doesn't have any floats in Steps column. It looks as if, some floats have been placed by python itself!! Why does this happen?
(I am not able to upload the original csv file, because my company doesn't allow it!)

numpy.loadtxt does not read file with complex numbers

I am trying to read a file with complex numbers in the form :
data.dat
1.5795219122457646E-11-3.852906516379872E-15i -3.5949335665378405E-12-1.626143709108086E-15i
-6.720365121161621E-15-5.377186331212649E-17i -3.736251476362349E-15-3.0190920417856674E-17i
I use the following code to read the file :
import numpy as np
c_complex = np.loadtxt('data.dat', delimiter='\t', dtype=np.complex128)
But it gives me the following error :
TypeError: complex() argument must be a string or a number, not 'bytes'
What could I do to solve this problem ?
Thank you very much for your help

This seems to have been a bug in older versions of numpy (Issue). Either update your numpy to the latest version of their github repository or use the function numpy.genfromtxt().
c.complex = np.genfromtxt('data.dat', delimiter='\t', dtype=np.complex128)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

NumPy Reads Binary Data Incorrectly? - python

Just do print(0x0f2d1e6002df9000) in Python. It gives you: 1093563682234798080 So the answer you're getting is correct, your assumption about what it should be is incorrect.

Related

Absolute number is coming with decimals using pandas

pandas vs sasdataset ,values are exact correct

reading from binary file in numpy or pandas (Matlab code available)

Extracting data from a large csv file:causes dtype warnings

numpy.loadtxt does not read file with complex numbers

Categories

Resources