When using the feather package (http://blog.cloudera.com/blog/2016/03/feather-a-fast-on-disk-format-for-data-frames-for-r-and-python-powered-by-apache-arrow/) to try and write a simple 20x20 dataframe, I keep getting an error stating that strided data isn't yet supported. I don't believe my data is strided (or out of the ordinary), and I can replicate the sample code given on the website, but can't seem to get it to work with my own. Here is some sample code:
import feather
import numpy as np
import pandas as pd
tempArr = reshape(np.arange(400), (20,20))
df = pd.DataFrame(tempArr)
feather.write_dataframe(df, 'test.feather')
The last line returns the following error:
FeatherError: Invalid: no support for strided data yet
I am running this on Ubuntu 14.04. Am I perhaps misunderstanding something about how pandas dataframes are stored?
Please come to GitHub: https://github.com/wesm/feather/issues/97
Bug reports do not belong on StackOverflow
Related
I intended
to write a code which helps me display Table / Dataframe on GUI (Kivy). To which I found the solution here. Apparently it uses a non-official package from a github repo which is dfgui.
The Problem
occurred to me when I executed as told on StackOverflow link. However returned Error that
wx._core.PyAssertionError: C++ assertion "!items.IsEmpty()" failed at
/usr/include/wx-3.0/wx/ctrlsub.h(154) in InsertItems(): need something
to insert
I Brokedown
the problem by selective execution in foll. way
import dfgui
import pandas as pd
xls = pd.read_excel('Res.xls')
df = pd.DataFrame(xls)
dfgui.show(df)
#dfgui.show(xls) Apparently the same as df
which then returned
TypeError: String or Unicode type required
and led me to this link, which I couldn't understand much.
Point me in North, or perhaps a different solution could be great too.
I work for a company and I recently switched from using spreadsheet package to python. Since, I am very new to python there are alot of things that I have difficulty grasping.Using python, I am trying to extract data from a large csv file(37791 rows and 316 columns.) Here is a piece of code I wrote:
Solution 1
import numpy as np
import pandas as pd
df=pd.read_csv=('C:\\Users\\Maxwell\\Desktop\\Test.data.csv',skiprows=1)
data=df.loc[:,['Steps','Parameter']]
This command generates an error,i.e, it gives a DtypeWwarning:columns (0,1,2,3........81) have mixed types. Specify dtype option on import or set low memory= False
So, I found a workaround.
Solution 2
import pandas as pd
import numpy as np
df=pd.read_csv(('C:\\Users\\Maxwell\\Desktop\\Test.data.csv',skiprows=1,error_bad_lines=False, index_col=False, dtype='unicode')
data=df.loc[:,['Steps','Parameter']]
Two questions:
i)I was able to get around the error, but now the columns that I want(Steps & Parameter)have been converted to objects(probably due to the dtype='unicode' command). How can I convert Steps column into an integer type and parameter into a float.
ii) Some people say that dtype warning isn't really an error. But, I found out that when I use Solution 1 and read the csv file. The Steps column contains some floats.The original csv file doesn't have any floats in Steps column. It looks as if, some floats have been placed by python itself!! Why does this happen?
(I am not able to upload the original csv file, because my company doesn't allow it!)
I'm trying to use the pandas read_sas() function.
First, I create a SAS dataset by running this code in SAS:
libname tmp 'c:\temp';
data tmp.test;
do i=1 to 100;
x=rannor(0);
output;
end;
run;
Now, in IPython, I do this:
import numpy as np
import pandas as pd
%cd C:\temp
pd.read_sas('test.sas7bdat')
Pretty straightforward and seems like it should work. But I just get this error:
TypeError: read() takes at most 1 argument (2 given)
What am I missing here? I'm using pandas version 0.18.0.
According issue report linked below, this bug will be fixed in 18.1.
https://github.com/pydata/pandas/issues/12647
I am trying to pickle a DataFrame with
import pandas as pd
from pandas import DataFrame
data = pd.read_table('Purchases.tsv',index_col='coreuserid')
data.to_pickle('Purchases.pkl')
I have been running on "data" for a while and have had no issues so I know it is not a data corruption issue. I am thinking likely syntax but I have tried a number of variants. I hesitate to give the whole error message but it ends with:
\pickle.pyc in to_pickle(obj, path)
13 """
14 with open(path, 'wb') as f:
15 pkl.dump(obj, f, protocol=pkl.HIGHEST_PROTOCOL)
SystemError: error return without exception set
The Purchases.pkl file is created but if I call
data = pd.read_pickle('Purchases.pkl')
I get EOFError. I am using Canopy 1.4 so pandas 0.13.1 which should be recent enough to have this functionality.
Fast forward a few years, and now it works fine. Thanks pandas ;)
You can try create a class from your DataFrame and pickle it after.
This can help you:
Pass pandas dataframe into class
I met a DF file which is encoded in binary format. But when I open it using Vim, still I can see characters like "pandas.core.frame", "numpy.core.multiarray". So I guess it is related with Python. However I know little about the Python language. Though I have tried using pandas and numpy modules, I failed to read the file. Could you guys give any suggestion on this issue? Thank you in advance. Here is the Dropbox link to the DF file: https://www.dropbox.com/s/b22lez3xysvzj7q/flux.df
Looks like DataFrame stored with pickle, use read_pickle() to read it:
import pandas as pd
df = pd.read_pickle('flux.df')