I am trying to convert a pandas dataframe into an rda file. I found this code to do this:
'''
import rpy2
from rpy2 import robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()
r_data = pandas2ri.py2rpy(df)
robjects.r.assign("df", r_data)
robjects.r("save(df, file='test.rda')")
'''
But I keep getting the following error:
AttributeError: 'list' object has no attribute 'encode'
I'm not sure what this means, whether it is because I need to convert the df to a UTF-8 and not sure how to go about doing that.
Note: I am using python 3.10
Related
I'm using a Jupyter notebook and I'm trying to open a data file and keep getting an error code AttributeError:'pandas._libs.properties.AxisProperty' object has no attribute 'unique'. This is my first time using Jupyter So I am not familiar with any error like this.
import pandas as pd
df = pd.DataFrame
df - pd.read_csv("C:/Users/yusra/OneDrive/Documents/vgsales.csv")
df
You are not using pd.DataFrame right. See below corrected code:
import pandas as pd
df=pd.read_csv("C:/Users/yusra/OneDrive/Documents/vgsales.csv")
df
I've just started a python for finance course and am brand new to programming. I'm trying to import a csv file into "fb" dataframe but it keeps giving me the following error: type object 'DataFrame' has no attribute 'read_csv'.
Here is my code:
import pandas
import pandas as pd
fb=pd.DataFrame.read_csv('data/facebook.csv')
Instead of
fb=pd.DataFrame.read_csv('data/facebook.csv')
Try
fb=pd.read_csv('data/facebook.csv')
I'm trying to save a pandas DataFrame in binary data formats and book says that pandas objects all have save method which writes the data to disc as a pickle. but when I run the code there is an error. Is there save method for pandas objects in pandas new versions? I'm using pandas 0.25.3
import pandas as pd
frame = pd.read_csv('PandasTest.csv')
frame.save('PandasTest_Pickle')
The error is:
AttributeError: 'DataFrame' object has no attribute 'save'
As others in comment section suggested, use 'to_pickle' and 'read_pickle' methods. For e.g,
import pandas as pd
frame=pd.read_csv('data.csv')
frame.to_pickle('frame_pickle')
pd.read_pickle('frame_pickle')
I am trying to call the function 'Nelson.Siegel' in the'YieldCurve' Package using rpy2. 'Nelson.Siegel' takes an xts file (rates) and a list (Marurity) as inputs, it seems that I have to convert pandas data frame into xts format, and I am not sure how to achieve it. And I am not sure if I call the Nelson.Siegel function in the correct way. Any help will be appreicated.
I try to use pandas2ri.activate() to change data type from pandas to r but it seems that I need to further make it into xts format. I try to import as.xts in xts package but it doesn't work together with rpy2.
import pandas as pd
import numpy as np
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
base = importr('base')
utils = importr('utils')
utils.install_packages('YieldCurve', repos="http://cran.us.r-project.org")
Yieldcurve= importr('YieldCurve')
NelsonSiegel = robjects.r('Nelson.Siegel')
from rpy2.robjects import pandas2ri
pandas2ri.activate()
Maturity=[0.5,1,2]
df = pd.DataFrame(np.random.randint(0,30,size=(10,3)),
columns=["1","2","3"],
index=pd.date_range("20190101", periods=10))
NSParam= NelsonSiegel(df, Maturity)
Error message: Error in is.finite(if (is.character(from)) from <- as.numeric(from) else from) :
default method not implemented for type 'list'
Specifying that Maturity should be an vector rather than let the converter assume that a list is wanted might solve this:
Maturity=robjects.vectors.IntVector([0.5,1,2])
Otherwise, first check whether your pandas data frame is safely converter to an R data frame:
df = pd.DataFrame(np.random.randint(0,30,size=(10,3)),
columns=["1","2","3"],
index=pd.date_range("20190101", periods=10))
base.print(df)
I have some .rda files that I need to access with Python.
My code looks like this:
import rpy2.robjects as robjects
from rpy2.robjects import r, pandas2ri
pandas2ri.activate()
df = robjects.r.load("datafile.rda")
df2 = pandas2ri.ri2py_dataframe(df)
where df2 is a pandas dataframe. However, it only contains the header of the .rda file! I have searched back and forth. None of the solutions proposed seem to be working.
Does anyone have an idea how to efficiently convert an .rda dataframe to a pandas dataframe?
Thank you for your useful question. I tried the two ways proposed above to handle my problem.
For feather, I faced this issue:
pyarrow.lib.ArrowInvalid: Not a Feather V1 or Arrow IPC file
For rpy2, as mentioned by #Orange: "pandas2ri.ri2py_dataframe does not seem to exist any longer in rpy2 version 3.0.3" or later.
I searched for another workaround and found pyreadr useful for me and maybe for those who are facing the same problems as I am: https://github.com/ofajardo/pyreadr
Usage: https://gist.github.com/LeiG/8094753a6cc7907c716f#gistcomment-2795790
pip install pyreadr
import pyreadr
result = pyreadr.read_r('/path/to/file.RData') # also works for Rds, rda
# done! let's see what we got
# result is a dictionary where keys are the name of objects and the values python
# objects
print(result.keys()) # let's check what objects we got
df1 = result["df1"] # extract the pandas data frame for object df1
You could try using the new feather library developed as a language agnostic dataframe to be used in either R or Python.
# Install feather
devtools::install_github("wesm/feather/R")
library(feather)
path <- "your_file_path"
write_feather(datafile, path)
Then install in python
$ pip install feather-format
And load in your datafile
import feather
path = 'your_file_path'
datafile = feather.read_dataframe(path)
As mentioned, consider converting the .rda file into individual .rds objects using R's mget or eapply for building Python dictionary of dataframes.
RPy2
import os
import pandas as pd
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import importr
pandas2ri.activate()
base = importr('base')
base.load("datafile.rda")
rdf_List = base.mget(base.ls())
# ITERATE THROUGH LIST OF R DFs
pydf_dict = {}
for i,f in enumerate(base.names(rdf_List)):
pydf_dict[f] = pandas2ri.ri2py_dataframe(rdf_List[i])
for k,v in pydf_dict.items():
print(v.head())