if I supply ("pandas", "pd"), is it possible to obtain import pandas as pd programatically?
I do know that I can do:
import importlib
pd = importlib.import_module('pandas')
But, can I supply the pd somehow as a string as well?
Technically, yes, and in more than way:
locals()['pd'] = importlib.import_module('pandas')
Of course, it rather bad practice.
Yes.
name = 'pd'
exec(f'import pandas as {name}')
Related
Running this in a jupyter notebook with this. When I run it with just the file path it works fine but when I try to specify a sheet it gives me the error. What would be the right syntax to make this parameter (and I guess other parameters) work?
import pandas as pd
import datetime
import numpy as np
df = pd.DataFrame(pd.read_excel(the file path's name would be here), sheet_name='the sheets name'
)
df
Like Yuca said in the comments. You need to remove the pd.DataFrame
your code should be like this.
import pandas as pd
import datetime
import numpy as np
df = pd.read_excel('C:/path/to/file.xlsx', sheet_name='the sheets name')
import pandas as pd
tbl1 = pd.import_csv('sample_prices.csv')
tbl1.print()
and still not receiving anything? It does not even come up with an error.
The code might be written this way.
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
print(tbl1)
pd.import_csv doesn't exist. You probably meant to use pd.read_csv instead.
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
tbl1.print()
That said, I'm not sure why it wouldn't raise an error...
If you have a custom function called import_csv, you'll want to call it like this:
import pandas as pd
tbl1 = import_csv('sample_prices.csv')
tbl1.print()
...without the pd. prefix.
It should be writed like this
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
print(tbl1)
I have several scripts in a project folder, most of which use the same handful of standard libraries and modules. Instead of having to reiterate in every single script
import pandas as pd
import numpy as np
import datetime
import re
etc
etc
is it possible for me to place all import statements in a masterImports.py file and simply import masterImports at the top of each script ?
Yes you can.
So the basic idea is to import all the libraries in one file. Then import that file.
An example:
masterImports.py
import pandas as pd
import numpy as np
import datetime
import re
etc
etc
otherFile.py
import masterImports as mi
print(mi.datetime.datetime(2021,7,20))
Or you could use wildcard imports -
from masterImports import * # OR from masterImports import important_package
print(datetime.datetime(2021,7,20))
Do not use wildcard asterix imports because there can be name clashes
Try this, and you will see that there is no error
It's possible, although not really the done thing
To use it, you'd need to do, at the top of each script:
from master_imports import *
"test.csv" has columns "col_a", "col_b" and "col_c".
#import pandas import pandas as pd
df = pd.read_csv('./data/test.csv',header=0,dtype={'col_a':object,'col_b':object,'col_c':object})
This code can work well. But I would like to change the code using the variable "key_word" as follow, but it cannot work well.Why? How should I modify this code?
#import pandas import pandas as pd
key_word='col_a':object,'col_b':object,'col_c':object
df = pd.read_csv('./data/test.csv',header=0,dtype={key_word})
make key_word a dictionary by initializing it like this:
key_word={'col_a':object,'col_b':object,'col_c':object}
that should do the trick. right now it cannot possibly work since you produce a massive syntax error without curly brackets.
I have some .rda files that I need to access with Python.
My code looks like this:
import rpy2.robjects as robjects
from rpy2.robjects import r, pandas2ri
pandas2ri.activate()
df = robjects.r.load("datafile.rda")
df2 = pandas2ri.ri2py_dataframe(df)
where df2 is a pandas dataframe. However, it only contains the header of the .rda file! I have searched back and forth. None of the solutions proposed seem to be working.
Does anyone have an idea how to efficiently convert an .rda dataframe to a pandas dataframe?
Thank you for your useful question. I tried the two ways proposed above to handle my problem.
For feather, I faced this issue:
pyarrow.lib.ArrowInvalid: Not a Feather V1 or Arrow IPC file
For rpy2, as mentioned by #Orange: "pandas2ri.ri2py_dataframe does not seem to exist any longer in rpy2 version 3.0.3" or later.
I searched for another workaround and found pyreadr useful for me and maybe for those who are facing the same problems as I am: https://github.com/ofajardo/pyreadr
Usage: https://gist.github.com/LeiG/8094753a6cc7907c716f#gistcomment-2795790
pip install pyreadr
import pyreadr
result = pyreadr.read_r('/path/to/file.RData') # also works for Rds, rda
# done! let's see what we got
# result is a dictionary where keys are the name of objects and the values python
# objects
print(result.keys()) # let's check what objects we got
df1 = result["df1"] # extract the pandas data frame for object df1
You could try using the new feather library developed as a language agnostic dataframe to be used in either R or Python.
# Install feather
devtools::install_github("wesm/feather/R")
library(feather)
path <- "your_file_path"
write_feather(datafile, path)
Then install in python
$ pip install feather-format
And load in your datafile
import feather
path = 'your_file_path'
datafile = feather.read_dataframe(path)
As mentioned, consider converting the .rda file into individual .rds objects using R's mget or eapply for building Python dictionary of dataframes.
RPy2
import os
import pandas as pd
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import importr
pandas2ri.activate()
base = importr('base')
base.load("datafile.rda")
rdf_List = base.mget(base.ls())
# ITERATE THROUGH LIST OF R DFs
pydf_dict = {}
for i,f in enumerate(base.names(rdf_List)):
pydf_dict[f] = pandas2ri.ri2py_dataframe(rdf_List[i])
for k,v in pydf_dict.items():
print(v.head())