Where I did wrong?
import pandas as pd
import numpy as np
msft = pd.read_csv("week_51.csv")
print(msft.head())
Step 1
Test your pandas installation:
import pandas as pd
pd.test()
Note: for this you need pytest, which comes with most popular distributions.
Step 2
If the test fails, install pandas. There are several methods to choose from depending on your setup.
Related
I am following a tutorial online to use Python to determine financial returns. I am getting an error "TypeError: line() got an unexpected keyword argument 'reuse_plot'" when I am following the video content. I am very new to Python and I really do not understand why.
Here is the total code:
import datetime as dt
import pandas as pd
import numpy as np
import pylab
import seaborn as sns
import scipy.stats as stats
from pandas_datareader import data as pdr
import plotly.offline as pyo
pyo.init_notebook_mode(connected=True)
pd.options.plotting.backend = 'plotly'
end = dt.datetime.now()
start = dt.datetime(2018,1,1)
df = pdr.get_data_yahoo('CBA.AX', start, end)
df.Close.plot()
This error occurs due to conflicts between pandas and plotly packages. I could solve it by upgrading the pandas to the latest version (I can confirm that for pandas==1.3.5 and plotly==5.7.0, the error is not generated anymore).
I realized that there may be something wrong in my local dev env just now.
I tried my code on colab.
it worked well.
import pandas as pd
df = pd.read_excel('hurun-2018-top50.xlsx')
thank u all.
please close this session.
------- following is original description ---------
I am trying to import excel with python and pandas.
I already pip installed "xlrd" module.
I googled a lot and tried several different methods, none of them worked.
Here is my code.
import pandas as pd
from pandas import ExcelFile
from pandas import ExcelWriter
df = pd.read_excel('hurun-2018-top50.xlsx', index_col=0)
df = pd.read_excel('hurun-2018-top50.xlsx', sheetname='Sheet1')
df = pd.read_excel('hurun-2018-top50.xlsx')
Any response will be appreciated.
Pandas has worked fine for me for years. All of a sudden, today, I am getting this error:
File "C:\Users\Excel\Anaconda3\lib\site-packages\dautil\data.py", line 3, in <module>
from pandas.io import wb
ImportError: cannot import name 'wb'
It seems like the error is coming form data.py. Here is a screen shot.
This seemed to happen all of a sudden, and the error is triggered when I run a few different processes that call this process. I uninstalled and re-installed pandas. I am still getting the same error.
The documentation says
Starting in 0.19.0, pandas no longer supports pandas.io.data or
pandas.io.wb, so you must replace your imports from pandas.io with
those from pandas_datareader:
So, as per documentation, you should be doing this:
from pandas.io import data, wb # becomes
from pandas_datareader import data, wb
Even with pandas_datareader, the same error may happen, if this your case, then you have two solutions
for Pandas >=0.23 make sure that your pandas_datareader is > = 0.7, if for some reason you don't want to upgrade pandas_datareader to 0.7, or downgrading the pandas_datareader, then alternavly, you can do:
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
import pandas_datareader as web
I am trying to work with Dask because my dataframe has become large and that pandas by itself can't simply process it. I read my dataset in as follows and get the following result that looks odd, not sure why its not outputting the dataframe:
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
import dask.bag as db
import json
%matplotlib inline
Leads = db.read_text('Leads 6.4.18.txt')
Leads
This returns (instead of my pandas dataframe):
dask.bag<bag-fro..., npartitions=1>
Then when I try to rename a few columns:
Leads_updated = Leads.rename(columns={'Business Type':'Business_Type','Lender
Type':'Lender_Type'})
Leads_updated
I get:
AttributeError: 'Bag' object has no attribute 'rename'
Can someone please explain what I am not doing correctly. The ojective is to just use Dask for all these steps since it is too big for regular Python/Pandas. My understanding is the syntax used under Dask should be the same as Pandas.
I have a problem having pandas and sklearn work together. Importing any module from sklearn, makes pandas run havoc.
This is a minimal example of my problem:
#!/usr/bin/env python
import pandas as pd
import sklearn.metrics as sk
df_train = pd.DataFrame()
print df_train
Which prints:
/usr/local/lib/python2.7/site-packages/pandas/core/config.py:570: DeprecationWarning: height has been deprecated.
warnings.warn(d.msg, DeprecationWarning)
If I comment the line where I import sklearn.metrics, everything works correctly
Help? :}
Jose
You can ignore the warning message with:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning,
module="pandas", lineno=570)
which should be safe for now. As #Jeff notes, it'll be fixed in pandas 0.13.