scikitlearn breaks pandas installation

scikitlearn breaks pandas installation - python

I have a problem having pandas and sklearn work together. Importing any module from sklearn, makes pandas run havoc.
This is a minimal example of my problem:
#!/usr/bin/env python
import pandas as pd
import sklearn.metrics as sk
df_train = pd.DataFrame()
print df_train
Which prints:
/usr/local/lib/python2.7/site-packages/pandas/core/config.py:570: DeprecationWarning: height has been deprecated.
warnings.warn(d.msg, DeprecationWarning)
If I comment the line where I import sklearn.metrics, everything works correctly
Help? :}
Jose

You can ignore the warning message with:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning,
module="pandas", lineno=570)
which should be safe for now. As #Jeff notes, it'll be fixed in pandas 0.13.

Related

error No module named 'xlrd'. how to import excel with python and pandas properly? please close this

I realized that there may be something wrong in my local dev env just now.
I tried my code on colab.
it worked well.
import pandas as pd
df = pd.read_excel('hurun-2018-top50.xlsx')
thank u all.
please close this session.
------- following is original description ---------
I am trying to import excel with python and pandas.
I already pip installed "xlrd" module.
I googled a lot and tried several different methods, none of them worked.
Here is my code.
import pandas as pd
from pandas import ExcelFile
from pandas import ExcelWriter
df = pd.read_excel('hurun-2018-top50.xlsx', index_col=0)
df = pd.read_excel('hurun-2018-top50.xlsx', sheetname='Sheet1')
df = pd.read_excel('hurun-2018-top50.xlsx')
Any response will be appreciated.

Error in data.py module "cannot import name 'wb'"

Pandas has worked fine for me for years. All of a sudden, today, I am getting this error:
File "C:\Users\Excel\Anaconda3\lib\site-packages\dautil\data.py", line 3, in <module>
from pandas.io import wb
ImportError: cannot import name 'wb'
It seems like the error is coming form data.py. Here is a screen shot.
This seemed to happen all of a sudden, and the error is triggered when I run a few different processes that call this process. I uninstalled and re-installed pandas. I am still getting the same error.

The documentation says
Starting in 0.19.0, pandas no longer supports pandas.io.data or
pandas.io.wb, so you must replace your imports from pandas.io with
those from pandas_datareader:
So, as per documentation, you should be doing this:
from pandas.io import data, wb # becomes
from pandas_datareader import data, wb

Even with pandas_datareader, the same error may happen, if this your case, then you have two solutions
for Pandas >=0.23 make sure that your pandas_datareader is > = 0.7, if for some reason you don't want to upgrade pandas_datareader to 0.7, or downgrading the pandas_datareader, then alternavly, you can do:
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
import pandas_datareader as web

Using Dask with Python causes issues when running Pandas code

I am trying to work with Dask because my dataframe has become large and that pandas by itself can't simply process it. I read my dataset in as follows and get the following result that looks odd, not sure why its not outputting the dataframe:
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
import dask.bag as db
import json
%matplotlib inline
Leads = db.read_text('Leads 6.4.18.txt')
Leads
This returns (instead of my pandas dataframe):
dask.bag<bag-fro..., npartitions=1>
Then when I try to rename a few columns:
Leads_updated = Leads.rename(columns={'Business Type':'Business_Type','Lender
Type':'Lender_Type'})
Leads_updated
I get:
AttributeError: 'Bag' object has no attribute 'rename'
Can someone please explain what I am not doing correctly. The ojective is to just use Dask for all these steps since it is too big for regular Python/Pandas. My understanding is the syntax used under Dask should be the same as Pandas.

AttributeError: module 'pandas' has no attribute 'read_csv'

Where I did wrong?
import pandas as pd
import numpy as np
msft = pd.read_csv("week_51.csv")
print(msft.head())

Step 1
Test your pandas installation:
import pandas as pd
pd.test()
Note: for this you need pytest, which comes with most popular distributions.
Step 2
If the test fails, install pandas. There are several methods to choose from depending on your setup.

Scatter_Matrix Will Not Display Using Pandas and

Working through following the Machine Learning Tutorial:
http://machinelearningmastery.com/machine-learning-in-python-step-by-step/
Specifically, Section 4.2. Unfortunately, my code is throwing an error
NameError: name 'scatter_matrix' is not defined
Here is my code:
import pandas
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
dataset = pandas.read_csv(url, names=names)
scatter_matrix(dataset)
plt.show()
There's at least one Stack Overflow question on scatter_matrix, but I haven't able to figure out what's missing.
Pandas scatter_matrix - plot categorical variables

You will have to import it like this:
from pandas.plotting import scatter_matrix

Cause you've imported the Pandas. You could use it like below:
pd.scatter_matrix(dataset)
However, pandas.scatter_matrix() is deprecated. use pandas.plotting.scatter_matrix() instead

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

scikitlearn breaks pandas installation - python

You can ignore the warning message with: import warnings warnings.filterwarnings("ignore", category=DeprecationWarning, module="pandas", lineno=570) which should be safe for now. As #Jeff notes, it'll be fixed in pandas 0.13.

Related

error No module named 'xlrd'. how to import excel with python and pandas properly? please close this

Error in data.py module "cannot import name 'wb'"

Using Dask with Python causes issues when running Pandas code

AttributeError: module 'pandas' has no attribute 'read_csv'

Scatter_Matrix Will Not Display Using Pandas and

Categories

Resources