How can I fix this Import Error Seaborn Heatmap? - python

I am new to python and I am trying to create a heatmap off of a pivot table. Below is the code I am using. There are NaN values in my df but I made them 0's and I am still getting the same answer. I have also read that it has something to do with the way I have named my file similar to python's standard library but I do not recall doing this or know how to change it.
sns.heatmap(pivot, annot = True)
plt.yticks(rotation=0)
plt.xticks(rotation=90)
plt.show()
The error looks like this:
ImportError: cannot import name 'roperator'
These are the import I have done:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
import matplotlib.pyplot as plt
import seaborn as sns

Related

sns.regplot does not show the fitted regression line

I am new to Python and have a problem that I want to solve.
I used the following code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
path='https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/automobileEDA.csv'
df = pd.read_csv(path)
sns.regplot(x="engine-size", y="price", data=df)
plt.ylim(0,)
When I run the code the I don't get the fitted regression line, only the Scatterplot shows up.
I also get following error:
TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'
Can someone help?

How should I reduce the computing time in pandas on Kaggle?

I am working on 2019 Data Science Bowl.The training and testing data is taking a long time when I am using pandas to read it ,I want to reduce the time so that the machine can run the analysis efficiently.
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
import plotly as py
import plotly.express as px
import plotly.graph_objs as go
from plotly.subplots import make_subplots
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
keep_cols = ['event_id', 'game_session', 'installation_id', 'event_count', 'event_code', 'title', 'game_time', 'type', 'world']
specs_df = pd.read_csv('/kaggle/input/data-science-bowl-2019/specs.csv')
train_df = pd.read_csv('/kaggle/input/data-science-bowl-2019/train.csv',usecols=keep_cols)
test_df = pd.read_csv('/kaggle/input/data-science-bowl-2019/test.csv')
train_labels_df = pd.read_csv('/kaggle/input/data-science-bowl-2019/train_labels.csv')
Pandas read_csv method has a chunksize argument yields a certain number of rows as an iterator. This is useful for very large data sets where you can train on a smaller subset of the data iteratively.
More information on iterating through files is described in the documentation here.

Issues importing pandas tool scatter_matrix

I am currently facing an import issue with pandas.tools.plotting. I try to import the scatter matrix via
from pandas.tools.plotting import scatter_matrix
But I get the following error message from visual studio code:
[pylint] E0611:No name 'scatter_matrix' in module
'pandas.tools.plotting'
I also tried
from pandas.tools import scatter_matrix
but it didn't work either. Why can't I import the scatter matrix?
I am using
python 3.6.4
pandas 0.22.0
You need to use this line of code to import pandas scatter_matrix. As seen in the docs of pandas visualization.
from pandas.plotting import scatter_matrix
e.g.
scatter = pd.plotting.scatter_matrix(X, c = y, marker = 'o', s=40, hist_kwds={'bins':15}, figsize=(9,9), cmap = cmap)

Plotting an ETF price for longer time period

I have the code below. If you run my code the graph will be showing price history for just one year. Can someone tell me how I can plot SPY instrument for the whole time period from 01.01.2008 until now.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt from pandas_datareader import data as web
from pylab import plt plt.style.use('ggplot')
%matplotlib inline
spy=web.DataReader("SPY",data_source="google",start="2008-1-1")
spy["Close"].plot()

Scatter_Matrix Will Not Display Using Pandas and

Working through following the Machine Learning Tutorial:
http://machinelearningmastery.com/machine-learning-in-python-step-by-step/
Specifically, Section 4.2. Unfortunately, my code is throwing an error
NameError: name 'scatter_matrix' is not defined
Here is my code:
import pandas
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
dataset = pandas.read_csv(url, names=names)
scatter_matrix(dataset)
plt.show()
There's at least one Stack Overflow question on scatter_matrix, but I haven't able to figure out what's missing.
Pandas scatter_matrix - plot categorical variables
You will have to import it like this:
from pandas.plotting import scatter_matrix
Cause you've imported the Pandas. You could use it like below:
pd.scatter_matrix(dataset)
However, pandas.scatter_matrix() is deprecated. use pandas.plotting.scatter_matrix() instead

Categories