I'm running this code to build a scatter matrix. The problem is that the plot looks like a mess, because it's impossible to see the names of variables (see image below). Is there any way to change the orientation of titles and switch off the ticks with numbers?
import pandas as pd
import matplotlib.pyplot as plt
train = pd.read_csv('data/train.csv', parse_dates=[0])
plt.figure()
a = pd.scatter_matrix(train, alpha=0.05, figsize=(10,10), diagonal='hist')
plt.show()
As a minimal scatter_matrix example to switch off axis ticks and rotate the labels,
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
try:
from pandas.tools.plotting import scatter_matrix
except ImportError:
#Fix suggested by #Raimundo Jimenez as tools is deprecated
from pandas.plotting import scatter_matrix
df = pd.DataFrame(np.random.randn(1000, 4), columns=['long label', 'testing', 'another label', 'something else'])
sm = scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal='kde')
#Change label rotation
[s.xaxis.label.set_rotation(45) for s in sm.reshape(-1)]
[s.yaxis.label.set_rotation(0) for s in sm.reshape(-1)]
#May need to offset label when rotating to prevent overlap of figure
[s.get_yaxis().set_label_coords(-0.3,0.5) for s in sm.reshape(-1)]
#Hide all ticks
[s.set_xticks(()) for s in sm.reshape(-1)]
[s.set_yticks(()) for s in sm.reshape(-1)]
plt.show()
and similarly, you can adjust labels, resize, etc with any of the axis objects contained in the returned handle from scatter_matrix. This results in,
pandas.tools.plotting.scatter_matrix is now deprecated. Use pandas.plotting.scatter_matrix instead.
Updated code from the one proposed by Ed Smith (#ed-smith):
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame(np.random.randn(1000, 4), columns=['long label', 'testing', 'another label', 'something else'])
sm = pd.plotting.scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal='kde')
#Change label rotation
[s.xaxis.label.set_rotation(45) for s in sm.reshape(-1)]
[s.yaxis.label.set_rotation(0) for s in sm.reshape(-1)]
#May need to offset label when rotating to prevent overlap of figure
[s.get_yaxis().set_label_coords(-0.3,0.5) for s in sm.reshape(-1)]
#Hide all ticks
[s.set_xticks(()) for s in sm.reshape(-1)]
[s.set_yticks(()) for s in sm.reshape(-1)]
plt.show()
Related
When I run the code below I notice that the heatmap does not have a square shape knowing that I have used square=True but it did not work! Any idea how can I print the heatmap in a square format? Thank you!
The code:
from datetime import datetime
import numpy as np
import pandas as pd
import matplotlib as plt
import os
import seaborn as sns
temp_hourly_A5_A7_AX_ASHRAE=pd.read_csv('C:\\Users\\cvaa4\\Desktop\\projects\\s\\temp_hourly_A5_A7_AX_ASHRAE.csv',index_col=0, parse_dates=True, dayfirst=True, skiprows=2)
sns.heatmap(temp_hourly_A5_A7_AX_ASHRAE,cmap="YlGnBu", vmin=18, vmax=27, square=True, cbar=False, linewidth=0.0001);
The result:
square=True should work to have square cells, below is a working example:
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.tile([0,1], 15*15).reshape(-1,15))
sns.heatmap(df, square=True)
If you want a square shape of the plot however, you can use set_aspect and the shape of the data:
ax = sns.heatmap(df)
ax.set_aspect(df.shape[1]/df.shape[0]) # here 0.5 Y/X ratio
You can use matplotlib and set a figsize before plotting heatmap.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
rnd = np.random.default_rng(12345)
data = rnd.uniform(-100, 100, [100, 50])
plt.figure(figsize=(6, 5))
sns.heatmap(data, cmap='viridis');
Note that I used figsize=(6, 5) rather than a square figsize=(5, 5). This is because on a given figsize, seaborn also puts the colorbar, which might cause the heatmap to be squished a bit. You might want to change those figsizes too depending on what you need.
I have this bar graph but the X labels that are long keep overflowing into the other label. Is there a way I can create more space or cause a line break when it is doing this?
Below is the part of the code that accounts for the graph
import pandas as pd
import matplotlib.pyplot as plt
ax = tweets_df.plot(kind='bar', x='name', y='tweet_volume', fontsize=7, width=.5)
ax.set_xlabel('Hastag')
ax.set_ylabel('Tweets w/ Hashtag')
plt.xticks(rotation='horizontal')
plt.show()
IMHO you can use rotation=90 instead of rotation='horizontal' or if you want to keep horizontal and truncating values,
import pandas as pd
import matplotlib.pyplot as plt
N = 5
ax = tweets_df.plot(kind='bar', x='name', y='tweet_volume', fontsize=7, width=.5)
ax.set_xlabel('Hastag')
ax.set_ylabel('Tweets w/ Hashtag')
plt.xticks(rotation='horizontal')
labels = [item.get_text() for item in ax.get_xticklabels()]
ax.set_xticklabels([label[:N] for label in labels])
plt.show()
I want this plot's y-axis to be centered at 38, and the y-axis scaled such that the 'humps' disappear. How do I accomplish this?
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02', '33.79', '34.73', '36.47', '35.03', '33.45',
'33.82', '33.38', '34.68', '36.93', '33.44', '33.55',
'33.18', '33.07', '33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 2))
for i,j in zip(s,df[0]):
ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, df[0])
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.grid()
#plt.yticks(np.arange(min(df[p]), max(df[p])+1, 2))
fig.savefig("test.png")
plt.show()
I'm not entirely certain if this is what you're looking for but you can adjust the y-limits explicitly to change the scale, i.e.
ax.set_ylim([ax.get_ylim()[0], 42])
Which only sets the upper bound, leaving the lower limit unchanged, this would give you
you can supply any values you find appropriate, i.e.
ax.set_ylim([22, 52])
will give you something that looks like
Also note that the tick labels and general appearance of your plot will differ from what is shown here.
Edit - Here is the complete code as requested:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame()
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02','33.79','34.73','36.47','35.03','33.45',
'33.82','33.38','34.68','36.93','33.44','33.55',
'33.18','33.07','33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 3))
#for i,j in zip(s,df[0]):
# ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, pd.to_numeric(df[0]))
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.set_xticklabels(pd.to_datetime(s).strftime('%m.%d'), rotation=45)
ax.set_ylim([22, 52])
plt.show()
UPDATED
I have write down a code like the given bellow..
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.read_csv("data_1.csv",index_col="Group")
print df
fig,ax = plt.subplots(1)
heatmap = ax.pcolor(df)########
ax.pcolor(df,edgecolors='k')
cbar = plt.colorbar(heatmap)##########
plt.ylim([0,12])
ax.invert_yaxis()
locs_y, labels_y = plt.yticks(np.arange(0.5, len(df.index), 1), df.index)
locs_x, labels_x = plt.xticks(np.arange(0.5, len(df.columns), 1), df.columns)
ax.set_xticklabels(labels_x, rotation=10)
ax.set_yticklabels(labels_y,fontsize=10)
plt.show()
Which takes input like given bellow and plot a heat map with the two side leabel left and bottom..
GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30
S2,28,20,20,39,25
S3,20,21,21,44,21
I further want to add additional labels at right side as given bellow to the data and want to plot a heatmap with three side label. right left and bottom.
GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30,V1
S2,28,20,20,39,25,V2
S3,20,21,21,44,21,V3
What changes should i incorporate into the code.
Please help ..
You may create a new axis on the right of the plot, called twinx. Then you need to essentially adjust this axis the same way you already did with the first axis.
u = u"""GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30
S2,28,20,20,39,25
S3,20,21,21,44,21"""
import io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv(io.StringIO(u),index_col="GP1")
fig,ax = plt.subplots(1)
heatmap = ax.pcolor(df, edgecolors='k')
cbar = plt.colorbar(heatmap, pad=0.1)
bx = ax.twinx()
ax.set_yticks(np.arange(0.5, len(df.index), 1))
ax.set_xticks(np.arange(0.5, len(df.columns), 1), )
ax.set_xticklabels(df.columns, rotation=10)
ax.set_yticklabels(df.index,fontsize=10)
bx.set_yticks(np.arange(0.5, len(df.index), 1))
bx.set_yticklabels(["V1","V2","V3"],fontsize=10)
ax.set_ylim([0,12])
bx.set_ylim([0,12])
ax.invert_yaxis()
bx.invert_yaxis()
plt.show()
I am trying to set a background image to a line plot that I have done in matplotlib. While importing the image and using zorder argument also, I am getting two seperate images, in place of a single combined image. Please suggest me a way out. My code is --
import quandl
import pandas as pd
import sys, os
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import itertools
def flip(items, ncol):
return itertools.chain(*[items[i::ncol] for i in range(ncol)])
df = pd.read_pickle('neer.pickle')
rows = list(df.index)
countries = ['USA','CHN','JPN','DEU','GBR','FRA','IND','ITA','BRA','CAN','RUS']
x = range(len(rows))
df = df.pct_change()
fig, ax = plt.subplots(1)
for country in countries:
ax.plot(x, df[country], label=country)
plt.xticks(x, rows, size='small', rotation=75)
#legend = ax.legend(loc='upper left', shadow=True)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.show(1)
plt.figure(2)
im = plt.imread('world.png')
ax1 = plt.imshow(im, zorder=1)
ax1 = df.iloc[:,:].plot(zorder=2)
handles, labels = ax1.get_legend_handles_labels()
plt.legend(flip(handles, 2), flip(labels, 2), loc=9, ncol=12)
plt.show()
So in the figure(2) I am facing problem and getting two separate plots
In order to overlay background image over plot, we need imshow and extent parameter from matplotlib.
Here is an condensed version of your code. Didn't have time to clean up much.
First a sample data is created for 11 countries as listed in your code. It is then pickled and saved to a file (since there is no pickle file data).
import quandl
import pandas as pd
import sys, os
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import itertools
from scipy.misc import imread
countries = ['USA','CHN','JPN','DEU','GBR','FRA','IND','ITA','BRA','CAN','RUS']
df_sample = pd.DataFrame(np.random.randn(10, 11), columns=list(countries))
df_sample.to_pickle('c:\\temp\\neer.pickle')
Next the pickle file is read and we create bar plot directly from pandas
df = pd.read_pickle('c:\\temp\\neer.pickle')
my_plot = df.plot(kind='bar',stacked=True,title="Plot Over Image")
my_plot.set_xlabel("countries")
my_plot.set_ylabel("some_number")
Next we use imread to read image into plot.
img = imread("c:\\temp\\world.png")
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.imshow(img,zorder=0, extent=[0.1, 10.0, -10.0, 10.0])
plt.show()
Here is an output plot with image as background.
As stated this is crude and can be improved further.
You're creating two separate figures in your code. The first one with fig, ax = plt.subplots(1) and the second with plt.figure(2)
If you delete that second figure, you should be getting closer to your goal