I want to plot a curve on an image. I would to see the curve only in a certain range. So:
plt.figure()
plt.imshow(img)
plt.plot(x, my_curve)
plt.axis([0, X, Y, 0])
But in this way also the image is showed in that range, but I don't want this. I would like to see the whole image with a portion of the curve. How can apply the axes only on the second plot?
Note that I can't use a slice of the arrays. I am in this situation:
x = [0 0 0 10 10 10 30 30 30 40 40 40]
my_curve = [0 0 0 10 10 10 30 30 30 40 40 40]
Well I need to see the straight line on the image, but only between pixels 25 and 35. If I delete each element out of such range, I obtain only the point (30,30) and I can not represent the straight line.
If your data is sparse, you can interpolate it :
x2=np.linspace(x[0],x[-1],1000)[0:X]
my_curve2=np.interp(x2,x,my_curve)
plt.plot(x2, my_curve2)
Related
The issue
I have a contourf plot I made with a pandas dataframe that plots some 2-dimensional value with time on the x-axis and vertical pressure level on the y-axis. The field, time, and pressure data I'm pulling is all from a netCDF file. I can plot it fine, but I'd like to scale the y-axis to better represent the real atmosphere. (The default scaling is linear, but the pressure levels in the file imply a different king of scaling.) Basically, it should look something like the plot below on the y-axis. It's like a log scale, but compressing the bottom part of the axis instead of the top. (I don't know the term for this... like a log scale but inverted?) It doesn't need to be exact.
Working example (written in Jupyter notebook)
#modules
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import ticker, colors
#data
time = np.arange(0,10)
lev = np.array([900,800,650,400,100])
df = pd.DataFrame(np.arange(50).reshape(5,10),index=lev,columns=time)
df.index.name = 'Level'
print(df)
0 1 2 3 4 5 6 7 8 9
Level
900 0 1 2 3 4 5 6 7 8 9
800 10 11 12 13 14 15 16 17 18 19
650 20 21 22 23 24 25 26 27 28 29
400 30 31 32 33 34 35 36 37 38 39
100 40 41 42 43 44 45 46 47 48 49
#lists for plotting
levtick = np.arange(len(lev))
clevels = np.arange(0,55,5)
#Main plot
fig, ax = plt.subplots(figsize=(10, 5))
im = ax.contourf(df,levels=clevels,cmap='RdBu_r')
#x-axis customization
plt.xticks(time)
ax.set_xticklabels(time)
ax.set_xlabel('Time')
#y-axis customization
plt.yticks(levtick)
ax.set_yticklabels(lev)
ax.set_ylabel('Pressure')
#title and colorbar
ax.set_title('Some mean time series')
cbar = plt.colorbar(im,values=clevels,pad=0.01)
tick_locator = ticker.MaxNLocator(nbins=11)
cbar.locator = tick_locator
cbar.update_ticks()
The Question
How can I scale the y-axis such that values near the bottom (900, 800) are compressed while values near the top (200) are expanded and given more plot space, like in the sample above my code? I tried using ax.set_yscale('function', functions=(forward, inverse)) but didn't understand how it works. I also tried simply ax.set_yscale('log'), but log isn't what I need.
You can use a custom scale transformation with ax.set_yscale('function', functions=(forward, inverse)) as you suggested. From the documentation:
forward and inverse are callables that return the scale transform
and its inverse.
In this case, define in forward() the function you want, such as the inverse of the log function, or a more custom one for your need. Call this function before your y-axis customization.
def forward(x):
return 2**x
def inverse(x):
return np.log2(x)
ax.set_yscale('function', functions=(forward,inverse))
I am trying to draw a frequency bar plot and a cumulative "ogive" in the same plot. If I draw them separately both are shown OK, but when shown in the same figure, the cumulative graphic is shown shifted. Below the code used.
df = pd.DataFrame({'Correctas': [4,6,5,4,7,2,8,3,5,6,9,6,6,7,5,5,8,10,4,8,3,6,9,5,11,5,12,7,7,5,4,6]});
df['Correctas'].value_counts(sort = False).plot.bar();
df['Correctas'].value_counts(sort = False).cumsum().plot();
plt.show()
The frequency data is
2 1
3 3
4 7
5 14
6 20
7 24
8 27
9 29
10 30
11 31
12 32
So the cumulative shall start from 2 and it starts from 4 on x axis.
image showing the error
This has to do with bar chart plotting categorical x-axis. Here is a quick fix:
df = pd.DataFrame({'Correctas': [4,6,5,4,7,2,8,3,5,6,9,6,6,7,5,5,8,10,4,8,3,6,9,5,11,5,12,7,7,5,4,6]});
df_counts = df['Correctas'].value_counts(sort = False)
df_counts.index = df_counts.index.astype('str')
df_counts.plot.bar(alpha=.8);
df_counts.cumsum().plot(color='k', kind='line');
plt.show();
Output:
How can I make a distplot with seaborn to only have whole numbers?
My data is an array of numbers between 0 and ~18. I would like to plot the distribution of the numbers.
Impressions
0 210
1 1084
2 2559
3 4378
4 5500
5 5436
6 4525
7 3329
8 2078
9 1166
10 586
11 244
12 105
13 51
14 18
15 5
16 3
dtype: int64
Code I'm using:
sns.distplot(Impressions,
# bins=np.arange(Impressions.min(), Impressions.max() + 1),
# kde=False,
axlabel=False,
hist_kws={'edgecolor':'black', 'rwidth': 1})
plt.xticks = range(current.Impressions.min(), current.Impressions.max() + 1, 1)
Plot looks like this:
What I'm expecting:
The xlabels should be whole numbers
Bars should touch each other
The kde line should simply connect the top of the bars. By the looks of it, the current one assumes to have 0s between (x, x + 1), hence why the downward spike (This isn't required, I can turn off kde)
Am I using the correct tool for the job or distplot shouldn't be used for whole numbers?
For your problem can be solved bellow code,
import seaborn as sns # for data visualization
import numpy as np # for numeric computing
import matplotlib.pyplot as plt # for data visualization
arr = np.array([1,2,3,4,5,6,7,8,9])
sns.distplot(arr, bins = arr, kde = False)
plt.xticks(arr)
plt.show()
enter image description here
In this way, you can plot histogram using seaborn sns.distplot() function.
Note: Whatever data you will pass to bins and plt.xticks(). It should be an ascending order.
CH Gayle 17
YK Pathan 16
AB de Villiers 15
DA Warner 14
SK Raina 13
RG Sharma 13
MEK Hussey 12
AM Rahane 12
MS Dhoni 12
G Gambhir 12
I have a series like this. I want to plot the player on the x axis and their respective value on the y axis. I tried this code:
man_of_match=(matches['player_of_match'].value_counts())
sns.countplot(x=(man_of_match),data=matches,color='B')
sns.plt.show()
But with this code, it plots the frequency of the numeric value, i.e on x axis 12 gets plotted and the count on y axis becomes 4. Similarly for 13 on x axis it shows 2 on y axis.
How do i make the x axis show the name of the player and the y axis the corresponding value of the player.?
sns.countplot is meant to do the counting for you. You are counting yourself with value_counts then plotting the counts of counts. Pass matches directly to sns.countplot
ax = sns.countplot(matches['player_of_match'], color='B')
plt.sca(ax)
plt.xticks(rotation=90);
If you want to limit it to the top 10 players. Use value_counts as you did. But use matplotlib directly, to plot.
ax = matches['player_of_match'].value_counts().head(10).plot.bar(width=.8, color='R')
ax.set_xlabel('player_of_match')
ax.set_ylabel('count')
You can get it to look a lot like the seaborn plot
kws = dict(width=.8, color=sns.color_palette('pastel'))
ax = matches['player_of_match'].value_counts().head(10).plot.bar(**kws)
ax.set_xlabel('player_of_match')
ax.set_ylabel('count')
ax.grid(False, axis='x')
Suppose I have a table of data-
No. 200 400 600 800
1 13 14 17 18
2 16 18 20 21
3 20 15 18 19
and so on...
where each column represents a y-value for a given x-value. The first line is the x-value and the first column is the number of each dataset.
How can I read in and plot each row seperately?
For an idea of how I would like my results to be for the table I have quoted above see the following images. I have plotted each plot individually.
http://postimg.org/image/yw46zw7er/92d01c08/
http://postimg.org/image/c1kf2nqwp/29a8b1c8/
Matplotlib plots 2d arrays by plotting each column, so here you just need to transpose your data. Assuming the data is in a text file called data.csv.
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('data.csv')
x = [200, 400, 600, 800]
plt.plot(x, data.T)
plt.legend((1,2,3))
plt.show()