I have bunch of numbers (lets say all less than 500) and some 'inf' in my data. I want to plot them using integer numbers (I represent 'inf' with 1000). However, on y axis of plot, I want to write 'inf' instead of 1000. If I use 'plt.yticks', I can add label to all y points which is not what I want. How can I add a label to one specific point?
You can override the both the data position of the ticks and the labels of the ticks. Here is an example of a scatter plot with 3 extra points at "infinity". It does not look great because the extra points are at 1000, and there are ticks showing in the white space.
from matplotlib import pyplot as plt
import numpy as np
# create some data, plot it.
x = np.random.random(size=300)
y = np.random.randint(0,500, 300)
x_inf = [0,.1,.2]
y_inf = [1000,1000,1000]
plt.scatter(x,y)
plt.scatter(x_inf, y_inf)
First grab the axis. Then we can overwrite what data positions should have ticks, in this case 0 to 500 in steps of 100, and then 1000. Then we can also overwrite the labels of the ticks themselves.
ax = plt.gca()
# set the positions where we want ticks to appear
ax.yaxis.set_ticks([0,100,200,300,400,500,1000])
# set what will actually be displayed at each tick.
ax.yaxis.set_ticklabels([0,100,200,300,400,500,'inf'])
Check this out:
plt.plot(np.arange(5), np.arange(5))
plt.yticks(np.arange(5), [200, 400, 600, 800, 'inf'])
plt.show()
Related
The figure attached here has too many ticks in y axis and is congested.I dont want to change the y axis to any other scale.I want to hide the every second number in y axis.Is this possible?,which means [5,15,25.....] will be hidden.
Is there any other approach to avoid congestion in y axis?
pl.figure(figsize=(10, 8))
pl.scatter(x=T1['Current_Sim_rcs_obj1'], y=T1['Final Mean_Range'])
pl.xlabel('Truth RCS [dBsm]')
pl.xlim(-40, 5)
pl.ylim(0, 280)
pl.grid()
pl.ylabel('RUT_Range[m]')
pl.xticks(np.arange(-45, 15, 5))
pl.yticks(np.arange(0, 281, 10))
pl.show()
I tried to add the below line to the above code,which didn't worked as expected.
pl.axes().yaxis.set_minor_locator(MultipleLocator(5))
You can use matplotlib's yticks() function to get all the ticks (locations) and labels. Then you can modify the two lists based on some criteria. For the criteria you give, we can ignore every second tick and label:
from matplotlib import pyplot as plt
plt.figure() # Create a figure
locs, labels = plt.yticks() # Get the current locations and labels
locs = locs[0::2] # Choose every other location
labels = labels[0::2] # Choose every other label
plt.yticks(locs, labels) # Set new yticks and labels
I could imagine other cases where where the criteria are based on the length of locs. In fact, that's probably what matplotlib is doing behind the scenes already. It has some heuristic which is trying to choose a good length for locs based on the size of the plot and the scale of the data.
I'd like to remove the vertical grid line corresponding to the custom xtick (displayed at x = 71 in the below picture). I could remove a horizontal grid line corresponding to the ytick 701 in the below picture by using a hack : since I have no minor tick on the y axis, I defined the custom ytick corresponding to the line that points toward the maximum and crosses the y axis as a minor tick, and then I disabled grid lines for minor ticks on the y axis. Unfortunately I cannot use the same hack on the x axis without disabling the grid lines of the minor ticks and that's something I'd like to avoid at all costs.
Below is a not so minimal albeit still WE.
There are many things I don't understand, the 2 majors are why does
locs, labels = plt.xticks()
not return the locs and labels that are plotted and why I don't get xticks labels displayed as 10^x where x = 0, 1, 2 and 3 but that's outside the scope of the original question.
import matplotlib.pyplot as plt
plt.grid(True)
import numpy as np
# Generate data
x_data = np.arange(1, 1000 , 10)
y_data = np.random.lognormal(1e-5, 3, len(x_data))
y_max = max(y_data)
# plot
plt.xscale('log')
import math
ratio_log = math.log(x_data[np.argmax(y_data)]) / math.log(max(x_data)) # I need to do this in order to plot a horizontal red dashed line that points to the max and do not extend any further.
plt.axhline(y=y_max, xmin=0, xmax = ratio_log, color='r', linestyle='--') # horizontal line pointing to the max y value.
axes = plt.gca()
axes.set_xlim([1, max(x_data)]) # Limits for the x axis.
# custom ticks and labels
# First yticks because I'm able to achieve what I seek
axes.set_yticks([int(y_max)], minor=True) # Sets the custom ytick as a minor one.
from matplotlib.ticker import FormatStrFormatter
axes.yaxis.set_minor_formatter(FormatStrFormatter("%.0f"))
axes.yaxis.grid(False, which='minor') # Removes minor yticks grid. Since I only have my custom yticks as a minor one, this will disable only the grid line corresponding to that ytick. That's a hack.
import matplotlib.ticker as plticker
loc = plticker.MultipleLocator(base=y_max / 3.3) # this locator puts ticks at regular intervals. I ensure the y axis ticks look ok.
axes.yaxis.set_major_locator(loc)
# Now xticks. I'm having a lot of difficulty here, unable to remove the grid of a particular custom xticks.
locs, labels = plt.xticks() # Strangely, this doesn't return the locs and labels that are plotted. There are indeed 2 values that aren't displayed in the plot, here 1.00000000e-01 and 1.00000000e+04. I've got to remove them before I can append my custom loc and label.
# This means that if I do: plt.xticks(locs, labels) right here, it would enlarge both the lower and upper limits on the x axis... I fail to see how that's intuitive or useful at all. Might this be a bug?
locs = np.append(locs[1:-1], np.asarray(x_data[np.argmax(y_data)])) # One of the ugliest hack I have ever seen... to get correct ticks and labels.
labels = (str(int(loc)) for loc in locs) # Just visuals to get integers on the axis.
plt.xticks(locs, labels) # updates the xticks and labels.
plt.plot((x_data[np.argmax(y_data)], x_data[np.argmax(y_data)]), (0, y_max), 'r--') # vertical line that points to the max. Non OO way to do it, so a bad way.
plt.plot(x_data, y_data)
plt.savefig('grid_prob.png')
plt.close()
Example picture below (the code outputs a different picture each time it is executed, but the problem appears in all pictures).
Credit for the idea goes to #ImportanceOfBeingErnest to whom I am extremely grateful.
I removed the grid with
axes.xaxis.grid(False, which='both')
, then I added a grid correspond to each xtick except the custom one with the following loop:
for loc in locs[1:-1]:
if loc != x_data[np.argmax(y_data)]:
plt.axvline(x=loc, color = 'grey', linestyle = '-', linewidth = 0.4)
Insert this code just before the line
plt.xticks(locs, labels) # updates the xticks and labels.
Example of output picture below.
I currently use the align=’edge’ parameter and positive/negative widths in pyplot.bar() to plot the bar data of one metric to each axis. However, if I try to plot a second set of data to one axis, it covers the first set. Is there a way for pyplot to automatically space this data correctly?
lns3 = ax[1].bar(bucket_df.index,bucket_df.original_revenue,color='c',width=-0.4,align='edge')
lns4 = ax[1].bar(bucket_df.index,bucket_df.revenue_lift,color='m',bottom=bucket_df.original_revenue,width=-0.4,align='edge')
lns5 = ax3.bar(bucket_df.index,bucket_df.perc_first_priced,color='grey',width=0.4,align='edge')
lns6 = ax3.bar(bucket_df.index,bucket_df.perc_revenue_lift,color='y',width=0.4,align='edge')
This is what it looks like when I show the plot:
The data shown in yellow completely covers the data in grey. I'd like it to be shown next to the grey data.
Is there any easy way to do this? Thanks!
The first argument to the bar() plotting method is an array of the x-coordinates for your bars. Since you pass the same x-coordinates they will all overlap. You can get what you want by staggering the bars by doing something like this:
x = np.arange(10) # define your x-coordinates
width = 0.1 # set a width for your plots
offset = 0.15 # define an offset to separate each set of bars
fig, ax = plt.subplots() # define your figure and axes objects
ax.bar(x, y1) # plot the first set of bars
ax.bar(x + offset, y2) # plot the second set of bars
Since you have a few sets of data to plot, it makes more sense to make the code a bit more concise (assume y_vals is a list containing the y-coordinates you'd like to plot, bucket_df.original_revenue, bucket_df.revenue_lift, etc.). Then your plotting code could look like this:
for i, y in enumerate(y_vals):
ax.bar(x + i * offset, y)
If you want to plot more sets of bars you can decrease the width and offset accordingly.
I wrote the following program in python to obtain equi-width histograms. But when I am plotting it I am getting a single line in figure instead of a histogram. Can someone please help me figure out as to where am I going wrong.
import numpy as np
import matplotlib.pyplot as plt
for num in range(0,5):
hist, bin_edges = np.histogram([1000, 98,99992,8474,95757,958574,97363,97463,1,4,5], bins = 5)
plt.bar(bin_edges[:-1], hist, width = 1000)
plt.xlim(min(bin_edges), max(bin_edges))
plt.show()
Additionally I want to label each plot obtained with its "num" value..which range from 0 to 5. In the example given above although I have kept my data constant, but I intend to change my data for different "num" values.
Look at your bin edges:
>>> bin_edges
array([ 1.00000000e+00, 1.91715600e+05, 3.83430200e+05,
5.75144800e+05, 7.66859400e+05, 9.58574000e+05])
Your bin positions range from 1 to approximately 1 million, but you only gave the bars a width of 1000. Your bars, where they exist at all, are too skinny to be seen. Also, most of the bars have sero height, because most of the bins are empty:
>>> hist
array([10, 0, 0, 0, 1])
The "line" you see is the last bin, with one element. This bin covers a span of approximately 200000, but the bar width is only 1000, so it is very thin relative to the amount of space it is supposed to cover. The bar of height 10 is also there, but it's also very skinny, and jammed up against the left edge of the plot, so it's basically invisible.
It doesn't make sense to try to use constant-width bars while also placing them at x-coordinates that correspond to their size. By putting the bars at those x-coordinates, you are already spacing them out proportional to the bin widths; making the bars skinnier doesn't bring them closer together, it just makes them invisible.
If you want to use constant-width bars, you should put them at sequential X positions and use labels on the axis to show the values the bins represent. Here's a simple example with your data:
plt.bar(np.arange(len(bin_edges)-1), hist, width=1)
plt.xticks((np.arange(len(bin_edges))-0.5)[1:], bin_edges[:-1])
You'll have to decide how you want to format those labels.
So I have a graph that runs on an order of magnitude 10000 time steps, and thus I have a lot of data points and the xticks are spaced pretty far apart, which is cool, but I would like to to show on the xaxis the point at which the data is being plotted. In this case the xtick I want to show is 271. So is there a way to just "insert" 271 tick onto the x axis given that I already know what tick I want to display?
If it's not important that the ticks update when panning/zomming (i.e. if the plot is not meant for interactive use), then you can manually set the tick locations with the axes.set_xticks() method. In order to append one location (e.g. 271), you can first get the current tick locations with axes.get_xticks(), and then append 271 to this array.
A short example:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(np.arange(300))
# Get current tick locations and append 271 to this array
x_ticks = np.append(ax.get_xticks(), 271)
# Set xtick locations to the values of the array `x_ticks`
ax.set_xticks(x_ticks)
plt.show()
This produces
As you can see from the image, a tick has been added for x=271.