I'm trying to get the plot to show on a graph but it does not show anything for some reason. I have properly imported the matplotlib library, I am properly getting the values from the function and when I pass those values to the function where I want the plot to be shown it displays blank image. To show that I am getting the correct values I have used print along with plotting commands the values are getting printed but not the plot Here's the code. I was able to get the plot correct using
def GetCounts(data):
return (data['Sex'].value_counts())
def Get_Plot(Points):
_x1 = Points[0]
_x2 = Points[1]
_y = (_x1 + _x2) - 200
print('male' + ' ' + str(_x1) + '\n','female' + ' '+ str(_x2), _y)
plt.bar(height = _x1, tick_label = 'Male', left = _x1)
plt.xlabel('Counts of people according to Sex')
plt.ylabel('Number of people')
plt.show()
Counts = GetCounts(titanic)
Get_Plot(Counts)
I'm trying to get 2 bars placed in there 'Male' and 'Female' and I not sure how I will be able to. and with the code above I am only able to put only one of it.
Please help thanks in advance.
You may want to revisit the plt.bar documentation where it says
pyplot.bar(left, height, width=0.8, bottom=None, hold=None, data=None, **kwargs)
[...]
left : sequence of scalars
the x coordinates of the left sides of the bars
height : scalar or sequence of scalars
the height(s) of the bars
You may thus position the bars at the indizes 0 and 1 and their height will be given by Points
plt.bar(range(len(Points)),Points)
plt.xticks(range(len(Points)), Points.index)
Related
I'm working with the lifelines package to make Kaplan-Meier curves. I'd like to add the censored data, but also have a legend that only mentions the two lines.
I'm calling the function iteratively twice to plot two separate lines, as so:
def plot_km(col,months,dpi):
ax = plt.subplot(111)
clrIdx = 0
for r in df[col].unique():
ix = df[col] == r
plt.rcParams['figure.dpi'] = dpi
plt.rcParams['savefig.dpi'] = dpi
kmf.fit(T[ix], C[ix],label=r)
kmf.plot(ax=ax, color=colorsKM[clrIdx],show_censors=True,censor_styles={'ms': 6, 'marker': 's','label':'_out_'})
if clrIdx == 1:
plt.legend(handles=[],labels=['test1', 'test2'])
clrIdx += 1
Where the output is a KM curve as well as the censored datapoints. However, I can't seem to figure out a way to interact with the handles/labels that gets the desired output. The above code seems to ignore the censored objects by using 'label':'_out_' , but it ignores my custom labels in the plt.legend() call. If I enter anything for handles, e.g.: plt.legend(handles=[line0] , it throws "NameError: name 'line0' is not defined"
I tried playing around with h, l = ax.get_legend_handles_labels() but this always returns empty. I believe my issue is with not understanding how each of these "artists"(?) are getting stored, and how to get them again.
I want to fix/set (not increase) the distance between the plotting area and the x-axis label in plotnine/ggplot.
library(ggplot2)
ggplot(diamonds)
ggplot(diamonds) + geom_point(aes(x=carat, y=price, color=cut)) + geom_smooth(aes(x=carat, y=price, color=cut))
I want to fix the distance between the two red bars on . I would like to be able to have x-ticklabels that take up more space (rotated, larger font etc.) without affecting where the x-axis label is located relative to the plot. I have found many examples to adjust the spacing - but not manually set it.
This might be an R specific solution, I don't know how plotnine works under the hood. In R, the height of the x-axis label is determined dynamically by the dimensions of the text, and there is no convenient way of setting this manually (afaik).
Instead, one can edit the height of that row in the gtable and then plot the result.
library(ggplot2)
library(grid)
p <- ggplot(diamonds) +
geom_point(aes(x=carat, y=price, color=cut)) +
geom_smooth(aes(x=carat, y=price, color=cut))
# Convert plot to gtable
gt <- ggplotGrob(p)
#> `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
# Find row in gtable where the bottom axis is located
axis_row <- with(gt$layout, t[grep("axis-b", name)])
# Manually set the height of that row
gt$heights[axis_row] <- unit(2, "cm")
# Display new plot
grid.newpage(); grid.draw(gt)
Created on 2021-08-17 by the reprex package (v1.0.0)
I have a bar chart like this:
and this is the code that I use to generate it:
def performance_plot_builder(data: str, ax: pyplot.Axes):
df = pandas.read_csv(data, header=0, sep=';')
df[['library', 'function']] = df.name.str.split('_', expand=True, n=1)
df = df.pivot('function', 'library', 'elapsed')
normalized = df.div(df.max(axis=1), axis=0)
normalized.plot(ax=ax, kind='bar', color=[c.value for c in Color])
ax.set_ylabel('execution time (normalized)')
for p in ax.patches:
ax.annotate(str(p.get_height()), (p.get_x() * 1.005, p.get_height() * 1.005))
The data is first normalized relative to the maximum value between the two series for each item and then plotted. I've been able to annotate the value on each bar, however I would like several modifications:
I only want the values displayed on the maximum of each of the two values. For example, for array_access, only the stl bar's value will be shown since it is greater than etl.
The biggest thing I need is for the non-normalized values to be displayed instead of the normalized values as it is now (so the df dataframe instead of the normalized dataframe.
I would also like the labels to be rotated 90 degrees so that the labels display on the bars themselves.
This is an example dataframe I have:
library etl stl
function
copy 6.922975e-06 6.319098e-06
copy_if 1.369602e-04 1.423410e-04
count 6.135367e-05 1.179409e-04
count_if 1.332942e-04 1.908408e-04
equal 1.099963e-05 1.102448e-05
fill 5.337406e-05 9.352984e-05
fill_n 6.412923e-05 9.354095e-05
find 4.354274e-08 7.804437e-08
find_if 4.792641e-08 9.206846e-08
iter_swap 4.898631e-08 4.911048e-08
rotate 2.816952e-04 5.219732e-06
swap 2.832723e-04 2.882649e-04
swap_ranges 3.492764e-04 3.576686e-04
transform 9.739075e-05 1.080187e-04
I'm really not sure how to go about this since as far as I can tell, the data is retrieved from the Axes object, however this contains the normalized values.
Edit
I was able to somewhat accomplish all the modifications with this code:
interleaved = [val for pair in zip(df['etl'], df['stl']) for val in pair]
for v, p in zip(interleaved, ax.patches):
if p.get_height() == 1:
ax.text(x=p.get_x() + 0.01, y=0.825, s=f'{v:.1E}', rotation=90, color='white')
However, this is somewhat hard coded and only works if the bar chart values are normalized, which they are most likely to be, but not necessarily, so I would like a solution that is generic and is independent from the normalized values.
I was able to figure it out:
size = len(ax.patches) // 2
for v_etl, v_stl, p_etl, p_stl in zip(df['etl'], df['stl'], ax.patches[:size], ax.patches[size:]):
p, v = (p_etl, v_etl) if v_etl > v_stl else (p_stl, v_stl)
ax.text(x=p.get_x() + 0.18 * p.get_width(), y=p.get_height() - 0.175, s=f'{v:.1E}', rotation=90, color='white')
I'm trying to make the bars of the plot of the same size with this code:
my_plot_replicas = (ggplot(df)
+ aes(x='algorithm',y='replicas',fill='algorithm')
+ geom_col(position=position_dodge2(preserve='single'))
+ geom_errorbar(aes(ymin='replicas-error', ymax='replicas+error'),
width=.2,position=position_dodge(.9))
+ facet_grid('mobility ~ time_elapsed',scales = "free_x")
+ scale_fill_manual(["darkgray", "gray"])
)
But I get this plot, where the bars that are alone take the whole width of the grid:
I would like to have the bars from columns 0 and 43200 of the same size as the others, is that possible?
As per plotnine documentation
space : str in ['fixed', 'free', 'free_x', 'free_y']
Whether the x or y sides of the panels should have the size. It also depends to the scales parameter. Default is 'fixed'. This setting is not yet supported.
plotnine.facets.facet_grid
thanks for reading my question !
I created plot using Pyplot, this is my data :
Length of "point" array is : 114745
Length of "id_item" array is : 114745
Length of "sessions" array is : 92128
And this is my code :
point = []
id_item = []
sessions = [] # temp_dict.keys()
for item in cursor_fromCompanyDB:
sessions.append(item['sessionId'])
for object_item in item['objects']:
point.append(object_item['point'])
id_item.append(object_item['id'])
plt.figure()
plt.title('Scatter Point and Id of SessionId', fontsize=20)
plt.xlabel('point', fontsize=15)
plt.ylabel('Item', fontsize=15)
plt.scatter(point, id_item, marker = 'o')
plt.autoscale(enable=True, axis=u'both', tight=False)
for label, x, y in zip(sessions, point, id_item):
plt.annotate(label, xy = (x, y))
plt.show()
And this is result :
As you can see, values very close and hard to see.
I want value in id_item show full value and values in the center (sessions) easy to see.
Thanks very much to help me.
There are two ways to fix your plot:
Make the plot so large that you have to scroll down pages to see every session ID.
Reduce your data / don't display everything.
Personally, I'd take option 2. At a certain point it becomes impossible or just really ugly to display a certain amount of points, especially with labels assigned to them. You will have to make sacrifices somewhere.
Edit: If you really want to change your figure size, look here for a thread explaining how to do that.