thanks for reading my question !
I created plot using Pyplot, this is my data :
Length of "point" array is : 114745
Length of "id_item" array is : 114745
Length of "sessions" array is : 92128
And this is my code :
point = []
id_item = []
sessions = [] # temp_dict.keys()
for item in cursor_fromCompanyDB:
sessions.append(item['sessionId'])
for object_item in item['objects']:
point.append(object_item['point'])
id_item.append(object_item['id'])
plt.figure()
plt.title('Scatter Point and Id of SessionId', fontsize=20)
plt.xlabel('point', fontsize=15)
plt.ylabel('Item', fontsize=15)
plt.scatter(point, id_item, marker = 'o')
plt.autoscale(enable=True, axis=u'both', tight=False)
for label, x, y in zip(sessions, point, id_item):
plt.annotate(label, xy = (x, y))
plt.show()
And this is result :
As you can see, values very close and hard to see.
I want value in id_item show full value and values in the center (sessions) easy to see.
Thanks very much to help me.
There are two ways to fix your plot:
Make the plot so large that you have to scroll down pages to see every session ID.
Reduce your data / don't display everything.
Personally, I'd take option 2. At a certain point it becomes impossible or just really ugly to display a certain amount of points, especially with labels assigned to them. You will have to make sacrifices somewhere.
Edit: If you really want to change your figure size, look here for a thread explaining how to do that.
Related
I'm working with the lifelines package to make Kaplan-Meier curves. I'd like to add the censored data, but also have a legend that only mentions the two lines.
I'm calling the function iteratively twice to plot two separate lines, as so:
def plot_km(col,months,dpi):
ax = plt.subplot(111)
clrIdx = 0
for r in df[col].unique():
ix = df[col] == r
plt.rcParams['figure.dpi'] = dpi
plt.rcParams['savefig.dpi'] = dpi
kmf.fit(T[ix], C[ix],label=r)
kmf.plot(ax=ax, color=colorsKM[clrIdx],show_censors=True,censor_styles={'ms': 6, 'marker': 's','label':'_out_'})
if clrIdx == 1:
plt.legend(handles=[],labels=['test1', 'test2'])
clrIdx += 1
Where the output is a KM curve as well as the censored datapoints. However, I can't seem to figure out a way to interact with the handles/labels that gets the desired output. The above code seems to ignore the censored objects by using 'label':'_out_' , but it ignores my custom labels in the plt.legend() call. If I enter anything for handles, e.g.: plt.legend(handles=[line0] , it throws "NameError: name 'line0' is not defined"
I tried playing around with h, l = ax.get_legend_handles_labels() but this always returns empty. I believe my issue is with not understanding how each of these "artists"(?) are getting stored, and how to get them again.
By Using Matplotlib i am trying to create a Line chart but i am facing below issue. Below is the code. Can someone help me with any suggestion
Head = ['Date','Count1','Count2','Count3']
df9 = pd.DataFrame(Data, columns=Head)
df9.set_index('Date',inplace=True)
fig,ax = plt.subplots(figsize=(15,10))
df9.plot(ax=ax)
ax.xaxis.set_major_locator(mdates.WeekdayLocator(SATURDAY))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %d'))
plt.legend()
plt.xticks(fontsize= 15)
plt.yticks(fontsize= 15)
plt.savefig(Line.png)
i am getting below error
Error: Matplotlib UserWarning: Attempting to set identical left == right == 737342.0 results in singular transformations; automatically expanding (ax.set_xlim(left, right))
Sample Data:
01-10-2010, 100, 0 , 100
X Axis: I am trying to display date on base of date every saturdays
Y Axis: all other 3 counts
Can some one please help me whats this issue and how can i fix this...
The issue is caused by the fact that somehow, pandas.DataFrame.plot explicitly sets the x- and y- limits of your plot to the limits of your data. This is normally fine, and no one notices. In fact, I had a lot of trouble finding references to your warning anywhere at all, much less the Pandas bug list.
The workaround is to set your own limits manually in your call to DataFrame.plot:
if len(df9) == 1:
delta = pd.Timedelta(days=1)
lims = [df9.index[0] - delta, df9.index[0] + delta]
else:
lims = [None, None]
df9.plot(ax=ax, xlim=lims)
This issue can also arise in a more tricky situation, when you do NOT only have one point, but only one cat get on your plot : Typically, when only one point is >0 and your plot yscale is logarithmic.
One should always set limits on a log scale when there 0 values. Because, there is no way the program can decide on a good scale lower limit.
That is a plot i generated using pyplot and (attempted to) adjust the text using the adjustText library which i also found here.
as you can see, it gets pretty crowded in the parts where 0 < x < 0.1. i was thinking that there's still ample space in 0.8 < y < 1.0 such that they could all fit and label the points pretty well.
my attempt was:
plt.plot(df.fpr,df.tpr,marker='.',ls='-')
texts = [plt.text(df.fpr[i],df.tpr[i], str(df.thr1[i])) for i in df.index]
adjust_text(texts,
expand_text=(2,2),
expand_points=(2,2),
expand_objects=(2,2),
force_objects = (2,20),
force_points = (0.1,0.25),
lim=150000,
arrowprops=dict(arrowstyle='-',color='red'),
autoalign='y',
only_move={'points':'y','text':'y'}
)
where my df is a pandas dataframe which can be found here
from what i understood in the docs, i tried varying the bounding boxes and the y-force by making them larger, thinking that it would push the labels further up, but it does not seem to be the case.
I'm the author of adjustText, sorry I just noticed this question. you are having this problem because you have a lot of overlapping texts with exactly the same y-coordinate. It's easy to solve by adding a tiny random shift along the y to the labels (and you do need to increase the force for texts, otherwise along one dimension it works very slowly), like so:
np.random.seed(0)
f, ax = plt.subplots(figsize=(12, 6))
plt.plot(df.fpr,df.tpr,marker='.',ls='-')
texts = [plt.text(df.fpr[i], df.tpr[i]+np.random.random()/100, str(df.thr1[i])) for i in df.index]
plt.margins(y=0.125)
adjust_text(texts,
force_text=(2, 2),
arrowprops=dict(arrowstyle='-',color='red'),
autoalign='y',
only_move={'points':'y','text':'y'},
)
Also notice that I increased the margins along the y axis, it helps a lot with the corners. The result is not quite perfect, limiting the algorithm to just one axis make life more difficult... But it's OK-ish already.
Have to mention, size of the figure is very important, I don't know what yours was.
I'm trying to get the plot to show on a graph but it does not show anything for some reason. I have properly imported the matplotlib library, I am properly getting the values from the function and when I pass those values to the function where I want the plot to be shown it displays blank image. To show that I am getting the correct values I have used print along with plotting commands the values are getting printed but not the plot Here's the code. I was able to get the plot correct using
def GetCounts(data):
return (data['Sex'].value_counts())
def Get_Plot(Points):
_x1 = Points[0]
_x2 = Points[1]
_y = (_x1 + _x2) - 200
print('male' + ' ' + str(_x1) + '\n','female' + ' '+ str(_x2), _y)
plt.bar(height = _x1, tick_label = 'Male', left = _x1)
plt.xlabel('Counts of people according to Sex')
plt.ylabel('Number of people')
plt.show()
Counts = GetCounts(titanic)
Get_Plot(Counts)
I'm trying to get 2 bars placed in there 'Male' and 'Female' and I not sure how I will be able to. and with the code above I am only able to put only one of it.
Please help thanks in advance.
You may want to revisit the plt.bar documentation where it says
pyplot.bar(left, height, width=0.8, bottom=None, hold=None, data=None, **kwargs)
[...]
left : sequence of scalars
the x coordinates of the left sides of the bars
height : scalar or sequence of scalars
the height(s) of the bars
You may thus position the bars at the indizes 0 and 1 and their height will be given by Points
plt.bar(range(len(Points)),Points)
plt.xticks(range(len(Points)), Points.index)
I'm using pylab.plot() in a for loop, and for some reason the legend has 6 entries, even though the for loop is only executed 3 times
#Plot maximum confidence
pylab.figure()
for numPeers in sorted(peers.keys()):
percentUni, maxes = peers[numPeers]
labels = list(set([i[1] for i in sorted(maxes,
key=itemgetter(1))]))
percentUni = [i[0] for i in sorted(maxes, key=itemgetter(1))]
x = []
y = []
ci = []
for l in xrange(len(labels)):
x.append(l+1)
y.append(max(maxes[l*3:l*3+3]))
pylab.plot(x, y, marker='o', label = "N=%d"%numPeers)
pylab.title('Maximal confidence in sender')
pylab.xlabel('Contribute Interval')
pylab.ylabel('Percent confident')
pylab.ylim([0,1])
pylab.xlim([0.5, 7.5])
pylab.xticks(xrange(1,8), labels)
pylab.legend(loc='upper right')
The plot looks like this, with each legend entry having exactly 2 copies.
I know the loop only runs 3x, because if I put in a print statement to debug, it only prints the string 3x.
I did see this in my search, but didn't find it helpful:
Duplicate items in legend in matplotlib?
I had a similar problem. What I ended up doing is add plt.close() at the beginning of my loop. I suspect you're seeing 6 because you have a nested loop where you're changing the x and y.
It ended up being a bug/type on my part, where I was supposed to write
maxes = [i[0] for i in sorted(maxes, key=itemgetter(1))]
instead of
percentUni = [i[0] for i in sorted(maxes, key=itemgetter(1))]
This mistake meant that maxes remained a list of 2-tuples instead of a list of integers, which is why things were plotted twice. And because I restricted the y-axis, I never saw that there were additional data elements plotted.
Thanks for your help, those who did answer!