I have these two lists:
**l1** = ['100.00', '120.33', '140.21', '159.81', '179.25', '183.13', '202.49', '202.89', '204.18', '205.35', '206.44', '207.45', '208.40', '209.30', '210.15', '210.96', '211.73', '212.47', '213.18', '213.87', '214.53', '215.17', '215.79', '216.39', '216.98', '217.54', '218.10', '218.63', '219.16', '219.67', '220.18', '220.67', '221.15']
**l2** = ['13.14', '13.37', '13.53', '13.66', '13.76', '13.77', '20.70', '21.51', '23.85', '26.39', '29.13', '32.06', '35.17', '38.47', '41.95', '45.63', '49.50', '53.59', '57.90', '62.45', '67.25', '72.33', '77.70', '83.40', '89.43', '95.83', '102.65', '109.90', '117.65', '125.95', '134.84', '144.40', '154.71']
I plot l1 against l2 and this is what it's supposed to come out:
should_be
I use the following code:
fig, ax = plt.subplots(1)
ax.plot(l1, l2)
plt.show()
and this is what comes out it_is
like the step is regular even if the values are not equally distributed. Thanks
The problem here is that the list contain a string of values, however they must be a list of integer/float values. I also do not think you need subplots unless you want to have more than one plot. Below I have modified and executed the same and it works:
import matplotlib.pyplot as plt
l1 = [100.00, 120.33, 140.21, 159.81, 179.25, 183.13, 202.49, 202.89, 204.18, 205.35, 206.44, 207.45, 208.40, 209.30, 210.15, 210.96, 211.73, 212.47, 213.18, 213.87, 214.53, 215.17, 215.79, 216.39, 216.98, 217.54, 218.10, 218.63, 219.16, 219.67, 220.18, 220.67, 221.15]
l2 = [13.14, 13.37, 13.53, 13.66, 13.76, 13.77, 20.70, 21.51, 23.85, 26.39, 29.13, 32.06, 35.17, 38.47, 41.95, 45.63, 49.50, 53.59, 57.90, 62.45, 67.25, 72.33, 77.70, 83.40, 89.43, 95.83, 102.65, 109.90, 117.65, 125.95, 134.84, 144.40, 154.71]
plt.plot(l1, l2)
plt.show()
Output:
Have you tried to limited the views of the graph?
plt.xlim([25, 50])
or maybe, the axe range of them.
You can see more about it in this link:
https://stackabuse.com/how-to-set-axis-range-xlim-ylim-in-matplotlib/.
Have you tried using this?
%matplotlib inline
those graphs don't typically show up unless you have that line above where it's rendering.
I'm trying to use a sankey chart to show some user segmentation change using PySankey but the class order is the opposite to what I want. Is there a way for me to specify the order in which each class is posted?
Here is the code I'm using (a dummy version):
test_df = pd.DataFrame({
'curr_seg':np.repeat(['A','B','C','D'],4),
'new_seg':['A','B','C','D']*4,
'num_users':np.random.randint(low=10, high=20, size=16)
})
sankey(
left=test_df["curr_seg"], right=test_df["new_seg"],
leftWeight= test_df["num_users"], rightWeight=test_df["num_users"],
aspect=20, fontsize=20
)
Which produces this chart:
I want to have the A class first and the D class latest on both left and right axis. Does anybody know how can I set it up? Thank you very much.
There is a bug in the first line of check_data_matches_labels function, you need to change to the following:
if len(labels) > 0:
Then you can use leftLabels and rightLabels to control order.
I am doing a multiple regression in Stan.
I want a trace plot of the beta vector parameter for the regressors/design matrix.
When I do the following:
fit = model.sampling(data=data, iter=2000, chains=4)
fig = fit.plot('beta')
I get a pretty horrid image:
I was after something a little more user friendly. I have managed to hack the following which is closer to what I am after.
My hack plugs into the back of pystan as follows.
r = fit.extract() # r for results
from pystan.external.pymc import plots
param = 'beta'
beta = r[param]
name = df.columns.values.tolist()
(rows, cols) = beta.shape
assert(len(df.columns) == cols)
values = {param+'['+str(k+1)+'] '+name[k]:
beta[:,k] for k in range(cols)}
fig = plots.traceplot(values, values.keys())
for a in fig.axes:
# shorten the y-labels
l = a.get_ylabel()
if l == 'frequency':
a.set_ylabel('freq')
if l=='sample value':
a.set_ylabel('val')
fig.set_size_inches(8, 12)
fig.tight_layout(pad=1)
fig.savefig(g_dir+param+'-trace.png', dpi=125)
plt.close()
My question - surely I have missed something - but is there an easier way to get the kind of output I am after from pystan for a vector parameter?
Discovered that the ArviZ module does this pretty well.
ArviZ can be found here: https://arviz-devs.github.io/arviz/
I also struggled with this and just found a way to extract the parameters for the traceplot (the betas, I already knew).
When you do your fit, you can save it to a dataframe:
fit_df = fit.to_dataframe()
Now you have a new variable, your dataframe. Yes, it took me a while to find that pystan had a straightforward way to save the fit to a dataframe.
With that at hand you can check your dataframe. You can see it's header by printing the keys:
fit_df.keys()
the output is something like this:
Index([u'chain', u'chain_idx', u'warmup', u'accept_stat__', u'energy__',
u'n_leapfrog__', u'stepsize__', u'treedepth__', u'divergent__',
u'beta[1,1]', ...
u'eta05[892]', u'eta05[893]', u'eta05[894]', u'eta05[895]',
u'eta05[896]', u'eta05[897]', u'eta05[898]', u'eta05[899]',
u'eta05[900]', u'lp__'],
dtype='object', length=9037)
Now, you have everything you need! The betas are in columns as well as the chain ids. That's all you need to plot the betas and traceplot. Therefore, you can manipulate it in anyway you want and customize your figures as you wish. I'll show you an example of how I did it:
chain_idx = fit_df['chain_idx']
beta11 = fit_df['beta[1,1]']
beta12 = fit_df['beta[1,2]']
plt.subplots(figsize=(15,3))
plt.subplot(1,4,1)
sns.kdeplot(beta11)
plt.subplot(1,4,2)
plt.plot(chain_idx, beta11)
plt.subplot(1,4,3)
sns.kdeplot(beta12)
plt.subplot(1,4,4)
plt.plot(chain_idx, beta12)
plt.tight_layout()
plt.show()
The image from the above plot!
I hope it helps (if you still need it) ;)
Been unable to figure this one out so was hoping someone here could point me in the right direction...
I am basically trying to store the color that was used from my colormap such that I can use it later on in the code.
color_map = cm.get_cmap('Spectral')
for grp,frame in x.groupby('time'):
ax.scatter(x, y, cmap=color_map)
<other code>
ax.axvline(x=magic_number, color=<???>)
plt.show()
Pretty much I want to use the same color from my map in the for loop. I believe this is pretty simple to do but I cant seem to find the right combination of things to search for to get the answer.
I couldn't completely understand what you are trying to achieve. I'm not sure that below will be helpful.... (sadly)
your code should be something like this:
ax.axvline(x=magic_number, color=color_map(float(magic_number)/float(max_magix_number) ) )
It works quite simple float(magic_number)/float(max_magix_number) gives a float number in the range from zero to one. color_map(scaled number) returns required color as a tuple of R,G,B and transparancy....
>>> c = get_cmap('Spectral')
>>> c(0.5)
(0.998077662437524, 0.9992310649750096, 0.7460207612456747, 1.0)
>>>
As the title hints, I'm struggling to create a plotly chart that has multiple lines that are functions of the same slider variable.
I hacked something together using bits and pieces from the documentation: https://pastebin.com/eBixANqA. This works for one line.
Now I want to add more lines to the same chart, but this is where I'm struggling. https://pastebin.com/qZCMGeAa.
I'm getting a PlotlyListEntryError: Invalid entry found in 'data' at index, '0'
Path To Error: ['data'][0]
Can someone please help?
It looks like you were using https://plot.ly/python/sliders/ as a reference, unfortunately I don't have time to test with your code, but this should be easily adaptable. If you create each trace you want to plot in the same way that you have been:
trace1 = [dict(
type='scatter',
visible = False,
name = "trace title",
mode = 'markers+lines',
x = x[0:step],
y = y[0:step]) for step in range(len(x))]
where I note in my example my data is coming from pre-defined lists, where you are using a function, that's probably the only change you'll really need to make besides your own step size etc.
If you create a second trace in the same way, for example
trace2 = [dict(
type='scatter',
visible = False,
name = "trace title",
mode = 'markers+lines',
x = x2[0:step],
y = y2[0:step]) for step in range(len(x2))]`
Then you can put all your data together with the following
all_traces = trace1 + trace2
then you can just go ahead and plot it provided you have your layout set up correctly (it should remain unchanged from your single trace example):
fig = py.graph_objs.Figure(data=all_traces, layout=layout)
py.offline.iplot(fig)
Your slider should control both traces provided you were following https://plot.ly/python/sliders/ to get the slider working. You can combine multiple data dictionaries this way in order to have multiple plots controlled by the same slider.
I do note that if your lists of dictionaries containing data are of different length, that this gets topsy-turvy.