I'm using openturns to find the best fit distribution for my data. I got to plot it alright, but the X limit is far bigger than I'd like. My code is:
import statsmodels.api as sm
import openturns as ot
import openturns.viewer as otv
data = in_seconds
sample = ot.Sample(data, 1)
tested_factories = ot.DistributionFactory.GetContinuousUniVariateFactories()
best_model, best_bic = ot.FittingTest.BestModelBIC(sample, tested_factories)
print(best_model)
graph = ot.HistogramFactory().build(sample).drawPDF()
bestPDF = best_model.drawPDF()
bestPDF.setColors(["blue"])
graph.add(bestPDF)
name = best_model.getImplementation().getClassName()
graph.setLegends(["Histogram",name])
graph.setXTitle("Latências (segundos)")
graph.setYTitle("Frequência")
otv.View(graph)
I'd like to set X limits as something like "graph.setXLim", as we'd do in matplotlib, but I'm stuck with it as I'm new to OpenTurns.
Thanks in advance.
Any OpenTURNS graph has a getBoundingBox method which returns the bounding box as a dimension 2 Interval. We can get/set the lower and upper bounds of this interval with getLowerBound and getUpperBound. Each of these bounds is a Point with dimension 2. Hence, we can set the bounds of the graphics prior to the use of the View class.
In the following example, I create a simple graph containing the PDF of the gaussian distribution.
import openturns as ot
import openturns.viewer as otv
n = ot.Normal()
graph = n.drawPDF()
_ = otv.View(graph)
Suppose that I want to set the lower X axis to -1.
The script:
boundingBox = graph.getBoundingBox()
lb = boundingBox.getLowerBound()
print(lb)
produces:
[-4.10428,-0.0195499]
The first value in the Point is the X lower bound and the second is the Y lower bound. The following script sets the first component of the lower bound to -1, wraps the lower bound into the bounding box and sets the bounding box into the graph.
lb[0] = -1.0
boundingBox.setLowerBound(lb)
graph.setBoundingBox(boundingBox)
_ = otv.View(graph)
This produces the following graph.
The advantage of these methods is that they configure the graphics from the library, before the rendering is done by Matplotlib. The drawback is that they are a little more verbose than the Matplotlib counterpart.
Here is a minimal example adapted from openTURNS examples (see http://openturns.github.io/openturns/latest/examples/graphs/graphs_basics.html) to set the x range (initially from [-4,4] to [-2,2]) :
import openturns as ot
import openturns.viewer as viewer
from matplotlib import pylab as plt
n = ot.Normal()
# To configure the look of the plot, we can first observe the type
# of graphics returned by the `drawPDF` method returns: it is a `Graph`.
graph = n.drawPDF()
# The `Graph` class provides several methods to configure the legends,
# the title and the colors. Since a graphics can contain several sub-graphics,
# the `setColors` takes a list of colors as inputs argument: each item of
# the list corresponds to the sub-graphics.
graph.setXTitle("N")
graph.setYTitle("PDF")
graph.setTitle("Probability density function of the standard gaussian distribution")
graph.setLegends(["N"])
graph.setColors(["blue"])
# Combine several graphics
# In order to combine several graphics, we can use the `add` method.
# Let us create an empirical histogram from a sample.
sample = n.getSample(100)
histo = ot.HistogramFactory().build(sample).drawPDF()
# Then we add the histogram to the `graph` with the `add` method.
# The `graph` then contains two plots.
graph.add(histo)
# Using openturns.viewer
view = viewer.View(graph)
# Get the matplotlib.axes.Axes member with getAxes()
# Similarly, there is a getFigure() method as well
axes = view.getAxes() # axes is a matplotlib object
_ = axes[0].set_xlim(-2.0, 2.0)
plt.show()
You can read the definition of the View object here :
https://github.com/openturns/openturns/blob/master/python/src/viewer.py
As you will see, the View class contains matplotlib objects such as axes and figure. Once accessed by the getAxes (or getFigure) you can use the matplotlib methods.
Related
I am plotting some data from the cdasws.datarepresentation. I have plotted the plots using matlabplotlib but i cannot figure out how to flip the axes and i couldnt find it in the documentation.
here is the code
%pip install xarray
%pip install cdflib
%pip install cdasws
from cdasws import CdasWs
from cdasws.datarepresentation import DataRepresentation
import matplotlib.pyplot as plt
cdas = CdasWs()
datasets = cdas.get_datasets(observatoryGroup='Wind')
for index, dataset in enumerate(datasets):
print(dataset['Id'], dataset['Label'])
if index == 5:
break
variables = cdas.get_variables('WI_H1_WAV')
for variable_1 in variables:
print(variable_1['Name'], variable_1['LongDescription'])
data_1 = cdas.get_data('WI_H1_WAV', ['E_VOLTAGE_RAD1'],
'2020-07-11T02:00:00Z', '2020-07-11T03:00:00Z',
dataRepresentation = DataRepresentation.XARRAY)[1]
print(data_1)
### this is a bit of code to obtain the first part of the lower frequency data
print(data_1.E_VOLTAGE_RAD1)
data_1['E_VOLTAGE_RAD1'].plot()
The plot looks like this.
Is there a way to flip the axes?
I tried
plt.gca().invert_xaxis()
but that didnt help
when you plot a 2-dimensional array, the default plotting handler is xr.DataArray.plot.pcolormesh, so arguments will be handled by that function. To see the full set of available plotting methods and the default handler for 1, 2, and 3 dimensions, see xr.DataArray.plot and the Plotting section of the user guide.
The first two (optional) arguments to DataArray.plot.pcolormesh are the dims that get interpreted as the plot's x and y axes. From the API documentation:
x (str, optional) – Coordinate for x axis. If None, use darray.dims[1].
y (str, optional) – Coordinate for y axis. If None, use darray.dims[0].
So in your case, just provide the dimension names. I'm not sure what they're called in your data, but it should be something like this:
data_1['E_VOLTAGE_RAD1'].plot(x='epoch', y='frequency')
I am having difficulties accessing (the right) data when using holoviews/bokeh, either for connected plots showing a different aspect of the dataset, or just customising a plot with dynamic access to the data as plotted (say a tooltip).
TLDR: How to add a projection plot of my dataset (different set of dimensions and linked to main plot, like a marginal distribution but, you know, not restricted to histogram or distribution) and probably with a similar solution a related question I asked here on SO
Let me exemplify (straight from a ipynb, should be quite reproducible):
import numpy as np
import random, pandas as pd
import bokeh
import datashader as ds
import holoviews as hv
from holoviews import opts
from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize
hv.extension('bokeh')
With imports set up, let's create a dataset (N target 10e12 ;) to use with datashader. Beside the key dimensions, I really need some value dimensions (here z and z2).
import numpy as np
import pandas as pd
N = int(10e6)
x_r = (0,100)
y_r = (100,2000)
z_r = (0,10e8)
x = np.random.randint(x_r[0]*1000,x_r[1]*1000,size=(N, 1))
y = np.random.randint(y_r[0]*1000,y_r[1]*1000,size=(N, 1))
z = np.random.randint(z_r[0]*1000,z_r[1]*1000,size=(N, 1))
z2 = np.ones((N,1)).astype(int)
df = pd.DataFrame(np.column_stack([x,y,z,z2]), columns=['x','y','z','z2'])
df[['x','y','z']] = df[['x','y','z']].div(1000, axis=0)
df
Now I plot the data, rasterised, and also activate the tooltip to see the defaults. Sure, x/y is trivial, but as I said, I care about the value dimensions. It shows z2 as x_y z2. I have a question related to tooltips with the same sort of data here on SO for value dimension access for the tooltips.
from matplotlib.cm import get_cmap
palette = get_cmap('viridis')
# palette_inv = palette.reversed()
p=hv.Points(df,['x','y'], ['z','z2'])
P=rasterize(p, aggregator=ds.sum("z2"),x_range=(0,100)).opts(cmap=palette)
P.opts(tools=["hover"]).opts(height=500, width=500,xlim=(0,100),ylim=(100,2000))
Now I can add a histogram or a marginal distribution which is pretty close to what I want, but there are issues with this soon past the trivial defaults. (E.g.: P << hv.Distribution(p, kdims=['y']) or P.hist(dimension='y',weight_dimension='x_y z',num_bins = 2000,normed=True))
Both are close approaches, but do not give me the other value dimension I'd like visualise. If I try to access the other value dimension ('x_y z') this fails. Also, the 'x_y z2' way seems very clumsy, is there a better way?
When I do something like this, my browser/notebook-extension blows up, of course.
transformed = p.transform(x=hv.dim('z'))
P << hv.Curve(transformed)
So how do I access all my data in the right way?
Plotting a fairly large point cloud in python using plotly produces a graph with axes (not representative of the data range) and no data points.
The code:
import pandas as pd
import plotly.express as px
import numpy as np
all_res = np.load('fullshelf4_11_2019.npy' )
all_res.shape
(3, 6742382)
np.max(all_res[2])
697.5553566696478
np.min(all_res[2])
-676.311654692491
frm = pd.DataFrame(data=np.transpose(all_res[0:, 0:]),columns=["X", "Y", "Z"])
fig = px.scatter_3d(frm, x='X', y='Y', z='Z')
fig.update_traces(marker=dict(size=4))
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0))
fig.show()
Alternatively you could generate random data and follow the process through
all_res = np.random.rand(3, 6742382)
Which also produces a blank graph with a axis scales that are incorrect.
So -- what am I doing wrong, and is there a better way to plot such a moderately large data set?
Thanks for your help!
Try plotting using ipyvolume.It can handle large point cloud datasets.
It seems like that's too much data for WebGL to handle. I managed to plot 100k points, but 1M points already caused Jupyter to crash. However, a 3D scatterplot of 6.7 million points is of questionable value anyway. You probably won't be able to make any sense out of it (except for data boundaries maybe) and it will be super slow to rotate etc.
I would try to think of alternative approaches, depending on what you want to do. Maybe pick a representative subset of points and plot those.
I would suggest using pythreejs for a point cloud. It has very good performance, even for a large number of points.
import pythreejs as p3
import numpy as np
N = 1_000_000
# Positions centered around the origin
positions = np.random.normal(loc=0.0, scale=100.0, size=(N, 3)).astype('float32')
# Create a buffer geometry with random color for each point
geometry = p3.BufferGeometry(
attributes={'position': p3.BufferAttribute(array=positions),
'color': p3.BufferAttribute(
array=np.random.random((N, 3)).astype('float32'))})
# Create a points material
material = p3.PointsMaterial(vertexColors='VertexColors', size=1)
# Combine the geometry and material into a Points object
points = p3.Points(geometry=geometry, material=material)
# Create the scene and the renderer
view_width = 700
view_height = 500
camera = p3.PerspectiveCamera(position=[800.0, 0, 0], aspect=view_width/view_height)
scene = p3.Scene(children=[points, camera], background="#DDDDDD")
controller = p3.OrbitControls(controlling=camera)
renderer = p3.Renderer(camera=camera, scene=scene, controls=[controller],
width=view_width, height=view_height)
renderer
I'm trying to overplot two arrays with different shapes but I'm unable to project one on the top of the other. For example:
#importing the relevant packages
import numpy as np
import matplotlib.pyplot as plt
def overplot(data1,data2):
'''
This function should make a contour plot
of data2 over the data1 plot.
'''
#creating the figure
fig = plt.figure()
#adding an axe
ax = fig.add_axes([1,1,1,1])
#making the plot for the
#first dataset
ax.imshow(data1)
#overplotting the contours
#for the second dataset
ax.contour(data2, projection = data2,
levels = [0.5,0.7])
#showing the figure
plt.show(fig)
return
if __name__ == '__main__':
'''
testing zone
'''
#creating two mock datasets
data1 = np.random.rand(3,3)
data2 = np.random.rand(9,9)
#using the overplot
overplot(data1,data2)
Currently, my output is something like:
While what I actually would like is to project the contours of the second dataset into the first one. This way, if I got images of the same object but with different resolution for the cameras I would be able to do such plots. How can I do that?
Thanks for your time and attention.
It's generally best to make the data match, and then plot it. This way you have complete control over how things are done.
In the simple example you give, you could use repeat along each axis to expand the 3x3 data to match the 9x9 data. That is, you could use, data1b = np.repeat(np.repeat(data1, 3, axis=1), 3, axis=0) to give:
But for the more interesting case of images, like you mention at the end of your question, then the axes probably won't be integer multiples and you'll be better served by a spline or other type interpolation. This difference is an example of why it's better to have control over this yourself, since there are many ways to to this type of mapping.
I am trying to create four gabor patches, very similar to those below.
I don't need them to be identical to the pictures below, but similar.
Despite a bit of tinkering, I have been unable to reproduce these images...
I believe they were created in MATLAB originally. I don't have access to the original MATLAB code.
I have the following code in python (2.7.10):
import numpy as np
from scipy.misc import toimage # One can also use matplotlib*
data = gabor_fn(sigma = ???, theta = 0, Lambda = ???, psi = ???, gamma = ???)
toimage(data).show()
*graphing a numpy array with matplotlib
gabor_fn, from here, is defined below:
def gabor_fn(sigma,theta,Lambda,psi,gamma):
sigma_x = sigma;
sigma_y = float(sigma)/gamma;
# Bounding box
nstds = 3;
xmax = max(abs(nstds*sigma_x*numpy.cos(theta)),abs(nstds*sigma_y*numpy.sin(theta)));
xmax = numpy.ceil(max(1,xmax));
ymax = max(abs(nstds*sigma_x*numpy.sin(theta)),abs(nstds*sigma_y*numpy.cos(theta)));
ymax = numpy.ceil(max(1,ymax));
xmin = -xmax; ymin = -ymax;
(x,y) = numpy.meshgrid(numpy.arange(xmin,xmax+1),numpy.arange(ymin,ymax+1 ));
(y,x) = numpy.meshgrid(numpy.arange(ymin,ymax+1),numpy.arange(xmin,xmax+1 ));
# Rotation
x_theta=x*numpy.cos(theta)+y*numpy.sin(theta);
y_theta=-x*numpy.sin(theta)+y*numpy.cos(theta);
gb= numpy.exp(-.5*(x_theta**2/sigma_x**2+y_theta**2/sigma_y**2))*numpy.cos(2*numpy.pi/Lambda*x_theta+psi);
return gb
As you may be able to tell, the only difference (I believe) between the images is contrast. So, gabor_fn would likely needed to be altered to do allow for this (unless I misunderstand one of the params)...I'm just not sure how.
UPDATE:
from math import pi
from matplotlib import pyplot as plt
data = gabor_fn(sigma=5.,theta=pi/2.,Lambda=12.5,psi=90,gamma=1.)
unit = #From left to right, unit was set to 1, 3, 7 and 9.
bound = 0.0009/unit
fig = plt.imshow(
data
,cmap = 'gray'
,interpolation='none'
,vmin = -bound
,vmax = bound
)
plt.axis('off')
The problem you are having is a visualization problem (although, I think you are chossing too large parameters).
By default matplotlib, and scipy's (toimage) use bilinear (or trilinear) interpolation, depending on your matplotlib's configuration script. That's why your image looks so smooth. It is because your pixels values are being interpolated, and you are not displaying the raw kernel you have just calculated.
Try using matplotlib with no interpolation:
from matplotlib import pyplot as plt
plt.imshow(data, 'gray', interpolation='none')
plt.show()
For the following parameters:
data = gabor_fn(sigma=5.,theta=pi/2.,Lambda=25.,psi=90,gamma=1.)
You get this output:
If you reduce lamda to 15, you get something like this:
Additionally, the sigma you choose changes the strength of the smoothing, adding parameters vmin=-1 and vmax=1 to imshow (similar to what #kazemakase) suggested, will give you the desired contrast.
Check this guide for sensible values (and ways to use) gabor kernels:
http://scikit-image.org/docs/dev/auto_examples/plot_gabor.html
It seems like toimage scales the input data so that the min/max values are mapped to black/white.
I do not know what amplitudes to reasonably expect from gabor patches, but you should try something like this:
toimage(data, cmin=-1, cmax=1).show()
This tells toimage what range your data is in. You can try to play around with cmin and cmax, but make sure they are symmetric (i.e. cmin=-x, cmax=x) so that a value of 0 maps to grey.