I am stuck with python and matplotlib imshow(). Aim is it to show a twodimensonal color map which represents three dimensions.
My x-axis is represented by an array'TG'(93 entries). My y-axis is a set of arrays dependend of my 'TG' To be precise we have 93 different arrays with the length of 340. My z-axis is also a set of arrays depended of my 'TG' equally sized then y (93x340).
Basically what I have is a set of two-dimensonal measurements which I want to plot in color dependend on a third array. Is there a clever way to do that. I was trying to find out on my own first, but all I found is that most common is the problem with just a z-plane(two-dimensonal plot). So I have two matrices of the order of (93x340) and one array(93). Do you know a helpful advise.
Without more detail on your specific problem, it's hard to guess what is the best way to represent your data. I am going to give an example, hopefully it is relevant.
Suppose we are collecting height and weight of a group of people. Maybe the index of the person is your first dimension, and the height and weight depends on who it is. Then one way to represent this data is use height and weight as the x and y axes, and plot each person as a dot in that two dimensional space.
In this example, the person index doesn't really have much meaning, thus no color is needed.
Related
I have a numpy array with this shape: (109, 256) Every row is a frame and every column is a value of the frame's histogram (8 bits).
With k-means I cluster the histograms to get a resume of the frames. I want something like this:
Where every cluster should be a "scene" with similar histograms.
But how can I plot a representative graphic of the k-means process with 256 columns??
I'm trying with this typical example:
plt.scatter(X[:,0],X[:,1], c=kmeans.labels_, cmap='rainbow')
But yeah, it shows only 2 columns and it doesn't represent the problem. Any help? I'm really new on Python and machine learning.
PD: my k-means code works well and it clusters the way I want, but I don't know how to represent it correctly.
You always represent k-means clustering results on two axis. Those axis can be picked randomly. The only way you can include more attributes is by adapting the size of your points to another variables (for example the higher the income the bigger the point) or by having different color shades.
Otherwise, you seem to have done everything correctly, you have to stick to two variables for your axis and can't integrate more..
You can decide on creating more plots with different axis and create a grid (often this doesn't add much information though)
I don't think the title is precise enouth. If anyone will modify it, please help me.
I used to use numpy and matplotlib to draw a distribution diagram. As far as I know, np.histogram can only set the range with a bottom and a top value. But I'd like to make it three values, which are bottom, top and infinite.
For example
MW=[121,131,...,976,1400] # hundreds of out-of-order items
b,bins = np.histogram(MW,bins=10,range=(0,1000))
ax.bar(bins[:-1]+50,b,align='center',facecolor='grey',alpha=0.5,width=100,)
with these codes, I can draw a distribution diagram in which ten bins locates (0-100,100-200,...900-1000). But there are a few numbers higher than 1000. I want to put them in "(1000 - +∞)". So it seems like to make the parameter of range become (0,1000,infinite/or a number big enough), but it is not available.
A awful way to do is using some tricks such as:
MW=[x if x <1000 else 1001 for x in MW]
b,bins = np.histogram(MW,bins=11,range=(0,1100))
And change the xlabel of the plot.
Is there any better way to implement it?
If trick is the only way, is it possible to quickly change the xlabel?
I am currently working on a project where I have to bin up to 10-dimensional data. This works totally fine with numpy.histogramdd, however with one have a serious obstacle:
My parameter space is pretty large, but only a fraction is actually inhabited by data (say, maybe a few % or so...). In these regions, the data is quite rich, so I would like to use relatively small bin widths. The problem here, however, is that the RAM usage totally explodes. I see usage of 20GB+ for only 5 dimensions which is already absolutely not practical. I tried defining the grid myself, but the problem persists...
My idea would be to manually specify the bin edges, where I just use very large bin widths for empty regions in the data space. Only in regions where I actually have data, I would need to go to a finer scale.
I was wondering if anyone here knows of such an implementation already which works in arbitrary numbers of dimensions.
thanks 😊
I think you should first remap your data, then create the histogram, and then interpret the histogram knowing the values have been transformed. One possibility would be to tweak the histogram tick labels so that they display mapped values.
One possible way of doing it, for example, would be:
Sort one dimension of data as an unidimensional array;
Integrate this array, so you have a cumulative distribution;
Find the steepest part of this distribution, and choose a horizontal interval corresponding to a "good" bin size for the peak of your histogram - that is, a size that gives you good resolution;
Find the size of this same interval along the vertical axis. That will give you a bin size to apply along the vertical axis;
Create the bins using the vertical span of that bin - that is, "draw" horizontal, equidistant lines to create your bins, instead of the most common way of drawing vertical ones;
That way, you'll have lots of bins where data is more dense, and lesser bins where data is more sparse.
Two things to consider:
The mapping function is the cumulative distribution of the sorted values along that dimension. This can be quite arbitrary. If the distribution resembles some well known algebraic function, you could define it mathematically and use it to perform a two-way transform between actual value data and "adaptive" histogram data;
This applies to only one dimension. Care must be taken as how this would work if the histograms from multiple dimensions are to be combined.
I am doing a very simple task of plotting a 2d numpy histogram and displaying with with
mayavi.mlab.imshow(my2dhistogram, interpolate=False)
For a 5x5 array the output is the following,
I would like the bins along the border to be the same size as the ones in the center. I understand the logic of what mayavi is doing but for this application I absolutely need the bins to be equal size. This is for a scientific visualization where each bin represents a measurement on a detector surface.
Any suggestions?
I don't know how to do this the right way (it seems like it would be very difficult to get right from what I know about imshow), but I have a conceptual suggestion.
Represent your NxN matrix of items on the surface with an (N+2)x(N+2) matrix and set the border entries to be -1. Then make a customized colormap such that your desired colormap is contained between 0 and 1, with all other entries as (0,0,0,0). I'm not exactly sure how to do that -- iirc mayavi modules don't allow you to setup discontinuous color tables, but you could still hack it in this way. Let me know if the part about the color table is confusing, and I can provide some code to make it work.
Also, is there a reason you need to use mayavi's imshow as opposed to say matplotlib for this essentially 2D problem?
For a math fair project I want to make a program that will generate a Julia set fractal. To do this i need to plot complex numbers on a graph. Does anyone know how to do this? Remember I am using complex numbers, not regular coordinates. Thank You!
You could plot the real portion of the number along the X axis and plot the imaginary portion of the number along the Y axis. Plot the corresponding pixel with whatever color makes sense for the output of the Julia function for that point.
Julia set renderings are generally 2D color plots, with [x y] representing a complex starting point and the color usually representing an iteration count.