The smallest valid alpha value in matplotlib? - python

Some of my plots have several million lines. I dynamically adjust the alpha value, by the number of lines, so that the outliers more or less disappear, while the most prominent features appear clear. But for some alpha's, the lines just disappear.
What is the smallest valid alpha value for line plots in in matplotlib? And why is there a lower limit?

As #ImportanceOfBeingErnest suggested in the comments, the lower limit seems to be 1/255.
I did not have time to go though the source code and all, but I did test it, and assume what happens is, that the input alpha value needs to be represented as an int between 0 and 255:
int(alpha*255)
When the input alpha value is smaller than 1/255, e.g. 1/256, it is therefore represented by a 0, and the plot lines disappear. Whereas when the alpha is 1/255 (or slightly larger), it is converted to 1, and the plot lines can be seen.

My guess would also have been 1./255 such that the maximum 8-bit RGB color multiplied with alpha still makes a non-zero contribution to the image. However, that would also only allow to draw lines with fully saturated colors, and in reality it is not true.
Changing the relevant value in
for i in range(1000):
plt.plot((0, 100), (0, 100), alpha = 1/510, color = "g")
I found that the limit is actually 510 (i.e., the minimum alpha is 1./510). This holds true also for non-saturated colors (e.g., "wheat"), so reality is obviously still more complicated than the naive assumption described above.
My matplotlib version is 3.4.2.
I haven't investigated this further -- for very large numbers of overlaid images one would have to come up with a different approach anyway. There is a related ticket on github that suggests using mplcairo. Another option would be to export the output images as numpy arrays so that one can manually add and normalize them.

There's no lower limit; the lines just appear to be invisible for very small alpha values.
If you draw one line with alpha=0.01 the difference in color is too small for your screen / eyes to discern. If you draw 100 lines with a=0.01 on top of each other, you will see them.
As for your problem, you can just add a small number to the alpha value of each draw call so that lines that would otherwise have alpha < 0.1 still appear.

Related

Plotting a large number of "lines" as shading or color intensity

I have various pandas dataframes with up to 2000 time series in it. Obviously, a simple df.plot() doesn't really show anything useful (and takes a few minutes to plot). But at least I can easily get (and plot) a (rolling) mean. Simple example:
ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
df = pd.DataFrame(np.random.randn(1000, 14), index=ts.index, columns=list('ABCDEFGHIJKLMN'))
mean_df = df.mean(1)
rolling_mean = mean_df.rolling(window = 60, center = True)
ax = df.plot(style=':')
rolling_mean.mean().plot(ax=ax)
With this small example, one can probably see how the underlying data "behaves":
It's pretty symmetric around zero and most of the data is between +1 and -1, quite a bit is between +1 and +2, as well as -1 and -2, with some stuff happening in the 2/3 bracket and some outliers going up (or down) to almost +(-)4.
Why can one easily grasp this? Obviously, it's due to the number of lines per area, and thus the intensity or shading of the area. Becomes even clearer when I go monochrome:
However, this lacks a quantification for density or number of lines.
How can I turn this into something quantitative?
I.e. the 2-4 bracket should have various shades of light grey, 1-2 medium greys, 0-1 dark greys, so that the mean would be placed on top of an almost black zone in the graph, so that I end up with 50 shades of grey and maybe a color bar to boot.
I could probably play around with various shades of grey as a base color and see what setting different alphas does for a better visual effect, but this seems hacky.
Another option would be to do something like max_df = df.max(1) and min_df = df.min(1) and then use matplotlib to fill between (plt.fill_between(df.index, min_df, max_df)) and rig some way to repeat this for various levels (i.e. 1, 2 and 3 standard deviations away from the mean) so that I would end with some kind of continuous box-plot.
But I am wondering if there is a better way to do this.
Also: I'm not really sure how to best describe what I want/need, so please, if you have any questions/comments about the question, please comment and I'll see what I can edit to make myself clearer.
Another way to visualize the data for density is with the Kernel Density Estimate:
df.plot.kde()
plt.xlim(-4,4)
plt.grid()
plt.show()

How is it possible to detect multiple attached images, and cut them off?

Good evening :)
I would like to create a script capable of cutting the images present on large image one by one, so I want to cut out the 9 different images.
It is not so difficult to cut out these images because in fact it is enough just to cut out the good lines and the turn is played .. So where is my problem?
This problem becomes complicated, because there exist indeed several variants of great images; For example, it is likely that my script falls on a large image such as
(Here, you can see 8 different images, so the lines to be cut are different).
So how do I get there?
I am still a beginner and I do not understand everything :(
I hope I have been as clear as possible.
Thank you very much in advance!
Create two arrays, one in the length of the image width and one in the length of the image height and init them with zeroes.
Since you are trying to test similarity of horizontal lines and vertical lines, you should iterate through the image pixels and compare each pixel in the position (x,y) to the pixel in the location (x+1,y) and (x,y+1) meaning each pixel is compared to the pixel right to it and the pixel below it.
The result of each comparison should result with a similarity percentage.
To calculate similarity percentage of two pixels you can follow the answer to this question:
Algorithm to check similarity of colors
The percentage result of each horizontal comparison of pixels in the location (x,y) should be added to the value in the horizontal array in the location x.
So for example if:
pixel(3,0) compare to pixel(4,0) = 80%
horizontalArray[3] += 80% (80)
pixel(3,1) compare to pixel(4,1) = 72%
horizontalArray[3] += 72% (152)
pixel(3,2) compare to pixel(4,2) = 95%
horizontalArray[3] += 95% (247)
...
And in a similar way calculate the values for the vertical comparison:
pixel(0,3) compare to pixel(0,4) = 22%
verticalArray[3] += 80% (22)
pixel(1,3) compare to pixel(1.4) = 10%
verticalArray[3] += 72% (32)
pixel(2,3) compare to pixel(2,4) = 76%
verticalArray[3] += 95% (108)
...
After iterating through all pf the image pixels divide the values in the horizontalArray with the image height and the values in the verticalArray with the image width. the result of this action leave you now with two arrays containing average similarity percentage of each horizontal and each vertical line in you picture, now you can choose a arbitrary a magic number, lets say 15% and say that each line that got less than 15% similarity is a line you will perform a cut by it.
Test the script and see how accurately it find the correct lines.
If it's not sensitive enough increase the the value of your "magic number", if it is too sensitive and find lines where it shouldn't decrease the value of this magic number and try again.
EDIT
I edited my suggestion to color similarity formula as it had mistakes in it that would lead to not accurate results.
Instead I added reference to another answer dealing with the question of color comparison, use it with the rest of the algorithm and it should work.

Set specific aspect ratio for narrow matrix - Matplotlib

I have a 13x1340 matrix that I usually plot correctly without the need to specify an aspect ratio.
However, I would now like to tweak that aspect ratio so that two matrices whose 13 rows correspond to different scales are plotted as rectangles of equal length but different height, proportionally to the corresponding axis scale.
I have tried to use the get_aspect() method to obtain the numerical value that is being used, but it returns 'auto'. I have tried to guess the value and found that it is close to 4.5/(1340*180), which looks like a completely absurd value to me. I expected it to be something closer to 13/1340, but perhaps I don't quite understand how aspect ratios are calculated.
Setting the aspect ratio to 1 gives me an incredibly thin figure, with the proper vertical size. As the value decreases, the figure becomes longer in length, until it reaches ~ 4.5/(1340*180). After that, it starts losing height while keeping a fixed length.
Figure size is set to 3 inches high by 7 inches large, and the dpi is set to 300 on the savefig() method.
The get_data_ratio() method returns a value slightly larger than 13*1340, although it is clear that this value is not the aspect ratio used do construct the figure.

How to stack multiple images on top of each other using python or matlab?

How can I stack multiple images and save the new output image using python (or matlab)?
I need to set the alpha of each image and do i little translation, e.g.:
here's an example based on my comment:
mask=zeros(50,50,5);
for n=1:size(mask,3)
mask(randi(20):randi(20)+20,randi(20):randi(20)+20,n )=1;
mask(:,:,n)= bwperim( mask(:,:,n),8);
end
A=permute(mask,[3 2 1]);
% plottning
h=slice(A,[],1:5,[]);
set(h,'EdgeColor','none','FaceColor','interp');
alpha(0.3);
colormap(flipud(flag))
You could make such a stack of translated (shifted) images with Python, using the numpy and matplotlib module. Pillow (another Python module) by itself could probably do it as well, but I would have to look up how to ensure values of overlapping pixels get added, rather than overwritten.
So, here's a numpy + matplotlib solution, that starts off with a test image:
import numpy as np
import matplotlib.pyplot as plt
img1 = plt.imread('img.png')
For those following along, a very simply test image is shown at the end of this post, which will also serve to show the different options available for stacking (overwriting or additive which is weighted opacity with equal weights).
layers = 5 # How many images should be stacked.
x_offset, y_offset = 40, 20 # Number of pixels to offset each image.
new_shape = ((layers - 1)*y_offset + img1.shape[0],
(layers - 1)*x_offset + img1.shape[1],
4) # the last number, i.e. 4, refers to the 4 different channels, being RGB + alpha
stacked = np.zeros(new_shape, dtype=np.float)
for layer in range(layers):
stacked[layer*y_offset:layer*y_offset + img1.shape[0],
layer*x_offset:layer*x_offset + img1.shape[1],
...] += img1*1./layers
plt.imsave('stacked.png', stacked, vmin=0, vmax=1)
It's very simple really: you precalculate the size of the output image, initialize it to have full transparency and then you "drop" the base image in that file, each time offset by a certain offset vector. The interesting part comes when parts overlap. You then have some choices:
overwrite what was there before. In this case, change the += operator to simply =. Also, don't scale by the number of layers.
add in a weighted fashion. You should rescale all the intensity values in each channel by a certain weight (equal importance was taken in the example above) and then add those values. It is possible, depending a.o. on the weights, that you saturate pixels. You have the option then to clip the array (thereby resulting in loss of information) or simply rescale everything by the newly obtained maximum value. The example above uses clipping by specifying vmin and vmax in the call to imsave.
The test image shown here contains 4 transparent squares, but those are not easily distinguished from the 2 white ones in the top left row. They were added to illustrate the transparency addition and effect of rescaling (white becomes gray).
After running the above code, you end up with something like this (change your offsets though) ("add")
or like this ("overwrite")
There are a few more ways you can think of that reflect what you want to do when pixels overlap. The 2 situations here are probably the most common ones though. In any case, the approach laid out here should give you a good start.

Selecting best range of values from histogram curve

Scenario :
I am trying to track two different colored objects. At the beginning, user is prompted to hold the first colored object (say, may be a RED) at a particular position in front of camera (marked on screen by a rectangle) and press any key, then my program takes that portion of frame (ROI) and analyze the color in it, to find what color to track. Similarly for second object also. Then as usual, use cv.inRange function in HSV color plane and track the object.
What is done :
I took the ROI of object to be tracked, converted it to HSV and checked the Hue histogram. I got two cases as below :
( here there is only one major central peak. But in some cases, I get two such peaks, One a bigger peak with some pixel cluster around it, and second peak, smaller than first one, but significant size with small cluster around it also. I don't have an sample image of it now. But it almost look like below (created in paint))
Question :
How can I get best range of hue values from these histograms?
By best range I mean, may be around 80-90% of the pixels in ROI lie in that range.
Or is there any better method than this to track different colored objects ?
If I understand right, the only thing you need here is to find a maximum in a graph, where the maximum is not necessarily the highest peak, but the area with largest density.
Here's a very simple not too scientific but fast O(n) approach. Run the histogram trough a low pass filter. E.g. a moving average. The length of your average can be let's say 20. In that case the 10th value of your new modified histogram would be:
mh10 = (h1 + h2 + ... + h20) / 20
where h1, h2... are values from your histogram. The next value:
mh11 = (h2 + h3 + ... + h21) / 20
which can be calculated much easier using the previously calculated mh10, by dropping it's first component and adding a new one to the end:
mh11 = mh10 - h1/20 + h21/20
Your only problem is how you handle numbers at the edge of your histogram. You could shrink your moving average's length to the length available, or you could add values before and after what you already have. But either way, you couldn't handle peaks right at the edge.
And finally, when you have this modified histogram, just get the maximum. This works, because now every value in your histogram contains not only himself but it's neighbors as well.
A more sophisticated approach is to weight your average for example with a Gaussian curve. But that's not linear any more. It would be O(k*n), where k is the length of your average which is also the length of the Gaussian.

Categories