By using open source library, pylibdmtx is able to detect data matrix barcode inside an image. The processing speed slower when the barcode just a small portion in a large image. It take few argument to shrink and detected the barcode
Here is a part of coding in the library
with libdmtx_decoder(img, shrink) as decoder:
properties = [
(DmtxProperty.DmtxPropScanGap, gap_size),
(DmtxProperty.DmtxPropSymbolSize, shape),
(DmtxProperty.DmtxPropSquareDevn, deviation),
(DmtxProperty.DmtxPropEdgeThresh, threshold),
(DmtxProperty.DmtxPropEdgeMin, min_edge),
(DmtxProperty.DmtxPropEdgeMax, max_edge)
]
My question is, is there any other library to use beside pylibdmtx ? Or any suggestion to increase the processing speed without affect the accuracy. By the way pylibdmtx is updated on 18/1/2017, it is a maintained library
An option is to pre-locate the code by image filtering.
A Data Matrix has a high contrast (in theory) and a given cell size. If you shring the image so that cells become one or two pixels large, the Data Matrix will stand-out as a highly textured area, and the gradient will strongly respond.
I am also using this library for decoding data matrix and i found something about these arguments that timeout is int value in milliseconds which is really helping for quickly decoding and gap_size is no of pixels between tow data matrix used when you have more than one data matrix in sequence to decode with equal gap. with threshold you can directly give threshold value between 0-100 to this function without using open-CV functions and max count is no of data matrix to be decode in one image and shape is data_matrix size i. for 1010 it is 0, 1212 it is 1 and so on.
by using all these together we can have quick and effective decoding of data matrix.
Related
Goal:
For the past two weeks I've been trying to figure out how to convert the following image:
To one that looks like this (may not match exactly, as this image was taken at a different time):
Lens Correction (necessary?):
The first thing I noticed is that simply slicing the image and overlaying the four parts wouldn't work perfectly, as the curvature of certain lines does not match. For instance, the mid-court line bends left in the second slice and bends right in the third slice. This bending looks like a barrel distortion so I tried using both a parameterized lens correction function (passing k1, k2, and k3 to OpenCV) and using lensfun. Since the lensfun database does not include my camera make or model (it's an AXIS camera) and I do not know the make or model of the lens (it's manufactured as part of the camera), I wrote a small script to dump test images using various lenses with various parameters, then skimmed through the thousands of output images until I found one that looked like it had relatively straight lines:
This correction was done using the "Samyang 12mm f/2.8 Fish-Eye ED AS NCS" lens with a "Canon EOS 10D" camera in lensfun. It's probably not perfect, but I figured it was close enough to move on to step two.
Once the lens distortion was corrected, the second issue is that the same line in two slices was pointing in different directions, which should be corrected with a simple perspective transform. So I began a long quest to figure out the proper parameters for this perspective transform.
Failed Attempts:
1. Using SciPy
I started by writing a cost function to judge the "quality" of a given set of parameters (overlapped pixels should match) and applying SciPy's solver to figure it out. I made several tweaks to my cost function (applying a Gaussian blur, scaling down the image, gray scaling the image, using the Sobel operator to get a gradient, looking only at the pixels on either side of a "seam" after overlapping instead of the whole overlap region, etc) but it always failed to find a good solution. The results looked worse than the original camera image most of the time:
2. Using math
When that failed I tried applying math to compute the proper perspective transform. I know the FOV of the camera (from the spec sheet), I know the image width and height, I know the sensor size (from the spec sheet), and using a protractor I measured the angles between the lenses. Using the pinhole model I then calculated the expected (x,y) values of points on the image plane and what transform would be necessary to correct them. The results looked better than SciPy, but were still dismal.
3. Using OpenCV's Stitcher
After this I tried using OpenCV's built-in Stitcher class. However it failed to stitch together slices 2 and 3 due to insufficient overlap between the images (and about 10% of the time it even failed to stitch together slices 1 and 2, presumably because of the non-deterministic nature of RANSAC). Even when it did succeed, the stitch wasn't that great:
4. Using ORB and OpenCV's findHomography
Most recently I tried using ORB with a mask (only looking for features in the overlap region) and OpenCV's findHomography function to create a custom version of the Stitcher. While the matches seemed promising, the resulting stitch was still sub-optimal:
I'm beginning to suspect that my methodology (slice -> lens correct -> perspective transform -> overlay) is flawed and there's a better way to do this.
5. Updated ORB / findHomography
I updated my feature detection to eliminate any matches where the Y coordinates differed drastically (e.g. matching the white of the table to the white of the lights). After doing this my number of matched features fell from ~110 to ~55, but the homography was improved significantly. Here's the stitch that results for slices 1/2 and 2/3 with the update:
Until someone can tell me that I'm going about this all wrong, I'm going to keep pursuing this strategy with the following added step:
Slice image
Lens correct each slice
Perspective transform slice 2 or 3 so that the side line is horizontal and the mid-court line is vertical
Use ORB + match filtering + findHomography to iteratively align and then stitch adjacent slices
Ultimately when it's all said and done I want to try and compute a mapping from input pixels to output pixels so that we're not doing all of this complex work (lens correction, ORB, findHomography, etc) per-frame. We'll do it once per camera, save the mapping to a file somewhere, then we can in real-time map the input video to an output video frame-by-frame using cv2.remap
Note:
The second image I posted showing the "expected output" comes directly from the camera in question. It can be configured to return the first image at 30 fps, or the second image at 10 fps. We wish to perform the stitching off-camera on a more powerful computer so we can get 30 fps but still have the single image.
AXIS provides an SDK for doing the stitching off-camera, but this SDK is Windows-only and most of our tech stack is Linux and most of our development machines are Mac OS. I have used a Windows computer to try and look into the stitching SDK they provide, however I had no luck getting it to compile and run. Their sample code kept throwing errors and I've never had any luck getting Visual Studio or C++ to play nicely for me.
My suggestion is to train an autoencoder. Use the first image as input and the second one as an output, as in a denoising autoencoder:
Note that you may lose resolution if you create a botteleneck too small in the middle layer.
Also, Variational autoencoders present a latent vector but work following the same principle.
You can adapt this code:
denoise = Sequential()
denoise.add(Convolution2D(20, 3,3,
border_mode='valid',
input_shape=input_shape))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(UpSampling2D(size=(2, 2)))
denoise.add(Convolution2D(20, 3, 3,
init='glorot_uniform'))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(Convolution2D(20, 3, 3,init='glorot_uniform'))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(MaxPooling2D(pool_size=(3,3)))
denoise.add(Convolution2D(4, 3, 3,init='glorot_uniform'))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(Reshape((28,28,1)))
sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=False)
denoise.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['accuracy'])
denoise.summary()
denoise.fit(x_train_noisy, x_train,
nb_epoch=50,
batch_size=30,verbose=1)
I know the basic flow or process of the Image Registration/Alignment but what happens at the pixel level when 2 images are registered/aligned i.e. similar pixels of moving image which is transformed to the fixed image are kept intact but what happens to the pixels which are not matched, are they averaged or something else?
And how the correct transformation technique is estimated i.e. how will I know that whether to apply translation, scaling, rotation, etc and how much(i.e. what value of degrees for rotation, values for translation, etc.) to apply?
Also, in the initial step how the similar pixel values are identified and matched?
I've implemented the python code given in https://simpleitk.readthedocs.io/en/master/Examples/ImageRegistrationMethod1/Documentation.html
Input images are of prostate MRI scans:
Fixed Image Moving Image Output Image Console output
The difference can be seen in the output image on the top right and top left. But I can't interpret the console output and how the things actually work internally.
It'll be very helpful if I get a deep explanation of this thing. Thank you.
A transformation is applied to all pixels. You might be confusing rigid transformations, which will only translate, rotate and scale your moving image to match the fixed image, with elastic transformations, which will also allow some morphing of the moving image.
Any pixel that a transformation cannot place in the fixed image is interpolated from the pixels that it is able to place, though a registration is not really intelligent.
What it attempts to do is simply reduce a cost function, where a high cost is associated with a large difference and a low cost is associated with a small difference. Cost functions can be intensity based (pixel values) or feature based (shapes). It will (semi-)randomly shift the image around untill a preset criteria is met, generally a maximum amount of iterations.
What that might look like can be seen in the following gif:
http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/registration_visualization.gif
I'm looking for a image format with a very low resolution (e.g. 0-25 possible intensity levels) in Opencv for my real-time image processing
The problem
I'm developing a machine learning algorithm based on decision trees. This algorithm gets as input a dataset with multiple features where each feature can have a value from 0 to max. intensity level. As I am currently working with 8-bit Opencv formats this results in a range from 0 - 255 different levels. This resolution is way to fine due to I'm working with a very sparse dataset (i.e. it ends up in let's say 70% undefined cases). But if i rescale these values by e.g. 10 (so the range 0-25) the predictions are still alright and the number of undefined cases are going towards 0.
My current solution
I manually downscale every pixel with the help of loops by a particular value for the algorithm and afterwards upscale it again so it can be shown to the user. This takes too much time.
Would be nice if Opencv would provide a format with even smaller resolution, so I could work with these images right away and skip this preprocessing. If 8-bit greyscale images are already the smallest format, what is the fastest algorithm to easily rescale those ?
I have a series of images which serve as my raw data which I am trying to prepare for publication. These images have a series of white specks randomly throughout which I would like to replace with the average of some surrounding pixels.
I cannot post images, but the following code should produce a PNG that approximates the issue that I'm trying to correct:
import numpy as np
from scipy.misc import imsave
random_array = np.random.random_sample((512,512))
random_array[random_array < 0.999] *= 0.25
imsave('white_specs.png', random_array)
While this should produce an image with a similar distribution of the specks present in my raw data, my images do not have specks uniform in intensity, and some of the specks are more than a single pixel in size (though none of them are more than 2). Additionally, there are spots on my image that I do not want to alter that were intentionally saturated during data acquisition for the purpose of clarity when presented: these spots are approximately 10 pixels in diameter.
In principle, I could write something to look for pixels whose value exceeds a certain threshold then check them against the average of their nearest neighbors. However, I assume what I'm ultimately trying to achieve is not an uncommon action in image processing, and I very much suspect that there is some SciPy functionality that will do this without having to reinvent the wheel. My issue is that I am not familiar enough with the formal aspects/vocabulary of image processing to really know what I should be looking for. Can someone point me in the right direction?
You could simply try a median filter with a small kernel size,
from scipy.ndimage import median_filter
filtered_array = median_filter(random_array, size=3)
which will remove the specks without noticeably changing the original image.
A median filter is well suited for such tasks since it will better preserve features in your original image with high spatial frequency, when compared for instance to a simple moving average filter.
By the way, if your images are experimental (i.e. noisy) applying a non-aggressive median filter (such as the one above) never hurts as it allows to attenuate the noise as well.
I would like to know how I can make a density plot in python. I'm using the following code plt.hist2d(x[:,1],x[:,2],weights=log(y),bins=100)
where the x values are an array, and y is how much energie there are in the respective pixel (I'm working with galaxies's images, but not fits images). But there is a problem with this code, if I choose a little value of bins, for example 240, I can see well the structures of the galaxy, however distorced. If I choose a bin's value of 3000, the image loss an amount of information, many values of y do not are plotted. I will show the two examples below.
I tried to use plt.imshow but does not work, appears the problem TypeError: Invalid dimensions for image data. The data that I'm working comes from hdf5 files.
I would like to have the possibility to plot the image, with high resolution, to be possible see the structures of the galaxy better. It's possible?
Here is the images:
With the system you describe, you should set the bin size according to the size of the pixel. That way you would have the maximum resolution.
I suspect however, that the maximum number of levels you can represent is 256.
If you would like more resolution, you may have to calculate the image yourself. According to this article, you can save images with up to 32 bit precision in grayscale. That would probably be exaggerating. 16-bit is nice though. This calculation isn't that difficult, and the PIL (Python Imaging Library) has the tools to do the formatting work.
Of course, much depends on the resolution of the data you have available!