Converting GeoJSON to a raster image using rasterio

Converting GeoJSON to a raster image using rasterio - python

I am hoping for advice as to how I can convert GeoJSON geometry into a tif image using rasterio. I have tried a lot of things, but all of them do not rasterize all the shapes found in the GeoJSON (more like 80% of the file is rasterized). How can I ensure all the geometry is rasterized and is of adequate size? Let me know if my question is unclear.

Your problem stems from the fact that when you rasterize a shape, using gdal_translate for example, you must determine the resolution of the raster, and it must be chosen in accordance to the "size" of your feature vectors if you want your raster to retain enough information.
If you do not want to lose too many details in the rasterization process, I guess a good rule of thumb would be to set the resolution to be lower than the typical size of your individual features. For example, if your features look like squares ~1km large, a "good" resolution would be in the 10-100m range.
A way to programmatically get the typical size of your features would be to compute their minimum_rotated_rectangle for example: https://shapely.readthedocs.io/en/stable/manual.html#object.minimum_rotated_rectangle

Related

Is it possible to turn a low quality image into a high quality one with Python?

I made a tif image based on a 3d model of a woodsheet. (x, y, z) represents a point in a 3d space. I simply map (x, y) to a pixel position in the image and (z) to the greyscale value of that pixel. It worked as I have imagined. Then I ran into a low-resolution problem when I tried to print it. The tif image would get pixilated badly as soon as it zooms out. My research suggests that I need to increase the resolution of the image. So I tried a few super-resolution algos found from online sources, including this one https://learnopencv.com/super-resolution-in-opencv/
The final image did get a lot bigger in resolution (10+ times larger in either dimension) but the same problem persists - it gets pixilated as soon as it zooms out, just about the same as the original image.
Looks like quality of an image has something to do not only with resolution of it but also something else. When I say quality of image, I mean how clear the wood texture is in the image. And when I enlarge it, how sharp/clear the texture remains in the image. Can anyone shed some light on this? Thank you.
original tif
The algo generated tif is too large to be included here (32M)
Gigapixel enhanced tif
Update - Here is a recently achieved result: with a GAN-based solution
It has restored/invented some of the wood grain details. But the models need to be retrained.

In short, it is possible to do this via deep learning reconstruction like the Super Resolution package you referred to, but you should understand what something like this is trying to do and whether it is fit for purpose.
Generic algorithms like the Super Resolution is trained on variety of images to "guess" at details that is not present in the original image, typically using generative training methods like using the low vs high resolution version of the same image as training data.
Using a contrived example, let's say you are trying to up-res a picture of someone's face (CSI Zoom-and-Enhance style!). From the algorithm's perspective, if a black circle is always present inside a white blob of a certain shape (i.e. a pupil in an eye), then next time it the algorithm sees the same shape it will guess that there should be a black circle and fill in a black pupil. However, this does not mean that there is details in the original photo that suggests a black pupil.
In your case, you are trying to do a very specific type of up-resing, and algorithms trained on generic data will probably not be good for this type of work. It will be trying to "guess" what detail should be entered, but based on a very generic and diverse set of source data.
If this is a long-term project, you should look to train your algorithm on your specific use-case, which will definitely yield much better results. Otherwise, simple algorithms like smoothing will help make your image less "blocky", but it will not be able to "guess" details that aren't present.

calculate particle size distribution from AFM measurements

I am trying to obtain a radius and diameter distribution from some AFM (Atomic force microscopy) measurements. So far I am trying out Gwyddion, ImageJ and different workflows in Matlab.
At the moment the best results I have found is to use Gwyddion and to take the Phase image, high pass filter it and then try an edge detection with 'Laplacian of Gaussian'. The result is shown in figure 3. However this image is still too noisy and doesnt really capture the edges of all the particles. (some are merged together others do not have a clear perimeter).
In the end I need an image which segments each of the spherical particles which I can use for blob detection/analysis to obtain size/radius information.
Can anyone recommend a different method?
[

I would definitely try a Granulometry, it was designed for something really similar. There is a good explanation of granulometry here starting page 158.
The granulometry will perform consecutive / increasing openings that will erase the different patterns according to their dimensions. The bigger the pattern, the latter it will be erased. It will give you a curve that represent the pattern dimension distributions in your image, so exactly what you want.
However, it will not give you any information about the position inside the image. If you want to have a rough modeling of the blobs present in your image, you can take a look to the Ultimate Opening.

Maybe you can use Avizo, it's a powerful software for dealing with image issues, especially for three D data (CT)

Converting an AutoCAD model to a matrix of points/volumes with the mass density specified at each location

I am an experimental physicist (grad student) that is trying to take an AutoCAD model of the experiment I've built and find the gravitational potential from the whole instrument over a specified volume. Before I find the potential, I'm trying to make a map of the mass density at each point in the model.
What's important is that I already have a model and in the end I'll have a something that says "At (x,y,z) the value is d". If that's an crazy csv file, a numpy array, an excel sheet, or... whatever, I'll be happy.
Here's what I've come up with so far:
Step 1: I color code the AutoCAD file so that color associates with material.
Step 2: I send the new drawing/model to a slicer (made for 3D printing). This takes my 3D object and turns it into equally spaced (in z-direction) 2d objects... but then that's all output as g-code. But hey! G-code is a way of telling a motor how to move.
Step 3: This is the 'hard part' and the meat of this question. I'm thinking that I take that g-code, which is in essence just a set of instructions on how to move a nozzle and use it to populate a numpy array. Basically I have 3D array, each level corresponds to one position in z, and the grid left is my x-y plane. It reads what color is being put where, and follows the nozzle and puts that mass into those spots. It knows the mass because of the color. It follows the path by parsing the g-code.
When it is done with that level, it moves to the next grid and repeats.
Does this sound insane? Better yet, does it sound plausible? Or maybe someone has a smarter way of thinking about this.
Even if you just read all that, thank you. Seriously.

Does this sound insane? Better yet, does it sound plausible?
It's very reasonable and plausible. Using the g-code could do that, but it would require a g-code interpreter that could map the instructions to a 2D path. (Not 3D, since you mentioned that you're taking fixed z-slices.) That could be problematic, but, if you found one, it could work, but may require some parser manipulation. There are several of these in a variety of languages, that could be useful.
SUGGESTION
From what you describe, it's akin to doing a MRI scan of the object, and trying to determine its constituent mass profile along a given axis. In this case, and unlike MRI, you have multiple colors, so that can be used to your advantage in region selection / identification.
Even if you used a g-code interpreter, it would reproduce an image whose area you'll still have to calculate, so noting that and given that you seek to determine and classify material composition by path (in that the path defines the boundary of a particular material, which has a unique color), there may be a couple ways to approach this without resorting to g-code:
1) If the colors of your material are easily (or reasonably) distinguishable, you can create a color mask which will quantify the occupied area, from which you can then determine the mass.
That is, if you take a photograph of the slice, load the image into a numpy array, and then search for a specific value (say red), you can identify the area of the region. Then, you apply a mask on your array. Once done, you count the occupied elements within your array, and then you divide it by the array size (i.e. rows by columns), which would give you the relative area occupied. Since you know the mass of the material, and there is a constant z-thickness, this will give you the relative mass. An example of color masking using numpy alone is shown here: http://scikit-image.org/docs/dev/user_guide/numpy_images.html
As such, let's define an example that's analogous to your problem - let's say we have a picture of a red cabbage, and we want to know which how much of the picture contains red / purple-like pixels.
To simplify our life, we'll set any pixel above a certain threshold to white (RGB: 255,255,255), and then count how many non-white pixels there are:
from copy import deepcopy
import numpy as np
import matplotlib.pyplot as plt
def plot_image(fname, color=128, replacement=(255, 255, 255), plot=False):
# 128 is a reasonable guess since most of the pixels in the image that have the
# purplish hue, have RGB's above this value.
data = imread(fname)
image_data = deepcopy(data) # copy the original data (for later use if need be)
mask = image_data[:, :, 0] < color # apply the color mask over the image data
image_data[mask] = np.array(replacement) # replace the match
if plot:
plt.imshow(image_data)
plt.show()
return data, image_data
data, image_data = plot_image('cabbage.jpg') # load the image, and apply the mask
# Find the locations of all the pixels that are non-white (i.e. 255)
# This returns 3 arrays of the same size)
indices = np.where(image_data != 255)
# Now, calculate the area: in this case, ~ 62.04 %
effective_area = indices[0].size / float(data.size)
The selected region in question is shown here below:
Note that image_data contains the pixel information that has been masked, and would provide the coordinates (albeit in pixel space) of where each occupied (i.e. non-white) pixel occurs. The issue with this of course is that these are pixel coordinates and not a physical one. But, since you know the physical dimensions, extrapolating those quantities are easily done.
Furthermore, with the effective area known, and knowledge of the physical dimension, you have a good estimate of the real area occupied. To obtain better results, tweak the value of the color threshold (i.e. color). In your real-life example, since you know the color, search within a pixel range around that value (to offset noise and lighting issues).
The above method is a bit crude - but effective - and, it may be worth exploring using it in tandem with edge-detection, as that could help improve the region identification, and area selection. (Note that isn't always strictly true!) Also, color deconvolution may be useful: http://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_ihc_color_separation.html#sphx-glr-auto-examples-color-exposure-plot-ihc-color-separation-py
The downside to this is that the analysis requires a high quality image, good lighting; and, most importantly, it's likely that you'll lose some of the more finer details of the edges, which would impact your masses.
2) Instead of resorting to camera work, and given that you have the AutoCAD model, you can use that and the software itself in addition to the above prescribed method.
Since you've colored each material in the model differently, you can use AutoCAD's slicing tool, and can do something similar to what the first method suggests doing physically: slicing the model, and taking pictures of the slice to expose the surface. Then, using a similar method described above of color masking / edge detection / region determination through color selection, you should obtain a much better and (arguably) very accurate result.
The downside to this, is that you're also limited by the image quality used. But, as it's software, that shouldn't be much of an issue, and you can get extremely high accuracy - close to its actual result.
The last suggestion to improve these results would be to script numerous random thin slicing of the AutoCAD model along a particular directional vector shared by every subsequent slice, exporting each exposed surface, analyzing each image in the manner described above, and then collecting those results to given you a Monte Carlo-like and statistically quantifiable determination of the mass (to correct for geometry effects due to slicing along one given axis).

How to make a density plot in python without loss information?

I would like to know how I can make a density plot in python. I'm using the following code plt.hist2d(x[:,1],x[:,2],weights=log(y),bins=100)
where the x values are an array, and y is how much energie there are in the respective pixel (I'm working with galaxies's images, but not fits images). But there is a problem with this code, if I choose a little value of bins, for example 240, I can see well the structures of the galaxy, however distorced. If I choose a bin's value of 3000, the image loss an amount of information, many values of y do not are plotted. I will show the two examples below.
I tried to use plt.imshow but does not work, appears the problem TypeError: Invalid dimensions for image data. The data that I'm working comes from hdf5 files.
I would like to have the possibility to plot the image, with high resolution, to be possible see the structures of the galaxy better. It's possible?
Here is the images:

With the system you describe, you should set the bin size according to the size of the pixel. That way you would have the maximum resolution.
I suspect however, that the maximum number of levels you can represent is 256.
If you would like more resolution, you may have to calculate the image yourself. According to this article, you can save images with up to 32 bit precision in grayscale. That would probably be exaggerating. 16-bit is nice though. This calculation isn't that difficult, and the PIL (Python Imaging Library) has the tools to do the formatting work.
Of course, much depends on the resolution of the data you have available!

Detect the location of an image within a larger image

How do you detect the location of an image within a larger image? I have an unmodified copy of the image. This image is then changed to an arbitrary resolution and placed randomly within a much larger image which is of an arbitrary size. No other transformations are conducted on the resulting image. Python code would be ideal, and it would probably require libgd. If you know of a good approach to this problem you'll get a +1.

There is a quick and dirty solution, and that's simply sliding a window over the target image and computing some measure of similarity at each location, then picking the location with the highest similarity. Then you compare the similarity to a threshold, if the score is above the threshold, you conclude the image is there and that's the location; if the score is below the threshold, then the image isn't there.
As a similarity measure, you can use normalized correlation or sum of squared differences (aka L2 norm). As people mentioned, this will not deal with scale changes. So you also rescale your original image multiple times and repeat the process above with each scaled version. Depending on the size of your input image and the range of possible scales, this may be good enough, and it's easy to implement.
A proper solution is to use affine invariants. Try looking up "wide-baseline stereo matching", people looked at that problem in that context. The methods that are used are generally something like this:
Preprocessing of the original image
Run an "interest point detector". This will find a few points in the image which are easily localizable, e.g. corners. There are many detectors, a detector called "harris-affine" works well and is pretty popular (so implementations probably exist). Another option is to use the Difference-of-Gaussians (DoG) detector, it was developed for SIFT and works well too.
At each interest point, extract a small sub-image (e.g. 30x30 pixels)
For each sub-image, compute a "descriptor", some representation of the image content in that window. Again, many descriptors exist. Things to look at are how well the descriptor describes the image content (you want two descriptors to match only if they are similar) and how invariant it is (you want it to be the same even after scaling). In your case, I'd recommend using SIFT. It is not as invariant as some other descriptors, but can cope with scale well, and in your case scale is the only thing that changes.
At the end of this stage, you will have a set of descriptors.
Testing (with the new test image).
First, you run the same interest point detector as in step 1 and get a set of interest points. You compute the same descriptor for each point, as above. Now you have a set of descriptors for the target image as well.
Next, you look for matches. Ideally, to each descriptor from your original image, there will be some pretty similar descriptor in the target image. (Since the target image is larger, there will also be "leftover" descriptors, i.e. points that don't correspond to anything in the original image.) So if enough of the original descriptors match with enough similarity, then you know the target is there. Moreover, since the descriptors are location-specific, you will also know where in the target image the original image is.

You probably want cross-correlation. (Autocorrelation is correlating a signal with itself; cross correlating is correlating two different signals.)
What correlation does for you, over simply checking for exact matches, is that it will tell you where the best matches are, and how good they are. Flip side is that, for a 2-D picture, it's something like O(N^3), and it's not that simple an algorithm. But it's magic once you get it to work.
EDIT: Aargh, you specified an arbitrary resize. That's going to break any correlation-based algorithm. Sorry, you're outside my experience now and SO won't let me delete this answer.

http://en.wikipedia.org/wiki/Autocorrelation is my first instinct.

Take a look at Scale-Invariant Feature Transforms; there are many different flavors that may be more or less tailored to the type of images you happen to be working with.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.