On Creating Random Walkers in 3D elements of 3 different 4D arrays

On Creating Random Walkers in 3D elements of 3 different 4D arrays - python

I am currently working with 2 different 4D arrays (x,y,z,l) containing voxel data of a brain scan (x,y,z) and a label (l). Every voxel holds the probability of that voxel to contain that label.
The first array, represents the brain tissue in 3D space, has 3 labels, white matter, grey matter, and the CSF. My only area of interest is grey matter.
The second aray, represents a pre defined probability distribution of 360 labels, all of which correspond to a portion of the grey matter.
The "shape" of the probability distributions are in 3D is similar, but not exactly the same. But they are already aligned in 3D space with a combination of rigid and affine transformations, and it is "as similar as possible".
What I want to do, is to map this 360 3D elements, onto the second 3D element of first array and get a best fit.
I am not an expert in machine learning, so I tried to come up with my own idea;
Pick every voxel from grey matter probability map
Generate 1000 random walks from each point.
Which territory is walked most? - Label the point
What I explained above, I havent been able to code yet, therefore I am unable to offer any code.
I am still trying to create "the random walking sampler" but I quickly realized I need a lot of nested for loops work on a lot of different 3D arrays (1363 to be exact).
Is there a better way to do it?
Is there a better way to do this?

Related

How can I fit a perfect square onto a ndarray matrix containing the edges of a fuzzy image of an imperfect square?

Context: Microscopy.
I'm trying to stitch together small images (can refer to as sub-images or tiles) of a fuzzy grid to make a larger "mosaic" or "atlas" of the overall grid, such that these sub-images are lined up perfectly. The orientation is random overall, but consistent from tile to tile, and the four sides of a tile typically cut through several squares in the grid. I've used scikit-image to get the edges of each square separately as an Nx2 matrix (ndarray). What I think would be great is if I could fit a proper square (I do mean the kind with four sides and four right angles -- not a number multiplied by itself in case that's unclear) over each edge/silhouette of a square (I do have unrelated edges detected that pick up noise so need to get rid of those), but a) not sure how to do that and b) the squares themselves have noise--for example a piece of sample or other contaminant falling right on the edge of a square distorts the shape a bit.
In the images above, I'm showing one such sub-image/tile (I have others which line up as neighbors to this, with slight (~1%) overlap. I figure if I have the geometric centers of all of the squares, I'll have no trouble finding the correct dimensions for creating the larger mosaic by average distance between neighboring squares in the grid is consistent. Please feel free to berate me mercilessly if I'm going about this poorly, or ask any follow-up questions if I failed to mention anything relevant.
Also maybe worth mentioning: The script I'm attempting to write needs to do this in an automated, fast (no human interaction) way behind-the-scenes, and for other grid images not just the one you see here, and that's important because other images will likely be fuzzier, as this is what's called a "bare" grid, but others will have "nanowires" (imagine this but the black part is almost furry/hairy).
Thanks in advance!!

Calculate 3D Plane that Rests on a 3D Surface

I have about 300,000 points defining my 3D surface. I would like to know if I dropped a infinitely stiff sheet onto my 3D surface, what the equation of that plane would be. I know I need to find the 3 points the sheet would rest on as that defines a plane, but I'm not sure how to find my 3 points out of the ~300,000. You can assume this 3D surface is very bumpy and that this sheet will most likely lie on 3 "hills".
Edit: Some more background knowledge. This is point cloud data for a scan of a 3D surface which is nearly flat. What I would like to know is how this object would rest if I flipped it over and put it on a completely flat surface. I realize that this surface may be able to rest on the table in various different ways depending on the density and thickness of the object but you can assume the number of ways is finite and I would like to know all of the different ways just in case.
Edit: After looking at some point cloud libraries I'm thinking of doing something like computing the curvature using a kd tree (using SciPy) and only looking at regions that have a negative curvature and then there should be 3+ regions with negative curvature so some combinatorics + iterations should give the correct 3 points for the plane(s).

Get 3D coordinates from two 2D frames

I couldn't find the proper answer to my problem on the Web, so I'll ask it here. Let's say we're given two 2D photos of the same place taken from slightly different angles. I've chosen the set of points (edge detection), found correspondences between them (which point is which on other photo). Now I need to somehow find out world coordinates of these points in 3D.
For the last 5 hours I've read a lot about it but I still can't understand what steps should I follow. I've tried to estimate motion of a camera using the function recoverPose applied to an essential matrix and two sets of points on each frame. I can't understand what it gives me when I know rotation and translation matrices (thatrecoverPose returned). What should I do in order to achieve my goal?
I also know the calibration matrix of my camera (I use KITTI dataset). I've read opencv documentation but still don't understand.
It's monocular vision.

Converting an AutoCAD model to a matrix of points/volumes with the mass density specified at each location

I am an experimental physicist (grad student) that is trying to take an AutoCAD model of the experiment I've built and find the gravitational potential from the whole instrument over a specified volume. Before I find the potential, I'm trying to make a map of the mass density at each point in the model.
What's important is that I already have a model and in the end I'll have a something that says "At (x,y,z) the value is d". If that's an crazy csv file, a numpy array, an excel sheet, or... whatever, I'll be happy.
Here's what I've come up with so far:
Step 1: I color code the AutoCAD file so that color associates with material.
Step 2: I send the new drawing/model to a slicer (made for 3D printing). This takes my 3D object and turns it into equally spaced (in z-direction) 2d objects... but then that's all output as g-code. But hey! G-code is a way of telling a motor how to move.
Step 3: This is the 'hard part' and the meat of this question. I'm thinking that I take that g-code, which is in essence just a set of instructions on how to move a nozzle and use it to populate a numpy array. Basically I have 3D array, each level corresponds to one position in z, and the grid left is my x-y plane. It reads what color is being put where, and follows the nozzle and puts that mass into those spots. It knows the mass because of the color. It follows the path by parsing the g-code.
When it is done with that level, it moves to the next grid and repeats.
Does this sound insane? Better yet, does it sound plausible? Or maybe someone has a smarter way of thinking about this.
Even if you just read all that, thank you. Seriously.

Does this sound insane? Better yet, does it sound plausible?
It's very reasonable and plausible. Using the g-code could do that, but it would require a g-code interpreter that could map the instructions to a 2D path. (Not 3D, since you mentioned that you're taking fixed z-slices.) That could be problematic, but, if you found one, it could work, but may require some parser manipulation. There are several of these in a variety of languages, that could be useful.
SUGGESTION
From what you describe, it's akin to doing a MRI scan of the object, and trying to determine its constituent mass profile along a given axis. In this case, and unlike MRI, you have multiple colors, so that can be used to your advantage in region selection / identification.
Even if you used a g-code interpreter, it would reproduce an image whose area you'll still have to calculate, so noting that and given that you seek to determine and classify material composition by path (in that the path defines the boundary of a particular material, which has a unique color), there may be a couple ways to approach this without resorting to g-code:
1) If the colors of your material are easily (or reasonably) distinguishable, you can create a color mask which will quantify the occupied area, from which you can then determine the mass.
That is, if you take a photograph of the slice, load the image into a numpy array, and then search for a specific value (say red), you can identify the area of the region. Then, you apply a mask on your array. Once done, you count the occupied elements within your array, and then you divide it by the array size (i.e. rows by columns), which would give you the relative area occupied. Since you know the mass of the material, and there is a constant z-thickness, this will give you the relative mass. An example of color masking using numpy alone is shown here: http://scikit-image.org/docs/dev/user_guide/numpy_images.html
As such, let's define an example that's analogous to your problem - let's say we have a picture of a red cabbage, and we want to know which how much of the picture contains red / purple-like pixels.
To simplify our life, we'll set any pixel above a certain threshold to white (RGB: 255,255,255), and then count how many non-white pixels there are:
from copy import deepcopy
import numpy as np
import matplotlib.pyplot as plt
def plot_image(fname, color=128, replacement=(255, 255, 255), plot=False):
# 128 is a reasonable guess since most of the pixels in the image that have the
# purplish hue, have RGB's above this value.
data = imread(fname)
image_data = deepcopy(data) # copy the original data (for later use if need be)
mask = image_data[:, :, 0] < color # apply the color mask over the image data
image_data[mask] = np.array(replacement) # replace the match
if plot:
plt.imshow(image_data)
plt.show()
return data, image_data
data, image_data = plot_image('cabbage.jpg') # load the image, and apply the mask
# Find the locations of all the pixels that are non-white (i.e. 255)
# This returns 3 arrays of the same size)
indices = np.where(image_data != 255)
# Now, calculate the area: in this case, ~ 62.04 %
effective_area = indices[0].size / float(data.size)
The selected region in question is shown here below:
Note that image_data contains the pixel information that has been masked, and would provide the coordinates (albeit in pixel space) of where each occupied (i.e. non-white) pixel occurs. The issue with this of course is that these are pixel coordinates and not a physical one. But, since you know the physical dimensions, extrapolating those quantities are easily done.
Furthermore, with the effective area known, and knowledge of the physical dimension, you have a good estimate of the real area occupied. To obtain better results, tweak the value of the color threshold (i.e. color). In your real-life example, since you know the color, search within a pixel range around that value (to offset noise and lighting issues).
The above method is a bit crude - but effective - and, it may be worth exploring using it in tandem with edge-detection, as that could help improve the region identification, and area selection. (Note that isn't always strictly true!) Also, color deconvolution may be useful: http://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_ihc_color_separation.html#sphx-glr-auto-examples-color-exposure-plot-ihc-color-separation-py
The downside to this is that the analysis requires a high quality image, good lighting; and, most importantly, it's likely that you'll lose some of the more finer details of the edges, which would impact your masses.
2) Instead of resorting to camera work, and given that you have the AutoCAD model, you can use that and the software itself in addition to the above prescribed method.
Since you've colored each material in the model differently, you can use AutoCAD's slicing tool, and can do something similar to what the first method suggests doing physically: slicing the model, and taking pictures of the slice to expose the surface. Then, using a similar method described above of color masking / edge detection / region determination through color selection, you should obtain a much better and (arguably) very accurate result.
The downside to this, is that you're also limited by the image quality used. But, as it's software, that shouldn't be much of an issue, and you can get extremely high accuracy - close to its actual result.
The last suggestion to improve these results would be to script numerous random thin slicing of the AutoCAD model along a particular directional vector shared by every subsequent slice, exporting each exposed surface, analyzing each image in the manner described above, and then collecting those results to given you a Monte Carlo-like and statistically quantifiable determination of the mass (to correct for geometry effects due to slicing along one given axis).

Examples on N-D arrays usage

I was surprised when I started learning numpy that there are N dimensional arrays. I'm a programmer and all I thought that nobody ever use more than 2D array. Actually I can't even think beyond a 2D array. I don't know how think about 3D, 4D, 5D arrays or more. I don't know where to use them.
Can you please give me examples of where 3D, 4D, 5D ... etc arrays are used? And if one used numpy.sum(array, axis=5) for a 5D array would what happen?

A few simple examples are:
A n x m 2D array of p-vectors represented as an n x m x p 3D matrix, as might result from computing the gradient of an image
A 3D grid of values, such as a volumetric texture
These can even be combined in the case of a gradient of a volume in which case you get a 4D matrix
Staying with the graphics paradigm, adding time adds an extra dimension, so a time-variant 3D gradient texture would be 5D
numpy.sum(array, axis=5) is not valid for a 5D-array (as axes are numbered starting at 0)

Practical applications are hard to come up with but I can give you a simple example for 3D.
Imagine taking a 3D world (a game or simulation for example) and splitting it into equally sized cubes. Each cube could contain a specific value of some kind (a good example is temperature for climate modelling). The matrix can then be used for further operations (simple ones like calculating its Transpose, its Determinant etc...).
I recently had an assignment which involved modelling fluid dynamics in a 2D space. I could have easily extended it to work in 3D and this would have required me to use a 3D matrix instead.
You may wish to also further extend matrices to cater for time, which would make them 4D. In the end, it really boils down to the specific problem you are dealing with.
As an end note however, 2D matrices are still used for 3D graphics (You use a 4x4 augmented matrix).

There are so many examples... The way you are trying to represent it is probably wrong, let's take a simple example:
You have boxes and a box stores N items in it. You can store up to 100 items in each box.
You've organized the boxes in shelves. A shelf allows you to store M boxes. You can identify each box by a index.
All the shelves are in a warehouse with 3 floors. So you can identify any shelf using 3 numbers: the row, the column and the floor.
A box is then identified by: row, column, floor and the index in the shelf.
An item is identified by: row, column, floor, index in shelf, index in box.
Basically, one way (not the best one...) to model this problem would be to use a 5D array.

For example, a 3D array could be used to represent a movie, that is a 2D image that changes with time.
For a given time, the first two axes would give the coordinate of a pixel in the image, and the corresponding value would give the color of this pixel, or a grey scale level. The third axis would then represent time. For each time slot, you have a complete image.
In this example, numpy.sum(array, axis=2) would integrate the exposure in a given pixel. If you think about a film taken in low light conditions, you could think of doing something like that to be able to see anything.

They are very applicable in scientific computing. Right now, for instance, I am running simulations which output data in a 4D array: specifically
| Time | x-position | y-position | z-position |.
Almost every modern spatial simulation will use multidimensional arrays, along with programming for computer games.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.