Numpy function over matrix - python

So my question is quite similar to this post: Most efficient way to map function over numpy array, but I have some additional questions to add along.
Right now, I'm taking in an image represented by a 2-D array, and for each pixel in the image, I am doing some computation that involves convolving the nxn neighboring pixels with a Gaussian kernel to find a "weight" for each pixel. My end goal is to return a 2-D array of the same size as the input, with the calculated weight in place of each pixel.
So what I did was to first create a function getWeight that, given a pixel, does the necessary computation using its neighbors and a Gaussian kernel to find its corresponding weight.
So my question is: given getWeight is using a for-loop, or the numpy.fromiter, to apply this function to every pixel in the 2-D array
the best way to go about solving this problem?
Or could there be a way to use built-in np functions to apply this sort of operation on the entirety of the array at once? (This question is kind of vague, but what I am trying to get at is that since numpy operations on arrays are not actually done by "using a for loop for every pixel", whether there could be something I could use to optimize my problem).

Related

Efficient way to draw line with specific falloff (eg blurry line)

Not completely sure what to call this problem but I will try my best to explain it here.
I have the coordinates of a line I want to draw onto a numpy array. However, I don't just want a simple line, but a thick line where I can specify the falloff (brightness with distance from the line) with a curve or mathematic function. For example, I might want to have a gaussian falloff, which would look something similar to the example below where a gaussian blur was applied to the image.
However, using blur filters does not allow the flexibility in functions I would like and does not enable precise control of the falloff (for example, when I want points on the line to have exactly value 1.0 and points further than say 10 pixels away to be 0.0).
I have attempted to solve this problem by creating the falloff pattern for a point, and then drawing that pattern into a new numpy channel for every point of the line, before merging them via the max function. This works but is too slow.
Is there a more efficient way to draw such a line from my input coordinates?
The solution I came up with is to make use of dilations. This method is more general and can be applied to any polygonal shape or binary mask.
Rasterize geometry the simple way first. For points set the corresponding pixel; for lines draw 1 pixel thick lines with library function from opencv or similar; for polygons draw the boundary or fill the polygon with opencv functions. This results in the initial mask with value 1 on the lines.
Iteratively apply dilations to this mask. This grows the mask pixel by pixel. Set the strength of the new mask according to an arbitrary falloff function.
The dilation operation is available in opencv. Alternatively, it can efficiently be implemented as a simple convolution with boolean matrices, which can then run on GPU devices.
An example of the results can be seen with the polygonal input:
Exponential falloff:
Sinusoidal falloff:

Rotate 4x4 image patch by multiples of 15 degrees

I want to find an efficient way to rotate a 4x4 image patches from a larger image by angles that are multiples of 15. I am currently extracting a 6x6 patch e.g. patch=img[x-3:x+3,y-3:y+3] and then running scipy.ndimage.interpolation.rotate(patch,-15*o,reshape=False)[1:5,1:5]. However, I essentially need to do this at ever location (x,y) in the image. I have a "stacked" version of the image with an array of size (m,n,6,6) where m and n are the dimensions of the original image. Even if run interpolation.rotate on the stacked version, it looks like it internally simply does it iteratively and it takes a long time.
Since I only need to do this at fixed angles, I am trying to pre-compute some constants and vectorize the implementation so that I can process them all at once. I have tried digging into the implementation of SciPy rotate but it did not help much.
Is there a sensible way to do this?

Compute distance between combinations of points in a grid

I am looking for an efficient solution to the following problem. This should work with python, but does not have to be in python.
I have a 2D matrix, each element of the matrix represents a point in a 2D, orthogonal grid. I want to compute the shortest distance between couples of points in the grid. This would be trivial if there were no "obstacles" in the grid.
A figure helps explaining:
Each cell in the figure is one element of the matrix (the matrix is square, but it could be rectangular). Gray cells are obstacles, any path between two points must go around them. The green cells are those I am interested in. I am not interested in red cells, but a path can go trough them.
The distance between points like A and B is trivial to compute, but how to compute the path between A and C as shown in the figure?
I have read about the A* algorithm, but since I am working with a rather big grid, generally (few hundred) x (few hundred), I was wondering if there is a smarter alternative. Remember: I have to find the distance between all couples of "green cells", not just between two of them. If I have n green cells, I will have a number of combinations equal to the binomial coefficient (n 2).
The grid is fixed, I have to compute all the distances once and them use them in further calculations, say accessing them based on the relevant indices in the matrix.
Note: the problem is NOT this one, were coordinates are in a list. My 2D coordinates are organised in a 2D grid and the question is about exploiting this aspect for having a more efficient algorithm.
I suppose the most straightforward solution would be the Floyd-Warshall algorithm, which computes the shortest distances between all pairs of nodes in a graph. This doesn't necessarily exploit the fact that you happen to have a 2D grid (it could work on other kinds of graphs too), but it should work fine. The fact that you do have a 2D grid may enable you to implement it more efficiently than if you had to write an implementation for any arbitrary graph (e.g. you can just store distances as they're computed in a matrix, instead of some less efficient data structure).
The regular version only produces the distances of all shortest paths as output, and doesn't actually store the paths themselves as output. There's additional info on the wikipedia page on how to modify the algorithm to enable you to efficiently reconstruct paths if necessary.
Intuitively, I suspect more efficient implementations may be possible which do exploit the fact that you have a 2D grid, probably using ideas from Rectangular Symmetry Reduction and/or Jump Point Search. Both of those ideas are traditionally used with A* for single-pair pathfinding queries though, I'm not aware of any work using them for all-pair shortest path computations. My intuition says they can be exploited there too, but in the time it'll take to figure that out exactly and implement it correctly, you can probably much more easily implement and run Floyd-Warshall.

Examples on N-D arrays usage

I was surprised when I started learning numpy that there are N dimensional arrays. I'm a programmer and all I thought that nobody ever use more than 2D array. Actually I can't even think beyond a 2D array. I don't know how think about 3D, 4D, 5D arrays or more. I don't know where to use them.
Can you please give me examples of where 3D, 4D, 5D ... etc arrays are used? And if one used numpy.sum(array, axis=5) for a 5D array would what happen?
A few simple examples are:
A n x m 2D array of p-vectors represented as an n x m x p 3D matrix, as might result from computing the gradient of an image
A 3D grid of values, such as a volumetric texture
These can even be combined in the case of a gradient of a volume in which case you get a 4D matrix
Staying with the graphics paradigm, adding time adds an extra dimension, so a time-variant 3D gradient texture would be 5D
numpy.sum(array, axis=5) is not valid for a 5D-array (as axes are numbered starting at 0)
Practical applications are hard to come up with but I can give you a simple example for 3D.
Imagine taking a 3D world (a game or simulation for example) and splitting it into equally sized cubes. Each cube could contain a specific value of some kind (a good example is temperature for climate modelling). The matrix can then be used for further operations (simple ones like calculating its Transpose, its Determinant etc...).
I recently had an assignment which involved modelling fluid dynamics in a 2D space. I could have easily extended it to work in 3D and this would have required me to use a 3D matrix instead.
You may wish to also further extend matrices to cater for time, which would make them 4D. In the end, it really boils down to the specific problem you are dealing with.
As an end note however, 2D matrices are still used for 3D graphics (You use a 4x4 augmented matrix).
There are so many examples... The way you are trying to represent it is probably wrong, let's take a simple example:
You have boxes and a box stores N items in it. You can store up to 100 items in each box.
You've organized the boxes in shelves. A shelf allows you to store M boxes. You can identify each box by a index.
All the shelves are in a warehouse with 3 floors. So you can identify any shelf using 3 numbers: the row, the column and the floor.
A box is then identified by: row, column, floor and the index in the shelf.
An item is identified by: row, column, floor, index in shelf, index in box.
Basically, one way (not the best one...) to model this problem would be to use a 5D array.
For example, a 3D array could be used to represent a movie, that is a 2D image that changes with time.
For a given time, the first two axes would give the coordinate of a pixel in the image, and the corresponding value would give the color of this pixel, or a grey scale level. The third axis would then represent time. For each time slot, you have a complete image.
In this example, numpy.sum(array, axis=2) would integrate the exposure in a given pixel. If you think about a film taken in low light conditions, you could think of doing something like that to be able to see anything.
They are very applicable in scientific computing. Right now, for instance, I am running simulations which output data in a 4D array: specifically
| Time | x-position | y-position | z-position |.
Almost every modern spatial simulation will use multidimensional arrays, along with programming for computer games.

Using Python to find distance and angle of surrounding data points in 3D. Finding straight lines

I am using Ipython Notebook. I am working on a project where I need to look at about 100 data points in 3D space and figure out the distance between each and the angle from one another. I want to see correlations of the data points and ultimately see if there is any structure to the data (a straight line hidden somewhere). I have looked into clustering techniques and hough transforms, but they seem not to give me the result I need. Any ideas are much appreciated.. thanks!
For the first issue of determining the pairwise distance between three dimensional points, you can use scipy.spatial.distance.pdist(). This will generate n(n-1)/2 distances for n points. For the second issue finding the angle between points, that's trickier. It seems so tricky that I don't even really want to think about it; however, to that end, you can use scipy.spatial.distance.cosine(), which will determine the cosine distance between two vectors.
Have you looked at scikits? I've found them very helpful in my work. http://scikit-learn.org/stable/
The distance is best found using scipy.spatial.distance.pdist() as mentioned in cjohnson318's answer. For a small array of points 'a' defined as:
import numpy as np
a=np.array([[0,0,0],[1,1,1],[4,2,-2],[3,-1,2]])
The distance euclidean distance 'D' between the points can be found as:
from scipy.spatial.distance import pdist, squareform
D = squareform(pdist(a))
In 3d polar notation, you would need 2 angles to define the direction from one point to another. It seems like a Cartesian unit vector giving the direction would likely serve your purpose just as well. These can be found as:
(a-a[:,np.newaxis,:]) / D[...,np.newaxis]
This will include NaN's in the diagonal elements, as there is no vector from a point to itself. If necessary, these can be changed to zeros using np.nan_to_num
If you actually do need the angles, you could get them by applying np.arctan to the components of the unit vector.

Categories