Build a Numpy array where datas match their own coordinates - python

I want to write a function with a tuple s (of length n) as an argument, which should return an array of shape s concatenated with (n,) (thus adding an extra dimension to the array), for which the data indexed by any tuple of length n (thus forgetting the last dimension) returns the tuple itself.
Here is an example of what it should do:
>>> a=f((2,3,4))
>>> a[1,1,1]
ndarray([1, 1, 1])
>>> a.shape
(2,3,4,3)
I managed to do it with the following code (if I am not wrong), but I am pretty sure it can be achieved in a more simple way:
a = np.transpose(
np.meshgrid(
*(np.arange(0, x) for x in s)),
axes = (2,1) + tuple(range(3, n+1)) + (0,))
(I hope I correctly copied my code since I had to simplify variables here.)

You are looking for np.indices to generate those ranged-indices. To bring to the same format as the one proposed in the code listed in the question, we need to push back the first axis to the end.
So, we would have an implementation for s of generic length, like so -
s = (2,3,4) # Input
out = np.indices(s).transpose(range(1,len(s)+1) + [0])
Alternatively, this pushing back of dimension could be achieved with np.rollaxis as well -
out = np.rollaxis(np.indices(s),0,len(s)+1)

Related

Numpy/Pytorch generate mask based on varying index values

I've been trying to do the following as a batch operation in numpy or torch (no looping). Is this possible?
Suppose I have:
indices: [[3],[2]] (2x1)
output: [[0,0,0,0,1], [0,0,0,1,1]] (2xfixed_num) where fixed_num is 5 here
Essentially, I want to make indices up to that index value 0 and the rest 1 for each element.
Ok, so I actually assume this is some sort of HW assignment - but maybe it's not, either way it was fun to do, here's a solution for your specific example, maybe you can generalize it to any shape array:
def fill_ones(arr, idxs):
x = np.where(np.arange(arr.shape[1]) <= idxs[0], 0, 1) # This is the important logic.
y = np.where(np.arange(arr.shape[1]) <= idxs[1], 0, 1)
return np.array([x, y])
So where the comment is located - we use a condition to assign 0 to all indices before some index value, and 1 after such value. This actually creates a new array as opposed to a mask that we can use to the original array - so maybe it's "dirtier".
Also, I suspect it's possible to generalize to arrays more than 2 dimensions, but the solution i'm imagining now uses a for-loop. Hope this helps!
Note: arr is just a numpy array of whatever shape you want the output to be and idxs is a tuple of what indices past you want to the array elements to turn into 1's - hope that is clear

expand numpy array in n dimensions

I am trying to 'expand' an array (generate a new array with proportionally more elements in all dimensions). I have an array with known numbers (let's call it X) and I want to make it j times bigger (in each dimension).
So far I generated a new array of zeros with more elements, then I used broadcasting to insert the original numbers in the new array (at fixed intervals).
Finally, I used linspace to fill the gaps, but this part is actually not directly relevant to the question.
The code I used (for n=3) is:
import numpy as np
new_shape = (np.array(X.shape) - 1 ) * ratio + 1
new_array = np.zeros(shape=new_shape)
new_array[::ratio,::ratio,::ratio] = X
My problem is that this is not general, I would have to modify the third line based on ndim. Is there a way to use such broadcasting for any number of dimensions in my array?
Edit: to be more precise, the third line would have to be:
new_array[::ratio,::ratio] = X
if ndim=2
or
new_array[::ratio,::ratio,::ratio,::ratio] = X
if ndim=4
etc. etc. I want to avoid having to write code for each case of ndim
p.s. If there is a better tool to do the entire process (such as 'inner-padding' that I am not aware of, I will be happy to learn about it).
Thank you
array = array[..., np.newaxis] will add another dimension
This article might help
You can use slice notation -
slicer = tuple(slice(None,None,ratio) for i in range(X.ndim))
new_array[slicer] = X
Build the slicing tuple manually. ::ratio is equivalent to slice(None, None, ratio):
new_array[(slice(None, None, ratio),)*new_array.ndim] = ...

How to get arrays that ouput result in brackets like [1][2][3] to [1 2 3]

The title kind of says it all. I have this (excerpt):
import numpy as np
import matplotlib.pyplot as plt
number_of_particles=1000
phi = np.arccos(1-2*np.random.uniform(0.0,1.,(number_of_particles,1)))
vc=2*pi
mux=-vc*np.sin(phi)
and I get out
[[-4.91272413]
[-5.30620302]
[-5.22400513]
[-5.5243784 ]
[-5.65050497]...]
which is correct, but I want it to be in the format
[-4.91272413 -5.30620302 -5.22400513 -5.5243784 -5.65050497....]
Feel like there should be a simple solution, but I couldn't find it.
Suppose your array is represented by the variable arr.
You can do,
l = ''
for i in arr:
l = l+i+' '
arr = [l]
Use this command:
new_mux = [i[0] for i in mux]
But I need it in an array, so then I add this
new_mux=np.array(new_mux)
and I get the desired output.
There's a method transpose in numpy's array object
mux.transpose()[0]
(I just noticed that this is a very old question, but since I have typed up this answer, and I believe it is simpler and more efficient than the existing ones, I'll post it...)
Notice that when you do
np.random.uniform(0.0,1.,(number_of_particles, 1))
you are creating a two-dimensional array with number_of_particles rows and one column. If you want a one-dimensional array throughout, you could do
np.random.uniform(0.0,1.,(number_of_particles,))
instead.
If you want to keep things 2d, but reshape mux for some reason, you can... well, reshape it:
mux_1d = mux.reshape(-1)
-1 here means "reshape it to one axis (because there’s just one number) and figure out automatically home many elements there should be along that axis (because the number is -1)."

Filtered Numpy Array Changes Number of Dimensions

I'm having trouble getting used to Numpy arrays (I'm a Matlab user). When I try to select just a range of values from an array, I see the resulting array has an extra dimension:
ioi = np.nonzero((self.data_array[0,:] >= range_start) & (self.data_array[0,:] <= range_end))
print("self.data_array.shape = {0}".format(self.data_array.shape))
print("self.data_array.shape[:,ioi] = {0}".format(self.data_array[:,ioi].shape))
The result is:
self.data_array.shape = (5, 50000)
self.data_array.shape[:,ioi] = (5, 1, 408)
I also see that ioi is a tuple. I don't know if that has anything to do with it.
What is happening here to create that extra dimension and what should I do, in the most direct way, to get an array shape of (5,408) in this case?
The simplest and most efficient thing would be to get rid of the np.nonzero call, and use logical indexing just as one would in Matlab. Here's an example. (I'm using random data of the same shape, FYI.)
>>> data = np.random.randn(5, 5000)
>>> start, end = -0.5, 0.5
>>> ioi = (data[0] > start) & (data[0] < end)
>>> print(ioi.shape)
(5000,)
>>> print(ioi.sum())
1900
>>> print(data[:, ioi].shape)
(5, 1900)
The np.nonzero call is not usually needed. Just like Matlab's find function, it's slow compared with logical indexing, and usually one's goal can be more efficiently accomplished with logical indexing. np.nonzero, just like find, should mostly be used only when you need the actual index values themselves.
As you suspected, the reason for the extra dimensions is that tuples are handled differently from other types of indexing arrays in NumPy. This is to allow more flexible indexing, such as with slices, ellipses, etc. See this useful page for in-depth explanation, especially the last section.
There are at least two other options to solve the problem. One is to use the ioi array, as returned from np.nonzero, directly as your only index to the data array. As in: self.data_array[ioi]. Part of why you have an extra dimension is that you actually have two set of indices in your call: the slice (:) and the tuple ioi. np.nonzero is guaranteed to return a tuple exactly for this reason, so that its output can always be used to directly index the source array.
The last option is to call np.squeeze on the returned array, but I'd opt for one of the above first.

How to return an array of at least 4D: efficient method to simulate numpy.atleast_4d

numpy provides three handy routines to turn an array into at least a 1D, 2D, or 3D array, e.g. through numpy.atleast_3d
I need the equivalent for one more dimension: atleast_4d. I can think of various ways using nested if statements but I was wondering whether there is a more efficient and faster method of returning the array in question. In you answer, I would be interested to see an estimate (O(n)) of the speed of execution if you can.
The np.array method has an optional ndmin keyword argument that:
Specifies the minimum number of dimensions that the resulting array
should have. Ones will be pre-pended to the shape as needed to meet
this requirement.
If you also set copy=False you should get close to what you are after.
As a do-it-yourself alternative, if you want extra dimensions trailing rather than leading:
arr.shape += (1,) * (4 - arr.ndim)
Why couldn't it just be something as simple as this:
import numpy as np
def atleast_4d(x):
if x.ndim < 4:
y = np.expand_dims(np.atleast_3d(x), axis=3)
else:
y = x
return y
ie. if the number of dimensions is less than four, call atleast_3d and append an extra dimension on the end, otherwise just return the array unchanged.

Categories