I'm trying to learn the Arrayfire idioms by translating some vectorised numpy code.
For example, this is valid rowwise addition and multiplication in numpy,
>>> a = np.array([1,2,3])
>>> a
array([1, 2, 3])
>>> b = np.arange(9).reshape((3, 3))
>>> b
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> a + b
array([[ 1, 3, 5],
[ 4, 6, 8],
[ 7, 9, 11]])
>>> a * b
array([[ 0, 2, 6],
[ 3, 8, 15],
[ 6, 14, 24]])
Do all arrays in arrayfire need to be the same shape? This generates an error.
>>> a = af.from_ndarray(a)
>>> b = af.from_ndarray(b)
>>> a + b
Invalid dimension for argument 1
Expected: ldims == rdims
>>> a * b
Invalid dimension for argument 1
Expected: ldims == rdims
From #pradeep answer, you can do this using tile until the added as a feature request.
>>> a = np.array([1,2,3])
>>> a = af.tile(af.transpose(af.from_ndarray(a)),3,1)
>>> af.display(a)
[3 3 1 1]
1 2 3
1 2 3
1 2 3
>>> b = np.arange(9).reshape((3, 3))
>>> b = af.from_ndarray(b)
>>> af.display(a + b)
[3 3 1 1]
1 3 5
4 6 8
7 9 11
>>> af.display(a * b)
[3 3 1 1]
0 2 6
3 8 15
6 14 24
As of writing this response, yes that is correct. Array's need to be of same shape. But I would like to point out that we are already working on broadcasting feature for binary operations - here is the PR - we will try to get this feature into a release as soon as we can.
However, even with current release, this limitation can be easily worked around using tile function. Since tile will be a JIT operation for such broadcast operations, it won't allocate any additional memory. The arithmetic operation and tiling operation will be combined into an efficient single launch kernel.
Related
I am trying to find a solution to the following problem. I have an index in C-order and I need to convert it into F-order.
To explain simply my problem, here is an example:
Let's say we have a matrix x as:
x = np.arange(1,5).reshape(2,2)
print(x)
array([[1, 2],
[3, 4]])
Then the flattened matrix in C order is:
flat_c = x.ravel()
print(flat_c)
array([1, 2, 3, 4])
Now, the value 3 is at the 2nd position of the flat_c vector i.e. flat_c[2] is 3.
If I would flatten the matrix x using the F-order, I would have:
flat_f = x.ravel(order='f')
array([1, 3, 2, 4])
Now, the value 3 is at the 1st position of the flat_f vector i.e. flat_f[1] is 3.
I am trying to find a way to get the F-order index knowing the dimension of the matrix and the corresponding index in C-order.
I tried using np.unravel_index but this function returns the matrix positions...
We can use a combination of np.ravel_multi_index and np.unravel_index for a ndarray supported solution. Hence, given array shape s of input array a and c-order index c_idx, it would be -
s = a.shape
f_idx = np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
So, the idea is pretty simple. Use np.unravel_index to get c-based indices in n-dim, then get flattened-linear index in fortran order by using np.ravel_multi_index on flipped shape and those flipped n-dim indices to simulate fortran behavior.
Sample runs on 2D -
In [321]: a
Out[321]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
In [322]: s = a.shape
In [323]: c_idx = 6
In [324]: np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
Out[324]: 4
In [325]: c_idx = 12
In [326]: np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
Out[326]: 8
Sample run on 3D array -
In [336]: a
Out[336]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
In [337]: s = a.shape
In [338]: c_idx = 21
In [339]: np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
Out[339]: 9
In [340]: a.ravel('F')[9]
Out[340]: 21
Suppose your matrix is of shape (nrow,ncol). Then the 1D index when unraveled in C style for the (irow,icol) entry is given by
idxc = ncol*irow + icol
In the above equation, you know idxc. Then,
icol = idxc % ncol
Now you can find irow
irow = (idxc - icol) / ncol
Now you know both irow and icol. You can use them to get the F index. I think the F index will be given by
idxf = nrow*icol + irow
Please double-check my math, I might have got something wrong...
For the 3D case, if your array has dimensions [n1][n2][n3], then the unraveled C-index for [i1][i2][i3] is
idxc = n2*n3*i1 + n3*i2+i3
Using modulo operations similar to the 2D case, we can recover i1,i2,i3 and then convert to unraveled F index, i.e.
n3*i2 + i3 = idxc % (n2*n3)
i3 = (n3*i2+i3) % n3
i2 = ((n3*i2+i3) - i3) /n3
i1 = (idxc-(n3+i2+i3)) /(n2*n3)
F index would be:
idxf = i1 + n1*i2 +n1*n2*i3
Please check my math.
In simple cases you may also get away with transposing and ravelling the array:
import numpy as np
x = np.arange(2 * 2).reshape(2, 2)
print(x)
# [[0 1]
# [2 3]]
print(x.ravel())
# [0 1 2 3]
print(x.transpose().ravel())
# [0 2 1 3]
x = np.arange(2 * 3 * 4).reshape(2, 3, 4)
print(x)
# [[[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
# [[12 13 14 15]
# [16 17 18 19]
# [20 21 22 23]]]
print(x.ravel())
# [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
print(x.transpose().ravel())
# [ 0 12 4 16 8 20 1 13 5 17 9 21 2 14 6 18 10 22 3 15 7 19 11 23]
The problem is very similar to that in How to evaluate the sum of values within array blocks, where I need to sum up elements in a matrix by blocks. What is different here is that the blocks could have different sizes. For example, given a 4-by-5 matrix
1 1 | 1 1 1
----|------
1 1 | 1 1 1
1 1 | 1 1 1
1 1 | 1 1 1
and block sizes 1 3 along the rows and 2 3 along the columns, the result should be a 2-by-2 matrix:
2 3
6 9
Is there a way of doing this without loops?
Seems like a good fit to use np.add.reduceat to basically sum along rows and then along cols -
def sum_blocks(a, row_sizes, col_sizes):
# Sum rows based on row-sizes
s1 = np.add.reduceat(a,np.r_[0,row_sizes[:-1].cumsum()],axis=0)
# Sum cols from row-summed output based on col-sizes
return np.add.reduceat(s1,np.r_[0,col_sizes[:-1].cumsum()],axis=1)
Sample run -
In [45]: np.random.seed(0)
...: a = np.random.randint(0,9,(4,5))
In [46]: a
Out[46]:
array([[5, 0, 3, 3, 7],
[3, 5, 2, 4, 7],
[6, 8, 8, 1, 6],
[7, 7, 8, 1, 5]])
In [47]: row_sizes = np.array([1,3])
...: col_sizes = np.array([2,3])
In [48]: sum_blocks(a, row_sizes, col_sizes)
Out[48]:
array([[ 5, 13],
[36, 42]])
I’m trying to take a set of data that consists of N rows, and expand each row to include the squares/cube/etc of each column in that row (what power to go up to is determined by a variable j). The data starts out as a pandas DataFrame but can be turned into a numpy array.
For example:
If the row is [3,2] and j is 3, the row should be transformed to [3, 2, 9, 4, 27, 8]
I currently have a semi working version that consists of a bunch of nested for loops and is pretty ugly. I’m hoping for a cleaner way to make this transformation so things will be a bit easier for me to debug.
The behavior I’m looking for is basically the same as sklearns PolynomialFeature, but I’m trying to do it in numpy and or pandas only.
Thanks!
Use NumPy broadcasting for a vectorized solution -
In [66]: a = np.array([3,2])
In [67]: j = 3
In [68]: a**np.arange(1,j+1)[:,None]
Out[68]:
array([[ 3, 2],
[ 9, 4],
[27, 8]])
And there's a NumPy builtin : np.vander -
In [142]: np.vander(a,j+1).T[::-1][1:]
Out[142]:
array([[ 3, 2],
[ 9, 4],
[27, 8]])
Or with increasing flat set as True -
In [180]: np.vander(a,j+1,increasing=True).T[1:]
Out[180]:
array([[ 3, 2],
[ 9, 4],
[27, 8]])
Try concat with ignore_index option to remove duplicate in column names:
df = pd.DataFrame(np.arange(9).reshape(3,3))
j = 3
pd.concat([df**i for i in range(1,j+1)], axis=1,ignore_index=True)
Output:
0 1 2 3 4 5 6 7 8
0 0 1 2 0 1 4 0 1 8
1 3 4 5 9 16 25 27 64 125
2 6 7 8 36 49 64 216 343 512
I have 4 images, each have width and height of 8. They belong inside a vector with shape [4,8,8]. I reshape the vector of images to become a matrix of images with shape [2,2,8,8].
How can I concatenate the images from inside the matrix to produce a single image so that the shape becomes [16,16]? I want the images to be concatenated so that their x,y position from the matrix are maintained - essentially just stitching separate images together into a single image.
I have a feeling this could easily be done in numpy, maybe even tensorflow, but I'm open to any nice solution in python.
You can use the numpy.concatenate with different axis. Here is an example with a reduced version using 4 images with shape [2 2], which produces a [4 4]resulting image:
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
c = np.array([[9, 10], [11, 12]])
d = np.array([[13, 14], [15, 16]])
ab = np.concatenate((a, b), axis=1)
cd = np.concatenate((c, d), axis=1)
abcd = np.concatenate((ab, cd), axis=0)
>>> print(abcd)
array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
>>> print(abcd.shape)
(4, 4)
Just adapt this code to yours, instead of using a, b, c, d concatenating images by the first dimension of your tensor, with something similar to np.concatenate((t[0], t[1]), axis=1) being t your tensor of shape [4 8 8].
Otherwise, as other answers suggested you can use twice the numpy.hstack function twice, but I think that it's behaviour it's not that easily readable, even being less code.
You can use np.hstack twice like so (slightly smaller arrays to make them printable):
import numpy as np
original = np.array([[np.arange(16).reshape(4,-1)]*2]*2)
combined = np.hstack(np.hstack(original))
print(combined)
wich gives:
[[ 0 1 2 3 0 1 2 3]
[ 4 5 6 7 4 5 6 7]
[ 8 9 10 11 8 9 10 11]
[12 13 14 15 12 13 14 15]
[ 0 1 2 3 0 1 2 3]
[ 4 5 6 7 4 5 6 7]
[ 8 9 10 11 8 9 10 11]
[12 13 14 15 12 13 14 15]]
I wouldn't recommend doing this, because it's hardly readable, but just for the sake of numpy exercise you can make a grid of images with just reshape() and transpose() methods.
import numpy as np
w_img = 3
h_img = 2
n_img = 4
images = (np.array([a + b for a in 'abcd' for b in '123456'])
.reshape(n_img, h_img, w_img))
# A batch of 4 images 3x2
print(images)
#[[['a1' 'a2' 'a3']
# ['a4' 'a5' 'a6']]
#
# [['b1' 'b2' 'b3']
# ['b4' 'b5' 'b6']]
#
# [['c1' 'c2' 'c3']
# ['c4' 'c5' 'c6']]
#
# [['d1' 'd2' 'd3']
# ['d4' 'd5' 'd6']]]
# Making 2x2 grid
w_grid = 2
h_grid = 2
grid = images.reshape(h_grid, w_grid, h_img, w_img) # axes: h_grid, w_grid, h_img, w_img
grid = grid.transpose([0, 2, 1, 3]) # axes: h_grid, h_img, w_grid, w_img
grid = grid.reshape(4, 6) # axes: (h_grid * h_img), (w_grid * w_img)
print(grid)
#[['a1' 'a2' 'a3' 'b1' 'b2' 'b3']
# ['a4' 'a5' 'a6' 'b4' 'b5' 'b6']
# ['c1' 'c2' 'c3' 'd1' 'd2' 'd3']
# ['c4' 'c5' 'c6' 'd4' 'd5' 'd6']]
I want to replace elements in a numpy array using a list of old values and new values. See below for a code example (replace_old is the requested method). The method must work for both int, float and string elements. How do I do that?
import numpy as np
dat = np.hstack((np.arange(1,9), np.arange(1,4)))
print dat # [1 2 3 4 5 6 7 8 1 2 3]
old_val = [2, 5]
new_val = [11, 57]
new_dat = replace_old(dat, old_val, new_val)
print new_dat # [1 11 3 4 57 6 7 8 1 11 3]
You can use np.place :
>>> np.place(dat,np.in1d(dat,old_val),new_val)
>>> dat
array([ 1, 11, 3, 4, 57, 6, 7, 8, 1, 11, 3])
For creating the mask array you can use np.in1d(arr1,arr2) which will give you :
a boolean array the same length as ar1 that is True where an element of ar1 is in ar2 and False otherwise
Edit:Note that the preceding recipe will replace old_values based on those order and as #ajcr mentioned it wont work for another arrays,so as a general way for now I suggest the following way using a loop (which I don't think that was the best way):
>>> dat2 = np.array([1, 2, 1, 2])
>>> old_val = [1, 2]
>>> new_val = [33, 66]
>>> z=np.array((old_val,new_val)).T
>>> for i,j in z:
... np.place(dat2,dat2==i,j)
...
>>> dat2
array([33, 66, 33, 66])
In this case you create a new array (z) which is contains the relevant pairs from old_val and new_val and then you can pass them to np.place and replace them .