convert 1d array to 2d array with nan values - python

suppose I have a series like this
S1 = Series([[1 , 2 , 3] , [4 , 5 , 6] , np.nan , [0] , [8 ,9 ]])
0 [1, 2, 3]
1 [4, 5, 6]
2 NaN
3 [0]
4 [8, 9]
then I will create a numpy array from this series
arr1d = S1.values # [[1, 2, 3] [4, 5, 6] nan [0] [8, 9]]
print(arr1d.shape) #(5L,)
print(arr1d.ndim) # 1
is it possible to create a 2d array from arr1d that looks like the following
arr2d = np.array([[1 , 2 , 3 ] , [4 , 5 , 6] ,
[np.nan , np.nan , np.nan] , [0 , np.nan , np.nan] , [8 , 9 , np.nan]])
this is how the 2d array looks like
[[ 1. 2. 3.]
[ 4. 5. 6.]
[ nan nan nan]
[ 0. nan nan]
[ 8. 9. nan]]
print(arr2d.ndim) # 2
print(arr2d.shape) # (5L, 3L)
the solution should work dynamically with any number of elements in arr1d this is just an example of how the data may look like

Not claiming efficiency but this should work:
from itertools import zip_longest
arr2d = np.array(list(zip_longest(*np.atleast_1d(*S1), fillvalue=np.nan))).T
print(arr2d)
print(arr2d.shape)
Output:
[[ 1. 2. 3.]
[ 4. 5. 6.]
[ nan nan nan]
[ 0. nan nan]
[ 8. 9. nan]]
(5, 3)

Related

Python - Divide each row by a vector

I have a 10x10 matrix and I want to divide each row of the matrix with the elements of a vector.
For eg:
Suppose I have a 3x3 matrix
1 1 1
2 2 2
3 3 3
and a vector [1, 2, 3]
Then this is the operation I wish to do:
1/1 1/2 1/3
2/1 2/1 2/3
3/1 3/2 3/3
i.e, divide the elements of a row by the elements of a vector(A python list)
I can do this using for loops. But, is there a better way to do this operation in python?
You should look into broadcasting in numpy. For your example this is the solution:
a = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
b = np.array([1, 2, 3]).reshape(1, 3)
c = a / b
print(c)
>>> [[1. 0.5 0.33333333]
[2. 1. 0.66666667]
[3. 1.5 1. ]]
The first source array should be created as a Numpy array:
a = np.array([
[ 1, 1, 1 ],
[ 2, 2, 2 ],
[ 3, 3, 3 ]])
You don't need to reshape the divisor array (it can be a 1-D array,
as in your source data sample):
v = np.array([1, 2, 3])
Just divide them:
result = a / v
and the result is:
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])

Matrix element repetition bug

I'm trying to create a matrix that reads:
[0,1,2]
[3,4,5]
[6,7,8]
However, my elements keep repeating. How do I fix this?
import numpy as np
n = 3
X = np.empty(shape=[0, n])
for i in range(3):
for j in range(1,4):
for k in range(1,7):
X = np.append(X, [[(3*i) , ((3*j)-2), ((3*k)-1)]], axis=0)
print(X)
Results:
[[ 0. 1. 2.]
[ 0. 1. 5.]
[ 0. 1. 8.]
[ 0. 1. 11.]
[ 0. 1. 14.]
[ 0. 1. 17.]
[ 0. 4. 2.]
[ 0. 4. 5.]
I'm not really sure how you think your code was supposed to work. You are appending a row in X at each loop, so 3 * 3 * 7 times, so you end up with a matrix of 54 x 3.
I think maybe you meant to do:
for i in range(3):
X = np.append(X, [[3*i , 3*i+1, 3*i+2]], axis=0)
Just so you know, appending array is usually discouraged (just create a list of list, then make it a numpy array).
You could also do
>> np.arange(9).reshape((3,3))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])

Python3: Remove array elements with same coordinate (x,y)

I have this array (x,y,f(x,y)):
a=np.array([[ 1, 5, 3],
[ 4, 5, 6],
[ 4, 5, 6.1],
[ 1, 3, 42]])
I want to remove the duplicates with same x,y. In my array I have (4,5,6) and (4,5,6.1) and I want to remove one of them (no criterion).
If I had 2 columns (x,y) I could use
np.unique(a[:,:2], axis = 0)
But my array has 3 columns and I don't see how to do this in a simple way.
I can do a loop but my arrays can be very large.
Is there a way to do this more efficiently?
If I understand correctly, you need this:
a[np.unique(a[:,:2],axis=0,return_index=True)[1]]
output:
[[ 1. 3. 42.]
[ 1. 5. 3.]
[ 4. 5. 6.]]
Please be mindful that it does not keep the original order of rows in a. If you want to keep the order, simply sort the indices:
a[np.sort(np.unique(a[:,:2],axis=0,return_index=True)[1])]
output:
[[ 1. 5. 3.]
[ 4. 5. 6.]
[ 1. 3. 42.]]
I think you want to do this?
np.rint will round your numbers to an integer
import numpy as np
a = np.array([
[ 1, 5, 3],
[ 4, 5, 6],
[ 4, 5, 6.1],
[ 1, 3, 42]
])
a = np.unique(np.rint(a), axis = 0)
print(a)
//result :
[[ 1. 3. 42.]
[ 1. 5. 3.]
[ 4. 5. 6.]]

How to divide an n-dimensional array by the first value from a dimension

Given an array of dimension N how do I divide all values in the array by the first value from a selected dimension?
Example code:
import numpy as np
A = np.random.randint(1, 10, size=(3,3,3))
B = A[:,:,0]
C = np.divide(A,B)
A
print()
B
print()
C
print()
C[:,:,0]
Output:
array([[[1, 8, 5],
[3, 6, 5],
[5, 4, 2]],
[[6, 2, 9],
[4, 2, 2],
[5, 6, 8]],
[[3, 3, 1],
[2, 7, 7],
[6, 4, 6]]])
array([[1, 3, 5],
[6, 4, 5],
[3, 2, 6]])
array([[[1. , 2.66666667, 1. ],
[0.5 , 1.5 , 1. ],
[1.66666667, 2. , 0.33333333]],
[[6. , 0.66666667, 1.8 ],
[0.66666667, 0.5 , 0.4 ],
[1.66666667, 3. , 1.33333333]],
[[3. , 1. , 0.2 ],
[0.33333333, 1.75 , 1.4 ],
[2. , 2. , 1. ]]])
array([[1. , 0.5 , 1.66666667],
[6. , 0.66666667, 1.66666667],
[3. , 0.33333333, 2. ]])
I was expecting the final output from C[:,:,0] to be all 1's. I guess it has to do with the broadcasting of B but I don't think I understand why it isn't broadcasting B into a shape (3,3,3) where it is replicated along dimension 2.
To get your expected results you could reshape your B array to:
B = A[:,:,0].reshape(3,-1, 1)
Then when you divide you will get a result like:
array([[[1. , 0.11111111, 0.11111111],
[1. , 0.25 , 0.5 ],
[1. , 0.88888889, 0.44444444]],
[[1. , 0.88888889, 1. ],
[1. , 1.8 , 1.6 ],
[1. , 4.5 , 0.5 ]],
[[1. , 0.66666667, 0.5 ],
[1. , 1.125 , 0.75 ],
[1. , 0.5 , 2.25 ]]])
You could also maintain the proper dimension for broadcasting by taking B as:
B = A[:,:,0:1]
You need to reshape B such that it aligns with A[:,:,0]:
>>> A
array([[[1, 8, 5],
[3, 6, 5],
[5, 4, 2]],
[[6, 2, 9],
[4, 2, 2],
[5, 6, 8]],
[[3, 3, 1],
[2, 7, 7],
[6, 4, 6]]])
>>> B = A[:, :, 0]
>>> B
array([[1, 3, 5],
[6, 4, 5],
[3, 2, 6]])
# And you need to reorient B as:
>>> B.T[None,:].T
array([[[1],
[3],
[5]],
[[6],
[4],
[5]],
[[3],
[2],
[6]]])
>>> A / B.T[None,:].T
array([[[1. , 8. , 5. ],
[1. , 2. , 1.66666667],
[1. , 0.8 , 0.4 ]],
[[1. , 0.33333333, 1.5 ],
[1. , 0.5 , 0.5 ],
[1. , 1.2 , 1.6 ]],
[[1. , 1. , 0.33333333],
[1. , 3.5 , 3.5 ],
[1. , 0.66666667, 1. ]]])

How to broadcast across 3d tensor in theano?

If I have a 3d tensor block B and I would like to set some of its "faces" to 0 with probability 0.5. Here axis 1 are rows, axis 2 are columns, and axis 3 are the "faces". I have tried
size = (B.shape[1], 1, 1)
noise = self.theano_rng.binomial(size=size, n=1, p=0.5)
return noise * B
But this isn't working, the shapes aren't lining up and I get an error.
For example, I would like
2 2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4 4
* [1 0] ->
6 6 6 0 0 0
7 7 7 0 0 0
8 8 8 0 0 0
You can use dimshuffle to add the dimensions necessary to enable broadcasting.
Here's a working example:
import numpy
import theano
import theano.tensor as tt
x = tt.tensor3()
y = tt.bvector()
z = x * y.dimshuffle(0, 'x', 'x')
f = theano.function([x, y], z)
x_value = numpy.array([[[2, 2, 2], [3, 3, 3], [4, 4, 4]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8]]], dtype=theano.config.floatX)
y_value = numpy.array([1, 0], dtype=numpy.int8)
print f(x_value, y_value)
which prints
[[[ 2. 2. 2.]
[ 3. 3. 3.]
[ 4. 4. 4.]]
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]]

Categories