Numpy arrays with arraysd - python

Is it possible to fulfill numpy arrays with arrays?
I want to obtain a following structure without specifying values by hand
ves = np.zeros((12,12), dtype=object)
ves[0][0] = np.array([0,0,0])
ves[0][1] = np.array([0,0,0])
ves[0][2] = np.array([0,0,0])
ves[0][3] = np.array([0,0,0])
and so on...
In order to obtain the expected result, I have tried ves = np.zeros((12,12), dtype=array), but it does not work.

import numpy as np
v = np.zeros([12,12,3])
As per my understanding through your explanation, it seems you wanted a three dimension matrix where each cell needs three 0 values for 12*12 places. So the above code creates the value filled ndarray.

Related

Xarray add (as in sum) two rows along same dimension but at different coordinate value

I have used xarray to create two different DataArrays with the same dimensions and coordinates. However I want to add two different coordinates in one of these dimensions. I'm trying to add coordinate 'a' to coordinate 'b' in dimension 'x'. There is an easy workaround if these are the only dimensions of my matrix but more complicated if I have more dimensions and I want to keep the normal xarray behaviour for the other dimensions. Please see the example below that fails on the last line. I know how to manually fix this in numpy but the beauty of xarray is that I shouldn't have to.
Does xarray allow an easy solution for this kind of operation?
import xarray as xr
import numpy as np
# create simple DataArray M and N to show what I would like to do
M = xr.DataArray([1, 2], dims="x",coords={'x':['a','b']})
N = xr.DataArray([3, 4], dims="x",coords={'x':['a','b']})
print(M.sel(x='a')+N.sel(x='b')) # this will NOT give me the value
print(M.sel(x='a').values+N.sel(x='b').values) # this will give me the value
# create a more complex DataArray M and N to show what the challenge
m = np.arange(3*2*4)
m = m.reshape(3,2,4)
n = np.arange(4*2*3)
n = n.reshape(4,2,3)
M = xr.DataArray(m, dims=['z1',"x","z2"],coords={'x':['a','b']})
N = xr.DataArray(n, dims=["z2",'x','z1'],coords={'x':['a','b']})
print(M.sel(x='a')+N.sel(x='b')) # this will NOT give me the value
print(M.sel(x='a').values+N.sel(x='b').values) # this will result in an error

Padding multiple arrays to obtain same shape as largest array

I have multiple 2D arrays saved in a list called image_concat. This list will be composed of over a hundred of these arrays, but for now I'm just trying to make my code run for a list with only two of them. These arrays all have different shapes, and I would like to find the largest x-dimension and largest y-dimension out of all the arrays, and then pad all the other ones with enough zeros around the edges so that in the end, they all have the same shape. Note that the largest x-dimension and largest y-dimension might belong to separate arrays, or they might belong to the same one. What I have tried writing so far is not successfully changing the shape of the smaller array for some reason. But I also think that some issues will arise even after changing the shapes, since some arrays might be off by one in the end due to elements in the shape being even or odd.
import astropy
import numpy as np
import math
import matplotlib.pyplot as plt
from astropy.utils.data import download_file
from astropy.io import fits
images = ['http://irsa.ipac.caltech.edu/ibe/data/wise/allsky/4band_p1bm_frm/9a/02729a/148/02729a148-w2-int-1b.fits?center=89.353536,37.643864deg&size=0.6deg', 'http://irsa.ipac.caltech.edu/ibe/data/wise/allsky/4band_p1bm_frm/2a/03652a/123/03652a123-w4-int-1b.fits?center=294.772333,-19.747157deg&size=0.6deg']
image_list = []
for url in images:
image_list.append(download_file(url, cache=True))
image_concat = [fits.getdata(image) for image in image_list]
# See shapes in the beginning
print(np.shape(image_concat[0]))
print(np.shape(image_concat[1]))
def pad(image_concat):
# Identify largest x and y dimensions
xdims, ydims = np.zeros(len(image_concat)), np.zeros(len(image_concat))
for i in range(len(xdims)):
x, y = np.shape(image_concat[i])
xdims[i] = x
ydims[i] = y
x_max = int(np.max(xdims))
y_max = int(np.max(ydims))
# Pad all arrays except the largest dimensions
for A in image_concat:
x_len, y_len = np.shape(A)
print(math.ceil((y_max-y_len)/2))
print(math.ceil((x_max-x_len)/2))
np.pad(A, ((math.ceil((y_max-y_len)/2), math.ceil((y_max-y_len)/2)), (math.ceil((x_max-x_len)/2), math.ceil((x_max-x_len)/2))), 'constant', constant_values=0)
return image_concat
image_concat = pad(image_concat)
# See shapes afterwards (they haven't changed for some reason)
print(np.shape(image_concat[0]))
print(np.shape(image_concat[1]))
I can't understand why the shape isn't changing for this case. And also, is there a way to easily generalize this so that it will work on many arrays regardless of if they have even or odd dimensions?
np.pad doesn't modify the array in-place, it returns a padded array. So you'd need to do image_concat[i] = np.pad(...), where i is the index of A.

How to generate a number of random vectors starting from a given one

I have an array of values and would like to create a matrix from that, where each row is my starting point vector multiplied by a sample from a (normal) distribution.
The number of rows of this matrix will then vary in dependence from the number of samples I want.
%pylab
my_vec = array([1,2,3])
my_rand_vec = my_vec*randn(100)
Last command does not work, because array shapes do not match.
I could think of using a for loop, but I am trying to leverage on array operations.
Try this
my_rand_vec = my_vec[None,:]*randn(100)[:,None]
For small numbers I get for example
import numpy as np
my_vec = np.array([1,2,3])
my_rand_vec = my_vec[None,:]*np.random.randn(5)[:,None]
my_rand_vec
# array([[ 0.45422416, 0.90844831, 1.36267247],
# [-0.80639766, -1.61279531, -2.41919297],
# [ 0.34203295, 0.6840659 , 1.02609885],
# [-0.55246431, -1.10492863, -1.65739294],
# [-0.83023829, -1.66047658, -2.49071486]])
Your solution my_vec*rand(100) does not work because * corresponds to the element-wise multiplication which only works if both arrays have identical shapes.
What you have to do is adding an additional dimension using [None,:] and [:,None] such that numpy's broadcasting works.
As a side note I would recommend not to use pylab. Instead, use import as in order to include modules as pointed out here.
It is the outer product of vectors:
my_rand_vec = numpy.outer(randn(100), my_vec)
You can pass the dimensions of the array you require to numpy.random.randn:
my_rand_vec = my_vec*np.random.randn(100,3)
To multiply each vector by the same random number, you need to add an extra axis:
my_rand_vec = my_vec*np.random.randn(100)[:,np.newaxis]

Vectorizing loops in NumPy

I am trying to vectorize a loop iteration using NumPy but am struggling to achieve the desired results. I have an array of pixel values, so 3 dimensions, say (512,512,3) and need to iterate each x,y and calculate another value using a specific index in the third dimension. An example of this code in a standard loop is as follows:
for i in xrange(width):
for j in xrange(height):
temp = math.sqrt((scalar1-array[j,i,1])**2+(scalar2-array[j,i,2])**2)
What I am currently doing is this:
temp = np.sqrt((scalar1-array[:,:,1])**2+(scalar2-array[:,:,2])**2)
The temp array I get from this is the desired dimensions (x,y) but some of the values differ from the loop implementation. How can I eliminate the loop to compute this example efficiently in NumPy?
Thanks in advance!
Edit:
Here is code that is giving me differing results for temp and temp2, obviously temp2 is just the calculation for one cell
temp = np.sqrt((cb_key-fg_cbcr_array[:,:,1])**2+(cr_key-fg_cbcr_array[:,:,2])**2)
temp2 = np.sqrt((cb_key-fg_cbcr_array[500,500,1])**2+(cr_key-fg_cbcr_array[500,500,2])**2)
print temp[500, 500]
print temp2
The output for the above is
12.039
94.069123521
The scalars are definitely initialized and the array is generated from an image using
fg = PIL.Image.open('fg.jpg')
fg_cbcr = fg.convert("YCbCr")
fg_cbcr_array = np.array(fg_cbcr)
Edit2:
Ok so I have tracked it down to a problem with my array. Not sure why yet but it works when the array is generated with np.random.random but not when loading from a file using PIL as above.
Your vectorized solution is correct.
in your for loop temp is a scalar value that will take only the last value
use np.sqrt istead of math.sqrt for vectorized inputs
you should not use array as a variable since it can shadow the np.array method
I checked using the following code, which may give you some tip about where the error may be:
import numpy as np
width = 512
height = 512
scalar1 = 1
scalar2 = 2
a = np.random.random((height, width, 3))
tmp = np.zeros((height, width))
for i in xrange(width):
for j in xrange(height):
tmp[j,i] = np.sqrt((scalar1-a[j,i,1])**2+(scalar2-a[j,i,2])**2)
tmp2 = np.sqrt((scalar1-a[:,:,1])**2+(scalar2-a[:,:,2])**2)
np.allclose(tmp, tmp2)

How to flatten a numpy ndarray along axis?

I have three arrays: longitude(400,600),latitude(400,600),data(30,400,60); what I am trying to do is to extract value in the data array according to it's location(latitude and longitude).
Here is my code:
import numpy
import tables
hdf = "data.hdf5"
h5file = tables.openFile(hdf, mode = "r")
lon = numpy.array(h5file.root.Lonitude)
lat = numpy.array(h5file.root.Latitude)
arr = numpy.array(h5file.root.data)
lon = numpy.array(lon.flat)
lat = numpy.array(lat.flat)
arr = numpy.array(arr.flat)
lonlist=[]
latlist=[]
layer=[]
fre=[]
for i in range(0,len(lon)):
for j in range(0,30):
longi = lon[j]
lati = lat[j]
layers=[j]
frequency= arr[i]
lonlist.append(longi)
latlist.append(lati)
layer.append(layers)
fre.append(frequency)
output = numpy.column_stack((lonlist,latlist,layer,fre))
The problem is that the "frequency" is not what I want.I want the data array to be flattened along axis-zero,so that the "frequency" would be the 30 values at one location.Is there such a function in numpy to flatten ndarray along a particular axis?
You can try np.ravel(your_array), or your_array.shape=-1. The np.ravel function lets you use an optional argument order: choose C for a row-major order or F for a column-major order.
I guess what you actually wanted was just transpose to change the axis order. Depending on what you do with it, it might be useful to do a .copy() after the transposed to optimize the memory layout, since transpose will not create a copy itself.
Just to add, if you want to make something that is beyond F and C order, you can use transposed = ndarray.transpose([1,2,0]) to move the first axis to the end, the last into second position and then do transposed.ravel() (I assumed C order, so moved 0 axis to the end). You can also use reshape which is more powerful then the simple ravel (return shape can be any dimension).
Note that unless the strides add up exactly, numpy will have to make a copy of the array, you can avoid that by the very nice transposed.flat() iterator in many cases.
>>> a = np.random.rand(2,2,2)
>>> a
array([[[ 0.67379148, 0.95508303],
[ 0.80520281, 0.34666202]],
[[ 0.01862911, 0.33851973],
[ 0.18464121, 0.64637853]]])
>>> np.ravel(a)
array([ 0.67379148, 0.95508303, 0.80520281, 0.34666202, 0.01862911,
0.33851973, 0.18464121, 0.64637853])
You are essentially unfolding a high-dimensional tensor. Try tensorly.unfold(arr, mode=the_direction_you_want). For example,
import numpy as np
import tensorly as tl
a = np.zeros((3, 4, 5))
b = tl.unfold(a, mode=1)
b.shape # (4, 15)

Categories