Value error while converting tensor to numpy array - python

I'm using the following code to extract the features from image.
def ext():
imgPathList = glob.glob("images/"+"*.JPG")
features = []
for i, path in enumerate(tqdm(imgPathList)):
feature = get_vector(path)
feature = feature[0] / np.linalg.norm(feature[0])
features.append(feature)
paths.append(path)
features = np.array(features, dtype=np.float32)
return features, paths
However, the above code throws the following error,
features = np.array(features, dtype=np.float32)
ValueError: only one element tensors can be converted to Python scalars
How can I be able to fix it?

The error says that your features variable is a list which contains multi dimensional values which cant be converted to tensor, because .append is converting the tensors to list, So some workaround is to use concatenation function of torch as torch.cat() (read here) instead of append method. I tried to replicate the solution with toy example.
I am assuming that features contain 2D tensor
import torch
for i in range(1,11):
alpha = torch.rand(2,2)
if i<2:
beta = alpha #will concatenate second sample
else:
beta = torch.cat((beta,alpha),0)
import numpy as np
features = np.array(beta, dtype=np.float32)

It seems you have a list of tensors you can not convert directly like that.
You need to convert internal tensors into NumPy array first (Use torch.Tensor.numpy to convert tensor into the array) and then list of NumPy array to the final array.
features = np.array([item.numpy() for item in features], dtype=np.float32)

Related

Reading h5py files into tensors

So I have a training set and a test set both in h5py format. I also have a data_load function that loads the files and returns NumPy arrays. The main problem is I don't need NumPy as I am working with Tensors. I am expecting to have an x&y tensor of size N(batch size) and D_in(input size for each image) and D_out(Output size of each tensor).
The problem:
x&y do not get converted to tensors of dimensions mentioned below.If anything their types remain to be numpy.ndarray. Any help is appreciated.
def load_data(train_file, test_file):
# Load the training data
train_dataset =h5py.File(train_file, 'r')
# Separate features(x) and labels(y) for training set
train_set_x_orig =np.array(train_dataset["train_set_x"][:])
train_set_y_orig =np.array(train_dataset["train_set_y"][:])
# Load the test data
test_dataset =h5py.File(test_file,'r')
# Separate features(x) and labels(y) for training set
test_set_x_orig =np.array(test_dataset["test_set_x"][:])
test_set_y_orig =np.array(test_dataset["test_set_y"][:])
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = torch.from_numpy(train_set_y_orig.reshape((1, train_set_y_orig.shape[0])))
test_set_y_orig = torch.from_numpy(test_set_y_orig.reshape((1, test_set_y_orig.shape[0])))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
x = torch.Tensor(N, D_in)
y = torch.Tensor(N, D_out)
train_file="data/train_catvnoncat.h5"
test_file="data/test_catvnoncat.h5"
x,y,_,_,_=load_data(train_file,test_file)
Because you did not convert train_set_x_orig to a torch tensor before returning.
Either use torch.from_numpy() on train_set_x_orig before returning as you do with train_set_y_orig or cast it to a tensor before assigning to x.
However, y should be of type torch.tensor.
Below is a demonstration that explains the issue:
# some sample tensor
In [27]: x = torch.Tensor(3, 2)
# check its type
In [28]: type(x)
Out[28]: torch.Tensor
# some sample ndarray
In [29]: arrx = np.arange(6).reshape(3, -1)
# assign array to tensor
# note that now the object `x` refers to the numpy array object
In [30]: x = arrx
# see that the type() of `x` is now numpy ndarray
In [31]: type(x)
Out[31]: numpy.ndarray
Also, as hpaulj pointed out in the comments, there is no need to wrap the sliced objects from h5py in np.array() since the sliced objects are already of type numpy ndarrays. So, you can just get rid of them and the code will look more cleaner!

Advanced Indexing in 3 Dimensional Numpy ndarray In Python

I have a ndarray of shape (68, 64, 64) called 'prediction'. These dimensions correspond to image_number, height, width. For each image, I have a tuple of length two that contains coordinates that corresponds to a particular location in each 64x64 image, for example (12, 45). I can stack these coordinates into another Numpy ndarray of shape (68,2) called 'locations'.
How can I construct a slice object or construct the necessary advanced indexing indices to access these locations without using a loop? Looking for help on the syntax. Using pure Numpy matrixes without loops is the goal.
Working loop structure
Import numpy as np
# example code with just ones...The real arrays have 'real' data.
prediction = np.ones((68,64,64), dtype='float32')
locations = np.ones((68,2), dtype='uint32')
selected_location_values = np.empty(prediction.shape[0], dtype='float32')
for index, (image, coordinates) in enumerate(zip(prediction, locations)):
selected_locations_values[index] = image[coordinates]
Desired approach
selected_location_values = np.empty(prediction.shape[0], dtype='float32')
correct_indexing = some_function_here(locations). # ?????
selected_locations_values = predictions[correct_indexing]
A straightforward indexing should work:
img = np.arange(locations.shape[0])
r = locations[:, 0]
c = locations[:, 1]
selected_locations_values = predictions[img, r, c]
Fancy indexing works by selecting elements of the indexed array that correspond to the shape of the broadcasted indices. In this case, the indices are quite straightforward. You just need the range to tell you what image each location corresponds to.

When Python code outputs an array or list of arrays. are those NumPy array or something else?

My Python code outputs a list of arrays. My question is;
are those NumPy arrays or something else?
When trying to use those outputs by copying and pasting to new Python code I am getting a type error which I think arise if they are NumPy arrays.
import numpy as np
class Network(object):
def __init__(self, sizes):
"""The list ``sizes`` contains the number of neurons in the
respective layers of the network. """
self.num_layers = len(sizes)
self.sizes = sizes
self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
self.weights = [np.random.randn(y, x)
for x, y in zip(sizes[:-1], sizes[1:])]
def feedforward(self, a):
"""Return the output of the network if ``a`` is input."""
for b, w in zip(self.biases, self.weights):
a = np.dot(w, a)+b
return a
Network = Network([2,3,1])
print(Network.feedforward([1,5]))
print(Network.weights)
print(Network.biases)
print(type(Network.weights))
here is the output
[[-3.29027694 -2.17332051 -0.55471131]]
[array([[-1.06867352, 1.10685543],
[-0.03651884, 0.59706138],
[ 1.35881759, -0.12161689]]), array([[-1.52001116,
0.44110627, 0.34252238]])]
[array([[-0.25784339],
[ 0.50499638],
[-0.00993926]]), array([[-0.61316203]])]
<class 'list'>
The type of the array is shown to be a regular list. If you want to convert this list to a numpy array you can use the numpy.asarray function.
Here's some examples of the usage of the asarray function:
np.asarray([1,5])
my_tuple = ([1, 3, 9], [8, 2, 6])
out_arr = geek.asarray(my_tuple)
Source: https://www.geeksforgeeks.org/numpy-asarray-in-python/
The reverse can also be done using np.array. This will input a regular python list/array and return a numpy array.
The answer is simple, thanks #gmoshkin to clear the fog around me.
When python code outputs array or list of arrays and you haven't imported array from array and did only numpy imports and operations, code can not out put any array but numpy arrays or lists with elements having numpy arrays only. As we make type of element check by:
print(type(Network.weights[0])), we get output: class 'numpy.ndarray'>
So it is in fact a list of numpy arrays not a list of standard arrays.

Labels as a matrix in LMDB data using python

I want to create a lmdb data in python where the labels are not scalars but each label is (1,K) vector and K is the number of classes. More specifically, the label vector has zeros everywhere except in the corresponding class index you have 1.
I tested the following code in python:
with env.begin(write=True) as txn:
for i in range(N):
datum = caffe.proto.caffe_pb2.Datum()
datum.channels = X.shape[1]
datum.height = X.shape[2]
datum.width = X.shape[3]
datum.data = X[i].tobytes() # or .tostring() if numpy < 1.9
datum.label = int(y[i])
str_id = '{:08}'.format(i)
txn.put(str_id.encode('ascii'), datum.SerializeToString())
print i+1
But I got the this error TypeError: only length-1 arrays can be converted to Python scalars where y[i] is a numpy (1,k) vector as described above.
I am also wondering if caffe would accept such format of labels.
Any help would be very appreciated
Yes, caffe supports transformation of numpy arrays to datum, then you can put the datum to lmdb.
Using caffe.io.array_to_datum(numpy_array) to transform a numpy_array to datum, NOTE that the numpy_array must have 4 axes, so if you want to put a vecttor into lmdb, you should initialize a numpy_array with shape [1,1,1,M], while M is the length of your vector.
here is a tool to write image/map pairs to lmdb which can be feed to caffe networks.

Correct way to broadcast a 100x9 to a 100x9x1x1 numpy array for computation in Caffe

I am trying to input my own data into a caffe model using the python wrappers. I read the data from HDF5 as a numpy array with dimension 100x9. But for the input to model, I use the following code:
input_ = np.zeros((100,9,1,1), dtype=np.float32)
net.forward(**{net.inputs[0]:input_})
So basically I need to fill out input_ from a 100x9 array.
Heres how you would convert a 100x9 array to a 100x9x1x1 array:
x = np.zeros((100,9))
y = x[:,:,np.newaxis,np.newaxis]

Categories