I want to create a lmdb data in python where the labels are not scalars but each label is (1,K) vector and K is the number of classes. More specifically, the label vector has zeros everywhere except in the corresponding class index you have 1.
I tested the following code in python:
with env.begin(write=True) as txn:
for i in range(N):
datum = caffe.proto.caffe_pb2.Datum()
datum.channels = X.shape[1]
datum.height = X.shape[2]
datum.width = X.shape[3]
datum.data = X[i].tobytes() # or .tostring() if numpy < 1.9
datum.label = int(y[i])
str_id = '{:08}'.format(i)
txn.put(str_id.encode('ascii'), datum.SerializeToString())
print i+1
But I got the this error TypeError: only length-1 arrays can be converted to Python scalars where y[i] is a numpy (1,k) vector as described above.
I am also wondering if caffe would accept such format of labels.
Any help would be very appreciated
Yes, caffe supports transformation of numpy arrays to datum, then you can put the datum to lmdb.
Using caffe.io.array_to_datum(numpy_array) to transform a numpy_array to datum, NOTE that the numpy_array must have 4 axes, so if you want to put a vecttor into lmdb, you should initialize a numpy_array with shape [1,1,1,M], while M is the length of your vector.
here is a tool to write image/map pairs to lmdb which can be feed to caffe networks.
Related
It's hard to articulate the question I have exactly. I want to subscript all values in the array, but I want the order to be different. The code is below:
# import numpy
import numpy as np
# variables
y = np.zeros([30,120,1440])
composite_events = np.zeros([30,120,1440])
# longitude array
lon = np.linspace(-179.25,179.25,1440)
# index for the starting point
center_lon_index = np.int(500)
# index for shifting the longitude array
lonindex = (np.arange(start=0,stop=np.size(lon),step=1) + center_lon_index) % np.size(lon)
# set the composite events to be the same as the data, but centering a particular point
composite_events[0,:,:] = y[0,:,lonindex]
The code returns the following error.
ValueError: could not broadcast input array from shape (1440,120) into shape (120,1440)
I understand the error, but as far as I can tell, the shape of y should be the same as the shape of composite_events. This type of code works in other languages I've used. What is python doing here? Thanks!
I have a task to create a 30x40 feature matrix with random integers between 1 & 100:
import numpy as np
matrix= np.random.randint(1,100,size=(30,40))
Next I need to rescale the elements in the matrix to be between the range 5-10:
from sklearn import preprocessing
scaler = preprocessing.MinMaxScaler()
scaler.fit (5,10)
matrix1 = scaler.fit_transform(matrix)
Which gives me this error:
ValueError: Expected 2D array, got scalar array instead:
array=5.0.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample
I've tried reshaping the data:
matrix.reshape(-1,1)
but I get the same error.
I think you need to define the feature range when you create an instance of MinMaxScaler like this:
scaler = preprocessing.MinMaxScaler(feature_range=(5, 10))
And then you could fit and transform the data like this:
matrix1 = scaler.fit_transform(matrix)
The last line is a short form for:
scaler.fit(matrix)
matrix1 = scaler.transform(matrix)
I'm using the following code to extract the features from image.
def ext():
imgPathList = glob.glob("images/"+"*.JPG")
features = []
for i, path in enumerate(tqdm(imgPathList)):
feature = get_vector(path)
feature = feature[0] / np.linalg.norm(feature[0])
features.append(feature)
paths.append(path)
features = np.array(features, dtype=np.float32)
return features, paths
However, the above code throws the following error,
features = np.array(features, dtype=np.float32)
ValueError: only one element tensors can be converted to Python scalars
How can I be able to fix it?
The error says that your features variable is a list which contains multi dimensional values which cant be converted to tensor, because .append is converting the tensors to list, So some workaround is to use concatenation function of torch as torch.cat() (read here) instead of append method. I tried to replicate the solution with toy example.
I am assuming that features contain 2D tensor
import torch
for i in range(1,11):
alpha = torch.rand(2,2)
if i<2:
beta = alpha #will concatenate second sample
else:
beta = torch.cat((beta,alpha),0)
import numpy as np
features = np.array(beta, dtype=np.float32)
It seems you have a list of tensors you can not convert directly like that.
You need to convert internal tensors into NumPy array first (Use torch.Tensor.numpy to convert tensor into the array) and then list of NumPy array to the final array.
features = np.array([item.numpy() for item in features], dtype=np.float32)
I'm trying to classify images using an Artificial Neural Network and the approach I want to try is:
Get feature descriptors (using SIFT for now)
Classify using a Neural Network
I'm using OpenCV3 and Python for this.
I'm relatively new to Machine Learning and I have the following question -
Each image that I analyse will have different number of 'keypoints' and hence different dimensions of the 2D 'descriptor' array. How do I decide the input for my ANN. For example for one sample image the descriptor shape is (12211, 128) so do I flatten this array and use it as an input, in which case I have to worry about varying input sizes for each image, or do I compute something else for the input?
I'm not sure if this is an exact solution but this worked for me. The main idea is as follows:
Divide your image into a MxN grid.
Obtain a set number of feature points for each sub-image.
Concatenate the results for all the sub-images to obtain a feature vector for the entire image.
The supporting code roughly is given below (the function "pre_process_image"):
def tiles(arr, nrows, ncols):
"""
If arr is a 2D array, the returned list contains nrowsXncols numpy arrays
with each array preserving the "physical" layout of arr.
When the array shape (rows, cols) are not divisible by (nrows, ncols) then
some of the array dimensions can change according to numpy.array_split.
"""
rows, cols, channel = arr.shape
col_arr = np.array_split(range(cols), ncols)
row_arr = np.array_split(range(rows), nrows)
return [arr[r[0]: r[-1]+1, c[0]: c[-1]+1]
for r, c in product(row_arr, col_arr)]
def pre_process_images(data, dimensions=(28, 28)):
images = data['image']
features = []
count = 1
nrows = dimensions[0]
ncols = dimensions[1]
sift = cv2.xfeatures2d.SIFT_create(1)
for arr in images:
image_feature = []
cut_image = tiles(arr, nrows, ncols)
for small_image in cut_image:
(kps, descs) = sift.detectAndCompute(im, None)
image_feature.append(descs.flatten())
features.append(image_feature)
print count
count += 1
data['sift_features'] = features
return data
However this is extremely slow. I'm working on a way to optimally select features using PCA right now for the same.
It will be good if you apply Normalization on each image before getting the feature extractor.
I am trying to input my own data into a caffe model using the python wrappers. I read the data from HDF5 as a numpy array with dimension 100x9. But for the input to model, I use the following code:
input_ = np.zeros((100,9,1,1), dtype=np.float32)
net.forward(**{net.inputs[0]:input_})
So basically I need to fill out input_ from a 100x9 array.
Heres how you would convert a 100x9 array to a 100x9x1x1 array:
x = np.zeros((100,9))
y = x[:,:,np.newaxis,np.newaxis]