Dynamic dimension in numpy array python - python

I have a list of numpy arrays which are to be given as input to the neural network model by a generator. However, the shape of numpy arrays can be different (*,1) where * is the dynamic number that is populated for every image. How I can write it in generator? I tried to give this like this in the neural network model.
preds = Input(shape=(None,1),name='preds')
In generator, I am trying to give something like this:
result = np.zeros((batchsize,None,1))
for i in range(batch_size):
result[i,:,:]=predictions
But this is giving me the error that TypeError: 'NoneType' object cannot be interpreted as an integer in declaration of np.zeros(batchsize,None,1). What is the correct way to give dynamic shape for a numpy array.

Related

Give two inputs to torch::jit::script::Module forward method

I am trying to build and train a network in python using pytorch . My forward method takes two inputs as follows :
def forward(self, x1, x2):
I trained this model in python and saved using torch.jit.script .
Then I load this model in c++ using the torch::jit::load.
How do I now pass the inputs to the model in c++ ?
If I try passing two separate tensors to the forward method like the following
std::vector<torch::jit::IValue> inputs1{tensor1};
std::vector<torch::jit::IValue> inputs2{tensor2};
at::Tensor output = module.forward(inputs1,inputs2).toTensor();
then I receive an error saying that the method forward expects 1 argument, 2 provided.
I can't also concatenate the two tensors since the shapes are different in all axis.
The problem is by concatenating the two tensors and giving the concatenated tensor as input to the model. Then in the forward method, we can create two separate tensors using the concatenated tensor and use them separately for the output computation.
For concatenation to work, I appended the tensors with 0's so that they are of the same size in all axis except the one in which concatenation is to be done.

efficiently converting for model.fit

I'm struggling in loading data into model.fit efficiently. My code creates training_data object with samples and values. Samples is a standard python list of objects of tf.Tensor class. Values is a list of integers.
When running
model.fit(training_data.samples, training_data.values, epochs=10)
I get an error
ValueError: Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'tensorflow.python.framework.ops.EagerTensor'>"}), (<class 'list'> containing values of types {"<class 'int'>"})
I can get this two work, by pre-converting it all to numpy arrays like this:
s, v = np.asarray(training_data.samples), np.asarray(training_data.values)
model.fit(s, v, epochs=10)
However this is impossibly slow. Loading data and very heavy preprocessing (signal chunking, fft, etc. etc.) takes about a minute and then just the data conversion with 1800 samples this part hangs for an hour and I lose patience before actual learning starts. Tensors's shape is (94, 257) so nothing big.
So what's an efficient way to pass data to model.fit, given that I already have it in memory.
Hi this is just a suggestion but try using a generator object from tf.keras.utils.Sequence but this also depends on what type of data you are using?
You can look at the example here:
https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
I implemented my own here you can check it out:
https://github.com/edwin-19/custom_keras_generator/blob/master/notebooks/Model%20Comparison.ipynb
For me it's a common error when I use the wrong input format (typically when I pass a list).
Try to convert only training_data.values to a numpy array or a tensor if it's a list.
As pinpointed in the documentation (x, y, validation_data,..) have a limited number of accepted inputs, i.e. numpy array and tensor:
model.fit

TensorFlow Federated: How can I write an Input Spec for a model with more than one input

I'm trying to make an image captioning model using the federated learning library provided by tensorflow, but I'm stuck at this error
Input 0 of layer dense is incompatible with the layer: : expected min_ndim=2, found ndim=1.
this is my input_spec:
input_spec=collections.OrderedDict(x=(tf.TensorSpec(shape=(2048,), dtype=tf.float32), tf.TensorSpec(shape=(34,), dtype=tf.int32)), y=tf.TensorSpec(shape=(None), dtype=tf.int32))
The model takes image features as the first input and a list of vocabulary as a second input, but I can't express this in the input_spec variable. I tried expressing it as a list of lists but it still didn't work. What can I try next?
Great question! It looks to me like this error is coming out of TensorFlow proper--indicating that you probably have the correct nested structure, but the leaves may be off. Your input spec looks like it "should work" from TFF's perspective, so it seems it is probably slightly mismatched with the data you have
The first thing I would try--if you have an example tf.data.Dataset which will be passed in to your client computation, you can simply read input_spec directly off this dataset as the element_spec attribute. This would look something like:
# ds = example dataset
input_spec = ds.element_spec
This is the easiest path. If you have something like "lists of lists of numpy arrays", there is still a way for you to pull this information off the data itself--the following code snippet should get you there:
# data = list of list of numpy arrays
input_spec = tf.nest.map_structure(lambda x: tf.TensorSpec(x.shape, x.dtype), data)
Finally, if you have a list of lists of tf.Tensors, TensorFlow provides a similar function:
# tensor_structure = list of lists of tensors
tf.nest.map_structure(tf.TensorSpec.from_tensor, tensor_structure)
In short, I would reocmmend not specifying input_spec by hand, but rather letting the data tell you what its input spec should be.

how to correctly use tf.function with a TensorFlow Dataset

I'm trying to use TF Datasets with a #tf.function to perform some preprocessing on a directory of images. Inside the tf function the image file is read as a RAW string tensor and I'm trying to take a slice from that tensor. The slice, the first 13 characters, represent info about .ppm images (header). I get an error: ValueError: Shape must be rank 1 but is rank 0 for 'Slice' (op: 'Slice') with input shapes: [], [1], [1]. Initially I was trying to directly slice the .numpy() attribute of the tensor (filepath input parameter to the tf function), but I think it is semantically wrong to do this inside a tf function. It also didn't work as the filepath input tensor does not have a numpy() attribute (I don't understand why??). Outside of the tf function, e.g. in a jupyter notebook cell, I can iterate over the dataset and get individual items which have a numpy attribute and do a slice and all subsequent processing on it just fine. I do realize there may be a gap in my understanding of how TF works (I am using TF 2.0), so I hope someone can clarify what I missed in my readings. The purpose of the tf function is convert the ppm images to png, so there is a side effect of this function, but I did not get that far to find out if this is possible to do.
Here's the code:
#tf.function
def ppm_to_png(filepath):
ppm_bytes = tf.io.read_file(filepath) #.numpy()
bytes_header = tf.slice(ppm_bytes, [0], [13])
# bytes_header = ppm_bytes[:13].eval() # this did not work either with similar error msg
.
.
.
import glob
files = glob.glob(os.path.join(data_dir, '00000/*.ppm'))
dataset = tf.data.Dataset.from_tensor_slices(files)
png_filepaths = dataset.map(ppm_to_png, num_parallel_calls=tf.data.experimental.AUTOTUNE)
To manipulate string values in TF, have a look at the tf.strings namespace.
In this case, you can use tf.strings.substr:
#tf.function
def ppm_to_png(filepath):
ppm_bytes = tf.io.read_file(filepath)
bytes_header = tf.strings.substr(ppm_bytes, 0, 13)
tf.print(bytes_header)
tf.slice only operates on the Tensor objects, and doesn't work on their elements. Here, ppm_bytes is a scalar Tensor containing a single element of type tf.string, and whose value is the entire string contents of the file. So when you call tf.slice, it only looks at the scalar bit, and is not smart enough to realize that you actually want to take a slice of that element instead.

feed composite inputs to model

I need to feed an image and a vector sampled from normal distribution simultaneously. As the image dataset I'm using is too large, I create a ImageDeserializer for that part. But I also need to add random vector (sampled from numpy normal distribution), to the input map before feed it to the network. Is there any way to achieve this?
I also test:
mb_data = reader_train.next_minibatch(mb_size, input_map=input_map)
mb_data[random_input_node] = np.random.normal((mb_size, 100))
but get the following error:
TypeError: cannot convert value of dictionary to N4CNTK13MinibatchDataE
The problem solved with the following snippet to feed data to trainer:
mb_data = reader_train.next_minibatch(mb_size, input_map=input_map)
z = np.random.normal(mb_size)
my_trainer.train_minibatch({feature_image: mb_data[image].data, feature_z: z})
Also thanks to #mewahl. Defining new reader is another suitable way to solve the problem, and I think it must be faster than what I have done.

Categories