Tensorflow Split Using Feed Dict Input Dimension - python

I'm trying to tf.split a tensor based on the dimension of an input fed in using feed_dict (dimension of input changes with each batch). Currently I keep getting an error saying that a tensor cannot be split with a "Dimension". Is there a way to get the value of the dimension and split using it?
Thanks!
input_d = tf.placeholder(tf.int32, [None, None], name="input_d")
# toy feed dict
feed = {
input_d: [[20,30,40,50,60],[2,3,4,5,-1]] # document
}
W_embeddings = tf.get_variable(shape=[vocab_size, embedding_dim], \
initializer=tf.random_uniform_initializer(-0.01, 0.01),\
name="W_embeddings")
document_embedding = tf.gather(W_embeddings, input_d)
timesteps_d = document_embedding.get_shape()[1]
doc_input = tf.split(1, timesteps_d, document_embedding)

tf.split takes a python integer for the num_split argument. However, document_embedding.get_shape() returns a TensorShape, and document_embedding.get_shape()[1] gives a Dimension instance, hence you get an error says "can't split with a Dimension".
Try timestep_ds = document_embedding.get_shape().as_list()[1], this statement should give you a python integer.
Here are some relevant documentations for tf.split and tf.Tensor.get_shape

Related

Error when trying to simulate one FMU with 4 inputs from csv files [duplicate]

I have a fmu created in gt-suite. I am trying to work with it in python using python PyFMI package.
My code
from pyfmi import load_fmu
import numpy as np
model = load_fmu('AHUPIv2b.fmu')
t = np.linspace(0.,100.,100)
u = np.linspace(3.5,4.5,100)
v = np.linspace(900,1000,100)
u_traj = np.transpose(np.vstack((t,u)))
v_traj = np.transpose(np.vstack((t,v)))
input_object = (('InputVarI','InputVarP'),(u_traj,v_traj))
res = model.simulate(final_time=500, input=input_object, options={'ncp':500})
res = model.simulate(final_time=10)
model.simulate takes input as one of its parameters, Documentation says
input --
Input signal for the simulation. The input should be a 2-tuple
consisting of first the names of the input variable(s) and then
the data matrix.
'InputVarI','InputVarP' are the input variables and u_traj,v_traj are data matrices.
My code gives an error
gives an error -
TypeError: tuple indices must be integers or slices, not tuple
Is the input_object created wrong? Can someone help with how to create the input tuples correctly as per the documentation?
The input object is created incorrect. The second variable in the input tuple should be a single data matrix, not two data matrices.
The correct input should be:
data = np.transpose(np.vstack((t,u,v)))
input_object = (['InputVarI','InputVarP'],data)
See also pyFMI parameter change don't change the simulation output

RuntimeError: Operation does not have identity in f-string statement

I am evaluating a pytorch model. It gives results in following manner
results = model(batch)
# results is a list of dictionaries with 'boxes', 'labels' and 'scores' keys and torch tensor values
Then I try to print some of the values to check what is happening
print(
(
f"{results[0]['boxes'].shape[0]}\n" # Returns how many boxes there is
f"{results[0]['scores'].mean()}" # Mean credibility score of the boxes
)
)
This results in error
Exception has occurred: RuntimeError: operation does not have identity
To make things more confusing, print only fails some of the time. Why does this fail?
I had the same problem in my code. It turns out when trying to access attributes of empty tensors (e.g. shape, mean, etc.) the outcome is the no identity exception.
Code to reproduce:
import torch
a = torch.arange(12)
mask = a > 100
b = a[mask] # tensor([], dtype=torch.int64) -- empty tensor
b.min() # yields "RuntimeError: operation does not have an identity."
Figure out why your code returns empty tensors and this will solve the problem.

Find input shape from onnx file

How can I find the input size of an onnx model? I would eventually like to script it from python.
With tensorflow I can recover the graph definition, find input candidate nodes from it and then obtain their size. Can I do something similar with ONNX (or even simpler)?
Thank you
Yes, provided the input model has the information. Note that inputs of an ONNX model may have an unknown rank or may have a known rank with dimensions that are fixed (like 100) or symbolic (like "N") or completely unknown. You can access this as below:
import onnx
model = onnx.load(r"model.onnx")
# The model is represented as a protobuf structure and it can be accessed
# using the standard python-for-protobuf methods
# iterate through inputs of the graph
for input in model.graph.input:
print (input.name, end=": ")
# get type of input tensor
tensor_type = input.type.tensor_type
# check if it has a shape:
if (tensor_type.HasField("shape")):
# iterate through dimensions of the shape:
for d in tensor_type.shape.dim:
# the dimension may have a definite (integer) value or a symbolic identifier or neither:
if (d.HasField("dim_value")):
print (d.dim_value, end=", ") # known dimension
elif (d.HasField("dim_param")):
print (d.dim_param, end=", ") # unknown dimension with symbolic name
else:
print ("?", end=", ") # unknown dimension with no name
else:
print ("unknown rank", end="")
print()
Please do NOT use input as a variable name because it's a built-in function.
The first idea that comes to mind is that using the google.protobuf.json_format.MessageToDict() method if I need the name, data_type, or some properties of a protobuf object. For example:
form google.protobuf.json_format import MessageToDict
model = onnx.load("path/to/model.onnx")
for _input in model.graph.input:
print(MessageToDict(_input))
will gives the output like:
{'name': '0', 'type': {'tensorType': {'elemType': 2, 'shape': {'dim': [{'dimValue': '4'}, {'dimValue': '3'}, {'dimValue': '384'}, {'dimValue': '640'}]}}}}
I'm not very clear whether every model.graph.input is a RepeatedCompositeContainer object or not, but it would be necessary to use the for loop when it is a RepeatedCompositeContainer.
Then you need to get the shape information from the dim field.
model = onnx.load("path/to/model.onnx")
for _input in model.graph.input:
m_dict = MessageToDict(_input))
dim_info = m_dict.get("type").get("tensorType").get("shape").get("dim") # ugly but we have to live with this when using dict
input_shape = [d.get("dimValue") for d in dim_info] # [4,3,384,640]
If you need the only dim, please use message object instead.
model = onnx.load("path/to/model.onnx")
for _input in model.graph.input:
dim = _input.type.tensor_ype.shape.dim
input_shape = [MessgeToDict(d).get("dimValue") for d in dim]
# if you prefer the python naming style, using the line below
# input_shape = [MessgeToDict(d, preserving_proto_field_name=True).get("dim_value") for d in dim]
One line version:
model = onnx.load("path/to/model.onnx")
input_shapes = [[d.dim_value for d in _input.type.tensor_type.shape.dim] for _input in model.graph.input]
Refs:
https://github.com/googleapis/python-vision/issues/70
AttributeError: 'google.protobuf.pyext._message.RepeatedCompositeCo' object has no attribute 'append'
If you use onnxruntime instead of onnx for inference.
Try using the below code.
import onnxruntime as ort
model = ort.InferenceSession("model.onnx", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
input_shape = model.get_inputs()[0].shape

Convert a tensorflow tf.data.Dataset FlatMapDataset to TensorSliceDataset

I want to pass a list of tf.Strings to the .map(_parse_function) function.
def _parse_function(self, img_path):
img_str = tf.read_file(img_path)
img_decode = tf.image.decode_jpeg(img_str, channels=3)
img_decode = tf.divide(tf.cast(img_decode , tf.float32),255)
return img_decode
When the tf.data.Dataset is of type TensorSliceDataset,
dataset_from_slices = tf.data.Dataset.from_tensor_slices((tensor_with_filenames))
I can simply do
dataset_from_slices.map(_parse_function), which works.
However, dataset_from_generator = tf.data.Dataset.from_generator(...) returns a Dataset which is an instance of FlatMapDataset type and dataset_from_generator.map(_parse_function) gives the following error:
InvalidArgumentError: Input filename tensor must be scalar, but had shape: [32]
If I change the first line to:
img_str = tf.read_file(img_path[0])
that also works but then I only get the first image, which is not what I am looking for. Any suggestions?
It sounds like the elements of your dataset_from_generator are batched. The simplest remedy is to use tf.contrib.data.unbatch() to convert them back into individual elements:
# Each element is a vector of strings.
dataset_from_generator = tf.data.Dataset.from_generator(...)
# Converts each vector of strings into multiple individual elements.
dataset = dataset_from_generator.apply(tf.contrib.data.unbatch())
dataset = dataset.map(_parse_function)

List and Numpy array in Python

I was trying to hot-encode data.
Data is list of vocabulary_size = 17005207.
To hot-encode, I made a list of inputs of num_labels = 100.
Following code:
inputs = []
for i in range(vocabulary_size):
inputs.append(np.arange(num_labels) == data[i]).astype(np.float32)
Throws me an Error:
AttributeError: 'NoneType' object has no attribute 'astype'
I tried dtype = np.float32 inside append function but again erroneous.
When I try this :
inputs = []
for i in range(vocabulary_size):
inputs.append(np.arange(num_labels) == data[i])
inputs = np.array(inputs,dtype=np.float32)
I get correct answer : Hot-Encoded Input Sequence of vocabulary_size x num_labels.
Any Alternative Solution Of One Line Without Using Numpy?
Solved :Can I be done directly using numpy array(input) with list(data)?
Info about data : data = np.ndarray(len(words), dtype=np.int32)
Reformat function:
def reformat(data):
num_labels = vocabulary_size
print (type(data))
data = (np.arange(num_labels) == data[:,None]).astype(np.int32)
return data
print (data,len(data))
return data
New Question : The dimension of data is (vocabulary_size,)...How to convert data using ravel or reshape into dimension of (1,vocabulary_size)?
Not sure whether I've understood correctly what you're asking for, but if what you want is a oneliner, you could transform you're already working code into this:
inputs = np.array([np.arange(num_labels) == data[i] for i in range(vocabulary_size)], dtype=np.float32)

Categories