I'm trying to deploy a simple model on the Triton Inference Server. It is loaded well but I'm having trouble formatting the input to do a proper inference request.
My model has a config.pbtxt set up like this
max_batch_size: 1
input: [
{
name: "examples"
data_type: TYPE_STRING
format: FORMAT_NONE
dims: [ -1 ]
is_shape_tensor: false
allow_ragged_batch: false
optional: false
}
]
I've tried using a pretty straightforward python code to setup the input data like this (the outputs are not written but are setup correctly)
bytes_data = [input_data.encode('utf-8')]
bytes_data = np.array(bytes_data, dtype=np.object_)
bytes_data = bytes_data.reshape([-1, 1])
inputs = [
httpclient.InferInput('examples', bytes_data.shape, "BYTES"),
]
inputs[0].set_data_from_numpy(bytes_data)
But I keep getting the same error message
tritonclient.utils.InferenceServerException: Could not parse example input, value: '[my text input here]'
[[{{node ParseExample/ParseExampleV2}}]]
I've tried multiple ways of encoding the input, as bytes or even as TFX serving used to ask like this { "instances": [{"b64": "CjEKLwoJdXR0ZXJhbmNlEiIKIAoecmVuZGV6LXZvdXMgYXZlYyB1biBjb25zZWlsbGVy"}]}
I'm not exactly sure where the problems comes from if anyone knows?
If anyone gets this same problem, this solved it. I had to create a tf.train.Example() and set the data correctly
example = tf.train.Example()
example_bytes = str.encode(input_data)
example.features.feature['utterance'].bytes_list.value.extend([example_bytes])
inputs = [
httpclient.InferInput('examples', [1], "BYTES"),
]
inputs[0].set_data_from_numpy(np.asarray(example.SerializeToString()).reshape([1]), binary_data=False)
Related
I am trying to fit a TensorFlow model and one my features comes in as a comma-separated string of ints (possibly empty string). The feature appears in the pretransform schema as
feature {
name: "csstring"
type: BYTES
presence {
min_fraction: 1.0
}
shape {
dim {
size: 1
}
}
}
and in the preprocessing_fn function it is processed via
splitted = tf.squeeze(tf.strings.split(inputs["csstring"], sep=","), axis=1)
filled = tf.where(splitted=='', 'nan', splitted)
casted = tf.strings.to_number(filled)
meaned = tf.reduce_mean(casted, axis=1)
outputs["csstring"] = meaned
I have managed to load the pre-transformed examples in a notebook and apply these transformation steps to get the processed feature as the average of each list (nan if the list is empty).
However when I run the pipeline as a whole on Kubeflow I am getting this error where the transform component fails:
ValueError: An error occured while trying to apply the transformation: "StringToNumberOp could not correctly convert string:
[[node transform/transform/StringToNumber_1 (defined at venv/lib/python3.8/site-packages/tensorflow_transform/saved/saved_transform_io.py:262) ]]
I can't see any particular string instance that would be problematic to cast, and would appreciate any ideas as to why the pipeline doesn't work.
I am reading a .h5 file with h5py module. What i am trying to achieve here is print all the groups and all the datasets within a group without knowing the content structure of the file. I am using visititems function to iterate over all the nodes of the file.
My code works fine at first until it gives an error:
TypeError: No NumPy equivalent for TypeBitfieldID exists
I am new to h5py module so can anyone tell me why is this happening? This code runs fine for starting iterations of the loop but later some datasets/nodes of this file cause this error.
As far as i can understand is that some items(datasets or groups) from this .hf file are not being read correctly.
Link to this .h5 file i am using is :
https://cernbox.cern.ch/index.php/s/wk7SN1qt2O7jbrl
This is my code:
AWAKE_csv = open('AWAKE_csv.csv', mode='w')
AWAKE_writer = csv.writer(AWAKE_csv, delimiter=',')
AWAKE_writer.writerow(["GROUP", "DATASET", "SIZE", "SHAPE", "TYPE"])
def visitor_func(name, node):
if isinstance(node, h5py.Dataset):
print('Dataset: ' + name)
out = node.dtype
AWAKE_writer.writerow([' ', name, node.size, node.shape, out])
else:
print('Group: ' + name)
# node is a group
AWAKE_writer.writerow([name])
with h5py.File(glob.glob("*.h5")[0],'r') as f:
f.visititems(visitor_func)
The line in my code which throws this error is:
out = node.dtype
With this visit function, I can get information on all the datasets that raise this node.dtype error:
def foo1(name,node):
#print(name)
if isinstance(node, h5py.Dataset):
try:
node.dtype
except TypeError as err:
print(name)
print(node.size, node.shape)
print(err)
I get a couple of screens worth, with a typical display like:
0 (0,)
No NumPy equivalent for TypeBitfieldID exists
AwakeEventData/GD_BPM.AWAKE.TRIUMF/AcquisitionSPS/posOK
1 (1,)
No NumPy equivalent for TypeBitfieldID exists
AwakeEventData/GD_BPM.AWAKE.TRIUMF/GlobalAcquisition/posOK
So if your goal is just to visit everything, and display the information that you can, add a try/except like this to your visit function.
The h5dump display for one of those datasets is:
2215:~/mypy$ h5dump -d /AwakeEventData/GD_BPM.AWAKE.TRIUMF/AcquisitionSPS/posOK ../Downloads/1541962108935000000_167_838.h5
HDF5 "../Downloads/1541962108935000000_167_838.h5" {
DATASET "/AwakeEventData/GD_BPM.AWAKE.TRIUMF/AcquisitionSPS/posOK" {
DATATYPE H5T_STD_B64LE
DATASPACE SIMPLE { ( 1 ) / ( H5S_UNLIMITED ) }
DATA {
(0): 80:17:00:00:00:00:00:00
}
ATTRIBUTE "bitFieldSize" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
DATA {
(0): 14
}
}
}
}
Adding print(list(node.attrs.values())) displays that bitFieldSize attribute.
There are other, non-python viewers. I don't know if pytables or pandas could read this file or not.
Yes, this file is an interesting curiosity. HDFView has no trouble opening or viewing the data (even the troublesome ones). I wrote a little pytables code to walk the group hierarchy and report the leaf names. It issues this warning for several datasets:
DataTypeWarning: Unsupported type for attribute 'exception' in node 'BinningSetting'. Offending HDF5 class: 8
When I look at these datasets in HDFView, they show
Name: exception
Type: 8-bit enun (FALSE=0, TRUE=1)
Unfortunately, I don't know enough about HDF5 or pytables to explain what's going on. It's interesting that some of these datasets are different from those mentioned by #hpaulj.
Here's my code (warning: it creates a mountain of output):
import tables as tb
h5f = tb.open_file('1541962108935000000_167_838.h5',mode='r')
for grp in h5f.walk_groups('/') :
grp_leaves = grp._v_leaves
if len(grp_leaves) > 0 :
print ('Group: ', grp)
for grp_leaf in grp_leaves :
print ('\tLeaf:', grp_leaf)
The first few offending groups are:
Group: /AwakeEventData/XUCL-SPECTRO/BinningSetting
Group: /AwakeEventData/XUCL-SPECTRO/CameraSettings
Group: /AwakeEventData/XMPP-STREAK/StreakImage
Group: /AwakeEventData/TT43.BPM.430308/Acquisition
Group: /AwakeEventData/TT41.BTV.412426/Image
Does that help?
I'm running NSC raytraces in Zemax, using the python zemax DDE server pyZDDE to run through multiple configurations. Ideally I'd like it to run through all model configurations and perform a small amount of analysis so that I can leave the models processing overnight.
Part of this analysis involves using a filter string to get detector output for a couple of different wavelengths, however when I try to pass my filter string (in this case 'W2', I get the error "ValueError: could not convert string to float: W2"
The full error is:
File "C:\ProgramData\Anaconda2\lib\site-packages\pyzdde\zdde.py", line 9397, in zGetDetectorViewer
ret = _zfu.readDetectorViewerTextFile(pyz, textFileName, displayData)
File "C:\ProgramData\Anaconda2\lib\site-packages\pyzdde\zfileutils.py", line 679, in readDetectorViewerTextFile
posX = float(line_list[smoothLineNum + 2].split(':')[1].strip()) # 'Detector X'
ValueError: could not convert string to float: W2
So to me it looks like its mistaking the filter string for the detector infromation, but I'm not sure how to fix it!
Solutions I've tried:
Checking the encoding- I'm using ASCII, but running it in utf-8
hasn't changed the error.
Running a detector .CFG file generated by Zemax that gives the
desired output when not run though pyZDDE.
Minimal working example:
import pyzdde.zdde as pyz
#get rid of any non closed connections possibly hanging around
pyz.closeLink()
#Connect to server
ln = pyz.createLink()
status = ln.zDDEInit()
ln.zSetTimeout(1e5)
filename = 'C:\\...\\Zemax\\Samples\\MultiConfigLens.zmx'
# Load a lens file into the ZEMAX DDE server
ln.zLoadFile(filename)
#Generate config files
configFile0 = 'C:\\...\Zemax\\Samples\\MultiConfigLens_Config1.CFG'
configFile1 = 'C:\\...\Zemax\\Samples\\MultiConfigLens_Config1.CFG'
configFile2 = 'C:\\...\Zemax\\Samples\\MultiConfigLens_Config2.CFG'
ln.zSetDetectorViewerSettings(settingsFile=configFile0, surfNum=1, detectNum=10, showAs=0, scale = 1,dType=4)
ln.zSetDetectorViewerSettings(settingsFile=configFile1, surfNum=1, detectNum=10, zrd='RAYS1.ZRD', dfilter='W1',showAs=0, scale=1, dType=4)
ln.zSetDetectorViewerSettings(settingsFile=configFile2, surfNum=1, detectNum=10, zrd='RAYS1.ZRD', dfilter='W2',showAs=0, scale=1, dType=4)
#perform the ray trace
ln.zNSCTrace(1,0,split = 1,scatter = 1,usePolar = 1,ignoreErrors = 1,randomSeed = 0,save = 1,saveFilename = 'RAYS1.ZRD',timeout = 1e5)
#grab that detector data
data0 = ln.zGetDetectorViewer(configFile0,displayData = True)
data1 = ln.zGetDetectorViewer(configFile1,displayData = True)
data2 = ln.zGetDetectorViewer(configFile2,displayData = True)
as soon as the code gets to "data1" it fails and returns the above error message- any help would be super appreciated!
EDIT: I found the source of the problem, and I'll post the explaination I submitted as a bug report on the pyZDDE github page:
Looking through readDetectorViewerTextFile in zfileutils (around line 678) it looks like the function doesn't account for the fact that the text file outputted by the detector is changed when a ray database and a filter string are included in the detector config meaning that it attempts to read the filter string as the detector X position, causing the value error.
So,I'm trying to learn and understand Doc2Vec.
I'm following this tutorial. My input is a list of documents i.e list of lists of words. This is what my code looks like:
input = [["word1","word2",..."wordn"],["word1","word2",..."wordn"],...]
documents = TaggedLineDocument(input)
model = doc2vec.Doc2Vec(documents,size = 50, window = 10, min_count = 2, workers=2)
But I am getting some unicode error(tried googling this error, but no good ):
TypeError('don\'t know how to handle uri %s' % repr(uri))
Can somebody please help me understand where i am going wrong ? Thank you !
TaggedLineDocument should be instantiated with a file path. Make sure the file is setup in the format one document equals one line.
documents = TaggedLineDocument('myfile.txt')
documents = TaggedLineDocument('compressed_text.txt.gz')
From the source code:
The uri (the think you are instantiating TaggedLineDocument with) can be either:
1. a URI for the local filesystem (compressed ``.gz`` or ``.bz2`` files handled automatically):
`./lines.txt`, `/home/joe/lines.txt.gz`, `file:///home/joe/lines.txt.bz2`
2. a URI for HDFS: `hdfs:///some/path/lines.txt`
3. a URI for Amazon's S3 (can also supply credentials inside the URI):
`s3://my_bucket/lines.txt`, `s3://my_aws_key_id:key_secret#my_bucket/lines.txt`
4. an instance of the boto.s3.key.Key class.
For the data, I have the same formatted list as yours:
[['aw', 'wb', 'ce', 'uw', 'qqg'], ['g', 'e', 'ent', 'va'],['a']...]
For the labels, I have a list:
[1, 0, 0 ...]
It indicates the class of my above sentences, you can have any class(tag) at here(not only 1 or 0)
Since we already have the list like above, we can use TaggedDocumnet directly, instead of TaggedLineDocument
model = gensim.models.Doc2Vec(self.myDataFlow(data,labels))
def myDataFlow(self,data,labels):
for i, j in zip(data,labels):
yield TaggedDocument(i,[j])
I also posted this question in the GIS section of SO. As I'm not sure if this rather a 'pure' python question I also ask it here again.
I was wondering if anyone has some experience in getting elevation data from a raster without using ArcGIS, but rather get the information as a python list or dict?
I get my XY data as a list of tuples.
I'd like to loop through the list or pass it to a function or class-method to get the corresponding elevation for the xy-pairs.
I did some research on the topic and the gdal API sounds promising. Can anyone advice me how to go about things, pitfalls, sample code? Other options?
Thanks for your efforts, LarsVegas
I recommend checking out the Google Elevation API
It's very straightforward to use:
http://maps.googleapis.com/maps/api/elevation/json?locations=39.7391536,-104.9847034&sensor=true_or_false
{
"results" : [
{
"elevation" : 1608.637939453125,
"location" : {
"lat" : 39.73915360,
"lng" : -104.98470340
},
"resolution" : 4.771975994110107
}
],
"status" : "OK"
}
note that the free version is limited to 2500 requests per day.
We used this code to get elevation for a given latitude/longitude (NOTE: we only asked to print the elevation, and the rounded lat and long values).
import urllib.request
import json
lati = input("Enter the latitude:")
lngi = input("Enter the longitude:")
# url_params completes the base url with the given latitude and longitude values
ELEVATION_BASE_URL = 'http://maps.googleapis.com/maps/api/elevation/json?'
URL_PARAMS = "locations=%s,%s&sensor=%s" % (lati, lngi, "false")
url=ELEVATION_BASE_URL + URL_PARAMS
with urllib.request.urlopen(url) as f:
response = json.loads(f.read().decode())
status = response["status"]
result = response["results"][0]
print(float(result["elevation"]))
print(float(result["location"]["lat"]))
print(float(result["location"]["lng"]))
Have a look at altimeter a wrapper for the Google Elevation API
Here is the another one nice API that I`v built: https://algorithmia.com/algorithms/Gaploid/Elevation
import Algorithmia
input = {
"lat": "50.2111",
"lon": "18.1233"
}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('Gaploid/Elevation/0.3.0')
print algo.pipe(input)