Uber Ludwig: Issue Making Predictions - python

I decided to mess with Uber Ludwig again. I wanted to make a simple demo using the python API that learns to add 1 to the input number. I have successfully produced a model, but the issue arises when predicting. I am running on the newest release from github on PopOS 19.10 on CPU TensorFlow.
Thank you for any help.
Edit: I have reproduced the issue on windows as well.
The error is as follows
Traceback (most recent call last):
File "predict.py", line 3, in <module>
x = model.predict({"numberIn":[1]}, return_type='dict')
File "/home/user/.local/lib/python3.7/site-packages/ludwig/api.py", line 914, in predict
gpu_fraction=gpu_fraction,
File "/home/user/.local/lib/python3.7/site-packages/ludwig/api.py", line 772, in _predict
self.model_definition['preprocessing']
File "/home/user/.local/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 159, in build_data
preprocessing_parameters
File "/home/user/.local/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 180, in handle_missing_values
dataset_df[feature['name']] = dataset_df[feature['name']].fillna(
AttributeError: 'list' object has no attribute 'fillna'
Here is my prediction script
from ludwig.api import LudwigModel
model = LudwigModel.load("/home/user/Documents/ludwig-test/plus1/results/api_experiment_run_0/model")
x = model.predict({"numberIn":[1]}, return_type='dict')
#x = model.predict({"numberIn":[1]}, return_type=<class 'dict'>) I tried this with no success
print(x)
Here is the contents of my training script.
mydata = {"numberIn":[], "value":[]}
for x in range(10000):
mydata["numberIn"].append(x)
mydata["value"].append(x + 1)
from ludwig.api import LudwigModel
print("Imported Ludwig")
modelobject = LudwigModel(model_definition_file="modeldef.yaml")
stats = modelobject.train(data_dict=mydata)
modelobject.close()
modeldef.yaml
input_features:
-
name: numberIn
type: numerical
output_features:
-
name: value
type: numerical

Solution: Input argument of predict function is not positional and data_dict needs to be specified in this case.
x = modelobject.predict(data_dict=mydictionary)

Related

Code error with variable storage from selected feature

I'm using QGIS 3.6 with the built in Python text editor. I have found a snippet of code that I'm trying to make work, and I've modified it to the best of my abilities to fit my specific needs. I have a point layer called "Regulators" and it contains a field called "Town". The idea of the code is that when I select a single feature on the "Regulators" layer, the code will look at the "Town" field, and select all other features that match that field's value. I select a feature, run this code:
layer = iface.activeLayer()
field_name = 'Town'
values = []
for feat in layer.selectedFeatures():
tmp_value = feat[field_name]
if tmp_value not in values:
values.append(str(tmp_value))
strings = []
for val in values:
if val != values[-1]:
string = field_name + ' = ' + val + ' or '
strings.append(string)
else:
last_string = field_name + ' = ' + val
strings.append(last_string)
query = ''.join(strings)
request = QgsFeatureRequest().setFlags(QgsFeatureRequest.NoGeometry)
request.setSubsetOfAttributes([]).setFilterExpression(query)
selection = layer.getFeatures(request)
layer.setSelectedFeatures([k.id() for k in selection])
and I get this error:
Traceback (most recent call last):
File "C:\PROGRA~1\QGIS3~1.6\apps\Python37\lib\code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "<string>", line 24, in <module>
AttributeError: 'QgsVectorLayer' object has no attribute 'setSelectedFeatures'
I'm very much new to python, and see nothing wrong with line 1 or 5. I have found some other codes that do what I'm attempting here, but they also return errors, so I'm wondering if there is some method or function that has changed since this code was posted. The integrated compiler with QGIS is also much different than I am used to.
EDIT: I've updated the code and the error message based on feedback I've received on the post so far. I assume that QgsVectorLayer is the generic term for a vector layer being referenced, in this case the "Regulators" layer. but I don't understand why it's trying to use the setSelectedFeatures method as an attribute.

Theano scan function and argument number lstm

I am new to Neural Networks and I am trying to modify this code RNN-Classifier and instead of using the GRU_step, I would rather use an LSTM.
I added one extra parameter c_prev
def lstm_step(x, h_prev, c_prev, W_xz, W_hz, W_xm, W_hm):
and after applying all the LSTM equations I am returning them both (h and c)
My hidden vector looks like that:
hidden_vector, _ = theano.scan(
lstm_step,
sequences=input_vectors,
outputs_info=initial_hidden_vector,
non_sequences=[W_xz, W_hz, W_xm, W_hm]
)
hidden_vector = hidden_vector[-1]
I get an exception like this and don't understand why it does not see the c_prev as an existant parameter (or how/where can I feed it with some values, so that it's not empty?)
python rnnclassifier.py data/sentiment.train.txt data/sentiment.test.txt
Traceback (most recent call last):
File "rnnclassifier.py", line 167, in <module>
rnn_classifier = RnnClassifier(word2id_len, n_classes)
File "rnnclassifier.py", line 110, in __init__
non_sequences=[W_xz, W_hz, W_xm, W_hm]
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py",
line 773, in scan condition, outputs, updates =
scan_utils.get_updates_and_outputs(fn(*args))
TypeError: lstm_step() takes exactly 7 arguments (6 given)
I am new to this topic and would appreciate any help or advice! Thank you.

scikitlearn adapt bigram to svm

I have problem. here's my code.
http://colorscripter.com/s/9vc2ryj
And I mistaked. evaluate_classifier(bigram_word_feats) is what I want.
I'm trying to text mining by SVM.
The feature vectors are bigram model.
But I got a problem:
Traceback (most recent call last):
File "C:/Users/LG/Desktop/untitled1/TEST.py", line 184, in <module>
evaluate_classifier(bigram_word_feats)
File "C:/Users/LG/Desktop/untitled1/TEST.py", line 90, in evaluate_classifier
classifier.train(trainfeats)
File "C:\Users\LG\Anaconda3\lib\site-packages\nltk\classify\scikitlearn.py", line 115, in train
X = self._vectorizer.fit_transform(X)
File "C:\Users\LG\Anaconda3\lib\site-packages\sklearn\feature_extraction\dict_vectorizer.py", line 226, in fit_transform
return self._transform(X, fitting=True)
File "C:\Users\LG\Anaconda3\lib\site-packages\sklearn\feature_extraction\dict_vectorizer.py", line 190, in _transform
feature_names.sort()
TypeError: unorderable types: tuple() < str()
Why this happen and how can I solve?
and what's the process of nltk classifier?
give it to my feature word and period? Then it just generate svm model?
Oh and I'm using python 3. Do I need to use python 2?
New answer:
I think the problem is that nltk expects a dict indexed by strings instead of tuples. Can you try to replace the return statement from:
return dict([(ngram, True) for ngram in itertools.chain(words, bigrams)])
to the following:
return dict([('|'.join (ngram), True) for ngram in itertools.chain(words, bigrams)])
Old answer:
`train` methods of Scikit-learn predictors expect two inputs: features and targets. Something like the following (not tested):
negfeats = [featx(f) for f in word_split(negdata)]posfeats = [featx(f) for f in word_split(posdata)]...trainlabels = [-1,] * negcutoff + [+1,] * poscutoffclassifier.train(trainfeats, trainlabels)
In defining trainlabels, I followed your style of using arithmetic operators on lists but I wouldn't do it in my code as it makes it less readable.

graph-tool - 'NestedBlockState' object has no attribute 'get_nonempty_B'

I am trying to replicate a section of code from the graph-tool cookbook to find the marginal probablity of the number of groups in a graph when using hierarchical partitioning. I however get an error telling me that 'NestedBlockState' object has no attribute 'get_nonempty_B' so presumably I have made a mistake somewhere. Does anybody know where I went wrong?
import graph_tool.all as gt
import cPickle as pickle
g = gt.load_graph('graph_no_multi_reac_type.gt')
gt.remove_parallel_edges(g)
state = gt.minimize_nested_blockmodel_dl(g, deg_corr=True)
state = state.copy(sampling=True)
with open('state_mcmc.pkl','wb') as state_pkl:
pickle.dump(state,state_pkl,-1)
print 'equilibrating Markov chain'
gt.mcmc_equilibrate(state, wait=1000, mcmc_args=dict(niter=10))
h = np.zeros(g.num_vertices() + 1)
def collect_num_groups(s):
B = s.get_nonempty_B()
h[B] += 1
print 'colleting marginals'
gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10),
callback=collect_num_groups)
with open('state_ncnc.pkl','wb') as state_pkl:
pickle.dump(state,state_pkl,-1)
with open('hist.pkl','wb') as h_pkl:
pickle.dump(h,h_pkl,-1)
The error I get looks as follows:
Traceback (most recent call last):
File "num_groups_marg_prob.py", line 42, in <module>
gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10),
File "/usr/lib/python2.7/dist-packages/graph_tool/inference/mcmc.py", line 172, in mcmc_equilibrate
extra = callback(state)
File "num_groups_marg_prob.py", line 35, in collect_num_groups
def collect_num_groups(s):
AttributeError: 'NestedBlockState' object has no attribute 'get_nonempty_B'
Quoting from an answer from the graph-tool mailing list:
"The error message is clear. This attribute belongs to BlockState, not
NestedBlockState. What you wish to do is:
s.levels[0].get_nonempty_B()
"
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/self-state-couple-state-state-state-entropy-args-Python-argument-types-did-not-match-C-signature-td4026975.html

FFT Runtime Error in Running Galsim

I keep receiving the following error when running a script to save an animation:
RuntimeError: SB Error: fourierDraw() requires an FFT that is too large, 6144
If you can handle the large FFT, you may update gsparams.maximum_fft_size.
So I went into /Galsim/include/galsim/GSparams.h
and I changed the following
maximum_fft_size(16384) from maximum_fft_size(4096)
or 2^14 from 2^12.
I still get the same error as before. Should I restart my machine or something?
That is not where to change the maximum_fft_size parameter. See demo7 for an example of how to use the GSParams object and to update parameters. There is also an example in the doc string for GSObject:
>>> gal = galsim.Sersic(n=4, half_light_radius=4.3)
>>> psf = galsim.Moffat(beta=3, fwhm=2.85)
>>> conv = galsim.Convolve([gal,psf])
>>> im = galsim.Image(1000,1000, scale=0.05) # Note the very small pixel scale!
>>> im = conv.drawImage(image=im) # This uses the default GSParams.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "galsim/base.py", line 1236, in drawImage
image.added_flux = prof.SBProfile.draw(imview.image, gain, wmult)
RuntimeError: SB Error: fourierDraw() requires an FFT that is too large, 6144
If you can handle the large FFT, you may update gsparams.maximum_fft_size.
>>> big_fft_params = galsim.GSParams(maximum_fft_size=10240)
>>> conv = galsim.Convolve([gal,psf],gsparams=big_fft_params)
>>> im = conv.drawImage(image=im) # Now it works (but is slow!)
>>> im.write('high_res_sersic.fits')

Categories