Saving keras configuration with custom metric function to JSON - python

I am trying to save my configuration for a Keras model. I would like to be able to read the configuration from the file to be able to reproduce the training.
Before implementing a custom metric in a function I could just do it the way shown below without the mean_pred. Now I am running into the problem TypeError: Object of type 'function' is not JSON serializable.
Here I read that it is possible to get the function name as string by custom_metric_name = mean_pred.__name__. I would like to not only be able to save the name, but to be able to save a reference to the function if possible.
Perhaps I should as mentioned here also think about not just storing my configuration in the .py file but using ConfigObj. Unless this would solve my current problem I would implement this later.
Minimum working example of problem:
import keras.backend as K
import json
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
config = {'epochs':500,
'loss':{'class':'categorical_crossentropy'},
'optimizer':'Adam',
'metrics':{'class':['accuracy', mean_pred]}
}
# Do the training etc...
config_filename = 'config.txt'
with open(config_filename, 'w') as f:
f.write(json.dumps(config))
Greatly appreciate help with this problem as well as other approaches to saving my configuration in the best way possible.

To solve my problem I saved the name of the function as a string in the config file and then extracted the function from a dictionary to use it as metrics in the model. One could additionally use: 'class':['accuracy', mean_pred.__name__] to save the name of the function as a string in the config.
This does also work for multiple custom functions and for more keys to metrics (eg. define metrics for 'reg' like 'class' when doing regression and classification).
import keras.backend as K
import json
from collections import defaultdict
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
config = {'epochs':500,
'loss':{'class':'categorical_crossentropy'},
'optimizer':'Adam',
'metrics':{'class':['accuracy', 'mean_pred']}
}
custom_metrics= {'mean_pred':mean_pred}
metrics = defaultdict(list)
for metric_type, metric_functions in config['metrics'].items():
for function in metric_functions:
if function in custom_metrics.keys():
metrics[metric_type].append(custom_metrics[function])
else:
metrics[metric_type].append(function)
# Do the training, use metrics
config_filename = 'config.txt'
with open(config_filename, 'w') as f:
f.write(json.dumps(config))

Related

Is there way to embed non-tf functions to a tf.Keras model graph as SavedModel Signature?

I want to add preprocessing functions and methods to the model graph as a SavedModel signature.
example:
# suppose we have a keras model
# ...
# defining the function I want to add to the model graph
#tf.function
def process(model, img_path):
# do some preprocessing using different libs. and modules...
outputs = {"preds": model.predict(preprocessed_img)}
return outputs
# saving the model with a custom signature
tf.saved_model.save(new_model, dst_path,
signatures={"process": process})
or we can use tf.Module here. However, the problem is I can not embed custom functions into the saved model graph.
Is there any way to do that?
I think you slightly misunderstand the purpose of save_model method in Tensorflow.
As per the documentation the intent is to have a method which serialises the model's graph so that it can be loaded with load_model afterwards.
The model returned by load_model is a class of tf.Module with all it's methods and attributes. Instead you want to serialise the prediction pipeline.
To be honest, I'm not aware of a good way to do that, however what you can do is to use a different method for serialisation of your preprocessing parameters, for example pickle or a different one, provided by the framework you use and write a class on top of that, which would do the following:
class MyModel:
def __init__(self, model_path, preprocessing_path):
self.model = load_model(model_path)
self.preprocessing = load_preprocessing(preprocessing_path)
def predict(self, img_path):
return self.model.predict(self.preprocessing(img_path))

cast xgboost.Booster class to XGBRegressor or load XGBRegressor from xgboost.Booster

I get a model from Sagemaker of type:
<class 'xgboost.core.Booster'>
I can score this locally which is great but some google searches have shown that it may not be possible to do "standard" things like this taken from here:
plt.barh(boston.feature_names, xgb.feature_importances_)
Is it possible to tranform xgboost.core.Booster to XGBRegressor? Maybe one could use the save_raw method looking at this? Thanks!
So far I tried:
xgb_reg = xgb.XGBRegressor()
xgb_reg._Boster = model
xgb_reg.feature_importances_
but this reults in:
NotFittedError: need to call fit or load_model beforehand
Something along those lines appears to work fine:
local_model_path = "model.tar.gz"
with tarfile.open(local_model_path) as tar:
tar.extractall()
model = xgb.XGBRegressor()
model.load_model(model_file_name)
model can then be used as usual - model.tar.gz is an artifcat coming from sagemaker.

How to call Python function by name dynamically using a string?

I have a Python function call like so:
import torchvision
model = torchvision.models.resnet18(pretrained=configs.use_trained_models)
Which works fine.
If I attempt to make it dynamic:
import torchvision
model_name = 'resnet18'
model = torchvision.models[model_name](pretrained=configs.use_trained_models)
then it fails with:
TypeError: 'module' object is not subscriptable
Which makes sense since model is a module which exports a bunch of things, including the resnet functions:
# __init__.py for the "models" module
...
from .resnet import *
...
How can I call this function dynamically without knowing ahead of time its name (other than that I get a string with the function name)?
You can use the getattr function:
import torchvision
model_name = 'resnet18'
model = getattr(torchvision.models, model_name)(pretrained=configs.use_trained_models)
This essentially is along the lines the same as the dot notation just in function form accepting a string to retrieve the attribute/method.
The new APIs since Aug 2022 are as follows:
# returns list of available model names
torchvision.models.list_models()
# returns specified model with pretrained common weights
torchvision.models.get_model("alexnet", weights="DEFAULT")
# returns specified model with pretrained=False
torchvision.models.get_model("alexnet", weights=None)
# returns specified model with specified pretrained weights
torchvision.models.get_model("alexnet", weights=ResNet50_Weights.IMAGENET1K_V2)
Reference:
https://pytorch.org/blog/easily-list-and-initialize-models-with-new-apis-in-torchvision/

How to add new elements to the config object of Keras models?

I'm in trouble with adding new elements to the config object of Keras models.
We can get the config object with the following code:
config = model.get_config()
But, I can not find any methods like model.set_condig().
I'd like to set my own elements to it.
Anyone know how to do this ?
Try:
from keras.models import model_from_config
new_model = model_from_config(your_config)

Python statsmodels OLS: how to save learned model to file

I am trying to learn an ordinary least squares model using Python's statsmodels library, as described here.
sm.OLS.fit() returns the learned model. Is there a way to save it to the file and reload it? My training data is huge and it takes around half a minute to learn the model. So I was wondering if any save/load capability exists in OLS model.
I tried the repr() method on the model object but it does not return any useful information.
The models and results instances all have a save and load method, so you don't need to use the pickle module directly.
Edit to add an example:
import statsmodels.api as sm
data = sm.datasets.longley.load_pandas()
data.exog['constant'] = 1
results = sm.OLS(data.endog, data.exog).fit()
results.save("longley_results.pickle")
# we should probably add a generic load to the main namespace
from statsmodels.regression.linear_model import OLSResults
new_results = OLSResults.load("longley_results.pickle")
# or more generally
from statsmodels.iolib.smpickle import load_pickle
new_results = load_pickle("longley_results.pickle")
Edit 2 We've now added a load method to main statsmodels API in master, so you can just do
new_results = sm.load('longley_results.pickle')
I've installed the statsmodels library and found that you can save the values using the pickle module in python.
Models and results are pickleable via save/load, optionally saving the model data.
[source]
As an example:
Given that you have the results saved in the variable results:
To save the file:
import pickle
with open('learned_model.pkl','w') as f:
pickle.dump(results,f)
To read the file:
import pickle
with open('learned_model.pkl','r') as f:
model_results = pickle.load(f)

Categories