I have a pytorch text multiclass classifier model based on a XLMR based architecture. Due to IP reasons I can't share the architecture code. I have tried putting as much detail as I can. Please point out if more information is needed. But it outputs 28 classes from 'class0' to 'class27' with probability scores that add to 1.
I am trying to use shap package to explain the results. I have wrapped my model into huggingface's custom pipeline object and I get the following output for 1 input text:
pipe = CustomPipeline(model = model, tokenizer = base_tokenizer)
output = pipe(list_of_inputs) # list_of_inputs = ['this is test input']
Output:
[[{"label": "class0","score": 0.01500235591083765},{"label": "class1","score": 0.001698049483820796},{"label": "class2","score": 0.0019644589629024267},{"label": "class3","score": 0.0004418794414959848},{"label": "class4","score": 5.9095666074426845e-05},{"label": "class5","score": 0.0007908751722425222},{"label": "class6","score": 0.002379569923505187},{"label": "class7","score": 0.0035733324475586414},{"label": "class8","score": 0.0014360857894644141},{"label": "class9","score": 0.0007365105557255447},{"label": "class10","score": 0.0014471099711954594},{"label": "class11","score": 0.0011013210751116276},{"label": "class12","score": 0.0010048456024378538},{"label": "class13","score": 0.000885132874827832},{"label": "class14","score": 0.0022015925496816635},{"label": "class15","score": 0.0013197452062740922},{"label": "class16","score": 0.0037292027845978737},{"label": "class17","score": 0.004212632775306702},{"label": "class18","score": 0.9481304287910461},{"label": "class19","score": 0.001469381619244814},{"label": "class20","score": 0.0009713817853480577},{"label": "class21","score": 0.0018773127812892199},{"label": "class22","score": 0.0009251375449821353},{"label": "class23","score": 0.0007248060428537428},{"label": "class24","score": 0.00031718137324787676},{"label": "class25","score": 0.0011144360760226846},{"label": "class26","score": 0.0002294857840752229},{"label": "class27","score": 0.00025681318948045373}]]
The output is in same format as the notebook specified by shap package.
Now, when I try to use shap Explainer:
pipe = UDTMPipeline(model = model, tokenizer = base_tokenizer)
explainer = shap.Explainer(pipe)
shap_values = explainer(list_of_inputs)
shap.plots.text(shap_values)
Error:
File "C:\pipeline.py", line 104, in udtm_xai
shap_values = explainer(query_data)
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\explainers\_partition.py", line 136, in __call__
return super().__call__(
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\explainers\_explainer.py", line 266, in __call__
row_result = self.explain_row(
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\explainers\_partition.py", line 161, in explain_row
self._curr_base_value = fm(m00.reshape(1, -1), zero_index=0)[0] # the zero index param tells the masked model what the baseline is
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\utils\_masked_model.py", line 67, in __call__
return self._full_masking_call(masks, batch_size=batch_size)
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\utils\_masked_model.py", line 144, in _full_masking_call
outputs = self.model(*joined_masked_inputs)
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\models\_transformers_pipeline.py", line 35, in __call__
output[i, self.label2id[obj["label"]]] = sp.special.logit(obj["score"]) if self.rescale_to_logits else obj["score"]
KeyError: 'class0'
The code is unable to find 'class0'. In the postprocess function of pipeline class, I read a file containing label mappings, obtain the softmax scores from _forward function of pipeline class and create a dictionary in the final format to send as output:
class CustomPipeline(Pipeline): # Ignore no indentation below, formatting issue in stackoverflow
def _sanitize_parameters(self, **kwargs):
self.mapping_json = json.loads(open("mapping_file.json", "r", encoding = "utf-8").read().strip())
return {}, {}, {}
def preprocess(self, inputs):
inputs_df = pd.DataFrame([inputs], columns = ["Query"])
inference_dataloader = getInferenceDataloader(inputs_df, self.tokenizer, batch_size = 16)
return inference_dataloader
def softmax_with_temp(self, input, t):
ex = torch.exp(input/t)
sum = torch.sum(ex,0)
return ex / sum
def _forward(self, model_inputs):
if torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
final_pred_labels = []
final_scores = []
batch_count = 0
self.model.eval()
for batch in model_inputs:
batch_count += 1
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_input_task = torch.full((b_input_ids.shape[0],), -1, dtype=torch.int32).to(device)
with torch.no_grad():
result = self.model((b_input_ids, b_input_mask, b_input_task))
logits = result
logits = logits.detach().cpu().numpy()
pred_softmax_t = self.softmax_with_temp(torch.from_numpy(logits[0]), 2).numpy()
return pred_softmax_t
def postprocess(self, model_outputs):
output_list = [{"label":"class0", "score":float(model_outputs[0])}]
index = 1
for label in self.mapping_json.keys(): #self.mapping_json contains label names that has been read from a file
output_list.append({"label":label, "score":float(model_outputs[index])}) #model_outputs is a list of 28 floating scores
index += 1
return output_list
Am I missing some to define any label based variable which is why I am getting 'class0' key error?
Related
first 'im not a developer by trade, my developer is not available for health reasons but i have some experience in python/spacy development. I need some guidance in this problem:
I have a h5 keras model which was written to translate some dead language from its latin alphabet written form into arabic "letters" using neural machine translation.
here's the code for creating and training the keras model (dataset.txt is some tab delimited sentences, one in latin alphabet and the other one in arabic):
import string
import re
from unicodedata import normalize
from numpy import array
from pickle import load
from pickle import dump
from numpy.random import rand
from numpy.random import shuffle
from numpy import argmax
import tensorflow as tf
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Embedding
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.callbacks import ModelCheckpoint
from keras.models import load_model
# load doc into memory
def load_doc(filename):
# open the file as read only
file = open(filename, mode='rt', encoding='utf-8')
# read all text
text = file.read()
# close the file
file.close()
return text
# split a loaded document into sentences
def to_pairs(doc):
lines = doc.strip().split('\n')
pairs = [line.split('\t') for line in lines]
return pairs
# clean a list of lines
def clean_pairs(lines):
cleaned = list()
for pair in lines:
clean_pair = list()
for line in pair:
clean_pair.append(' '.join(line))
cleaned.append(clean_pair)
return array(cleaned)
# save a list of clean sentences to file
def save_clean_data(sentences, filename):
dump(sentences, open(filename, 'wb'))
print('Saved: %s' % filename)
# load dataset
filename = 'dataset.txt'
doc = load_doc(filename)
# split into latin-bok pairs
pairs = to_pairs(doc)
# clean sentences
clean_pairs = clean_pairs(pairs)
# save clean pairs to file
save_clean_data(clean_pairs, 'latin-bok.pkl')
# spot check
for i in range(100):
print('[%s] => [%s]' % (clean_pairs[i,0], clean_pairs[i,1]))
# load a clean dataset
def load_clean_sentences(filename):
return load(open(filename, 'rb'))
# save a list of clean sentences to file
def save_clean_data(sentences, filename):
dump(sentences, open(filename, 'wb'))
print('Saved: %s' % filename)
# load dataset
raw_dataset = load_clean_sentences('latin-bok.pkl')
# reduce dataset size
n_sentences = 10000
dataset = raw_dataset[:, :]
# random shuffle
shuffle(dataset)
# split into train/test
train, test = dataset[:70000], dataset[70000:]
# save
save_clean_data(dataset, 'latin-bok-both.pkl')
save_clean_data(train, 'latin-bok-train.pkl')
save_clean_data(test, 'latin-bok-test.pkl')
# load a clean dataset
def load_clean_sentences(filename):
return load(open(filename, 'rb'))
# fit a tokenizer
def create_tokenizer(lines):
tokenizer = Tokenizer()
tokenizer.fit_on_texts(lines)
return tokenizer
# max sentence length
def max_length(lines):
return max(len(line.split()) for line in lines)
# encode and pad sequences
def encode_sequences(tokenizer, length, lines):
# integer encode sequences
X = tokenizer.texts_to_sequences(lines)
# pad sequences with 0 values
X = pad_sequences(X, maxlen=length, padding='post')
return X
# one hot encode target sequence
def encode_output(sequences, vocab_size):
ylist = list()
for sequence in sequences:
encoded = to_categorical(sequence, num_classes=vocab_size)
ylist.append(encoded)
y = array(ylist)
y = y.reshape(sequences.shape[0], sequences.shape[1], vocab_size)
return y
# define NMT model
def define_model(src_vocab, tar_vocab, src_timesteps, tar_timesteps, n_units):
model = Sequential()
model.add(Embedding(src_vocab, n_units, input_length=src_timesteps, mask_zero=True))
model.add(LSTM(n_units))
model.add(RepeatVector(tar_timesteps))
model.add(LSTM(n_units, return_sequences=True))
model.add(TimeDistributed(Dense(tar_vocab, activation='softmax')))
return model
# load datasets
dataset = load_clean_sentences('latin-bok-both.pkl')
train = load_clean_sentences('latin-bok-train.pkl')
test = load_clean_sentences('latin-bok-test.pkl')
# prepare english tokenizer
latin_tokenizer = create_tokenizer(dataset[:, 0])
latin_vocab_size = len(latin_tokenizer.word_index) + 1
latin_length = max_length(dataset[:, 0])
print('latin Vocabulary Size: %d' % latin_vocab_size)
print('latin Max Length: %d' % (latin_length))
# prepare german tokenizer
bok_tokenizer = create_tokenizer(dataset[:, 1])
bok_vocab_size = len(bok_tokenizer.word_index) + 1
bok_length = max_length(dataset[:, 1])
print('bok Vocabulary Size: %d' % bok_vocab_size)
print('bok Max Length: %d' % (bok_length))
# prepare training data
trainX = encode_sequences(bok_tokenizer, bok_length, train[:, 1])
trainY = encode_sequences(latin_tokenizer, latin_length, train[:, 0])
trainY = encode_output(trainY, latin_vocab_size)
# prepare validation data
testX = encode_sequences(bok_tokenizer, bok_length, test[:, 1])
testY = encode_sequences(latin_tokenizer, latin_length, test[:, 0])
testY = encode_output(testY, latin_vocab_size)
# define model
model = define_model(bok_vocab_size, latin_vocab_size, bok_length, latin_length, 256)
model.compile(optimizer='adam', loss='categorical_crossentropy')
# summarize defined model
print(model.summary())
# fit model
filename = 'translator.h5'
checkpoint = ModelCheckpoint(filename, monitor='val_loss', verbose=1, save_best_only=True, mode='min')
model.fit(trainX, trainY, epochs=30, batch_size=64, validation_data=(testX, testY), callbacks=[checkpoint], verbose=2)
# load a clean dataset
def load_clean_sentences(filename):
return load(open(filename, 'rb',encoding='utf-8'))
# fit a tokenizer
def create_tokenizer(lines):
tokenizer = Tokenizer()
tokenizer.fit_on_texts(lines)
return tokenizer
# max sentence length
def max_length(lines):
return max(len(line.split()) for line in lines)
# encode and pad sequences
def encode_sequences(tokenizer, length, lines):
# integer encode sequences
X = tokenizer.texts_to_sequences(lines)
# pad sequences with 0 values
X = pad_sequences(X, maxlen=length, padding='post')
return X
# map an integer to a word
def word_for_id(integer, tokenizer):
for word, index in tokenizer.word_index.items():
if index == integer:
return word
return None
# generate target given source sequence
def predict_sequence(model, tokenizer, source):
prediction = model.predict(source, verbose=0)[0]
integers = [argmax(vector) for vector in prediction]
target = list()
for i in integers:
word = word_for_id(i, tokenizer)
if word is None:
break
target.append(word)
return ' '.join(target)
# evaluate the skill of the model
def evaluate_model(model, tokenizer, sources, raw_dataset):
actual, predicted = list(), list()
for i, source in enumerate(sources):
# translate encoded source text
source = source.reshape((1, source.shape[0]))
translation = predict_sequence(model, latin_tokenizer, source)
raw_target, raw_src = raw_dataset[i]
if i < 10:
print('src=[%s], target=[%s], predicted=[%s]' % (raw_src, raw_target, translation))
actual.append([raw_target.split()])
predicted.append(translation.split())
# calculate BLEU score
# load datasets
dataset = load_clean_sentences('latin-bok-both.pkl')
train = load_clean_sentences('latin-bok-train.pkl')
test = load_clean_sentences('latin-bok-test.pkl')
# prepare english tokenizer
latin_tokenizer = create_tokenizer(dataset[:, 0])
latin_vocab_size = len(latin_tokenizer.word_index) + 1
latin_length = max_length(dataset[:, 0])
# prepare german tokenizer
bok_tokenizer = create_tokenizer(dataset[:, 1])
bok_vocab_size = len(bok_tokenizer.word_index) + 1
bok_length = max_length(dataset[:, 1])
# prepare data
trainX = encode_sequences(bok_tokenizer, bok_length, train[:, 1])
testX = encode_sequences(bok_tokenizer, bok_length, test[:, 1])
# load model
model = load_model('translator2.h5')
# test on some training sequences
print('train')
evaluate_model(model, latin_tokenizer, trainX, train)
# test on some test sequences
print('test')
evaluate_model(model, latin_tokenizer, testX, test)
The model works fine and is accurate. Now i wanted to implement my model to do this task: What i want to do is to get all the tokens of a doc and "translate" or rather "predict" them and store the translation in the doc object. As i can't do that i tasked a freelance developer to do this but i'm getting desperated as nothing is done correctly. The developer wrote me this code but nothing works:
code.py
from typing import List
from thinc.api import TensorFlowWrapper
from keras.models import load_model
from thinc.api import Model
from spacy.tokens.doc import Doc
import spacy
from spacy.lookups import Lookups
import os
import json
from spacy.pipeline import TrainablePipe
from spacy.language import Language
from itertools import islice
from typing import Optional
from collections.abc import Callable
from collections.abc import Iterable, Iterator
from spacy.training import Example
#spacy.registry.architectures("LatinBok.v1")
def create_tensorflow_model() -> Model[List[Doc], List[str]]:
model = load_model('translator.h5')
wrapped_model = TensorFlowWrapper(model)
return wrapped_model
#spacy.registry.misc("mon_lookups_loader")
def load_lookups(data_path):
lookups = Lookups()
# "lemma_lookup", "lemma_rules", "lemma_exc", "lemma_index"
with open(os.path.join('lemma/mon_lookup_lemma.json'), 'r',encoding="utf-8") as lemma_lookup:
lookups.add_table('lemma_lookup', json.load(lemma_lookup))
return lookups
class TrainableComponent(TrainablePipe):
def __init__(self, vocab, model, name="translator"):
self.model = model
self.vocab = vocab
self.name = name
def predict(self, docs):
return self.model.predict(docs)
def set_annotations(self, docs, scores):
c = 0
get_instances = self.model.attrs["get_instances"]
for doc in docs:
for (e1, e2) in get_instances(doc):
offset = (e1.start, e2.start)
if offset not in doc._.trans:
doc._.trans[offset] = {}
for j,token in enumerate(doc):
doc._.trans[offset][token] = scores[c, j]
c += 1
def initialize(
self,
get_examples: Callable[[], Iterable[Example]],
*,
nlp: Language = None,
labels: Optional[List[str]] = None,
):
if labels is not None:
for label in labels:
self.add_label(label)
else:
for example in get_examples():
relations = example.reference._.trans
for indices, label_dict in relations.items():
for label in label_dict.keys():
self.add_label(label)
subbatch = list(islice(get_examples(), 10))
doc_sample = [eg.reference for eg in subbatch]
label_sample = self._examples_to_truth(subbatch)
self.model.initialize(X=doc_sample, Y=label_sample)
#Language.factory("translator")
def make_component(nlp, name, model):
return TrainableComponent(nlp.vocab, model, name=name)
and he also rewrote my config.cfg file into:
[paths]
train = null
dev = null
vectors = null
init_tok2vec = null
[system]
gpu_allocator = null
seed = 0
[nlp]
lang = "bok"
pipeline = ["tok2vec","tagger","morphologizer","lemmatizer","translator"]
batch_size = 1000
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"#tokenizers":"spacy.Tokenizer.v1"}
[components]
[components.lemmatizer]
factory = "lemmatizer"
mode = "lookup"
model = null
overwrite = false
scorer = {"#scorers":"spacy.lemmatizer_scorer.v1"}
[components.morphologizer]
factory = "morphologizer"
extend = false
overwrite = true
scorer = {"#scorers":"spacy.morphologizer_scorer.v1"}
[components.morphologizer.model]
#architectures = "spacy.Tagger.v1"
nO = null
[components.morphologizer.model.tok2vec]
#architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"
[components.tagger]
factory = "tagger"
neg_prefix = "!"
overwrite = false
scorer = {"#scorers":"spacy.tagger_scorer.v1"}
[components.tagger.model]
#architectures = "spacy.Tagger.v1"
nO = null
[components.tagger.model.tok2vec]
#architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"
[components.tok2vec]
factory = "tok2vec"
[components.tok2vec.model]
#architectures = "spacy.Tok2Vec.v2"
[components.tok2vec.model.embed]
#architectures = "spacy.MultiHashEmbed.v2"
width = ${components.tok2vec.model.encode.width}
attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
rows = [5000,2500,2500,2500]
include_static_vectors = true
[components.tok2vec.model.encode]
#architectures = "spacy.MaxoutWindowEncoder.v2"
width = 256
depth = 8
window_size = 1
maxout_pieces = 3
[components.translator]
factory = "translator"
[components.translator.model]
#architectures = "LatinBok.v1"
[corpora]
[corpora.dev]
#readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null
[corpora.train]
#readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null
[training]
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
accumulate_gradient = 1
patience = 1600
max_epochs = 0
max_steps = 20000
eval_frequency = 200
frozen_components = []
annotating_components = []
before_to_disk = null
[training.batcher]
#batchers = "spacy.batch_by_words.v1"
discard_oversize = false
tolerance = 0.2
get_length = null
[training.batcher.size]
#schedules = "compounding.v1"
start = 100
stop = 1000
compound = 1.001
t = 0.0
[training.logger]
#loggers = "spacy.ConsoleLogger.v1"
progress_bar = false
[training.optimizer]
#optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = false
eps = 0.00000001
learn_rate = 0.001
[training.score_weights]
tag_acc = 0.33
pos_acc = 0.17
morph_acc = 0.17
morph_per_feat = null
lemma_acc = 0.33
[pretraining]
[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null
[initialize.components]
[initialize.tokenizer]
[initialize.components.lemmatizer]
[initialize.components.lemmatizer.lookups]
#misc = "bok_lookups_loader"
data_path = 'lemma'
[initialize.components.translator]
and told me to retrain the model using the Boktrain.spacy and Bokdev.spacy (which were connlu files which i used to train the original model) and a json containing the lemmas.
He told me to write this in my command line to compile everything:
python -m spacy train config.cfg --output Bok --paths.train Boktrain.spacy --paths.dev Bokdev.spacy --code code.py
Which i did but i got this error:
[2022-05-13 15:01:37,487] [INFO] Set up nlp object from config
[2022-05-13 15:01:37,753] [INFO] Pipeline: ['tok2vec', 'tagger', 'morphologizer', 'lemmatizer', 'translator']
[2022-05-13 15:01:45,619] [INFO] Created vocabulary
[2022-05-13 15:01:45,630] [INFO] Finished initializing nlp object
Traceback (most recent call last):
File "C:\Users\User\anaconda3\envs\BokModel\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\User\anaconda3\envs\BokModel\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\__main__.py", line 4, in <module>
setup_cli()
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\cli\_util.py", line 71, in setup_cli
command(prog_name=COMMAND)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\typer\main.py", line 500, in wrapper
return callback(**use_params) # type: ignore
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\cli\train.py", line 45, in train_cli
train(config_path, output_path, use_gpu=use_gpu, overrides=overrides)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\cli\train.py", line 72, in train
nlp = init_nlp(config, use_gpu=use_gpu)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\training\initialize.py", line 84, in init_nlp
nlp.initialize(lambda: train_corpus(nlp), sgd=optimizer)
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\language.py", line 1309, in initialize
proc.initialize(get_examples, nlp=self, **p_settings)
File "C:\Users\User\Downloads\BokModel\LatinToArabic\test.py", line 65, in initialize
relations = example.reference._.trans
File "C:\Users\User\anaconda3\envs\BokModel\lib\site-packages\spacy\tokens\underscore.py", line 47, in __getattr__
raise AttributeError(Errors.E046.format(name=name))
AttributeError: [E046] Can't retrieve unregistered extension attribute 'trans'. Did you forget to call the `set_extension` method?
I'm desperatly trying just to get a spacy doc object with the translations in it, so that i can use that in my spacy model pipeline or by using a custom spacy extension. I have basic spacy/python understanding, but i'm not understanding anything here. I tried to compile the code with each section and it seems that only the lemma part is correct, everything else seems to be incorrect. If anyone has a solution to this, as simple as it may seem for you, i'm interested.
Your error just indicates that that particular field isn't registered. You can register it using this - put it at the top of your code.py, after the imports:
Doc.set_extension("trans")
I don't know if you have other problems or not, but that will fix your current error.
If you don't understand something, I would recommend you search the spaCy docs for related topics - there's a lot of documented examples.
I am a beginner to machine learning and trying to train a model on counting the amount of numbers below 0.5 in a 1D Vector with the length of 10. The input vectors contain number between 0 and 1. I generate the input data and the labels in my script instead of having them in a seperate file, because the data is so simple.
This is the Code:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
class MyNet(nn.Module):
def __init__(self):
super(MyNet, self).__init__()
self.lin1 = nn.Linear(10,10)
self.lin2 = nn.Linear(10,1)
def forward(self,x):
x = self.lin1(x)
x = F.relu(x)
x = self.lin2(x)
return x
net = MyNet()
net.to(device)
def train():
criterion = nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr=0.1)
for epochs in range(100):
target = 0
data = torch.rand(10)
for entry in data:
if entry < 0.5:
target += 1
# print(target)
# print(data)
data = data.to(device)
out = net(data)
# print(out)
target = torch.Tensor(target)
target = target.to(device)
loss = criterion(out, target)
print(loss)
net.zero_grad()
loss.backward()
optimizer.step()
def test():
acc_error = 0
for i in range(100):
test_data = torch.rand(10)
test_data.to(device)
test_target = 0
for entry in test_data:
if entry < 0.5:
test_target += 1
out = net(test_data)
error = test_target - out
if error < 0:
error *= -1
acc_error += error
overall_error = acc_error / 100
print(overall_error)
train()
test()
This is the error:
Traceback (most recent call last):
File "test1.py", line 70, in <module>
test()
File "test1.py", line 59, in test
out = net(test_data)
File "/vol/fob-vol7/mi18/radtklau/SP/sem_project/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "test1.py", line 15, in forward
x = self.lin1(x)
File "/vol/fob-vol7/mi18/radtklau/SP/sem_project/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/vol/fob-vol7/mi18/radtklau/SP/sem_project/lib64/python3.6/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/vol/fob-vol7/mi18/radtklau/SP/sem_project/lib64/python3.6/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)
The other posts regarding the topic have not solved my problem. Maybe somebody can help. Thanks!
Notice how your error message traces back to test, while train works fine.
You've transfered your data correctly in train:
data = data.to(device)
But not in test:
test_data.to(device)
Instead it should be reassigned to test_data, since torch.Tensor.to makes a copy:
test_data = test_data.to(device)
I am trying to run the following code. The following code runs fine on google colab, however on my system it throws an error. Tensorflow version installed on my system is is 1.12.0 and keras version is 2.2.4. Help is highly appreciated.
def profiler(layer, test_input):
data_input = test_input
start = time.time()
data_input = layer.predict(data_input)
end = time.time() - start
milliseconds = end * 1000
return milliseconds
def dense_layer(input_dim, dense_size):
x = tf.keras.layers.Input((input_dim))
dense = tf.keras.layers.Dense(dense_size)(x)
model = tf.keras.models.Model(inputs=x, outputs=dense)
return model
def process_config(config):
tokens = config.split(",")
values = []
for token in tokens:
token = token.strip()
if token.find("-") == -1:
token = int(token)
values.append(token)
else:
start,end = token.split("-")
start = int(start.strip())
end = int(end.strip())
values = values + list(range(start,end+1))
return values
def evaluate_dense(input_shapes_range, dense_size_range):
for input_shape in input_shapes_range:
for dense_size in dense_size_range:
to_write = open("dense_data.csv", "a+")
model = dense_layer(input_shape, dense_size)
random_input = np.random.randn(1, input_shape)
running_time = profiler(model, random_input)
del model
input_size = "2000"
dense_size = "1000, 4096"
input_size_range = process_config(input_size)
dense_size_range = process_config(dense_size)
evaluate_dense(input_size_range, dense_size_range)
Error trace
File "C:/Users/Dense-layer.py", line 59, in <module>
evaluate_dense(input_size_range, dense_size_range)
File "C:/Users/Dense-layer.py", line 44, in evaluate_dense
model = dense_layer(input_shape, dense_size)
File "C:/Users/Dense-layer.py", line 16, in dense_layer
x = tf.keras.layers.Input((input_dim))
File "C:\Users\learn\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\input_layer.py", line 229, in Input
input_tensor=tensor)
File "C:\Users\learn\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\input_layer.py", line 91, in __init__
batch_input_shape = (batch_size,) + tuple(input_shape)
TypeError: 'int' object is not iterable
input_shape should be a tuple, but input_dim is an integer. You have passed input_dim, and since you have not specified it by name, it considers it as input_shape. So, just specify it by name:
tf.keras.layers.Input(input_dim=input_dim)
Or if you want to specify the shape, use it like:
tf.keras.layers.Input((input_dim,))
Environment: Win10, Python 3.6.8, Tensorflow-gpu 1.13.1
I have a model based on tf.estimator.DNNRegressor which I've trained locally. Now I'm having an issue getting the predict_input_fn function to work with real-time data. I have it working by reading a CSV file with a single entry but would like to pass in a numpy array or some other form of data from memory and have the model make a prediction.
Here's a code snippet:
tf.enable_eager_execution()
model = tf.estimator.DNNRegressor(
hidden_units=[128,128],
feature_columns=feature_columns,
model_dir=model_dir)
def predict_input_fn():
test_array = np.array([-0.057,-0.0569]) # shortened, normally 56 floats
defaults = [tf.float32] * 56
filenames = ['./prediction.csv']
dataset_old = tf.data.experimental.CsvDataset( # this works
filenames=filenames,
record_defaults=defaults)
dataset = tf.data.Dataset.from_tensor_slices(test_array) # this doesn't :(
dataset = dataset.batch(1)
features = tf.data.experimental.get_single_element(dataset)
return dict(zip(feature_names, features)), None
predictor = model.predict(input_fn=predict_input_fn)
print(next(predictor))
I'm getting the following error even though eager execution is enabled:
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.
The data format I'm feeding in for training is 56 x float32 variables as features and 1 x float32 variable as a label. As mentioned it works perfectly when using the tf.data.experimental.CsvDataset() function but for the life of me I can't figure out how to feed in a basic array.
I've tried putting the array in a container array, using the standard make_one_shot_iterator() function on the dataset, using .constant() to try and build a tensor before passing it to tensor_slices. Have gone through the docs and every example I can find but still can't get it working.
Struggling a bit, any help would be greatly appreciated.
Full code below:
import tensorflow as tf
# import tensorflowjs as tfjs
import json
import numpy as np
tf.enable_eager_execution()
model_dir = './dnn3/'
feature_columns = []
feature_names = []
feature_spec = {}
for i in range(56):
feature_columns.append(tf.feature_column.numeric_column(key="jkey"+str(i)))
feature_names.append("jkey"+str(i))
feature_spec["jkey"+str(i)] = tf.FixedLenFeature([1],tf.float32)
def serving_input_receiver_fn():
serialized_tf_example = tf.placeholder(dtype=tf.string,shape=[None],name='input_tensors')
receiver_tensors = {'inputs': serialized_tf_example}
features = tf.parse_example(serialized_tf_example, feature_spec)
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
def train_input_fn():
defaults = [tf.float32] * 57
filenames = ['./training.csv'] # can add to this array
dataset = tf.data.experimental.CsvDataset(
filenames=filenames,
record_defaults=defaults)
dataset = dataset.shuffle(1000000)
dataset = dataset.repeat(1000) # epochs
dataset = dataset.batch(128)
iter = dataset.make_one_shot_iterator()
next = iter.get_next()
features = next[:-1]
label = next[-1]
return dict(zip(feature_names, features)), label
def eval_input_fn():
defaults = [tf.float32] * 57
filenames = ['./evaluation.csv']
dataset = tf.data.experimental.CsvDataset(
filenames=filenames,
record_defaults=defaults)
dataset = dataset.shuffle(10000)
dataset = dataset.repeat(100)
dataset = dataset.batch(100)
iter = dataset.make_one_shot_iterator()
next = iter.get_next()
features = next[:-1]
label = next[-1]
return dict(zip(feature_names, features)), label
def predict_input_fn():
test_array = [[-0.057,-0.0569,-0.0569,-0.0759,-0.0568,-0.0379,-0.0379,-0.0567,-0.0946,0.0378,0.0566,0.0755,0,0,0,0,-0.0189,0,0,0,0,0.0379,0.0378,0.0568,0,0.0378,0,0.0758,0.0379,0.0379,0.057,0.0758,0.0379,0.0381,-0.152,0.2089,-0.0952,0.0568,-0.2485,.4667,-0.0803,-0.0775,-0.0832,-0.054,-0.0989,0.0063,0.0037,-0.0342,0.0007,0.0281,0.0187,-0.0065,0.0423,0.078,-0.0285,-0.0093],[-0.057,-0.0569,-0.0569,-0.0759,-0.0568,-0.0379,-0.0379,-0.0567,-0.0946,0.0378,0.0566,0.0755,0,0,0,0,-0.0189,0,0,0,0,0.0379,0.0378,0.0568,0,0.0378,0,0.0758,0.0379,0.0379,0.057,0.0758,0.0379,0.0381,-0.152,0.2089,-0.0952,0.0568,-0.2485,.4667,-0.0803,-0.0775,-0.0832,-0.054,-0.0989,0.0063,0.0037,-0.0342,0.0007,0.0281,0.0187,-0.0065,0.0423,0.078,-0.0285,-0.0093]]
defaults = [tf.float32] * 56
filenames = ['./prediction.csv']
dataset2 = tf.data.experimental.CsvDataset(
filenames=filenames,
record_defaults=defaults)
test_tensor = tf.constant(test_array)
#test_dict = dict(zip(feature_names, test_array))
dataset = tf.data.Dataset.from_tensor_slices(test_tensor)
dataset = dataset.batch(1)
features = tf.data.experimental.get_single_element(dataset)
return dict(zip(feature_names, features)), None
model = tf.estimator.DNNRegressor(
hidden_units=[128,128],
feature_columns=feature_columns,
model_dir=model_dir)
def train():
for i in range(5):
model.train(input_fn=train_input_fn, steps=10000)
eval_result = model.evaluate(input_fn=eval_input_fn)
print(eval_result)
def train_eval():
train_spec = tf.estimator.TrainSpec(train_input_fn,max_steps=20000)
eval_spec = tf.estimator.EvalSpec(eval_input_fn)
tf.estimator.train_and_evaluate(model,train_spec,eval_spec)
def save():
save_dir = model.export_savedmodel(export_dir_base=model_dir+'saved', serving_input_receiver_fn=serving_input_receiver_fn)
print('Saved to ',save_dir)
def predict():
predictor = model.predict(input_fn=predict_input_fn)
print(next(predictor))
predict()
# tfjs.converters.save_keras_model(model, './tfjs/')
And full error response:
WARNING:tensorflow:From C:\Users\jimba\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Traceback (most recent call last):
File "train.py", line 93, in <module>
predict()
File "train.py", line 91, in predict
print(next(predictor))
File "C:\Users\jimba\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 609, in predict
input_fn, model_fn_lib.ModeKeys.PREDICT)
File "C:\Users\jimba\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 967, in _get_features_from_input_fn
result = self._call_input_fn(input_fn, mode)
File "C:\Users\jimba\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1079, in _call_input_fn
return input_fn(**kwargs)
File "train.py", line 67, in predict_input_fn
return dict(zip(feature_names, features)), None
File "C:\Users\jimba\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 442, in __iter__
"Tensor objects are only iterable when eager execution is "
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn
So I am trying to implement a reinforcement learning algorithm using Evolution Strategy.
The principle is to clone your original model N times (let's say 100 times), apply some noise on those 100 clones, run them, check which ones are giving the best results and use that to update the original model.
Now I am trying to put each of these clones in a different thread and run them all in parallel.
Here is my Worker class :
class WorkerThread(Thread):
def __init__(self, action_dim, img_dim, sigma, sess):
Thread.__init__(self)
#sess = tf.Session()
self.actor = ActorNetwork(sess, action_dim, img_dim)
self.env = Environment()
self.reward = 0
self.N = {}
self.original_model = None
self.sigma = sigma
def setActorModel(self, model):
self.original_model = model
def run(self):
k = 0
for l in self.actor.model.layers:
if len(np.array(l.get_weights())) > 0:
# First generate some noise
shape = (np.array(l.get_weights()[0])).shape
if len(shape) == 2:
self.N[k] = np.random.randn(shape[0], shape[1])
else:
self.N[k] = np.random.randn(shape[0], shape[1], shape[2], shape[3])
# 2nd set weights using original model's weights and noise
la = self.original_model.layers[k]
self.actor.model.layers[k].set_weights((la.get_weights()[0] + self.sigma * self.N[k], la.get_weights()[1]))
k += 1
ob = self.env.reset()
while True:
action = self.actor.predict(np.reshape(ob['image'], (1, 480, 480, 3)))
ob = self.env.step(action[0])
if ob['done']:
self.reward = ob['reward']
break
So each worker thread has it's own model, and when running I set the weights using the original's model weights.
At that point I get the following error
File "/usr/local/lib/python3.6/site-packages/keras/engine/topology.py", line 1219, in set_weights
K.batch_set_value(weight_value_tuples)
File "/usr/local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2365, in batch_set_value
assign_op = x.assign(assign_placeholder)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 594, in assign
return state_ops.assign(self._variable, value, use_locking=use_locking)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 350, in _apply_op_helper
g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 5055, in _get_graph_from_inputs
_assert_same_graph(original_graph_element, graph_element)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 4991, in _assert_same_graph
original_item))
ValueError: Tensor("Placeholder:0", shape=(5, 5, 3, 24), dtype=float32) must be from the same graph as Tensor("conv2d_11/kernel:0", shape=(5, 5, 3, 24), dtype=float32_ref).
In the above code sample I use the same tensorflow session in all the threads. I tried creating a different session for each but I get the same error.
I have little knowledge about tensorflow, does anyone know how to fix that?
You need to use the same graph in all threads. Create a tf.Graph() in your main thread and wrap your per-thread function in "with my_graph.as_default():".