How to train data in NiftyNet

How to train data in NiftyNet - python

I'm trying to train a network using NiftyNet with my own data (CT images and their corresponding labels). I designed the Net class shortly following some other training with similar sample data, all NiftyNet documentation I could find and parameters of my own data adjusted. But I keep getting this error:
"TypeError: init() got an unexpected keyword argument 'w_initializer'".
I've tried every change I could think of in my config.ini, Net class, etc. But I can't make it work nor find the reason. Can anyone help with this error? Or maybe share some guidelines to train my own network from the beginning so I can at least try to start an alternative from zero and see if I find a way out?
Training command:
! net_segment train -c /home/niftynet/extensions/dense_vnet_TC/config.ini --name dense_vnet_TC.net_TC.MyNet
Some values in config.ini:
[NETWORK]
name = dense_vnet
batch_size = 6
volume_padding_size = 0
window_sampling = resize
[TRAINING]
sample_per_volume = 1
lr = 0.001
loss_type = dense_vnet_TC.dice_hinge.dice
starting_iter = 0
save_every_n = 1000
max_iter = 3001
[INFERENCE]
border = (0, 0, 0)
inference_iter = 3000
output_interp_order = 0
spatial_window_size = (512, 512, 40)
save_seg_dir = ./segmentation_output/
############################ Custom configuration
[SEGMENTATION]
image = ct
label = label
label_normalisation = False
output_prob = False
num_classes = 2
Basics of Net class:
from niftynet.network.base_net import BaseNet
class MyNet(BaseNet):
def __init__(self, num_classes, name='MyNet'):
super(MyNet, self).__init__(num_classes=num_classes, acti_func=acti_func, name=name)
# network specific property
self.hidden_features = 10
def layer_op(self, images, is_training):
# create layer instances
conv_1 = ConvolutionalLayer(self.hidden_features, kernel_size=3, name='conv_input')
conv_2 = ConvolutionalLayer(self.num_classes, kernel_size=1, acti_func=None, name='conv_output')
# apply layer instances
flow = conv_1(images, is_training)
flow = conv_2(flow, is_training)
return flow
End of output, after doing some of the processing as expected:
Traceback (most recent call last): File
"/home/niftynet/bin/net_segment", line 10, in
sys.exit(main()) File "/home/niftynet/lib/python3.6/site- packages/niftynet/init.py",
line 142, in main
app_driver.run(app_driver.app) File "/home/niftynet/lib/python3.6/site-packages/niftynet/engine/application_driver.py",
line 189, in run
is_training_action=self.is_training_action) File "/home/niftynet/lib/python3.6/site- packages/niftynet/engine/application_driver.py",
line 258, in create_graph
application.initialise_network() File "/home/niftynet/lib/python3.6/site-packages/niftynet/application/segmentation_application.py",
line 280, in initialise_network
acti_func=self.net_param.activation_function) TypeError: init() got an unexpected keyword argument 'w_initializer'

I think you need to change this line (based on a similar problem I had):
super(MyNet, self).__init__(num_classes=num_classes, acti_func=acti_func, name=name)
for (just add w_regularizer) :
super(MyNet, self).__init__(num_classes=num_classes, w_regularizer=w_regularizer, acti_func=acti_func, name=name)
if not try also to add it here :
def __init__(self, num_classes, w_regularizer=w_regularizer, name='MyNet'):
I hope it helps.

Related

Why does shap Explainer give KeyError: 'class0'?

I have a pytorch text multiclass classifier model based on a XLMR based architecture. Due to IP reasons I can't share the architecture code. I have tried putting as much detail as I can. Please point out if more information is needed. But it outputs 28 classes from 'class0' to 'class27' with probability scores that add to 1.
I am trying to use shap package to explain the results. I have wrapped my model into huggingface's custom pipeline object and I get the following output for 1 input text:
pipe = CustomPipeline(model = model, tokenizer = base_tokenizer)
output = pipe(list_of_inputs) # list_of_inputs = ['this is test input']
Output:
[[{"label": "class0","score": 0.01500235591083765},{"label": "class1","score": 0.001698049483820796},{"label": "class2","score": 0.0019644589629024267},{"label": "class3","score": 0.0004418794414959848},{"label": "class4","score": 5.9095666074426845e-05},{"label": "class5","score": 0.0007908751722425222},{"label": "class6","score": 0.002379569923505187},{"label": "class7","score": 0.0035733324475586414},{"label": "class8","score": 0.0014360857894644141},{"label": "class9","score": 0.0007365105557255447},{"label": "class10","score": 0.0014471099711954594},{"label": "class11","score": 0.0011013210751116276},{"label": "class12","score": 0.0010048456024378538},{"label": "class13","score": 0.000885132874827832},{"label": "class14","score": 0.0022015925496816635},{"label": "class15","score": 0.0013197452062740922},{"label": "class16","score": 0.0037292027845978737},{"label": "class17","score": 0.004212632775306702},{"label": "class18","score": 0.9481304287910461},{"label": "class19","score": 0.001469381619244814},{"label": "class20","score": 0.0009713817853480577},{"label": "class21","score": 0.0018773127812892199},{"label": "class22","score": 0.0009251375449821353},{"label": "class23","score": 0.0007248060428537428},{"label": "class24","score": 0.00031718137324787676},{"label": "class25","score": 0.0011144360760226846},{"label": "class26","score": 0.0002294857840752229},{"label": "class27","score": 0.00025681318948045373}]]
The output is in same format as the notebook specified by shap package.
Now, when I try to use shap Explainer:
pipe = UDTMPipeline(model = model, tokenizer = base_tokenizer)
explainer = shap.Explainer(pipe)
shap_values = explainer(list_of_inputs)
shap.plots.text(shap_values)
Error:
File "C:\pipeline.py", line 104, in udtm_xai
shap_values = explainer(query_data)
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\explainers\_partition.py", line 136, in __call__
return super().__call__(
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\explainers\_explainer.py", line 266, in __call__
row_result = self.explain_row(
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\explainers\_partition.py", line 161, in explain_row
self._curr_base_value = fm(m00.reshape(1, -1), zero_index=0)[0] # the zero index param tells the masked model what the baseline is
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\utils\_masked_model.py", line 67, in __call__
return self._full_masking_call(masks, batch_size=batch_size)
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\utils\_masked_model.py", line 144, in _full_masking_call
outputs = self.model(*joined_masked_inputs)
File "C:\Users\Miniconda3\envs\dtlr_udtm\lib\site-packages\shap\models\_transformers_pipeline.py", line 35, in __call__
output[i, self.label2id[obj["label"]]] = sp.special.logit(obj["score"]) if self.rescale_to_logits else obj["score"]
KeyError: 'class0'
The code is unable to find 'class0'. In the postprocess function of pipeline class, I read a file containing label mappings, obtain the softmax scores from _forward function of pipeline class and create a dictionary in the final format to send as output:
class CustomPipeline(Pipeline): # Ignore no indentation below, formatting issue in stackoverflow
def _sanitize_parameters(self, **kwargs):
self.mapping_json = json.loads(open("mapping_file.json", "r", encoding = "utf-8").read().strip())
return {}, {}, {}
def preprocess(self, inputs):
inputs_df = pd.DataFrame([inputs], columns = ["Query"])
inference_dataloader = getInferenceDataloader(inputs_df, self.tokenizer, batch_size = 16)
return inference_dataloader
def softmax_with_temp(self, input, t):
ex = torch.exp(input/t)
sum = torch.sum(ex,0)
return ex / sum
def _forward(self, model_inputs):
if torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
final_pred_labels = []
final_scores = []
batch_count = 0
self.model.eval()
for batch in model_inputs:
batch_count += 1
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_input_task = torch.full((b_input_ids.shape[0],), -1, dtype=torch.int32).to(device)
with torch.no_grad():
result = self.model((b_input_ids, b_input_mask, b_input_task))
logits = result
logits = logits.detach().cpu().numpy()
pred_softmax_t = self.softmax_with_temp(torch.from_numpy(logits[0]), 2).numpy()
return pred_softmax_t
def postprocess(self, model_outputs):
output_list = [{"label":"class0", "score":float(model_outputs[0])}]
index = 1
for label in self.mapping_json.keys(): #self.mapping_json contains label names that has been read from a file
output_list.append({"label":label, "score":float(model_outputs[index])}) #model_outputs is a list of 28 floating scores
index += 1
return output_list
Am I missing some to define any label based variable which is why I am getting 'class0' key error?

'Sequential' object has no attribute 'in_features' and 'fc'

I am making the model using the fine-tuning method and the model is
VGG-16. But I got the following error 'Sequential' object has no
attribute 'in_features' I used classifier so I change classifier into
fc but got this error 'Sequential' object has no attribute 'fc'. Can
somebody guide me on what I am doing wrong? I have attached the
screenshot of the error as well.
**ERROR:'Sequential' object has no attribute 'in_features'**
[![enter image description here][1]][1]
Traceback (most recent call last):
File "ct_pretrained.py", line 186, in <module>
model = build_model().cuda()
File "ct_pretrained.py", line 42, in build_model
return models.VGG(is_emr=is_emr)
File "/data/torch/models/vgg.py", line 19, in __init__
num_ftrs = self.axial_model.classifier.in_features
File "/root/miniconda/lib/python3.8/site-packages/torch/nn/modules/module.py",line 778, in __getattr__
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'Sequential' object has no attribute 'in_features'
**ERROR:'VGG' object has no attribute 'fc'**
[![enter image description here][2]][2]
Traceback (most recent call last):
File "ct_pretrained.py", line 186, in <module>
model = build_model().cuda()
File "ct_pretrained.py", line 42, in build_model
return models.VGG(is_emr=is_emr)
File "/data/torch/models/vgg.py", line 19, in __init__
num_ftrs = self.axial_model.fc.in_features
File "/root/miniconda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in __getattr__
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'VGG' object has no attribute 'fc'
import torch
import torch.nn as nn
from torchvision import models
__all__ = ['VGG']
class VGG(nn.Module):
def __init__(self, is_emr=False, mode='sum'):
super().__init__()
self.is_emr = is_emr
self.mode = mode
in_dim = 45
self.axial_model = models.vgg16(pretrained=True)
out_channels = self.axial_model.features[0].out_channels
self.axial_model.features[0] = nn.Conv2d(1, out_channels, kernel_size=7, stride=1, padding=0, bias=False)
self.axial_model.features[3] = nn.MaxPool2d(1)
num_ftrs = self.axial_model.classifier.in_features #error in this line of code
self.axial_model.classifier = nn.Linear(num_ftrs, 15)
self.sa_co_model = models.vgg16(pretrained=True)
self.sa_co_model.features[0] = nn.Conv2d(1, out_channels, kernel_size=7, stride=1, padding=(3,0), bias=False)
self.sa_co_model.features[3] = nn.MaxPool2d(1)
self.sa_co_model.classifier = nn.Linear(num_ftrs, 15)
if self.is_emr:
self.emr_model = EMRModel()
if self.mode == 'concat': in_dim = 90
self.classifier = Classifier(in_dim)
def forward(self, axial, sagittal, coronal, emr):
axial = axial[:,:,:-3,:-3]
sagittal = sagittal[:,:,:,:-3]
coronal = coronal[:,:,:,:-3]
axial_feature = self.axial_model(axial)
sagittal_feature = self.sa_co_model(sagittal)
coronal_feature = self.sa_co_model(coronal)
out = torch.cat([axial_feature, sagittal_feature, coronal_feature], dim=1)
if self.is_emr:
emr_feature = self.emr_model(emr)
if self.mode == 'concat':
out = torch.cat([out, emr_feature], dim=1)
elif self.mode == 'sum':
out += emr_feature
out = self.classifier(out)
return out

The classifier sequential object does not have a variable called in_features. If you want to do it dynamically, you will need to access the layer in the classifier, rather than the entire classifier: num_ftrs = self.axial.model.classifier[0].in_features. This accesses the first layer of the sequential object, namely the one that determines how many features go into the entire sequential object.
However, you can easily replace the classifier layer with another layer by determining the necessary number of features by hand. Looking at the pytorch sourcecode for VGG16, you can see the classifier takes 512 * 7 * 7 features as input.

Issue retreiving 'int' object is not iterable error in KERAS

I am trying to run the following code. The following code runs fine on google colab, however on my system it throws an error. Tensorflow version installed on my system is is 1.12.0 and keras version is 2.2.4. Help is highly appreciated.
def profiler(layer, test_input):
data_input = test_input
start = time.time()
data_input = layer.predict(data_input)
end = time.time() - start
milliseconds = end * 1000
return milliseconds
def dense_layer(input_dim, dense_size):
x = tf.keras.layers.Input((input_dim))
dense = tf.keras.layers.Dense(dense_size)(x)
model = tf.keras.models.Model(inputs=x, outputs=dense)
return model
def process_config(config):
tokens = config.split(",")
values = []
for token in tokens:
token = token.strip()
if token.find("-") == -1:
token = int(token)
values.append(token)
else:
start,end = token.split("-")
start = int(start.strip())
end = int(end.strip())
values = values + list(range(start,end+1))
return values
def evaluate_dense(input_shapes_range, dense_size_range):
for input_shape in input_shapes_range:
for dense_size in dense_size_range:
to_write = open("dense_data.csv", "a+")
model = dense_layer(input_shape, dense_size)
random_input = np.random.randn(1, input_shape)
running_time = profiler(model, random_input)
del model
input_size = "2000"
dense_size = "1000, 4096"
input_size_range = process_config(input_size)
dense_size_range = process_config(dense_size)
evaluate_dense(input_size_range, dense_size_range)
Error trace
File "C:/Users/Dense-layer.py", line 59, in <module>
evaluate_dense(input_size_range, dense_size_range)
File "C:/Users/Dense-layer.py", line 44, in evaluate_dense
model = dense_layer(input_shape, dense_size)
File "C:/Users/Dense-layer.py", line 16, in dense_layer
x = tf.keras.layers.Input((input_dim))
File "C:\Users\learn\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\input_layer.py", line 229, in Input
input_tensor=tensor)
File "C:\Users\learn\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\input_layer.py", line 91, in __init__
batch_input_shape = (batch_size,) + tuple(input_shape)
TypeError: 'int' object is not iterable

input_shape should be a tuple, but input_dim is an integer. You have passed input_dim, and since you have not specified it by name, it considers it as input_shape. So, just specify it by name:
tf.keras.layers.Input(input_dim=input_dim)
Or if you want to specify the shape, use it like:
tf.keras.layers.Input((input_dim,))

Torch Network load does not processed properly

I am trying to make a network using 3x64x64 image at pytorch environment, and it seems that I succeeded in training my network and save it. The network looks like :
class LC_small(nn.Module):
def __init__(self,c_in,c_out = 256):
super(LC_small,self).__init__()
self.conv1 = conv(c_in,64,k=3,stride=1,pad=1)
self.conv2 = conv(64, 128, k=3, stride=2, pad=1)
self.conv3 = conv(128, 128, k=3, stride=1, pad=1)
self.conv4 = conv(128, 128, k=3, stride=2, pad=1)
self.conv5 = conv(128, 128, k=3, stride=1, pad=1)
self.conv6 = conv(128, 256, k=3, stride=2, pad=1)
self.conv7 = conv(256, 256, k=3, stride=1, pad=1)# int(h/8 x w/8 x 256)
self.flat = dense(int(w_rsz/8)*int(h_rsz/8)*256,256)
self.dense1 = dense(256,128,False)
self.dense2 = dense(128,3,False)
def forward(self, input):
out = self.conv1(input)
out = self.conv2(out)
out = self.conv3(out)
out = self.conv4(out)
out = self.conv5(out)
out = self.conv6(out)
out = self.conv7(out)
out = out.view(out.size(0),-1)
out = self.flat(out)
out = self.dense1(out)
out = self.dense2(out)
# print(out.shape)
normal = torch.nn.functional.normalize(out, 2, 1)
return normal
And I saved my model while Training :
for epoch in range(10):
# continue # 현재 Training 됐다고 가정하고
total_loss = 0
route_param = open(route_diffuse+'/netparam.txt','w')
for param in lcnet.state_dict():
route_param.write(str(param)+'\t'+str(lcnet.state_dict()[param].size())+'\n')
for i,data in enumerate(load_LC,0):
input, gtval = data[0].to(dev),data[1].to(dev)
opt.zero_grad()
output = lcnet(input)
loss = crit(output,gtval)
loss.backward()
opt.step()
total_loss +=loss.item()
if i%10 == 9:
print(epoch,i,total_loss/10)
torch.save(lcnet,route_save)
total_loss = 0
However, by the time I try to load the network I made I saw a error message as following :
Traceback (most recent call last):
File "E:/DLPrj/venv/torch_practice.py", line 324, in <module>
ipl,npl = getseqi_np(sq_t,lcnet) # data : 8 x 6 x w x h
File "E:/DLPrj/venv/torch_practice.py", line 133, in getseqi_np
l1 = net_lc(torch.from_numpy(i1r))
File "E:\DLPrj\venv\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "E:/DLPrj/venv/torch_practice.py", line 216, in forward
out = self.conv1(input)
File "E:\DLPrj\venv\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "E:\DLPrj\venv\lib\site-packages\torch\nn\modules\container.py", line 92, in forward
input = module(input)
File "E:\DLPrj\venv\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "E:\DLPrj\venv\lib\site-packages\torch\nn\modules\conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "E:\DLPrj\venv\lib\site-packages\torch\nn\modules\conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 64 3 3 3, but got 3-dimensional input of size [64, 64, 3] instead
After this error pycharm freezes and I cannot re-run this code until I restart the pycharm.
When I train my network, I also get some warning messages :
E:\DLPrj\venv\lib\site-packages\torch\serialization.py:292: UserWarning: Couldn't retrieve source code for container of type LC_small. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
E:\DLPrj\venv\lib\site-packages\torch\serialization.py:292: UserWarning: Couldn't retrieve source code for container of type Sequential. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
E:\DLPrj\venv\lib\site-packages\torch\serialization.py:292: UserWarning: Couldn't retrieve source code for container of type Conv2d. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
E:\DLPrj\venv\lib\site-packages\torch\serialization.py:292: UserWarning: Couldn't retrieve source code for container of type BatchNorm2d. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
E:\DLPrj\venv\lib\site-packages\torch\serialization.py:292: UserWarning: Couldn't retrieve source code for container of type LeakyReLU. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
E:\DLPrj\venv\lib\site-packages\torch\serialization.py:292: UserWarning: Couldn't retrieve source code for container of type Linear. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
I cannot understand why the input size the network takes suddenly changes, or why it saved my network wrong. Please check my problem and thank you very much.

So your first error message is because torch.from_numpy(i1r) has the wrong shape. You need to do
np.expand_dims(i1r.transpose(2,0,1), axis=0)
Then it'll get processed correctly. This is because it expects a batch dimension and you aren't providing one along with the channels being in the first dimension not the last.
As for your second error message it's probably because you incorrectly defined conv, and dense so it messes up when saving the model.

Run multiple clones of a model in parallel

So I am trying to implement a reinforcement learning algorithm using Evolution Strategy.
The principle is to clone your original model N times (let's say 100 times), apply some noise on those 100 clones, run them, check which ones are giving the best results and use that to update the original model.
Now I am trying to put each of these clones in a different thread and run them all in parallel.
Here is my Worker class :
class WorkerThread(Thread):
def __init__(self, action_dim, img_dim, sigma, sess):
Thread.__init__(self)
#sess = tf.Session()
self.actor = ActorNetwork(sess, action_dim, img_dim)
self.env = Environment()
self.reward = 0
self.N = {}
self.original_model = None
self.sigma = sigma
def setActorModel(self, model):
self.original_model = model
def run(self):
k = 0
for l in self.actor.model.layers:
if len(np.array(l.get_weights())) > 0:
# First generate some noise
shape = (np.array(l.get_weights()[0])).shape
if len(shape) == 2:
self.N[k] = np.random.randn(shape[0], shape[1])
else:
self.N[k] = np.random.randn(shape[0], shape[1], shape[2], shape[3])
# 2nd set weights using original model's weights and noise
la = self.original_model.layers[k]
self.actor.model.layers[k].set_weights((la.get_weights()[0] + self.sigma * self.N[k], la.get_weights()[1]))
k += 1
ob = self.env.reset()
while True:
action = self.actor.predict(np.reshape(ob['image'], (1, 480, 480, 3)))
ob = self.env.step(action[0])
if ob['done']:
self.reward = ob['reward']
break
So each worker thread has it's own model, and when running I set the weights using the original's model weights.
At that point I get the following error
File "/usr/local/lib/python3.6/site-packages/keras/engine/topology.py", line 1219, in set_weights
K.batch_set_value(weight_value_tuples)
File "/usr/local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2365, in batch_set_value
assign_op = x.assign(assign_placeholder)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 594, in assign
return state_ops.assign(self._variable, value, use_locking=use_locking)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 350, in _apply_op_helper
g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 5055, in _get_graph_from_inputs
_assert_same_graph(original_graph_element, graph_element)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 4991, in _assert_same_graph
original_item))
ValueError: Tensor("Placeholder:0", shape=(5, 5, 3, 24), dtype=float32) must be from the same graph as Tensor("conv2d_11/kernel:0", shape=(5, 5, 3, 24), dtype=float32_ref).
In the above code sample I use the same tensorflow session in all the threads. I tried creating a different session for each but I get the same error.
I have little knowledge about tensorflow, does anyone know how to fix that?

You need to use the same graph in all threads. Create a tf.Graph() in your main thread and wrap your per-thread function in "with my_graph.as_default():".

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to train data in NiftyNet - python

Related

Why does shap Explainer give KeyError: 'class0'?

'Sequential' object has no attribute 'in_features' and 'fc'

Issue retreiving 'int' object is not iterable error in KERAS

Torch Network load does not processed properly

Run multiple clones of a model in parallel

Categories

Resources