I am using Caffe to do image classification, can I am using MAC OS X, Pyhton.
Right now I know how to classify a list of images using Caffe with Spark python, but if I want to make it faster, I want to use Spark.
Therefore, I tried to apply the image classification on each element of an RDD, the RDD created from a list of image_path. However, Spark does not allow me to do so.
Here is my code:
This is the code for image classification:
# display image name, class number, predicted label
def classify_image(image_path, transformer, net):
image = caffe.io.load_image(image_path)
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
output = net.forward()
output_prob = output['prob'][0]
pred = output_prob.argmax()
labels_file = caffe_root + 'data/ilsvrc12/synset_words.txt'
labels = np.loadtxt(labels_file, str, delimiter='\t')
lb = labels[pred]
image_name = image_path.split(images_folder_path)[1]
result_str = 'image: '+image_name+' prediction: '+str(pred)+' label: '+lb
return result_str
This this the code generates Caffe parameters and apply the classify_image method on each element of the RDD:
def main():
sys.path.insert(0, caffe_root + 'python')
caffe.set_mode_cpu()
model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'
model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
net = caffe.Net(model_def,
model_weights,
caffe.TEST)
mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', mu)
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(50,
3,
227, 227)
image_list= []
for image_path in glob.glob(images_folder_path+'*.jpg'):
image_list.append(image_path)
images_rdd = sc.parallelize(image_list)
transformer_bc = sc.broadcast(transformer)
net_bc = sc.broadcast(net)
image_predictions = images_rdd.map(lambda image_path: classify_image(image_path, transformer_bc, net_bc))
print image_predictions
if __name__ == '__main__':
main()
As you can see, here I tried to broadcast the caffe parameters, transformer_bc = sc.broadcast(transformer), net_bc = sc.broadcast(net)
The error is:
RuntimeError: Pickling of "caffe._caffe.Net" instances is not enabled
Before I am doing the broadcast, the error was :
Driver stacktrace.... Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):....
So, do you know, is there any way I can classify images using Caffe and Spark but also take advantage of Spark?
When you work with complex, non-native objects initialization has to moved directly to the workers for example with singleton module:
net_builder.py:
import cafe
net = None
def build_net(*args, **kwargs):
... # Initialize net here
return net
def get_net(*args, **kwargs):
global net
if net is None:
net = build_net(*args, **kwargs)
return net
main.py:
import net_builder
sc.addPyFile("net_builder.py")
def classify_image(image_path, transformer, *args, **kwargs):
net = net_builder.get_net(*args, **kwargs)
It means you'll have to distribute all required files as well. It can be done either manually or using SparkFiles mechanism.
On a side note you should take a look at the SparkNet package.
Related
Below is the order of how I am going to present my problem:
First I will show you the script .py that I am using to run the web app in a local host(flask app). This web app is a classifier which shows you whether a person has either Viral Pneumonia, Bacterial Pneumonia or they are Normal. Thus there are three classes(Viral, Bacterial or Normal) looking from chest x-rays which are in jpeg format.
Second I will show you the differnt .py script for Binary Classification for Pneumonia which is taking in raw dicom files and converting them into numpy arrays before they are diagnosed.
So to achieve diagnosis I am trying to integrate my app.py script which takes in jpegs, with the Pneumonia binary classification which takes in dicom files so as to take advantage of the dicom processing function of the second script but using all of the information and weights of the Viral and Bacterial one that I have, so that it can be used in a clinical setup. Clinical setups use dicom files not jpegs, that is why I am trying to combine these two scripts to reach the goal.
Below is my app.py script for Viral and Bacterial Pneumonia Classification which takes in jpegs, which I am trying to integrate on the other script that I am going to attach further below:
#::: Import modules and packages :::
# Flask utils
from flask import Flask, redirect, url_for, request, render_template
from werkzeug.utils import secure_filename
from gevent.pywsgi import WSGIServer
# Import Keras dependencies
from tensorflow.keras.models import model_from_json
from tensorflow.python.framework import ops
ops.reset_default_graph()
from keras.preprocessing import image
# Import other dependecies
import numpy as np
import h5py
from PIL import Image
import PIL
import os
#::: Flask App Engine :::
# Define a Flask app
app = Flask(__name__)
# ::: Prepare Keras Model :::
# Model files
MODEL_ARCHITECTURE = './model/model_adam.json'
MODEL_WEIGHTS = './model/model_100_eopchs_adam_20190807.h5'
# Load the model from external files
json_file = open(MODEL_ARCHITECTURE)
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
# Get weights into the model
model.load_weights(MODEL_WEIGHTS)
print('Model loaded. Check http://127.0.0.1:5000/')
# ::: MODEL FUNCTIONS :::
def model_predict(img_path, model):
'''
Args:
-- img_path : an URL path where a given image is stored.
-- model : a given Keras CNN model.
'''
IMG = image.load_img(img_path).convert('L')
print(type(IMG))
# Pre-processing the image
IMG_ = IMG.resize((257, 342))
print(type(IMG_))
IMG_ = np.asarray(IMG_)
print(IMG_.shape)
IMG_ = np.true_divide(IMG_, 255)
IMG_ = IMG_.reshape(1, 342, 257, 1)
print(type(IMG_), IMG_.shape)
print(model)
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
predict_x = model.predict(IMG_)
print(predict_x)
prediction = np.argmax(predict_x,axis=1)
print(prediction)
return prediction
# ::: FLASK ROUTES
#app.route('/', methods=['GET'])
def index():
# Main Page
return render_template('index.html')
#app.route('/predict', methods=['GET', 'POST'])
def upload():
# Constants:
classes = {'TRAIN': ['BACTERIA', 'NORMAL', 'VIRUS'],
'VALIDATION': ['BACTERIA', 'NORMAL'],
'TEST': ['BACTERIA', 'NORMAL', 'VIRUS']}
if request.method == 'POST':
# Get the file from post request
f = request.files['file']
# Save the file to ./uploads
basepath = os.path.dirname(__file__)
file_path = os.path.join(
basepath, 'uploads', secure_filename(f.filename))
f.save(file_path)
# Make a prediction
prediction = model_predict(file_path, model)
predicted_class = classes['TRAIN'][prediction[0]]
print('We think that is {}.'.format(predicted_class.lower()))
return str(predicted_class).lower()
if __name__ == '__main__':
app.run(debug = True)`
Below again is the already functioning script of Pneumonia binary classification which is taking in dicom files that I am trying to integrate with the weights and preprocessing information of the Viral and Bacterial classifier that I want to use:
## Loading standard modules and libraries
import numpy as np
import pandas as pd
import pydicom
%matplotlib inline
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.models import model_from_json
from skimage.transform import resize
# This function reads in a .dcm file, checks the important fields for our device, and returns a numpy array
# of just the imaging data
def check_dicom(filename):
print('Loading file {} ...'.format(filename))
ds = pydicom.dcmread(filename)
if (ds.BodyPartExamined !='CHEST') | (ds.Modality !='DX') | (ds.PatientPosition not in ['PA', 'AP']):
print('The image is not valid because the image position, the image type or the body part is not as per standards')
return
else:
print('ID:', ds.PatientID,
'Age:', ds.PatientAge,
'Modality:', ds.Modality,
'Postion: ', ds.PatientPosition,
'Body Part: ', ds.BodyPartExamined,
'Study Desc: ', ds.StudyDescription)
img = ds.pixel_array
return img
# This function takes the numpy array output by check_dicom and
# runs the appropriate pre-processing needed for our model input
def preprocess_image(img,img_mean,img_std,img_size):
# todo
img = resize(img, (224,224))
img = img / 255.0
grey_img = (img - img_mean) / img_std
proc_img = np.zeros((224,224,3))
proc_img[:, :, 0] = grey_img
proc_img[:, :, 1] = grey_img
proc_img[:, :, 2] = grey_img
proc_img = np.resize(proc_img, img_size)
return proc_img
# This function loads in our trained model w/ weights and compiles it
def load_model(model_path, weight_path):
# todo
json_file = open(model_path, 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
model.load_weights(weight_path)
return model
# This function uses our device's threshold parameters to predict whether or not
# the image shows the presence of pneumonia using our trained model
def predict_image(model, img, thresh):
# todo
result = model.predict(img)
print('Predicted value:', result)
predict=result[0]
prediction = "Negative"
if(predict > thresh):
prediction = "Positive"
return prediction
# This function uses our device's threshold parameters to predict whether or not
# the image shows the presence of pneumonia using our trained model
def predict_image(model, img, thresh):
# todo
result = model.predict(img)
print('Predicted value:', result)
predict=result[0]
prediction = "Negative"
if(predict > thresh):
prediction = "Positive"
return prediction
test_dicoms = ['test1.dcm','test2.dcm','test3.dcm','test4.dcm','test5.dcm','test6.dcm']
model_path = "my_model2.json" #path to saved model
weight_path = "xray_class_my_model2.best.hdf5" #path to saved best weights
IMG_SIZE=(1,224,224,3) # This might be different if you did not use vgg16
img_mean = 0.49262813 # mean image value from Build and train model line 22
img_std = 0.24496286 # loads the std dev from Build and train model line 22
my_model = load_model(model_path, weight_path) #loads model
thresh = 0.62786263 #threshold value for New Model2 from Build and train model line 66 at 80% Precision
# use the .dcm files to test your prediction
for i in test_dicoms:
img = np.array([])
img = check_dicom(i)
if img is None:
continue
img_proc = preprocess_image(img,img_mean,img_std,IMG_SIZE)
pred = predict_image(my_model,img_proc,thresh)
print('Model Classification:', pred , 'for Pneumonia' )
print('--------------------------------------------------------------------------------------------------------')
Output of above script:
Loading file test1.dcm ...
ID: 2 Age: 81 Modality: DX Postion: PA Body Part: CHEST Study Desc: No Finding
Predicted value: [[0.4775539]]
Model Classification: Negative for Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test2.dcm ...
ID: 1 Age: 58 Modality: DX Postion: AP Body Part: CHEST Study Desc: Cardiomegaly
Predicted value: [[0.47687072]]
Model Classification: Negative for Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test3.dcm ...
ID: 61 Age: 77 Modality: DX Postion: AP Body Part: CHEST Study Desc: Effusion
Predicted value: [[0.47764364]]
Model Classification: Negative for Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test4.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Loading file test5.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Loading file test6.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Threshold of 0.62786263 is considered at 80% Precision
Below is what I have tried so far but the diagnosis I am getting is always Viral on each and every dicom sample:
## Loading standard modules and libraries
import numpy as np
import pandas as pd
import pydicom
from PIL import Image
#%matplotlib inline
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.models import model_from_json
from keras.preprocessing import image
from skimage.transform import resize
# This function reads in a .dcm file, checks the important fields for our device, and returns a numpy array
# of just the imaging data
def check_dicom(filename):
print('Loading file {} ...'.format(filename))
ds = pydicom.dcmread(filename)
if (ds.BodyPartExamined !='CHEST'): #| (ds.Modality !='DX'): #| (ds.PatientPosition not in ['PA', 'AP']):
print('The image is not valid because the image position, the image type or the body part is not as per standards')
return
else:
print('ID:', ds.PatientID,
'Age:', ds.PatientAge,
'Modality:', ds.Modality,
'Postion: ', ds.PatientPosition,
'Body Part: ', ds.BodyPartExamined,
'Study Desc: ', ds.StudyDescription)
img = ds.pixel_array
return img
# This function takes the numpy array output by check_dicom and
# runs the appropriate pre-processing needed for our model input
def preprocess_image(img):
# todo
#im = np.reshape(img, (342,257 ))
#im = np.arange(257)
#img = Image.fromarray(im)
#img = image.load_img(img).convert('L')
img = resize(img, (342,257))
grey_img = img / 255.0
#grey_img = (img - img_mean) / img_std
proc_img = np.zeros((1,342,257,1))
proc_img[:, :, :, 0] = grey_img
#proc_img[:, :, :, 1] = grey_img
#proc_img[:, :, :, 2] = grey_img
proc_img = proc_img.reshape(1, 342, 257, 1)
return proc_img
# This function loads in our trained model w/ weights and compiles it
def load_model(model_path, weight_path):
# todo
json_file = open(model_path, 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
model.load_weights(weight_path)
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
return model
# This function uses our device's threshold parameters to predict whether or not
# the image shows the presence of pneumonia using our trained model
def predict_image(model, img):
# todo
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
#x = np.expand_dims(img, axis=0)
predict_x= model.predict(img)
print(predict_x)
prediction = np.argmax(predict_x,axis=1)
print(prediction)
return prediction
test_dicoms = ['test3.dcm','test2.dcm','test1.dcm','test4.dcm','test5.dcm','test6.dcm']
model_path = "model_adam.json" #path to saved model
weight_path = "model.h5" #path to saved best weights
#IMG_SIZE=(1,342,257,1) # This might be different if you did not use vgg16
#img_mean = 0.49262813 # mean image value from Build and train model line 22
#img_std = 0.24496286 # loads the std dev from Build and train model line 22
#my_model = load_model(model_path, weight_path) #loads model
#thresh = 0.62786263 #threshold value for New Model2 from Build and train model line 66 at 80% Precision
# use the .dcm files to test your prediction
for i in test_dicoms:
img = np.array([])
img = check_dicom(i)
if img is None:
continue
classes = {'TRAIN': ['BACTERIAL', 'NORMAL', 'VIRAL'],
'VALIDATION': ['BACTERIA', 'NORMAL'],
'TEST': ['BACTERIA', 'NORMAL', 'VIRUS']}
img_proc = preprocess_image(img)
prediction = predict_image(load_model(model_path, weight_path),img_proc)
predicted_class = classes['TRAIN'][int(prediction[0])]
print('Model Classification:', predicted_class, 'Pneumonia' )
print('--------------------------------------------------------------------------------------------------------')
Below is the output:
2022-01-02 10:50:00.817561: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-02 10:50:00.817601: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Loading file test3.dcm ...
ID: 61 Age: 77 Modality: DX Postion: AP Body Part: CHEST Study Desc: Effusion
2022-01-02 10:50:02.652828: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-01-02 10:50:02.652859: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-01-02 10:50:02.652899: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (Wisdom-HP-250-G3-Notebook-PC): /proc/driver/nvidia/version does not exist
2022-01-02 10:50:02.653123: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[[0.01132523 0.00254696 0.98612785]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test2.dcm ...
ID: 1 Age: 58 Modality: DX Postion: AP Body Part: CHEST Study Desc: Cardiomegaly
[[0.01112939 0.00251635 0.9863543 ]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test1.dcm ...
ID: 2 Age: 81 Modality: DX Postion: PA Body Part: CHEST Study Desc: No Finding
[[0.01128576 0.00255111 0.9861631 ]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test4.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Loading file test5.dcm ...
ID: 2 Age: 81 Modality: CT Postion: PA Body Part: CHEST Study Desc: No Finding
[[0.01128576 0.00255111 0.9861631 ]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test6.dcm ...
ID: 2 Age: 81 Modality: DX Postion: XX Body Part: CHEST Study Desc: No Finding
WARNING:tensorflow:5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fba38ed19d0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating #tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your #tf.function outside of the loop. For (2), #tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.
[[0.01128576 0.00255111 0.9861631 ]]
[2]
Model Classification: VIRAL Pneumonia
---------------------------------------
My suspicion is that I did it wrong on the image preprocessing steps when I have integrated these two scripts (Remember: The goal is to take advantage of the Dicom reading function of the second script). Thus the model is taking in and predicting wrong input altogether due to wrong array arrangements on trying to preprocess when I have integrated these two scripts.
If in need of some information on parameters in the jupyter training presentation of the model kindly highlight.
When a classifier work okay in train/test but not when doing inference in production, a very common reason is that the training data was processed differently from the production data. The fix is to make sure it is processed the same, ideally using the same bit of code.
How were the jpegs the classifier was trained on processed? Do the originally come from dicoms? If yes, what was the exact code for the conversion?
How were the jpegs loaded during training? Pay special attention to bits that modify the data rather than merely copy it, such as grey_img = (img - img_mean) / img_std and the other commented out lines in your code (maybe they were not commented out during training)
If you copy the dicom->jpeg conversion from 1 and the jpeg loading from 2, you will probably have a working prediction
The below dicom to jpeg conversion function did the job for me:
def take_dicom(dicomname):
ds = read_file('Dicom_files/' + dicomname)
im = fromarray(ds.pixel_array)
final_img = im.save('./Jpeg/' + dicomname + '.jpg')
pure_jpg = dicomname + '.jpg'
return pure_jpg
Just had to use the os function to point my prediction function to where it should pick these jpegs before they are preprocessed and diagnosed:
def preprocess_image(pure_jpg):
'''
Args:
-- img_path : an URL path where a given image is stored.
-- model : a given Keras CNN model.
'''
#print(pure_jpg)
basepath = os.path.dirname('./Jpeg/')
file_path = os.path.join(
basepath, img)
#image = take_dicom(file_path)
#print(str(image))
IMG = image.load_img(file_path).convert('L')
#print(IMG)
#print(type(IMG))
# Pre-processing the image
IMG_ = IMG.resize((257, 342))
#print(type(IMG_))
IMG_ = np.asarray(IMG_)
#print(IMG_.shape)
IMG_ = np.true_divide(IMG_, 255)
IMG_ = IMG_.reshape(1, 342, 257, 1)
#print(type(IMG_), IMG_.shape)
return IMG_
However, the problem is that it's only working for the following two dicom imaging modalities:
DX (Digital X-Ray)
CT (Computed Tormography)
CR (Computed Radiography) dicom images are failing to convert.
I'm working with MRI images and I'd like to use from_tensor_slices to preprocess the paths but I don't know how to use that properly. Below are my code, the problem message and link for the dataset.
First I rearrange my data. 484 images and 484 labels
image_data_path = './drive/MyDrive/Brain Tumour/Task01_BrainTumour/imagesTr/'
label_data_path = './drive/MyDrive/Brain Tumour/Task01_BrainTumour/labelsTr/'
image_paths = [image_data_path + name
for name in os.listdir(image_data_path)
if not name.startswith(".")]
label_paths = [label_data_path + name
for name in os.listdir(label_data_path)
if not name.startswith(".")]
image_paths = sorted(image_paths)
label_paths = sorted(label_paths)
Then, the function to load 1 example (I use nibabel to load nii files)
def load_one_sample(image_path, label_path):
image = nib.load(image_path).get_fdata()
image = tf.convert_to_tensor(image, dtype = 'float32')
label = nib.load(label_path).get_fdata()
label = tf.convert_to_tensor(label, dtype = 'uint8')
return image, label
Next, I tried using from_tensor_slices
image_filenames = tf.constant(image_paths)
label_filenames = tf.constant(label_paths)
dataset = tf.data.Dataset.from_tensor_slices((image_filenames, label_filenames))
all_data = dataset.map(load_one_sample)
And the error comes: TypeError: stat: path should be string, bytes, os.PathLike or integer, not Tensor
What can be wrong and how can I fix it?
Datalink: https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2 (task 1 - Brain Tumour)
Please tell me if you need more information.
nib.load is not a TensorFlow function.
If you want to use anything in tf.data pipeline that is not a TensorFlow function then you have to wrap it using a tf.py_function.
Code:
image_data_path = 'Task01_BrainTumour/imagesTr/'
label_data_path = 'Task01_BrainTumour/labelsTr/'
image_paths = [image_data_path + name
for name in os.listdir(image_data_path)
if not name.startswith(".")]
label_paths = [label_data_path + name
for name in os.listdir(label_data_path)
if not name.startswith(".")]
image_paths = sorted(image_paths)
label_paths = sorted(label_paths)
def load_one_sample(image_path, label_path):
image = nib.load(image_path.numpy().decode()).get_fdata()
image = tf.convert_to_tensor(image, dtype = 'float32')
label = nib.load(label_path.numpy().decode()).get_fdata()
label = tf.convert_to_tensor(label, dtype = 'uint8')
return image, label
def wrapper_load(img_path, label_path):
img, label = tf.py_function(func = load_one_sample, inp = [img_path, label_path], Tout = [tf.float32, tf.uint8])
return img, label
dataset = tf.data.Dataset.from_tensor_slices((image_paths, label_paths)).map(wrapper_load)
The error is not due to the from_tensor_slices function but arises as nibs.load is expecting a string but gets a tensor.
However, a better way would be to create tfrecords and use them to train the model.
I have created a custom encoder/decoder like so:
import tensorflow as tf
from tensorflow_model_optimization.python.core.internal import tensor_encoding as te
# noinspection PyUnresolvedReferences
class SparseTernaryCompressionEncodingStage(te.core.EncodingStageInterface):
AVERAGE = 'average'
NEGATIVES = 'negatives'
POSITIVES = 'positives'
TESTING = 'testing'
NEW_SHAPE = 'new_shape'
ORIGINAL_SHAPE = 'original_shape'
def name(self):
pass
def compressible_tensors_keys(self):
pass
def commutes_with_sum(self):
pass
def decode_needs_input_shape(self):
pass
def get_params(self):
pass
def encode(self, original_tensor, encode_params):
original_shape = tf.shape(original_tensor)
tensor = tf.reshape(original_tensor, [-1])
sparsification_rate = int(len(tensor) / 100 * 1)
new_shape = tensor.get_shape().as_list()
if sparsification_rate == 0:
sparsification_rate = 1
mask = tf.cast(tf.abs(tensor) >= tf.math.top_k(tf.abs(tensor), sparsification_rate)[0][-1], tf.float32)
inv_mask = tf.cast(tf.abs(tensor) < tf.math.top_k(tf.abs(tensor), sparsification_rate)[0][-1], tf.float32)
tensor_masked = tf.multiply(tensor, mask)
average = tf.reduce_sum(tf.abs(tensor_masked)) / sparsification_rate
compressed_tensor = tf.add(tf.multiply(average, mask) * tf.sign(tensor), tf.multiply(tensor_masked, inv_mask))
negatives = tf.where(compressed_tensor < 0)
positives = tf.where(compressed_tensor > 0)
encoded_x = {self.AVERAGE: average, self.NEGATIVES: negatives, self.POSITIVES: positives,
self.NEW_SHAPE: new_shape, self.ORIGINAL_SHAPE: original_shape}
return encoded_x
def decode(self, encoded_tensors, decode_params, num_summands=None, shape=None):
decompressed_tensor = tf.zeros(self.NEW_SHAPE, tf.float32)
average_values_negative = tf.fill([len(self.NEGATIVES), ], -self.AVERAGE)
average_values_positive = tf.fill([len(self.POSITIVES), ], self.AVERAGE)
decompressed_tensor = tf.tensor_scatter_nd_update(decompressed_tensor, self.NEGATIVES, average_values_negative)
decompressed_tensor = tf.tensor_scatter_nd_update(decompressed_tensor, self.POSITIVES, average_values_positive)
decompressed_tensor = tf.reshape(decompressed_tensor, self.ORIGINAL_SHAPE)
return decompressed_tensor
Now, i would like to use the encode function to encode all the weights that the client send to the server and, on the server, use the decode function to be able to obtain all the weights back. Basically, instead of sending all the weights from the client to the server, i want to send only some necessaries information that will let me able to create the weights back from only 5 informations.
The problem is that i don't understand how to tell the client to use this encoder to send the information and to the server to use the decoder before trying to do:
round_model_delta = tff.federated_mean(client_outputs.weights_delta, weight=weight_denom)
I'm using Tensorflow Federated simple_fedavg as basic project.
If you only want to modify the aggregation, you may have easier time using the tff.learning APIs with what you have, parameterizing the aggregation with a tff.aggregators object. For instance:
te.core.EncoderComposer(te.testing.PlusOneOverNEncodingStage()).make()
def encoder_fn(value_spec):
return te.encoders.as_gather_encoder(
te.core.EncoderComposer(SparseTernaryCompressionEncodingStage()).make(),
value_spec)
tff.learning.build_federated_averaging_process(
..., # Other args.
model_update_aggregation_factory=tff.aggregators.EncodedSumFactory(
encoder_fn))
You may also find these tutorials helpful:
https://www.tensorflow.org/federated/tutorials/tuning_recommended_aggregators
https://www.tensorflow.org/federated/tutorials/custom_aggregators
I have a python script that do deepfake stuff, and I need to execute that script into a UI program, I've tried to write it as a program, and have some issues
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using IronPython.Hosting;
using Microsoft.Scripting.Hosting;
namespace DeeepSliz
{
public static class Program
{
[STAThread]
static void Main()
{
Application.SetHighDpiMode(HighDpiMode.SystemAware);
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run(new Form1());
}
public static void Swagger()
{
var engine = Python.CreateEngine();
var script = #"C:\sliz\demo.py";
var sourse = engine.CreateScriptSourceFromFile(script);
var argv = new List<string>();
argv.Add("");
argv.Add("--lol");
engine.GetSysModule().SetVariable("argv", argv);
///
var eIO = engine.Runtime.IO;
///
var errors = new MemoryStream();
eIO.SetErrorOutput(errors, Encoding.Default);
var result = new MemoryStream();
eIO.SetOutput(errors, Encoding.Default);
///
var scope = engine.CreateScope();
sourse.Execute(scope);
///
string str(byte[] x) => Encoding.Default.GetString(x);
Console.WriteLine("ERRORS:");
Console.WriteLine(str(errors.ToArray()));
Console.WriteLine();
Console.WriteLine("Results;");
Console.WriteLine(str(result.ToArray()));
}
}
}
This is how it looks like, and i wrote a button to execute that code
private void button3_Click(object sender, EventArgs e)
{
Program.Swagger();
}
and when i start the program, and click "button3" this happend, and tihs
and ofc the python script (that works normaly)
import matplotlib
matplotlib.use('Agg')
import os, sys
import yaml
import eel
from argparse import ArgumentParser
from tqdm import tqdm
import imageio
import numpy as np
from skimage.transform import resize
from skimage import img_as_ubyte
import torch
from sync_batchnorm import DataParallelWithCallback
from modules.generator import OcclusionAwareGenerator
from modules.keypoint_detector import KPDetector
from animate import normalize_kp
from scipy.spatial import ConvexHull
'''eel.init('web')
eel.start('main.html', size=(700, 700))'''
if sys.version_info[0] < 3:
raise Exception("You must use Python 3 or higher. Recommended version is Python 3.7")
def load_checkpoints(config_path, checkpoint_path, cpu=False):
with open(config_path) as f:
config = yaml.load(f)
generator = OcclusionAwareGenerator(**config['model_params']['generator_params'],
**config['model_params']['common_params'])
if not cpu:
generator.cuda()
kp_detector = KPDetector(**config['model_params']['kp_detector_params'],
**config['model_params']['common_params'])
if not cpu:
kp_detector.cuda()
if cpu:
checkpoint = torch.load(checkpoint_path, map_location=torch.device('cpu'))
else:
checkpoint = torch.load(checkpoint_path)
generator.load_state_dict(checkpoint['generator'])
kp_detector.load_state_dict(checkpoint['kp_detector'])
if not cpu:
generator = DataParallelWithCallback(generator)
kp_detector = DataParallelWithCallback(kp_detector)
generator.eval()
kp_detector.eval()
return generator, kp_detector
def make_animation(source_image, driving_video, generator, kp_detector, relative=True,
adapt_movement_scale=True, cpu=False):
with torch.no_grad():
predictions = []
source = torch.tensor(source_image[np.newaxis].astype(np.float32)).permute(0, 3, 1, 2)
if not cpu:
source = source.cuda()
driving = torch.tensor(np.array(driving_video)[np.newaxis].astype(np.float32)).permute(0,
4, 1, 2, 3)
kp_source = kp_detector(source)
kp_driving_initial = kp_detector(driving[:, :, 0])
for frame_idx in tqdm(range(driving.shape[2])):
driving_frame = driving[:, :, frame_idx]
if not cpu:
driving_frame = driving_frame.cuda()
kp_driving = kp_detector(driving_frame)
kp_norm = normalize_kp(kp_source=kp_source, kp_driving=kp_driving,
kp_driving_initial=kp_driving_initial,
use_relative_movement=relative,
use_relative_jacobian=relative,
adapt_movement_scale=adapt_movement_scale)
out = generator(source, kp_source=kp_source, kp_driving=kp_norm)
predictions.append(np.transpose(out['prediction'].data.cpu().numpy(), [0, 2, 3, 1])
[0])
return predictions
def find_best_frame(source, driving, cpu=False):
import face_alignment
def normalize_kp(kp):
kp = kp - kp.mean(axis=0, keepdims=True)
area = ConvexHull(kp[:, :2]).volume
area = np.sqrt(area)
kp[:, :2] = kp[:, :2] / area
return kp
fa = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, flip_input=True,
device='cpu' if cpu else 'cuda')
kp_source = fa.get_landmarks(255 * source)[0]
kp_source = normalize_kp(kp_source)
norm = float('inf')
frame_num = 0
for i, image in tqdm(enumerate(driving)):
kp_driving = fa.get_landmarks(255 * image)[0]
kp_driving = normalize_kp(kp_driving)
new_norm = (np.abs(kp_source - kp_driving) ** 2).sum()
if new_norm < norm:
norm = new_norm
frame_num = i
return frame_num
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("--config", required=True, help="path to config")
parser.add_argument("--checkpoint", default='vox-cpk.pth.tar', help="path to checkpoint to
restore")
parser.add_argument("--source_image", default='sup-mat/source.png', help="path to source
image")
parser.add_argument("--driving_video", default='sup-mat/source.png', help="path to driving
video")
parser.add_argument("--result_video", default='result.mp4', help="path to output")
parser.add_argument("--relative", dest="relative", action="store_true", help="use relative
or absolute keypoint coordinates")
parser.add_argument("--adapt_scale", dest="adapt_scale", action="store_true", help="adapt
movement scale based on convex hull of keypoints")
parser.add_argument("--find_best_frame", dest="find_best_frame", action="store_true",
help="Generate from the frame that is the most aligned with source. (Only
for faces, requires face_alignment lib)")
parser.add_argument("--best_frame", dest="best_frame", type=int, default=None,
help="Set frame to start from.")
parser.add_argument("--cpu", dest="cpu", action="store_true", help="cpu mode.")
parser.set_defaults(relative=False)
parser.set_defaults(adapt_scale=False)
opt = parser.parse_args()
source_image = imageio.imread(opt.source_image)
reader = imageio.get_reader(opt.driving_video)
fps = reader.get_meta_data()['fps']
driving_video = []
try:
for im in reader:
driving_video.append(im)
except RuntimeError:
pass
reader.close()
source_image = resize(source_image, (256, 256))[..., :3]
driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video]
generator, kp_detector = load_checkpoints(config_path=opt.config,
checkpoint_path=opt.checkpoint, cpu=opt.cpu)
if opt.find_best_frame or opt.best_frame is not None:
i = opt.best_frame if opt.best_frame is not None else find_best_frame(source_image,
driving_video, cpu=opt.cpu)
print ("Best frame: " + str(i))
driving_forward = driving_video[i:]
driving_backward = driving_video[:(i+1)][::-1]
predictions_forward = make_animation(source_image, driving_forward, generator,
kp_detector, relative=opt.relative, adapt_movement_scale=opt.adapt_scale, cpu=opt.cpu)
predictions_backward = make_animation(source_image, driving_backward, generator,
kp_detector, relative=opt.relative, adapt_movement_scale=opt.adapt_scale, cpu=opt.cpu)
predictions = predictions_backward[::-1] + predictions_forward[1:]
else:
predictions = make_animation(source_image, driving_video, generator, kp_detector,
relative=opt.relative, adapt_movement_scale=opt.adapt_scale, cpu=opt.cpu)
imageio.mimsave(opt.result_video, [img_as_ubyte(frame) for frame in predictions], fps=fps)
idk how that fix, pls help.
Caveat: I haven't done this myself, my answer is entirely from Googling.
The error is saying that only one keyword argument is allowed.
This leads me to think that
OcclusionAwareGenerator(**config['model_params']['generator_params'], **config['model_params']['common_params'])
and
KPDetector(**config['model_params']['kp_detector_params'], **config['model_params']['common_params'])
are not valid in IronPython. You may need to merge the dictionaries and pass the combined one in with the ** syntax.
Using the second case as the basis for example purposes, you can use the copy package to create a new dictionary and then populate it with the values of the two existing ones:
import copy
params = copy.deepcopy(config['model_params']['kp_detector_params'])
params.update(config['model_params']['common_params'])
KPDetector(**params)
The deep copy is usually the safest, but copy.copy is an option as well and (based on a few assumptions) will likely not cause any issues.
Another, possibly simpler, option is to use collections.ChainMap to provide a combined view of the two dictionaries:
from collections import ChainMap
KPDetector(**ChainMap(config['model_params']['kp_detector_params'], config['model_params']['common_params']))
I'm fighting with TensorRT (TensorRT 4 for python right now) since several weeks. I passed a lot of problems to get TensorRT running. The example code from NVIDIA works well for me :
TensorRT MNIST example
Now, i created my own network in tensorflow (a very simple one) for upscaling images, let's say (in HWC) 320x240x3 into 640x480x3 .The usual way by creating a frozen-graph and running an inferencer just based on Tensorflow gave me expected results but not by using TensorRT.
I have a strange feeling about that i made something wrong by feeding the images into the GPU-memory (This would be probably an issue about pycuda and/or TensorRT).
The worst case scenario would be that TensorRT destroys my network by the optimization process.
I hope someone has just a little idea for saving my life.
This is my Tensorflow-model (i just wrapped the functions):
net = conv2d(input,
64,
k_size=3,
activation=tf.nn.relu,
name='conv1')
net = deconv2d(net,
3,
k_size=5,
activation=tf.tanh,
stride=self.params.resize_factor,
scale=self.params.resize_factor,
name='deconv')
This is the important snippet of my inferencer:
import tensorrt as trt
import uff
from tensorrt.parsers import uffparser
import pycuda.driver as cuda
import numpy as np
...
def _init_infer(self, uff_model):
g_logger = trt.infer.ConsoleLogger(trt.infer.LogSeverity.ERROR)
parser = uffparser.create_uff_parser()
parser.register_input(self.input_node, (self.channels, self.height, self.width), 0)
parser.register_output(self.output_node)
self.engine = trt.utils.uff_to_trt_engine(g_logger, uff_model, parser, self.max_batch_size,
self.max_workspace_size)
parser.destroy()
self.runtime = trt.infer.create_infer_runtime(g_logger)
self.context = self.engine.create_execution_context()
self.output = np.empty(self.output_size, dtype=self.dtype)
# create CUDA stream
self.stream = cuda.Stream()
# allocate device memory
self.d_input = cuda.mem_alloc(self.channels * self.max_batch_size * self.width *
self.height * self.output.dtype.itemsize)
self.d_output = cuda.mem_alloc(self.output_size * self.output.dtype.itemsize)
self.bindings = [int(self.d_input), int(self.d_output)]
def infer(self, input_batch, batch_size=1):
# transfer input data to device
cuda.memcpy_htod_async(self.d_input, input_batch, self.stream)
# execute model
self.context.enqueue(batch_size, self.bindings, self.stream.handle, None)
# transfer predictions back
cuda.memcpy_dtoh_async(self.output, self.d_output, self.stream)
# synchronize threads
self.stream.synchronize()
return self.output
And the executable snippet:
...
# create trt inferencer
trt_inferencer = TensorRTInferencer(params=params)
img = [misc.imread('./test_images/lion.png')]
img[0] = normalize(img[0])
img = img[0]
# inferencing method
result = trt_inferencer.infer(img)
result = inormalize(result, dtype=np.uint8)
result = result.reshape(1, params.height * 2, params.width * 2, 3)
...
And the weird result by comparison :(
upscaled lion TensorRT, Tensorflow, Original
I got it now, finally. The problem was a wrong dimension and order of the input images and output. And for everyone who run into the same problem, this is the adopted executable snippet, dependent on my initialization:
...
# create trt inferencer
trt_inferencer = TensorRTInferencer(params=params)
img = [misc.imread('./test_images/lion.png')]
img[0] = normalize(img[0])
img = img[0]
img = np.transpose(img, (2, 0, 1))
img = img.ravel()
# inferencing method
result = trt_inferencer.infer(img)
result = inormalize(result, dtype=np.uint8)
result = np.reshape(result, newshape=[3, params.height * 2, params.width * 2])
result = np.transpose(result, (1, 2, 0))
...