Is there a cost-sensitive loss function implementation in PyTorch?

Is there a cost-sensitive loss function implementation in PyTorch? - python

I would like to implement a cost-sensitive loss function in PyTorch. My two-class training dataset is heavily imbalanced, where 75% of the data are label '0' and only 25% of the data are label '1'.
I am new to PyTorch but my supervisor is adamant that I use it (they have more experience with it).
I found some implementations in Keras, but I am not that strong in coding to be able to port it over to PyTorch.
I have read around to find some resources to create a cost-sensitive loss function.
This paper uses something which I think might work (https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9417097), but I do not understand how the code is implemented despite having access to it here (https://github.com/emadeldeen24/AttnSleep/blob/f993511426900f9fca20594a738bf8bee1116381/utils/util.py).
This website describes the math very detailedly but I do not understand it: https://medium.com/rv-data/how-to-do-cost-sensitive-learning-61848bf4f5e7
Here is an implementation in Keras which I have trouble with converting to PyTorch: https://towardsdatascience.com/fraud-detection-with-cost-sensitive-machine-learning-24b8760d35d9
I also found this implementation in PyTorch, but have trouble with understanding it: https://discuss.pytorch.org/t/dealing-with-imbalanced-datasets-in-pytorch/22596/21
Could you please help me to understand the last link's implementation of the cost-sensitive loss function?
Thank you.

Related

Suggestions for nonparametric machine learning models

I am new to machine learning, but I have decent experience in python. I am faced with a problem: I need to find a machine learning model that would work well to predict the speed of a boat given current environmental and physical conditions. I have looked into Scikit-Learn, Pytorch, and Tensorflow, but I am having trouble finding information on what type of model I should use. I am almost certain that linear regression models would be useless for this task. I have been told that non-parametric regression models would be ideal for this, but I am unable to find many in the Scikit Library. Should I be trying to use regression models at all, or should I be looking more into Neural Networks? I'm open to any suggestions, thanks in advance.

I think multi-linear regression model would work well for your case. I am assuming that the input data is just a bunch of environmental parameters and you have a boat speed corresponding to that. For such problems, regression usually works well. I would not recommend you to use neural networks unless you have a lot of training data and the size of one input data is also quite big.

Tensorflow - What does Training and Prediction mode mean when making a model?

So, I've googled prior to asking this, obviously, however, there doesn't seem to be much mention on these modes directly. Tensorflow documentation mentions "test" mode in passing which, upon further reading, didn't make very much sense to me.
From what I've gathered, my best shot at this is that to reduce ram, when your model is in prediction mode, you just use a pretrained model to make some predictions based on your input?
If someone could help with this and help me understand, I would be extremely grateful.

Training refers to the part where your neural network learns. By learning I mean how your model changes it's weights to improve it's performance on a task given a dataset. This is achieved using the backpropogation algorithm.
Predicting, on the other hand, does not involve any learning. It is only to see how well your model performs after it has been trained. There are no changes made to the model when it is in prediction mode.

In Keras, can I use an arbitrary algorithm as a loss function for a network?

I has been trying to understand this machine learning problem for many days now and it really confuses me, I need some help.
I am trying to train a neural network whose input is an image, and which generates another image as output (it is not a very large image, it is 8x8 pixels). And I have an arbitrary fancy_algorithm() "black box" function that receives the input and prediction of the network (the two images) and outputs a float number that tells how good the output of the network was (calculates a loss). My problem is that I want to train THIS neural network but using the loss generated by the black box algorithm. This problem is confusing me, I researched a lot and I didn't find much about it, it seems like reinforcement learning, but at the same time I'm not sure because it’s not like an agent, but it has some kind of reinforcement at the same time.
In case you need more details to help me just ask. Thanks in advance!

Alright question solved. This is a reinforcement learning problem. I can’t use a gradient based optimization on my black-box loss function which don’t has any gradient. More details here: https://www.reddit.com/r/tensorflow/comments/gekotd/can_i_use_an_arbitrary_algorithm_as_a_loss/

Where can I find the algorithm behind model.predict?

I would like to implement the code for model.predict (https://keras.io/models/model/) in C++. But I am unable to find the exact logic (equations, formula) used in prediction?
For C++, I implemented the source code here: https://github.com/Dobiasd/frugally-deep
but unfortunately could not find the equation behind the predict function. (Frugally deep exports the model as a .json file and does the prediction using the predict function).
Would there be any resources that I could refer to find the equations for model.predict?

model.predict implements a forward pass of the model, so there is no direct equation, the computation is inferred from the computation graph of the model.
So in order to implement the same behavior, you have to do a forward pass through the layers of the model, where each layer implements its own computation, so its not a simple recommendation of use equation X, because its a large set of computational formulas that you have to implement, one for each kind of layer.

Looking at the repo, it appears you're looking for this.

optimizers other than GradientDescentOptimizer tensorflow having zero for gradient value

I am really at wit's end and don't know where else I can ask so I am asking here. I know that my question may not be of the best quality but I am hoping for at least some guidance on the direction I should look to figure out my problem.
I am replicating sci-kit learn's implementation of Elastic Net Multiple Linear Regression in tensorflow and tensorboard as a learning exercise so I can eventually move on to implement and visualize more difficult machine learning algorithms.
I have some code that does a Multiple Linear Regression using the Elastic Net Regularization as the loss function. With gradient descent, it converges to a suboptimal solution compared to sci-kit learn's algorithm. Through some searching, I learned that sci-kit learn initializes weights using the Xavier method, so I did that in tensorflow as well. Performance improved slightly but still was no where close to sklearn. My next improvement was to change the optimizer to attempt to match performance although my research told me that scikit learn uses coordinate descent which is a method that isn't implemented in tensorflow.
However, this is where I am stuck. It seems that simply switching out the optimizer for another optimizer does not seem to work (not that I expected it to, but I'm also having trouble finding material that will tell me how to set up properly). Currently I've simply performed the switch the following way, can anyone give me a hint why my gradients are 0?
Thanks!
# Declare optimizer
my_opt = tf.train.GradientDescentOptimizer(0.001)
my_opt = tf.train.AdamOptimizer(epsilon = 0.1)
Histogram of gradients:
Loss function showing that Adam optimizer isn't doing anything:
EDIT:
I have updated my learning rate to be higher, but convergence still doesn't seem that great. I think I will proceed to try to implement Coordinate Descent in tensorflow to match sci-kit learn's method as close as possible. I've attached an image of the difference for those curious:
In comparison to SGD:

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is there a cost-sensitive loss function implementation in PyTorch? - python

Related

Suggestions for nonparametric machine learning models

Tensorflow - What does Training and Prediction mode mean when making a model?

In Keras, can I use an arbitrary algorithm as a loss function for a network?

Where can I find the algorithm behind model.predict?

optimizers other than GradientDescentOptimizer tensorflow having zero for gradient value

Categories

Resources