Is it possible to make the PyTorch distributions create their samples directly on GPU.
If I do
from torch.distributions import Uniform, Normal
normal = Normal(3, 1)
sample = normal.sample()
Then sample will be on CPU. Of course it is possible to do sample = sample.to(torch.device("cuda")) to make it on GPU. But is there a way to have the sample go directly to GPU without first creating it on CPU?
PyTorch distributions inherit from Object, not nn.Module so it does not have a to method the put the distribution instance on GPU.
Any ideas?
Distributions use the reparametrization trick. Thus giving size 0 tensors which are on GPU to the distribution constructor works. As follows:
normal = Normal(torch.tensor(0).to(device=torch.device("cuda")), torch.tensor(1).to(device=torch.device("cuda")))
In my case, I'm using a Normal Distribution as my prior in a neural net model. I have a class called class1, for example, and in the init function I have to initiate my prior. However, calling .to('cuda') of an instance of class1 doesn't change the distribution device and causes error in later usages. Therefore, I could have used register buffers to manage it as follows.
class class1(nn.Module):
def __init__(self):
super().__init__()
self.register_buffer("mean", torch.tensor(0.))
self.register_buffer("var", torch.tensor(1.))
def get_dist(self):
return torch.distributions.Normal(self.mean, self.var)
However, I have several priors, and it's not possible to register_buffer a list. So, an option could be initiating distributions in get_dist property unless you don't care about the time complexity of initiating distributions. I decided to define a function for initiating distributions and a try-except in get_dist to handle different states. If the distributions variable is not assigned or on CPU while we expect it to be on GPU, it jumps to except where I initiate the distributions using torch.zeros(..).to(device).
Overall, to handle this error of CPU/GPU device, you need to initiate a distribution using Tensor input parameters with appropriate device. And the main reason is torch.Distribution module hasn't a device attribute unfortunately.
I just came across the same problem, and thanks to the other answers here for the pointers. I want to offer another option if you want a distribution inside a module, which is to override the to method in the module and manually call the to methods on the distribution parameter tensors. I've only tested with Uniform but works well here.
class MyModule(nn.Module):
def __init__(self, ...):
self.rng = Uniform(
low=torch.zeros(3),
high=torch.ones(3)
)
def to(self, *args, **kwargs):
super().to(*args, **kwargs)
self.rng.low = self.rng.low.to(*args, **kwargs)
self.rng.high = self.rng.high.to(*args, **kwargs)
Now you can put your model on the gpu as usual and self.rng.sample() will return a sample on the correct device.
You can solve the problem of "transferring non-parameter/buffer attributes to GPU" by overriding self._apply(self, fn) method of your network. Like this:
def _apply(self, fn):
# apply fn() to your modules
for module in self.children(): # like 'ResNet_backbone'
module._apply(fn)
# apply fn() to your prior
self.prior.attr1 = fn(self.prior.attr1) # like 'MultivariateNormal.loc', need to be Tensor
self.prior.attr2 = fn(self.prior.attr2)
···
self.prior.attrN = fn(self.prior.attrN)
# if we do not use register_buffer(Tensor)
# apply fn() to your non-parameter/buffer attributes
# need to be Tensor too
self.attr1 = fn(self.attr1)
self.attr2 = fn(self.attr2)
···
self.attrN = fn(self.attrN)
Related
Suppose I have following methods:
from torch import device
def run_cuda(device:device, count:int):
...
def gen_noise(device:device, width:int, height:int):
...
If I were to call these, I have to:
device = DEVICE
run_cuda(device, count=8)
gen_noise(device, width=128, height=128)
What I'm trying to achieve is to somehow remove multiple device argument in some way for better readabiliy. So I'd rather:
device = DEVICE
def use_device(device:device):
...
# multiline
device_user = use_device(device)
device_user.run_cuda(count=8)
device_user.gen_noise(width=128, height=128)
# inline
use_device(device).run_cuda(count=8)
use_device(device).gen_noise(width=128, height=128)
obviously I could simply wrap device class and manually define methods in it. But It feels like it would be quite better to define simple use_device(device) function that takes device argument, and pass into following function by 'calling' it.
Maybe something like so:
def use_device(device:device):
# call arbitrary methods that has 'device' argument.
return ExtensionMethodLikeCallable[[device, *args, **kwargs],any]()
Is this possible in Python?
Thank you.
Edit:
Justification for this implementation could be the fact that I could then:
use_device(device).use_dimension(512,512).use_iteration(8).gen_noise()
instead of:
gen_noise(device=device, width=512, height=512, iteration=8)
which one has 'better' readability is admittedly arguable but let's say it's just my personal preference.
I'm trying to train a model in mixed precision. However, I want a few of the layers to be in full precision for stability reasons. How do I force an individual layer to be float32 when using torch.autocast? In particular, I'd like for this to be onnx compileable.
Is it something like:
with torch.cuda.amp.autocast(enabled=False, dtype=torch.float32):
out = my_unstable_layer(inputs.float())
Edit:
Looks like this is indeed the official method. See the torch docs.
I think the motivation of torch.autocast is to automate the reduction of precision (not the increase).
If you have functions that need a particular dtype, you should consider using, custom_fwd
import torch
#torch.cuda.amp.custom_fwd(cast_inputs=torch.complex128)
def get_custom(x):
print(' Decorated function received', x.dtype)
def regular_func(x):
print(' Regular function received', x.dtype)
get_custom(x)
x = torch.tensor(0.0, dtype=torch.half, device='cuda')
with torch.cuda.amp.autocast(False):
print('autocast disabled')
regular_func(x)
with torch.cuda.amp.autocast(True):
print('autocast enabled')
regular_func(x)
autocast disabled
Regular function received torch.float16
Decorated function received torch.float16
autocast enabled
Regular function received torch.float16
Decorated function received torch.complex128
Edit: Using torchscript
I am not sure how much you can rely on this, due to a comment in the documentation. However the comment is apparently outdated.
Here is an example where I trace the model with autocast enabled, feeze it and then I use it and the value is indeed cast to the specified type
class Cast(torch.nn.Module):
#torch.cuda.amp.custom_fwd(cast_inputs=torch.float64)
def forward(self, x):
return x
with torch.cuda.amp.autocast(True):
model = torch.jit.trace(Cast().eval(), x)
model = torch.jit.freeze(model)
x = torch.tensor(0.0, dtype=torch.half, device='cuda')
print(model(x).dtype)
torch.float64
But I suggest you to validate this approach before using it for a serious application.
I am working on a personal project, to code a quadcopter simulation (and control) in Python, as a learning project. I am using the scipy integrator odeint and I am quite disappointing in the long computing time. So I wish to use numba to accelerate my integration. I call odeint every timestep, as I have to create commands after each simulated timestep.
At first, I had issues when my function to integrate (state_dot) was a method of the Quadcopter class. So I made it a separate function, but I am now having problems defining the right types when I decorate my function with #jit. The state_dot function has a dictionary (params) as an input argument (I've read that numba supports dictionaries), but also was a custom class (wind), because my wind model is a method of that class. If I exclude the wind for now, using numba.typed.Dict doesn't seem to work to import the dictionary.
To import the wind object in the function, I've seen the numba type object_ being used, but Python doesn't find a object_ in numba.
I am using numba version 0.45.0, and Python 3.7.
import numpy as np
from scipy.integrate import odeint
from numba import jit, void, float_, int_
import numba
class Quadcopter:
def __init__(self):
# Quad Params
# ---------------------------
mB = 1.2 # mass (kg)
params = {}
params["mB"] = mB
self.params = params
# Initial State
# ---------------------------
self.state = np.zeros(3)
def update(self, t, Ts, cmd, wind):
self.state = odeint(state_dot, self.state, [t,t+Ts], args = (cmd, self.params, wind))[1]
#jit(void(float_[:], float_, float_[:], numba.typed.Dict )) #(nopython = True)
def state_dot(state, t, cmd, params, wind):
# Import Params
# ---------------------------
mB = params["mB"]
# Import State Vector
# ---------------------------
x = state[0]
y = state[1]
z = state[2]
# Motor Dynamics and Rotor forces (Second Order System: https://apmonitor.com/pdc/index.php/Main/SecondOrderSystems)
# ---------------------------
print(cmd)
# Wind Model
# ---------------------------
[velW, qW1, qW2] = wind.randomWind(t)
print(velW)
# State Derivative Vector
# ---------------------------
sdot = np.zeros(3)
sdot[0] = x*t + 0.1
sdot[1] = y*t + 0.1
sdot[2] = z*t + 0.1
return sdot
class Wind:
def __init__(self):
# Normally, average wind would be randomly set here
self.velW_med = 5.0
self.qW1_med = 0.2
self.qW2_med = 0.1
def randomWind(self, t):
# Normally, wind values would be a sine function dependant of current time
velW = self.velW_med
qW1 = self.qW1_med
qW2 = self.qW2_med
return velW, qW1, qW2
# Set time
Ti = 0
Ts = 0.005
Tf = 10
# Initialize quadcopter and wind
quad = Quadcopter()
wind = Wind()
# Simulation
t = Ti
while round(t,3) < Tf:
cmd = np.array([1,2,1,3])
quad.update(t, Ts, cmd, wind)
print(quad.state)
t += Ts
The error received is
Traceback (most recent call last):
File "c:/Users/JOHN-Laptop/Documents/Code Dev/Test/question_quad.py", line 29, in <module>
#jit(void(float_[:], float_, float_[:], numba.typed.Dict )) #(nopython = True)
File "C:\Users\JOHN-Laptop\AppData\Local\Programs\Python\Python37\lib\site-packages\numba\decorators.py", line 186, in wrapper
disp.compile(sig)
File "C:\Users\JOHN-Laptop\AppData\Local\Programs\Python\Python37\lib\site-packages\numba\compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "C:\Users\JOHN-Laptop\AppData\Local\Programs\Python\Python37\lib\site-packages\numba\dispatcher.py", line 676, in compile
args, return_type = sigutils.normalize_signature(sig)
File "C:\Users\JOHN-Laptop\AppData\Local\Programs\Python\Python37\lib\site-packages\numba\sigutils.py", line 48, in normalize_signature
check_type(ty)
File "C:\Users\JOHN-Laptop\AppData\Local\Programs\Python\Python37\lib\site-packages\numba\sigutils.py", line 43, in check_type
"instance, got %r" % (ty,))
TypeError: invalid type in signature: expected a type instance, got <class 'numba.typed.typeddict.Dict'>
The full code can be viewed here : https://github.com/bobzwik/Quadcopter_SimCon/blob/dev_numba/Simulation/quadFiles/quad.py
If I am missing any information, feel free to ask.
EDIT: Changed the link of the full code, to link to another branch.
The first thing I notice is that — at least in the code you've shown here — your jit signature has four types, but the function you're decorating has five arguments:
#jit(void(float_[:], float_, float_[:], numba.typed.Dict))
def state_dot(state, t, cmd, params, wind):
So obviously you need to fix that. The easiest thing to try is to just remove the signature and let numba figure it out:
#jit
def state_dot(state, t, cmd, params, wind):
Of course, even if you do this, numba still complains that it doesn't know how to type everything, and points to the line saying mB = params["mB"]. It still does "loop lifting", which means that it's able to compile some things, but it won't be as fast as possible.
So the second thing to note is that while numba says it supports dicts, but then throws in a lot of caveats. Basically, using a dict still is not a good idea. I also don't see any good reason for you to use a dict. Why not just make mB a member of your class, as in self.mB = mB? I know you'll have more complicated things in your full Quadcopter class, but you can have lots of members.
Now, the third thing to note is that numba has gotten much better since I wrote that gist you pointed out elsewhere, and can now handle classes, so you might want to look into numba.jitclass. Generally, when you pass a jitclass object to a function you're trying to jit, numba will know how to handle it.
But perhaps more important than all of these is that your update method calls odeint for every single step. I would guess that this is the slowest part of your code. That function is meant to be called once, so that it can solve your whole problem from start to finish, and therefore it has lots of (relatively slow) overhead related to understanding the arguments you've passed, allocating memory, initializing things, etc. A better way to do it is to construct a scipy.integrate.ode object to keep everything set up between steps — and keep it around so that you can use the same one between steps. The newer interfaces solve_ivp and RK45 (and similar) are roughly equivalent to odeint and ode, respectively, except that ode has my preferred solver dop853. If you only need one of the OdeSolver subclasses, you might prefer those interfaces. Also note that if you actually change anything in your state between steps, you may need to call set_initial_value again, or things might go wrong without you noticing.
More generally, if you're worried about speed, the best thing you can do is to profile your code. The first step here is to just use %prun in ipython.
I am attempting to write a wrapper for the statsmodels formula API (this is a simplified version, the function does more than this):
import statsmodels.formula.api as smf
def wrapper(formula, data, **kwargs):
return smf.logit(formula, data).fit(**kwargs)
If I give this function to a user, who then attempts to define his/her own function:
def square(x):
return x**2
model = wrapper('y ~ x + square(x)', data=df)
they will receive a NameError because the patsy module is looking in the namespace of wrapper for the function square. Is there a safe, Pythonic way to handle this situation without knowing a priori what the function names are or how many functions will be needed?
FYI: This is for Python 3.4.3.
statsmodels uses the patsy package to parse the formulas and create the design matrix. patsy allows user functions as part of formulas and obtains or evaluates the user function in the user namespace or environment.
as reference see eval_env keyword in http://patsy.readthedocs.org/en/latest/API-reference.html
from_formula is the method of models that implements the formula interface to patsy. It use eval_env to provide the necessary information to patsy, which by default is the calling environment of the user. This can be overwritten by the user with the corresponding keyword argument.
The simplest way to define the eval_env is as an integer that indicates the stacklevel that patsy should use. from_formula is incrementing it to take account of the additional level in the statsmodels methods.
According to the comments, eval_env = 2 will use the next higher level from the level that creates the model, e.g. with model = smf.logit(..., eval_env=2).
This creates the model, calls patsy and creates the design matrix, model.fit() will estimate it and returns the results instance.
if you are willing to use eval to do the heavy lifting of your function you can construct a namespace from the arguments to wrapper and the local variables to the outer frame:
wrapper_code = compile("smf.logit(formula, data).fit(**kwargs)",
"<WrapperFunction>","eval")
def wrapper(formula,data,**kwargs):
outer_frame = sys._getframe(1)
namespace = dict(outer_frame.f_locals)
namespace.update(formula=formula, data=data, kwargs=kwargs, smf=smf)
return eval(wrapper_code,namespace)
I don't really see this as a cheat since it seems to be what logit is doing anyway for it to raise a NameError, and as long as wrapper_code is not modified and there are no name conflicts (like using something called data) this should do what you want.
I manage a fairly large python-based quantum chemistry suite, PyQuante. I'm currently struggling with how to set various defaults so that users can choose among different options at runtime.
For example, I have three different methods for computing electron repulsion integrals. Let's call them a,b,c. I used to simply pick the one I liked best (say, c), and have that hard-wired into the module that computes these integrals.
I have now modified this to use a module, Defaults.py, that contains all such hard-wires. But this is set at compile/install time. I would now like users to be able to override these options at runtime, say, using a .pyquanterc.py file.
In my integral routines, I currently have something like
from Defaults import integral_method
I know about dictionaries, and the .update() method. But I don't know how I would use this in real life. My defaults module looks like
integral_method = c
should I modify the end of Defaults.py to look for a .pythonrc.py file and override these values? E.g.
if os.path.exists('$HOME/.pythonrc.py'): do_something
If so, what should do_something look like?
With your current setup, the user can change the default functions in his scripts quite easily:
import Defaults
Defaults.integral_method = somefunc
If the user adds this to his script, all your modules that use integral_method from Defaults will use somefunc to calculate integrals.
I might do this via a factory class.
class IntegralSolver:
"""
Factory class containing methods for solving integrals.
>>> solver = IntegralSolver("method1")
>>> solver(x)
# solution via method1
Can also be used directly:
>>> IntegralSolver.method2(x)
# solution via method2
"""
def __init__(self, method):
self.__call__ = getattr(self, method)
#staticmethod
def method1(x):
return method1_solution
#staticmethod
def method2(x):
return method2_solution
It really depends on how your user runs the toolset. If they twiddle the python code each time, just setting a block at the top labeled OPTIONS should be good. If they run it off the command line, use the argparse library to allow them to switch options on the command line. Perhaps have it read the options out of a file with configParser to read a default file with your options, and if the user sets it, an additional file with their options.