I tried to change the device used in theano-based program.
from theano import config
config.device = "gpu1"
However I got error
Exception: Can't change the value of this config parameter after initialization!
I wonder what is the best way of change gpu to gpu1 in code ?
Thanks
Another possibility which worked for me was setting the environment variable in the process, before importing theano:
import os
os.environ['THEANO_FLAGS'] = "device=gpu1"
import theano
There is no way to change this value in code running in the same process. The best you could do is to have a "parent" process that alters, for example, the THEANO_FLAGS environment variable and spawns children. However, the method of spawning will determine which environment the children operate in.
Note also that there is no way to do this in a way that maintains a process's memory through the change. You can't start running on CPU, do some work with values stored in memory then change to running on GPU and continue running using the values still in memory from the earlier (CPU) stage of work. The process must be shutdown and restarted for a change of device to be applied.
As soon as you import theano the device is fixed and cannot be changed within the process that did the import.
Remove the "device" config in .theanorc, then in your code:
import theano.sandbox.cuda
theano.sandbox.cuda.use("gpu0")
It works for me.
https://groups.google.com/forum/#!msg/theano-users/woPgxXCEMB4/l654PPpd5joJ
Related
I'm running the following code in Google Collab (and on Kaggle Notebook).
When running it without pdb.set_trace(), everything works fine.
However, when using pdb.set_trace() and calling "continue/exit", it seems that the array is still stored in memory (memory consumption remains high, by the same size as the array).
from pdb import set_trace # also tried ipdb, IPython.core.debugger
def ccc():
aaa = list(range(50000000))
set_trace()
ccc()
Any ideas?
Thanks in advance.
EDIT
This also occurs when stopping the code execution manually (i.e., KeyboardInterrupt).
I have a server access which has multiple GPUs that can be accessed simultaneously by many users.
I choose only 1 gpu_id from the terminal and have a code like this.
device = "cuda:"+str(FLAGS.gpu_id) if torch.cuda.is_available() else "cpu"
where FLAGS is a parser, parsing arguments from terminal.
Even though I select only one id, I saw that I am using 2 different GPUs. That causes issues, when the other GPU memory is almost full, and my process terminates by throwing "CUDA out of memory" error.
I want to understand, what could be the possible cases for such thing to happen?
It is hard to tell what is wrong without knowing how you use the device parameter. In any case, you can try to achieve what you want with a different approach. Run your python script in the following way:
CUDA_VISIBLE_DEVICES=0 python3 my_code.py
I am using the joblib library to run multiple NN on my multiple CPU at once. the idea is that I want to make a final prediction as the average of all the different NN predictions. I use keras and theano on the backend.
My code works if I set n_job=1 but fails for anything >1.
Here is the error message:
[Parallel(n_jobs=3)]: Using backend ThreadingBackend with 3 concurrent workers.
Using Theano backend.
WARNING (theano.gof.compilelock): Overriding existing lock by dead process '6088' (I am process '6032')
WARNING (theano.gof.compilelock): Overriding existing lock by dead process '6088' (I am process '6032')
The code I use is rather simple (it works for n_job=1)
from joblib import Parallel, delayed
result = Parallel(n_jobs=1,verbose=1, backend="threading")(delayed(myNNfunction)(arguments,i,X_train,Y_train,X_test,Y_test) for i in range(network))
For information (I don't know if this is relevant), this is my parameters for keras:
os.environ['KERAS_BACKEND'] = 'theano'
os.environ["MKL_THREADING_LAYER"] = "GNU"
os.environ['MKL_NUM_THREADS'] = '3'
os.environ['GOTO_NUM_THREADS'] = '3'
os.environ['OMP_NUM_THREADS'] = '3'
I have tried to use the technique proposed here but it didn't change a thing. To be precise I have created a file in C:\Users\myname.theanorc with this in it:
[global]
base_compiledir=/tmp/%(user)s/theano.NOBACKUP
I've read somewhere (I can't find the link sorry) that on windows machines I shouldn't call the file .theanorc.txt but only .theanorc ; in any case it doesn't work.
Would you know what I am missing?
i've got a script that uses the resource-module from python (see http://docs.python.org/library/resource.html for information). Now i want to port this script to windows. is there any alternative version of this (the python-docs are labeling it as "unix only").
if there isn't, is there any other workaround?
I'm using the following method/constant:
resource.getrusage(resource.RUSAGE_CHILDREN)
resource.RLIMIT_CPU
Thank you
PS: I'm using python 2.7 / 3.2
There's no good way of doing this generically for all "Resources"" -- hence why it's a Unix only command. For CPU speed only you can either use registry keys to set the process id limit:
http://technet.microsoft.com/en-us/library/ff384148%28WS.10%29.aspx
As done here:
http://code.activestate.com/recipes/286159/
IMPORTANT: Backup your registry before trying anything with registry
Or you could set the thread priority:
http://msdn.microsoft.com/en-us/library/ms685100%28VS.85%29.aspx
As done here:
http://nullege.com/codes/search/win32process.SetThreadPriority
For other resources you'll have to scrap together similar DLL access APIs to achieve the desired effect. You should first ask yourself if you need this behavior. Oftentimes you can limit CPU time by sleeping the thread in operation at convenient times to allow the OS to swap processes and memory controls can be done problematically to check data structure sizes.
I'm trying to use setfsuid() with python 2.5.4 and RHEL 5.4.
Since it's not included in the os module, I wrapped it in a C module of my own and installed it as a python extension module using distutils.
However when I try to use it I don't get the expected result.
setfsuid() returns value indicating success (changing from a superuser), but I can't access files to which only the newly set user should have user access (using open()), indicating that fsuid was not truely changed.
I tried to verify setfsuid() worked, by running it consecutively twice with the same user input
The result was as if nothing had changed, and on every call the returned value was of old user id different from the new one. I also called getpid() from the module, and from the python script, both returned the same id. so this is not the problem.
Just in case it's significant, I should note that I'm doing all of this from within an Apache daemon process (WSGI).
Anyone can provide an explanation to that?
Thank you
The ability to change the FSUID is limited to either root or non-root processes with the CAP_SETFCAP capability. These days it's usually considered bad practice to run a webserver with root permissions so, most likely, you'll need to set the capability on the file server (see man capabilities for details). Please note that doing this could severly affect your overall system's security. I'd recommend considering spawning a small backend process that runs as root and converses with your WSGI app via a local UNIX socket prior to mucking with the security of a high-profile target like Apache.