How configure theano on Windows? - python

I have Installed Theano on Windows machine and followed the configuration instructions.
I placed the following .theanorc.txt file in C:\Users\my_username folder:
#!sh
[global]
device = gpu
floatX = float32
[nvcc]
fastmath = True
# flags=-m32 # we have this hard coded for now
[blas]
ldflags =
# ldflags = -lopenblas # placeholder for openblas support
I tried to run the test, but haven't managed to run it on GPU. I guess the values from .theanorc.txt are not read, because I added the line print config.device and it outputs "cpu".
Below is the basic test script and the output:
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
print config.device
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print f.maker.fgraph.toposort()
t0 = time.time()
for i in xrange(iters):
r = f()
t1 = time.time()
print 'Looping %d times took' % iters, t1 - t0, 'seconds'
print 'Result is', r
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print 'Used the cpu'
else:
print 'Used the gpu'
output:
pydev debugger: starting (pid: 9564)
cpu
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
Looping 1000 times took 10.0310001373 seconds
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 2.29967753
1.62323285]
Used the cpu
I have installed CUDA Toolkit successfully but haven't managed to install pyCUDA. I guess Theano should work without pyCUDA installed anyway.
I would be very thankful if anyone could help out solving this problem. I have followed these instructions but don't know why the configuration values in the program don't match the values in .theanorc.txt file.

Contrary to what has been said on a couple of pages, my installation (Windows 10, Python 2.7, Theano 0.10.0.dev1) would not interpret config instructions within a .theanorc.txt file in my user profile folder, but would read a .theanorc file.
If you are having trouble creating a file with that style of name, use the following commands at a terminal:
cd %USERPROFILE%
type NUL > .theanorc
Sauce: http://ankivil.com/making-theano-faster-with-cudnn-and-cnmem-on-windows-10/

You are right that Theano does not need PyCUDA.
It is strange that Theano does not read your configuration file. The exact path that gets read is this. Just run this in Python and you'll see where to put it:
os.path.expanduser('~/.theanorc.txt')

Try to change the content in .theanorc.txt as indicating by Theano website ( http://deeplearning.net/software/theano/install_windows.html). The path needs to be changed accordingly based on your installation.
[global]
floatX = float32
device = gpu
[nvcc]
flags=-LC:\Users\cchan\Anaconda3\libs
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin

Related

Conda Numba Cuda: libNVVM cannot be found

My development environment is: Ubuntu 18.04.5 LTS, Python3.6 and I have installed via conda (numba and cudatoolkit). Nvidia GPU GeForce GTX 1050 Ti, which is supported by cuda.
The installation of conda and numba seem to work as intended as I can import numba within python3.6 scripts.
The problem seems identical situation to the question asked here: Cuda: library nvvm not found
but none of the proposed solutions seem to work in my case, and I'm not sure how to highlight my situation properly (I can't do it through an answer in the other thread...). If raising a duplicate of the question is inappropriate, then guide me to proper conduct.
When I try to run the code below I get the following error: numba.cuda.cudadrv.error.NvvmSupportError: libNVVM cannot be found. Do conda install cudatoolkit: library nvvm not found
from numba import cuda, float32
#Controls threads per block and shared memory usage.
#The computation will be done on blocks of TPBxTPB elements.
TPB = 16
#cuda.jit
def fast_matmul(A, B, C):
# Define an array in the shared memory
# The size and type of the arrays must be known at compile time
sA = cuda.shared.array(shape=(TPB, TPB), dtype=float32)
sB = cuda.shared.array(shape=(TPB, TPB), dtype=float32)
x, y = cuda.grid(2)
tx = cuda.threadIdx.x
ty = cuda.threadIdx.y
bpg = cuda.gridDim.x # blocks per grid
if x >= C.shape[0] and y >= C.shape[1]:
# Quit if (x, y) is outside of valid C boundary
return
# Each thread computes one element in the result matrix.
# The dot product is chunked into dot products of TPB-long vectors.
tmp = 0.
for i in range(bpg):
# Preload data into shared memory
sA[tx, ty] = A[x, ty + i * TPB]
sB[tx, ty] = B[tx + i * TPB, y]
# Wait until all threads finish preloading
cuda.syncthreads()
# Computes partial product on the shared memory
for j in range(TPB):
tmp += sA[tx, j] * sB[j, ty]
# Wait until all threads finish computing
cuda.syncthreads()
C[x, y] = tmp
import numpy as np
matrix_A = np.array([[0.1,0.2],[0.1,0.2]])
Doing as suggested and running conda install cudatoolkit does not work. I have tried many variations on this install that I've found online to no avail.
In the other post a solution that seems to have worked for many is to add lines about environment variables in the .bashrc file in the home directory. The suggestions however refer to files that exist in the /usr directory, where I have no cuda data since I've installed through conda. I have tried many variations on these exports without success. This is perhaps where the solution lies, but if so then the solution would benefit from being generalized.
Does anyone have any up-to-date or generalized solutions to this problem?
EDIT: adding information from terminal outputs (thanks for the hint of editing the question to do so)
> conda list numba
# packages in environment at /home/tobka/anaconda3:
#
# Name Version Build Channel
numba 0.51.2 py38h0573a6f_1
> conda list cudatoolkit
# packages in environment at /home/tobka/anaconda3:
#
# Name Version Build Channel
cudatoolkit 11.0.221 h6bb024c_0
Also adding output from numba -s: https://pastebin.com/raw/6u1MUkxg
Idea of possible cause (not yet confirmed): I noticed in the numba -s output that it specifies Python Version: 3.8.3, where I've been explicitly using python3.6 in the terminal since simply using python has usually meant using python2.7. I checked however, and my system now uses Python 3.8.3 with the python command, and Python 3.6.9 with the python3.6. And when running the code using python I get a syntax error instead, which is a good sign: raise ValueError(missing_launch_config_msg).
I will try to fix the syntax errors and confirm that the code works, after which I will report here of the situation.
Confirmation of solution: using python instead of python3.6 in the terminal solved the problem. The root cause was the user.

Get CPU and GPU Temp using Python Windows

I was wondering if there was a way to get the CPU and the GPU temperature in python. I have already found a way for Linux (using psutil.sensors_temperature()), and I wanted to find a way for Windows.
A way to find the temperatures for Mac OS would also be appreciated, but I mainly want a way for windows.
I prefer to only use python modules, but DLL and C/C++ extensions are also completely acceptable!
When I try doing the below, I get None:
import wmi
w = wmi.WMI()
prin(w.Win32_TemperatureProbe()[0].CurrentReading)
When I try doing the below, I get an error:
import wmi
w = wmi.WMI(namespace="root\wmi")
temperature_info = w.MSAcpi_ThermalZoneTemperature()[0]
print(temperature_info.CurrentTemperature)
Error:
wmi.x_wmi: <x_wmi: Unexpected COM Error (-2147217396, 'OLE error 0x8004100c', None, None)>
I have heard of OpenHardwareMoniter, but this requires me to install something that is not a python module. I would also prefer to not have to run the script as admin to get the results.
I am also fine with running windows cmd commands with python, but I have not found one that returns the CPU temp.
Update: I found this: https://stackoverflow.com/a/58924992/13710015.
I can't figure out how to use it though.
When I tried doing: print(OUTPUT_temp._fields_), I got
[('Board Temp', <class 'ctypes.c_ulong'>), ('CPU Temp', <class 'ctypes.c_ulong'>), ('Board Temp2', <class 'ctypes.c_ulong'>), ('temp4', <class 'ctypes.c_ulong'>), ('temp5', <class 'ctypes.c_ulong'>)]
Note: I really do not want to run this as admin. If I absolutely have to, I can, but I prefer not to.
I think there doesn't have a directly way to achieve that. Some CPU producers wouldn't provide wmi to let your know the temperature directly.
You could use OpenHardwareMoniter.dll. Use the dynamic library.
Firstly, Download the OpenHardwareMoniter. It contains a file called OpenHardwareMonitorLib.dll (version 0.9.6, December 2020).
Install the module pythonnet:
pip install pythonnet
Below code works fine on my PC (Get the CPU temperature):
import clr # the pythonnet module.
clr.AddReference(r'YourdllPath')
# e.g. clr.AddReference(r'OpenHardwareMonitor/OpenHardwareMonitorLib'), without .dll
from OpenHardwareMonitor.Hardware import Computer
c = Computer()
c.CPUEnabled = True # get the Info about CPU
c.GPUEnabled = True # get the Info about GPU
c.Open()
while True:
for a in range(0, len(c.Hardware[0].Sensors)):
# print(c.Hardware[0].Sensors[a].Identifier)
if "/temperature" in str(c.Hardware[0].Sensors[a].Identifier):
print(c.Hardware[0].Sensors[a].get_Value())
c.Hardware[0].Update()
To Get the GPU temperature, change the c.Hardware[0] to c.Hardware[1].
Compare the result with :
Attention: If you want to get the CPU temperature, you need to run it as Administrator. If not, you will only get the value of Load. For GPU temperature, it can work without Admin permissions (as on Windows 10 21H1).
I did some changes from a Chinese Blog
I found a pretty good module for getting the temperature of NVIDIA GPUs.
pip install gputil
Code to print out temperature
import GPUtil
gpu = GPUtil.getGPUs()[0]
print(gpu.temperature)
I found it here
so you can use gpiozero to get the temp
first pip install gpiozero and from gpiozero import CPUTemperature to import it and cpu = CPUTemperature() print(cpu.temperature)
code:
from gpiozero import CPUTemperature
cpu = CPUTemperature()
print(cpu.temperature)
hope you enjoy. :)

Unable to select a box in a python script when launching Blender from console

For my project, I want to be able to run a Blender python script from the system console.
This minimal script (see below) is able to select a region, using the select_box operator. The script works correctly when launching from the blender application. However, when running it from the console, using "C:\Program Files\Blender Foundation\Blender 2.81\blender.exe" "C:\Users\Desktop\test.blend" -d --python "D:\Documents\minTest.py", the program crash with the following output:
Switching to fully guarded memory allocator.
Blender 2.81 (sub 16)
Build: 2019-12-04 14:30:40 Windows Release
argv[0] = C:\Program Files\Blender Foundation\Blender 2.81\blender.exe
argv[1] = C:\Users\Desktop\test.blend
argv[2] = -d
argv[3] = --python
argv[4] = D:\Documents\minTest.py
Read prefs: C:\Users\AppData\Roaming\Blender Foundation\Blender\2.81\config\userpref.blend
read file
Version 280 sub 39 date unknown hash unknown
found bundled python: C:\Program Files\Blender Foundation\Blender 2.81\2.81\python
Warning: Add-on 'io_mesh_xyz' was not upgraded for 2.80, ignoring
Warning: Add-on 't26_PointCloudSkinner1_Umbrella' was not upgraded for 2.80, ignoring
Read blend: C:\Users\Desktop\test.blend
read file C:\Users\Desktop\test.blend
Version 281 sub 16 date 2019-12-04 11:32 hash f1aa4d18d49d
***** DEBUG: working
Error : EXCEPTION_ACCESS_VIOLATION
Address : 0x00007FF60F40DCFD
Module : C:\Program Files\Blender Foundation\Blender 2.81\blender.exe
The test.blend file is the simple startup scene. The minTest.py script is the following:
import bpy
def getView3dAreaAndRegion():
for area in bpy.context.screen.areas:
if area.type == "VIEW_3D":
for region in area.regions:
if region.type == "WINDOW":
return area, region
view3dArea, view3dRegion = getView3dAreaAndRegion()
override = bpy.context.copy()
override['area'] = view3dArea
override['region'] = view3dRegion
print("***** DEBUG: working") #Debug to see that the script has launched
bpy.ops.view3d.select_box(override,xmin=100, xmax=500, ymin=100, ymax=300, wait_for_input=False)
More information:
I'm using Blender 2.81 and Python 3.7 (bundled in Blender).
The script works just fine if I remove the call to select_box in both Blender and from console.
So my questions are:
Why do I have different result according to how I launch the script?
What should I do to be able to run the script from the system console?
You can't expect consistent results if you want to imitate user input with a script, it is too hard to know where anything is positioned.
To get reliable results from a script, you need to work with consistent data, so compare the objects location to decide if it is within certain criteria.
import bpy
for obj in bpy.context.scene.objects:
if obj.location.z > 0.2 and obj.location.z < 1.5:
obj.select_set(True)
else:
obj.select_set(False)
If you want to select a portion of a mesh, decide on what is selected either based on its position relative to the object origin or its world position by using obj.matrix_world.
While you can directly access the mesh components in obj.data, you should consider using bmesh for any mesh editing.
import bpy
import bmesh
bpy.ops.object.mode_set(mode='EDIT')
me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)
for v in bm.verts:
if v.co.z > -1.8 and v.co.z < 0.5:
v.co.z += 0.2
v.select = True
else:
v.select = False
bmesh.update_edit_mesh(me)
bm.free()
bpy.ops.object.mode_set(mode='OBJECT')
For help with blender specific scripting ask at blender.stackexchange.

automatically choose a device keras tensorflow [duplicate]

I have access through ssh to a cluster of n GPUs. Tensorflow automatically gave them names gpu:0,...,gpu:(n-1).
Others have access too and sometimes they take random gpus.
I did not place any tf.device() explicitely because that is cumbersome and even if I selected gpu number j and that someone is already on gpu number j that would be problematic.
I would like to go throuh the gpus usage and find the first that is unused and use only this one.
I guess someone could parse the output of nvidia-smi with bash and get a variable i and feed that variable i to the tensorflow script as the number of the gpu to use.
I have never seen any example of this. I imagine it is a pretty common problem. What would be the simplest way to do that ? Is a pure tensorflow one available ?
I'm not aware of pure-TensorFlow solution. The problem is that existing place for TensorFlow configurations is a Session config. However, for GPU memory, a GPU memory pool is shared for all TensorFlow sessions within a process, so Session config would be the wrong place to add it, and there's no mechanism for process-global config (but there should be, to also be able to configure process-global Eigen threadpool). So you need to do on on a process level by using CUDA_VISIBLE_DEVICES environment variable.
Something like this:
import subprocess, re
# Nvidia-smi GPU memory parsing.
# Tested on nvidia-smi 370.23
def run_command(cmd):
"""Run command, return output as string."""
output = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True).communicate()[0]
return output.decode("ascii")
def list_available_gpus():
"""Returns list of available GPU ids."""
output = run_command("nvidia-smi -L")
# lines of the form GPU 0: TITAN X
gpu_regex = re.compile(r"GPU (?P<gpu_id>\d+):")
result = []
for line in output.strip().split("\n"):
m = gpu_regex.match(line)
assert m, "Couldnt parse "+line
result.append(int(m.group("gpu_id")))
return result
def gpu_memory_map():
"""Returns map of GPU id to memory allocated on that GPU."""
output = run_command("nvidia-smi")
gpu_output = output[output.find("GPU Memory"):]
# lines of the form
# | 0 8734 C python 11705MiB |
memory_regex = re.compile(r"[|]\s+?(?P<gpu_id>\d+)\D+?(?P<pid>\d+).+[ ](?P<gpu_memory>\d+)MiB")
rows = gpu_output.split("\n")
result = {gpu_id: 0 for gpu_id in list_available_gpus()}
for row in gpu_output.split("\n"):
m = memory_regex.search(row)
if not m:
continue
gpu_id = int(m.group("gpu_id"))
gpu_memory = int(m.group("gpu_memory"))
result[gpu_id] += gpu_memory
return result
def pick_gpu_lowest_memory():
"""Returns GPU with the least allocated memory"""
memory_gpu_map = [(memory, gpu_id) for (gpu_id, memory) in gpu_memory_map().items()]
best_memory, best_gpu = sorted(memory_gpu_map)[0]
return best_gpu
You can then put it in utils.py and set GPU in your TensorFlow script before first tensorflow import. IE
import utils
import os
os.environ["CUDA_VISIBLE_DEVICES"] = str(utils.pick_gpu_lowest_memory())
import tensorflow
An implementation along the lines of Yaroslav Bulatov's solution is available on https://github.com/bamos/setGPU.

Command to run Theano on GPU (windows)

I talk about the tutorial here http://deeplearning.net/software/theano/tutorial/using_gpu.html
The code I use
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
At section Testing Theano with GPU:
There are some command line which set Theano flag to run on cpu or gpu. The problem is I have no idea to put these command in.
I have try on windows cmd
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python theanogpu_example.py
then I get
'THEANO_FLAGS' is not recognized as an internal or external command,
operable program or batch file.
However, I am able to run the code, on cpu using the command
python theanogpu_example.py
I want to run the code on GPU, what should I do (with these command in the tutorial)?
SOLUTION
Thanks to #Blauelf about the idea of windows environment variable.
However, the param has to be separated
set THEANO_FLAGS="mode=FAST_RUN" & set THEANO_FLAGS="device=gpu" & set THEANO_FLAGS="floatX=float32" & python theanogpu_example.py
From the docs, THEANO_FLAGS is an environment variable. So as you're on Windows, you might want to change
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python theanogpu_example.py
into
set THEANO_FLAGS="mode=FAST_RUN,device=gpu,floatX=float32" & python theanogpu_example.py

Categories