We only have 4 GPU devices. and we have more than 4 users to run cuda program ,so before I run my program I want to check which device is not busy, or it will alloc memory fail. But I havent found a function to get this tag. I know when we want to use device we call "cudaSetDevice()" , so there must be a tag for each device. and that "nvidia-smi" can get more detail, include which proccess is using which device and how much memory it used. So who can help me?
The values for cudaSetDevice start at 0 and then increase monotonically for each additional device. Alternatively you can set the environment variable CUDA_VISIBLE_DEVICES to select which device to use. (see https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/).
To get information about what is using the device you need to use the driver API: http://docs.nvidia.com/cuda/cuda-driver-api/index.html
Related
I have a server access which has multiple GPUs that can be accessed simultaneously by many users.
I choose only 1 gpu_id from the terminal and have a code like this.
device = "cuda:"+str(FLAGS.gpu_id) if torch.cuda.is_available() else "cpu"
where FLAGS is a parser, parsing arguments from terminal.
Even though I select only one id, I saw that I am using 2 different GPUs. That causes issues, when the other GPU memory is almost full, and my process terminates by throwing "CUDA out of memory" error.
I want to understand, what could be the possible cases for such thing to happen?
It is hard to tell what is wrong without knowing how you use the device parameter. In any case, you can try to achieve what you want with a different approach. Run your python script in the following way:
CUDA_VISIBLE_DEVICES=0 python3 my_code.py
I need to create multiple EBS volumes and put some data there using python+boto3.
Overall my flow is:
Create volumes.
Wait their available state.
Attach one volume.
Wait its attached state.
List NVMe devices, detect the volume's one. <-- The issue happens here.
Mount NVMe device.
Copy data.
Unmount and detach the volume.
... next volume.
Most of the time it works fine. Volumes are correctly attaching and linking to NVMe devices (like /dev/nvme2p1). But at some point linux doesn't set block device for the volume: volume state is attached but nvme list doesn't show it.
If re-attach such volume with boto3 or manually in AWS console then it will have block device.
It happens in us-east-2 region but not in ap-south-1, for example.
I tried to attach/detach in single threaded and multi threaded mode.
In multi threaded mode I used separate boto3 clients per thread and a lock to attach sequentially. Also tried different boto3 versions and wait some time after attaching. Nothing helped.
My environment:
EC2 instance: t3a.small, AMI: ubuntu 20.04.2.
python2 (yes, we still using it).
botocore==1.12.253
boto3==1.9.199
Has anyone faced the same problem?
This is my first post, and I kind of have seen that the more specific the better, so I'll try to be super clear, and thanks in advance!
What I want:
I need to scan images from 2 or more scanners at the same time, these scanners are from the same brand and model, in this case Epson Perfection V600, I need different time intervals for at least 40 captures over a course of 20 hours.
My approach
I decided to use Windows, I already have a program in Python that does what I want with just one scanner, or with two from different models. But here is where you guys come in:
The problem
Windows always prints with the same scanner, Since they are from the same brand and model it always uses the same one, and I cannot use two different scanners because that will cause the images not to be comparable. Nevertheless, when I use two different scanners, I don't have such problem. I need to find a way to print with each scanner. I thought in buying a USB hub and control it with python as well, but apparently given libsub implementation in windows, I will not be able to control it. So I'm currently Looking for a way to disable an specific USB port so the program will only recognize one device, scan with it, disable that one, re-enable the other one, and so on.
What I have access to:
Right now I'm using Windows 10, 64 bits, python kernel 3 in a python 3.5 version inside a Conda environment, conda version (4.5.11).
Ubuntu 16.04, 64 bits, with pyinsane working, in a python 3.5 environment inside conda (don't have the conda version at hand).
One Epson perfection V600.
Two Canon Lide200, working only in windows, because drivers are not available in Ubuntu.
What I have also tried
Using Ubuntu,
I thought it was a good Idea, but the Epson drivers webpage fails to connect to the repository containing the rest of the Epson files, letting me only partially download the files, I already tried to contact the owner of the Docky repository, but he fails to contact me.
The error:
W: The repository 'http://ppa.launchpad.net/docky-core/ppa/ubuntu Xenial Release' does not have a Release file.
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: Failed to fetch http://ppa.launchpad.net/docky-core/ppa/ubuntu/dists/xenial/main/binary-amd64/Packages 404 Not Found
E: Some index files failed to download. They have been ignored, or old ones used instead.
when I manually try to enter the site's repository I found that
All links to XENIAL drivers are down, actually the whole Xenial
Folder is missing.
Also then thought it was a good idea to ignore this message, but I
Need the Epwoka driver to run Epson scanners in Ubuntu, and that
a whole problem by itself. Aside from that, is not known if Epson
Perfection V600 is going to be possible to be controlled by the
PyInsane lib, since is marked as untested.
Using Windows
I thought in buying an USB hub and to controlled as shown in this thread, but apparently is not possible in windows.
I already installed libsub, usb.util, libusb1,USB (for the core functions) and usb1, but I don't know (I think is not possible) to disable and re-enable a specific USB port with them.
Can't disable the drivers since that mean all USB will be down to connect with the scanners.
Device manager is not helping, because of the inability of telling which device is which.
Cannot change the name of the scanner (yes, printers can have specific names) but scanners can't.
Can't buy another scanner, I'm stuck with Epson.
My code for Scanning
import pyinsane2
def Scan(Device, dpi):
pyinsane2.init()
try:
pyinsane2.set_scanner_opt(Device, 'resolution', [dpi])
pyinsane2.set_scanner_opt(Device, 'mode', ['Color'])
pyinsane2.maximize_scan_area(Device)
scan_session = Device.scan(multiple=False)
try:
while True:
scan_session.scan.read()
except EOFError:
pass
Image = scan_session.images[-1]
finally:
pyinsane2.exit()
return(Image)
devices = pyinsane2.get_devices()
image_a = Scan(devices[0], 75)
image_b = Scan(devices[1], 75)
a = devices[1]
b = devices[0]
a == b #Different
a.dev_type == b.dev_type
a.model == b.model
a.name == b.name #Different
a.nice_name == b.nice_name
a.options == b.options
a.reload_options == b.reload_options #Different
a.scan == b.scan #Different
a.srcs == b.srcs #Different
a.vendor == b.vendor
I put a sticki note inside each Scanner, one with an "a" the other one with a "b" and it always scans with the scanner that I plugged in first
This is what I would like to do (and doing it manually works): .
This is what I get when trying in python:
Any solution will help me, get creative! I was thinking on using a .bat file to disable an specific port and calling it with Python. But I couldn't find a way to make it.Keep in mind that doing it manually is not an option 'cause of the 20 to 40 hours of continuous image acquisition.
Thanks!
~Diego
I'm using polling command(glob('/dev/tty[A-Za-z]*')) in python to detect usb devices connected to my linux pc in regular interval for my application. Is there any way to detect usb devices connected automatically?
Here is a start. You can find your usb vendor here. You got to code yourself a current_list_usb, set a time interval to check so you can compare and see if a new device is attached or not. Some code to use when importing usb module:
import usb, usb.core, usb.util, usb.backend.libusb1
...snippet...
# usb.core.find()
# find our device
dev = usb.core.find(idVendor= ...., idProduct= ....)
#dev_1 = usb.util.find_descriptor(cfg, find_all =True)
# was it found?
if dev is None:
raise ValueError('Device not found')
#x = dev.set_configuration()
#print (dev)
#print (help(usb.core))
if usb.core.find(find_all=True, bDeviceClass=7) is None:
raise ValueError('No printer found')
The normal way to do this is to make a udev rule that tells your program a new tty exists.
A custom udev rule may look something like this(let's call it /etc/udev/rules.d/50-custom-tty.rules:
KERNEL=="ttyUSB[0-9]+", RUN+="/usr/bin/my-program"
Here's a good guide on writing udev rules.
In this case, the program /usr/bin/my-program will run whenever a new ttyUSB device is created in /dev; udev will set a bunch of environment variables to tell you exactly what was just plugged in. You can then notify your main program that a new ttyUSB exists, and it should use it. Note that whatever program you run should be small, as otherwise the udev daemon will kill it if it takes too long.
I'd suggest using libudev and creating a udev monitor object to detect hotplugged devices. Here is a starting point for you to learn about libudev and its monitor feature:
https://www.freedesktop.org/software/systemd/man/libudev.html
There might be a good Python library already that wraps udev so you can use its features without writing C code.
If I have a USB modem that I am accessing using Python pyserial module, it requires the device to be identified '/dev/ttyACM0 for example. If the modem is attached to a USB hub it no longer appears in /dev/tty...
How do identify it programmatically from my Python code so regardless of whether it has been changed or not, or the machine rebooted I can locate the modem?
Note:
I can always see the device using lsusb, but if it is attached to a USB hub it does not appear as /dev/tty... device
This sounds like a bug in the linux kernel. If you can, try a more recent version.
If that fails, check the last few lines of the output of dmesg or in the file /var/log/messages (the latter depends on your distribution; if that file doesn't exist or doesn't contain what you're looking for, then check the other files in /var/log; sorting by time with ls -rt helps).
After identifying the device, you might see a pattern.
Another approach is the major and minor number. If you run ls -l /dev, you'll see output like this:
crw--w---- 1 root tty 4, 0 2011-12-19 09:15 tty0
The c means "character device" and the 4, 0 means it's the console device unit 0.
The 4 is the major number which identifies the type of device. See /proc/devices for a list of major numbers and the respective device drivers.
If you plug in the model directly, note the major number. After plugging it into a hub, try to find devices with the same number.
Instead of doing some voodoo in Python, try writing a udev rule which gives your device a much more useful name like /dev/my-serial-thingy. Using that from Python is way easier.