Multiprocessing, file not found - python

I'm using AlphaPose from GitHub and I'd like to run the script script/demo_inference.py from another script I created in AlphaPose root called run.py. In run.py I imported demo_inference.py as ap using this script:
def import_module_by_path(path):
name = os.path.splitext(os.path.basename(path))[0] spec =
importlib.util.spec_from_file_location(name, path) mod =
importlib.util.module_from_spec(spec) spec.loader.exec_module(mod) return mod
and
ap = import_module_by_path('./scripts/demo_inference.py')
Then, in demo_inference.py I substituted
if __name__ == "__main__":
with
def startAlphapose():
and in run.py I wrote
ap.StartAlphapose().
Now I got this error:
Load SE Resnet...
Loading YOLO model..
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/vislab/guerri/alphagastnet/insieme/alphapose/utils/detector.py", line 251, in image_postprocess
(orig_img, im_name, boxes, scores, ids, inps, cropped_boxes) = self.wait_and_get(self.det_queue)
File "/home/vislab/guerri/alphagastnet/insieme/alphapose/utils/detector.py", line 121, in wait_and_get
return queue.get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/home/vislab/guerri/alphagastnet/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 284, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
What does it mean?

We were running into this same problem in our cluster.
When using multiprocessing in PyTorch (typically to run multiple DataLoader workers), the subprocesses create sockets in the /tmp directory to communitcate with each other. These sockets all saved in folders named pymp-###### and look like 0-byte files. Deleting these files or folders while your PyTorch scripts are still running will cause the above error.
In our case, the problem was a buggy maintenance script that was erasing files out of the /tmp folder while they were still needed. It's possible there are other ways to trigger this error. But you should start by looking for those sockets and making sure they aren't getting erased by accident.
If that doesn't solve it, take a look at your /var/log/syslog file at the exact time when the error occurred. You'll very likely find the cause of it there.

Related

Dask client using a remote interpreter

I'm trying to use dask from Pycharm using a remote (SSH) interpreter.
Here's some code:
from dask.distributed import Client
client = Client(processes=True)
On Python 2 this seems to work fine, but on Python 3 this fails with:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/forkserver.py", line 278, in main
code = _serve_one(child_r, fds,
File "/usr/lib/python3.8/multiprocessing/forkserver.py", line 317, in _serve_one
code = spawn._main(child_r, parent_sentinel)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/usr/lib/python3.8/runpy.py", line 261, in run_path
code, fname = _get_code_from_file(run_name, path_name)
File "/usr/lib/python3.8/runpy.py", line 231, in _get_code_from_file
with open(fname, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/ubuntu/<input>'
distributed.nanny - WARNING - Restarting worker
/home/ubuntu exists and is writable- I'm not sure what /home/ubuntu/input is though!
Does anyone have an idea as to what is going wrong here?
EDIT
so after a bit of tracing back I can see that in multiprocess/spawn.py on line 226, there is a prepare function which takes a map of properties describing the processed to be spawned. For some reason the 'init_main_from_path' in that map is being set to /home/ubuntu/<input> which is a directory that doesn't exist anywhere (I'm not even usre this is a valid directory name!). It looks like the PyCharm console is setting this to help itself with the remote execution, but it's messing up the ability of multiprocess to correctly spawn a process. If I hack multiprocess/spawn.py to remove the rogue 'init_main_from_path' key then I can start the client.

Exception disappears if I stop at a breakpoint

Python 3.6.2
The problem with the code below is that being run, it raises an exception.
But being stepped in debugger, it works perfectly. Where I stop in the debugger is marked as breakpoint in the comments.
I tried the command both in IDE and in the shell. Exception raises. So, this problem is not related to the IDE.
This situation shook me a bit.
I made a video of it: https://www.youtube.com/watch?v=OUcMpEzooDk
Could you give me a kick here? How can it be?
Comment on the code below (not related to the problem, but just for the most curious).
This is an utility to use with Django web framework.
Users upload files, they are put to the media directory.
Of course, Django knows where the media directory is sutuated.
And then Django keeps in the database paths relative to media. Something like this:
it_1/705fad82-2f68-4f3c-90c2-116da3ad9a40.txt'
e5474da0-0fd3-4fa4-a85f-15c767ac32d4.djvu
I want to know exactly that files kept in media correspond to paths in the database. No extra files, no shortage.
Code:
from pathlib import Path
class <Something>():
def _reveal_lack_extra_files(self):
path = os.path.join(settings.BASE_DIR, '../media/')
image_files = Image.objects.values_list("file", flat=True)
image_files = [Path(os.path.join(path, file)) for file in image_files]
item_files = ItemFile.objects.values_list("file", flat=True)
item_files = [Path(os.path.join(path, file)) for file in item_files]
sheet_files = SheetFile.objects.values_list("file", flat=True)
sheet_files = [Path(os.path.join(path, file)) for file in sheet_files]
expected_files = set().union(image_files, item_files, sheet_files)
real_files = set()
glob_generator = list(Path(path).glob("**/*"))
for posix_path in glob_generator:
if os.path.isfile(posix_path._str): # Breakpoint
real_files.add(posix_path)
lack = expected_files.difference(real_files)
extra = real_files.difference(expected_files)
assert bool(lack) == False, "Lack of files: {}".format(lack)
assert bool(extra) == False, "Extra files: {}".format(extra)
Traceback:
/home/michael/PycharmProjects/venv/photoarchive_4/bin/python /home/michael/Documents/pycharm-community-2017.1.5/helpers/pydev/pydevd.py --multiproc --qt-support --client 127.0.0.1 --port 43849 --file /home/michael/PycharmProjects/photoarchive_4/manage.py checkfiles
warning: Debugger speedups using cython not found. Run '"/home/michael/PycharmProjects/venv/photoarchive_4/bin/python" "/home/michael/Documents/pycharm-community-2017.1.5/helpers/pydev/setup_cython.py" build_ext --inplace' to build.
pydev debugger: process 3840 is connecting
Connected to pydev debugger (build 171.4694.67)
Traceback (most recent call last):
File "/home/michael/Documents/pycharm-community-2017.1.5/helpers/pydev/pydevd.py", line 1591, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/michael/Documents/pycharm-community-2017.1.5/helpers/pydev/pydevd.py", line 1018, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/michael/Documents/pycharm-community-2017.1.5/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/michael/PycharmProjects/photoarchive_4/manage.py", line 22, in <module>
execute_from_command_line(sys.argv)
File "/home/michael/PycharmProjects/venv/photoarchive_4/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
utility.execute()
File "/home/michael/PycharmProjects/venv/photoarchive_4/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/michael/PycharmProjects/venv/photoarchive_4/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv
self.execute(*args, **cmd_options)
File "/home/michael/PycharmProjects/venv/photoarchive_4/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "/home/michael/PycharmProjects/photoarchive_4/general/management/commands/checkfiles.py", line 59, in handle
self._reveal_lack_extra_files()
File "/home/michael/PycharmProjects/photoarchive_4/general/management/commands/checkfiles.py", line 39, in _reveal_lack_extra_files
if os.path.isfile(posix_path._str):
AttributeError: _str
Process finished with exit code 1
You're using the _str attribute on paths, which is undocumented and not guaranteed to be set. In general, an underscore prefix indicates that this is a private attribute that should not be used by user code. If you want to convert a path to a string, just use str(the_path) instead.
But in this case, you don't need to do so: Path objects have an is_file method which you can call instead. Another possibility is to pass the Path object itself to the os.path.isfile function, which is supported on Python 3.6.

Python multiprocessing claims too many open files when no files are even opened

I'm attempting to speed up an algorithm that makes use of a gigantic matrix. I've parallelised it to operate on rows, and put the data matrix in shared memory so the system doesn't get clogged. However, instead of working smoothly as I'd have hoped, it now throws a weird error with regards to files, which I don't comprehend as I don't even open files in the thing.
Mock-up of roughly what's going on in the program proper, with the 1000-iteration for being representative of what's happening in the algorithm too.
import multiprocessing
import ctypes
import numpy as np
shared_array_base = multiprocessing.Array(ctypes.c_double, 10*10)
shared_array = np.ctypeslib.as_array(shared_array_base.get_obj())
shared_array = shared_array.reshape(10, 10)
def my_func(i, shared_array):
shared_array[i,:] = i
def pool_init(_shared_array, _constans):
global shared_array, constans
shared_array = _shared_array
constans = _constans
def pool_my_func(i):
my_func(i, shared_array)
if __name__ == '__main__':
for i in np.arange(1000):
pool = multiprocessing.Pool(8, pool_init, (shared_array, 4))
pool.map(pool_my_func, range(10))
print(shared_array)
And this throws this error (I'm on OSX):
Traceback (most recent call last):
File "weird.py", line 24, in <module>
pool = multiprocessing.Pool(8, pool_init, (shared_array, 4))
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 118, in Pool
context=self.get_context())
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 168, in __init__
self._repopulate_pool()
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 233, in _repopulate_pool
w.start()
File "//anaconda/lib/python3.4/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 267, in _Popen
return Popen(process_obj)
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
self._launch(process_obj)
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 69, in _launch
parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files
I'm quite stumped. I don't even open files in here. All I want to do is pass shared_array to the individual processes in a manner that won't clog the system memory, I don't even need to modify it within the parallelised process if this will help anything.
Also, in case it matters, the exact error thrown by the proper code itself is a little different:
Traceback (most recent call last):
File "tcap.py", line 206, in <module>
File "tcap.py", line 202, in main
File "tcap.py", line 181, in tcap_cluster
File "tcap.py", line 133, in ap_step
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 118, in Pool
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 168, in __init__
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 233, in _repopulate_pool
File "//anaconda/lib/python3.4/multiprocessing/process.py", line 105, in start
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 267, in _Popen
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 69, in _launch
OSError: [Errno 24] Too many open files
So yeah, I have no idea how to proceed. Any help would be appreciated. Thanks in advance!
You're trying to create 1000 process pools, which are not reclaimed (for some reason); these have consumed all available file descriptors in your main process for the pipes that are used for communicating between the main process and its children.
Perhaps you'd want to use:
pool = multiprocessing.Pool(8, pool_init, (shared_array, 4))
for _ in range(1000):
pool.map(pool_my_func, range(10))
There was a limit on number of file descriptors from OS. I changed my ulimit to 4096 from 1024 and it worked.
Check your num of descriptors limit using:
ulimit -n
For me it was 1024, and I updated it to 4096 and it worked.
ulimit -n 4096

Python SFTP File Not Found Errors

I'm attempting to download all the files in an SFTP directory to a local folder using the pysftp library. My code looks like this:
import pysftp
sftp = pysftp.Connection('server', username = 'name', password = 'password')
sftp.get_d('Daily_Reports', '/home/jchrysostom/Documents/SupplyChain/Daily_Reports/')
Daily_Reports is a folder that exists on the SFTP server - I have verified this. I have also verified that /home/jchrysostom/Documents/SupplyChain/Daily_Reports/ exists. I can cd to it in terminal with no problems.
However, when I run this python script, I get the following error: IOError: [Errno 2] File not found.
Any ideas what may be causing this?
UPDATE: A little investigation shows that the files actually have downloaded. In fact, all of them have downloaded just fine. However, I'm unable to run the rest of the script, because it's erroring out for some reason. Is this just a bug in the library?
UPDATE 2 - Full Traceback, as requested:
Traceback (most recent call last):
File "supplychain.py", line 20, in <module>
sftp.get_d('Daily_Reports','/home/jchrysostom/Documents/SupplyChain/Daily_Reports/')
File "/usr/local/lib/python2.7/dist-packages/pysftp.py", line 255, in get_d
preserve_mtime=preserve_mtime)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/pysftp.py", line 497, in cd
self.cwd(original_path)
File "/usr/local/lib/python2.7/dist-packages/pysftp.py", line 510, in chdir
self._sftp.chdir(remotepath)
File "/usr/local/lib/python2.7/dist-packages/paramiko/sftp_client.py", line 580, in chdir
if not stat.S_ISDIR(self.stat(path).st_mode):
File "/usr/local/lib/python2.7/dist-packages/paramiko/sftp_client.py", line 413, in stat
t, msg = self._request(CMD_STAT, path)
File "/usr/local/lib/python2.7/dist-packages/paramiko/sftp_client.py", line 729, in _request
return self._read_response(num)
File "/usr/local/lib/python2.7/dist-packages/paramiko/sftp_client.py", line 776, in _read_response
self._convert_status(msg)
File "/usr/local/lib/python2.7/dist-packages/paramiko/sftp_client.py", line 802, in _convert_status
raise IOError(errno.ENOENT, text)
IOError: [Errno 2] File not found
As best as I can tell, this is a bug in pysftp. The files are being copied successfully, but (at least according to the traceback here) the library is blowing up when it tries to change back to the original remote working directory on the FTP server.
Workaround is to iterate over the files in the directory and get() each individually...
for filename in sftp.listdir('Daily_Reports'):
sftp.get('Daily_Reports/' + filename, localpath = '/home/jchrysostom/Documents/SupplyChain/Daily_Reports/' + filename)

Running install mod_dav_svn and can't decipher thread error

I am relatively new to Python, and at the same time, am attempting to install mod_dav_svn into my Apache web server. I am looking to get some idea of the scope of the error I'm receiving.
At the command line, I type in 'sudo yum install mod_dav_svn' and receive this output:
Loaded plugins: fastestmirror
Determining fastest mirrors
Traceback (most recent call last):
File "/usr/bin/yum", line 29, in ?
yummain.user_main(sys.argv[1:], exit_code=True)
File "/usr/share/yum-cli/yummain.py", line 229, in user_main
errcode = main(args)
File "/usr/share/yum-cli/yummain.py", line 104, in main
result, resultmsgs = base.doCommands()
File "/usr/share/yum-cli/cli.py", line 339, in doCommands
self._getTs(needTsRemove)
File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 101, in _getTs
self._getTsInfo(remove_only)
File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 112, in _getTsInfo
pkgSack = self.pkgSack
File "/usr/lib/python2.4/site-packages/yum/init.py", line 591, in
pkgSack = property(fget=lambda self: self._getSacks(),
File "/usr/lib/python2.4/site-packages/yum/init.py", line 434, in _getSacks
self.repos.populateSack(which=repos)
File "/usr/lib/python2.4/site-packages/yum/repos.py", line 223, in populateSack
self.doSetup()
File "/usr/lib/python2.4/site-packages/yum/repos.py", line 71, in doSetup
self.ayum.plugins.run('postreposetup')
File "/usr/lib/python2.4/site-packages/yum/plugins.py", line 176, in run
func(conduitcls(self, self.base, conf, **kwargs))
File "/usr/lib/yum-plugins/fastestmirror.py", line 181, in postreposetup_hook
all_urls = FastestMirror(all_urls).get_mirrorlist()
File "/usr/lib/yum-plugins/fastestmirror.py", line 333, in get_mirrorlist
self._poll_mirrors()
File "/usr/lib/yum-plugins/fastestmirror.py", line 376, in _poll_mirrors
pollThread.start()
File "/usr/lib/python2.4/threading.py", line 416, in start
_start_new_thread(self.__bootstrap, ())
thread.error: can't start new thread
The only other question I could find with a similar error was this one: https://stackoverflow.com/search?q=can%27t+start+new+thread+python, but I'm not sure it has to do with running too many threads since I am the only one using the server, and this is one of the first python commands I have used. Could someone point me in the right direction, or towards some material that may help me troubleshoot the issue? Thanks!
Disable yum-fastermirror and you should be able to complete the install.
yum --disableplugin=fastestmirror update
Are you on a virtual machine?

Categories