How to get/set logical directory path in python

How to get/set logical directory path in python - python

In python is it possible to get or set a logical directory (as opposed to an absolute one).
For example if I have:
/real/path/to/dir
and I have
/linked/path/to/dir
linked to the same directory.
using os.getcwd and os.chdir will always use the absolute path
>>> import os
>>> os.chdir('/linked/path/to/dir')
>>> print os.getcwd()
/real/path/to/dir
The only way I have found to get around this at all is to launch 'pwd' in another process and read the output. However, this only works until you call os.chdir for the first time.

The underlying operational system / shell reports real paths to python.
So, there really is no way around it, since os.getcwd() is a wrapped call to C Library getcwd() function.
There are some workarounds in the spirit of the one that you already know which is launching pwd.
Another one would involve using os.environ['PWD']. If that environmnent variable is set you can make some getcwd function that respects it.
The solution below combines both:
import os
from subprocess import Popen, PIPE
class CwdKeeper(object):
def __init__(self):
self._cwd = os.environ.get("PWD")
if self._cwd is None: # no environment. fall back to calling pwd on shell
self._cwd = Popen('pwd', stdout=PIPE).communicate()[0].strip()
self._os_getcwd = os.getcwd
self._os_chdir = os.chdir
def chdir(self, path):
if not self._cwd:
return self._os_chdir(path)
p = os.path.normpath(os.path.join(self._cwd, path))
result = self._os_chdir(p)
self._cwd = p
os.environ["PWD"] = p
return result
def getcwd(self):
if not self._cwd:
return self._os_getcwd()
return self._cwd
cwd = CwdKeeper()
print cwd.getcwd()
# use only cwd.chdir and cwd.getcwd from now on.
# monkeypatch os if you want:
os.chdir = cwd.chdir
os.getcwd = cwd.getcwd
# now you can use os.chdir and os.getcwd as normal.

This also does the trick for me:
import os
os.popen('pwd').read().strip('\n')
Here is a demonstration in python shell:
>>> import os
>>> os.popen('pwd').read()
'/home/projteam/staging/site/proj\n'
>>> os.popen('pwd').read().strip('\n')
'/home/projteam/staging/site/proj'
>>> # Also works if PWD env var is set
>>> os.getenv('PWD')
'/home/projteam/staging/site/proj'
>>> # This gets actual path, not symlinked path
>>> import subprocess
>>> p = subprocess.Popen('pwd', stdout=subprocess.PIPE)
>>> p.communicate()[0] # returns non-symlink path
'/home/projteam/staging/deploys/20150114-141114/site/proj\n'
Getting the environment variable PWD didn't always work for me so I use the popen method. Cheers!

Related

how to preserve module path of a module executed as a script

I have a function called get_full_class_name(instance), which returns the full module-qualified class name of instance.
Example my_utils.py:
def get_full_class_name(instance):
return '.'.join([instance.__class__.__module__,
instance.__class__.__name__])
Unfortunately, this function fails when given a class that's defined in a currently running script.
Example my_module.py:
#! /usr/bin/env python
from my_utils import get_full_class_name
class MyClass(object):
pass
def main():
print get_full_class_name(MyClass())
if __name__ == '__main__':
main()
When I run the above script, instead of printing my_module.MyClass, it prints __main__.MyClass:
$ ./my_module.py
__main__.MyClass
I do get the desired behavior if I run the above main() from another script.
Example run_my_module.py:
#! /usr/bin/env python
from my_module import main
if __name__ == '__main__':
main()
Running the above script gets:
$ ./run_my_module.py
my_module.MyClass
Is there a way I could write the get_full_class_name() function such that it always returns my_module.MyClass regardless of whether my_module is being run as a script?

I propose handling the case __name__ == '__main__' using the techniques discussed in Find Path to File Being Run. This results in this new my_utils:
import sys
import os.path
def get_full_class_name(instance):
if instance.__class__.__module__ == '__main__':
return '.'.join([os.path.basename(sys.argv[0]),
instance.__class__.__name__])
else:
return '.'.join([instance.__class__.__module__,
instance.__class__.__name__])
This does not handle interactive sessions and other special cases (like reading from stdin). For this you may have to include techniques like discussed in detect python running interactively.

Following mkiever's answer, I ended up changing get_full_class_name() to what you see below.
If instance.__class__.__module__ is __main__, it doesn't use that as the module path. Instead, it uses the relative path from sys.argv[0] to the closest directory in sys.path.
One problem is that sys.path always includes the directory of sys.argv[0] itself, so this relative path ends up being just the filename part of sys.argv[0]. As a quick hack-around, the code below assumes that the sys.argv[0] directory is always the first element of sys.path, and disregards it. This seems unsafe, but safer options are too tedious for my personal code for now.
Any better solutions/suggestions would be greatly appreciated.
import os
import sys
from nose.tools import assert_equal, assert_not_equal
def get_full_class_name(instance):
'''
Returns the fully-qualified class name.
Handles the case where a class is declared in the currently-running script
(where instance.__class__.__module__ would be set to '__main__').
'''
def get_module_name(instance):
def get_path_relative_to_python_path(path):
path = os.path.abspath(path)
python_paths = [os.path.abspath(p) for p in sys.path]
assert_equal(python_paths[0],
os.path.split(os.path.abspath(sys.argv[0]))[0])
python_paths = python_paths[1:]
min_relpath_length = len(path)
result = None
for python_path in python_paths:
relpath = os.path.relpath(path, python_path)
if len(relpath) < min_relpath_length:
min_relpath_length = len(relpath)
result = os.path.join(os.path.split(python_path)[-1],
relpath)
if result is None:
raise ValueError("Path {} doesn't seem to be in the "
"PYTHONPATH.".format(path))
else:
return result
if instance.__class__.__module__ == '__main__':
script_path = os.path.abspath(sys.argv[0])
relative_path = get_path_relative_to_python_path(script_path)
relative_path = relative_path.split(os.sep)
assert_not_equal(relative_path[0], '')
assert_equal(os.path.splitext(relative_path[-1])[1], '.py')
return '.'.join(relative_path[1:-1])
else:
return instance.__class__.__module__
module_name = get_module_name(instance)
return '.'.join([module_name, instance.__class__.__name__])

sh.cd using context manager

here is what I am basically trying to do:
import sh, os
with sh.cd('/tmp'):
print os.getcwd()
print os.getcwd()
I get the following error though
line 3, in <module>
with sh.cd('/tmp'):
AttributeError: __exit__
What am I missing here? Are there alternative solutions to change directory within a context?

You can't use just any class/function as a context manager, it has to actually explicitly be implemented that way, using either the contextlib.contextmanager decorator on a function, or in the case of a class, by defining the __enter__ and __exit__ instance methods.
The sh.cd function you're using is simply a wrapper around os.chdir:
>>> import sh
>>> sh.cd
<bound method Environment.b_cd of {}>
b_cd is defined as:
def b_cd(self, path):
os.chdir(path)
As you can see, it's just a normal function; it can't be used as a context manager.
The link whereswalden provided shows a good way of implementing the behavior you want as a class. It could similarly be implemented as a function like this:
import contextlib
import os
#contextlib.contextmanager
def cd(path):
old_path = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(old_path)
Sample usage:
print(os.getcwd())
with cd("/"):
print os.getcwd()
print(os.getcwd())
Output:
'/home/dan'
'/'
'/home/dan'

sh now has the pushd() function that can be used as a context manager to change the current directory temporarily:
import sh
with sh.pushd("/tmp"):
sh.touch("a_file")
See https://amoffat.github.io/sh/sections/command_class.html?highlight=pushd#pushd

Python - bulk promote variables to parent scope

In python 2.7, I want to run:
$ ./script.py initparms.py
This is a trick to supply a parameter file to script.py, since initparms.py contains several python variables e.g.
Ldir = '/home/marzipan/jelly'
LMaps = True
# etc.
script.py contains:
X = __import__(sys.argv[1])
Ldir = X.Ldir
LMaps = X.Lmaps
# etc.
I want to do a bulk promotion of the variables in X so they are available to script.py, without spelling out each one in the code by hand.
Things like
import __import__(sys.argv[1])
or
from sys.argv[1] import *
don't work. Almost there perhaps... Any ideas? Thanks!

here's a one-liner:
globals().update(__import__(sys.argv[1]).__dict__)

You can use execfile:
execfile(sys.argv[1])
Of course, the usual warnings with exec or eval apply (Your script has no way of knowing whether it is running trusted or untrusted code).
My suggestion would be to not do what you're doing and instead use configparser and handling the configuration though there.

You could do something like this:
import os
import imp
import sys
try:
module_name = sys.argv[1]
module_info = imp.find_module(module_name, [os.path.abspath(os.path.dirname(__file__))] + sys.path)
module_properties = imp.load_module(module_name, *module_info)
except ImportError:
pass
else:
try:
attrlist = module_properties.__all__
except AttributeError:
attrlist = dir(module_properties)
for attr in attrlist:
if attr.startswith('__'):
continue
globals()[attr] = getattr(module_properties, attr)
Little complicated, but gets the job done.

pushd through os.system

I'm using a crontab to run a maintenance script for my minecraft server. Most of the time it works fine, unless the crontab tries to use the restart script. If I run the restart script manually, there aren't any issues. Because I believe it's got to do with path names, I'm trying to make sure it's always doing any minecraft command FROM the minecraft directory. So I'm encasing the command in pushd/popd:
os.system("pushd /directory/path/here")
os.system("command to sent to minecraft")
os.system("popd")
Below is an interactive session taking minecraft out of the equation. A simple "ls" test. As you can see, it does not at all run the os.system command from the pushd directory, but instead from /etc/ which is the directory in which I was running python to illustrate my point.Clearly pushd isn't working via python, so I'm wondering how else I can achieve this. Thanks!
>>> def test():
... import os
... os.system("pushd /home/[path_goes_here]/minecraft")
... os.system("ls")
... os.system("popd")
...
>>> test()
~/minecraft /etc
DIR_COLORS cron.weekly gcrypt inputrc localtime mime.types ntp ppp rc3.d sasldb2 smrsh vsftpd.ftpusers
DIR_COLORS.xterm crontab gpm-root.conf iproute2 login.defs mke2fs.conf ntp.conf printcap rc4.d screenrc snmp vsftpd.tpsave
X11 csh.cshrc group issue logrotate.conf modprobe.d odbc.ini profile rc5.d scsi_id.config squirrelmail vz
adjtime csh.login group- issue.net logrotate.d motd odbcinst.ini profile.d rc6.d securetty ssh warnquota.conf
aliases cyrus.conf host.conf java lvm mtab openldap protocols redhat-release security stunnel webalizer.conf
alsa dbus-1 hosts jvm lynx-site.cfg multipath.conf opt quotagrpadmins resolv.conf selinux sudoers wgetrc
alternatives default hosts.allow jvm-commmon lynx.cfg my.cnf pam.d quotatab rndc.key sensors.conf sysconfig xinetd.conf
bashrc depmod.d hosts.deny jwhois.conf mail named.caching-nameserver.conf passwd rc rpc services sysctl.conf xinetd.d
blkid dev.d httpd krb5.conf mail.rc named.conf passwd- rc.d rpm sestatus.conf termcap yum
cron.d environment imapd.conf ld.so.cache mailcap named.rfc1912.zones pear.conf rc.local rsyslog.conf setuptool.d udev yum.conf
cron.daily exports imapd.conf.tpsave ld.so.conf mailman netplug php.d rc.sysinit rwtab shadow updatedb.conf yum.repos.d
cron.deny filesystems init.d ld.so.conf.d makedev.d netplug.d php.ini rc0.d rwtab.d shadow- vimrc
cron.hourly fonts initlog.conf libaudit.conf man.config nscd.conf pki rc1.d samba shells virc
cron.monthly fstab inittab libuser.conf maven nsswitch.conf postfix rc2.d sasl2 skel vsftpd
sh: line 0: popd: directory stack empty
===
(CentOS server with python 2.4)

In Python 2.5 and later, I think a better method would be using a context manager, like so:
import contextlib
import os
#contextlib.contextmanager
def pushd(new_dir):
previous_dir = os.getcwd()
os.chdir(new_dir)
try:
yield
finally:
os.chdir(previous_dir)
You can then use it like the following:
with pushd('somewhere'):
print os.getcwd() # "somewhere"
print os.getcwd() # "wherever you started"
By using a context manager you will be exception and return value safe: your code will always cd back to where it started from, even if you throw an exception or return from inside the context block.
You can also nest pushd calls in nested blocks, without having to rely on a global directory stack:
with pushd('somewhere'):
# do something
with pushd('another/place'):
# do something else
# do something back in "somewhere"

Each shell command runs in a separate process. It spawns a shell, executes the pushd command, and then the shell exits.
Just write the commands in the same shell script:
os.system("cd /directory/path/here; run the commands")
A nicer (perhaps) way is with the subprocess module:
from subprocess import Popen
Popen("run the commands", shell=True, cwd="/directory/path/here")

pushd and popd have some added functionality: they store previous working directories in a stack - in other words, you can pushd five times, do some stuff, and popd five times to end up where you started. You're not using that here, but it might be useful for others searching for the questions like this. This is how you can emulate it:
# initialise a directory stack
pushstack = list()
def pushdir(dirname):
global pushstack
pushstack.append(os.getcwd())
os.chdir(dirname)
def popdir():
global pushstack
os.chdir(pushstack.pop())

I don't think you can call pushd from within an os.system() call:
>>> import os
>>> ret = os.system("pushd /tmp")
sh: pushd: not found
Maybe just maybe your system actually provides a pushd binary that triggers a shell internal function (I think I've seen this on FreeBSD beforeFreeBSD has some tricks like this, but not for pushd), but the current working directory of a process cannot be influenced by other processes -- so your first system() starts a shell, runs a hypothetical pushd, starts a shell, runs ls, starts a shell, runs a hypothetical popd... none of which influence each other.
You can use os.chdir("/home/path/") to change path: http://docs.python.org/library/os.html#os-file-dir

No need to use pushd -- just use os.chdir:
>>> import os
>>> os.getcwd()
'/Users/me'
>>> os.chdir('..')
>>> os.getcwd()
'/Users'
>>> os.chdir('me')
>>> os.getcwd()
'/Users/me'

Or make a class to use with 'with'
import os
class pushd: # pylint: disable=invalid-name
__slots__ = ('_pushstack',)
def __init__(self, dirname):
self._pushstack = list()
self.pushd(dirname)
def __enter__(self):
return self
def __exit__(self, exec_type, exec_val, exc_tb) -> bool:
# skip all the intermediate directories, just go back to the original one.
if self._pushstack:
os.chdir(self._pushstack.pop(0)))
if exec_type:
return False
return True
def popd(self) -> None:
if len(self._pushstack):
os.chdir(self._pushstack.pop())
def pushd(self, dirname) -> None:
self._pushstack.append(os.getcwd())
os.chdir(dirname)
with pushd(dirname) as d:
... do stuff in that dirname
d.pushd("../..")
d.popd()

If you really need a stack, i.e. if you want to do several pushd and popd,
see naught101 above.
If not, simply do:
olddir = os.getcwd()
os.chdir('/directory/path/here')
os.system("command to sent to minecraft")
os.chdir(olddir)

Is there a standard way to list names of Python modules in a package?

Is there a straightforward way to list the names of all modules in a package, without using __all__?
For example, given this package:
/testpkg
/testpkg/__init__.py
/testpkg/modulea.py
/testpkg/moduleb.py
I'm wondering if there is a standard or built-in way to do something like this:
>>> package_contents("testpkg")
['modulea', 'moduleb']
The manual approach would be to iterate through the module search paths in order to find the package's directory. One could then list all the files in that directory, filter out the uniquely-named py/pyc/pyo files, strip the extensions, and return that list. But this seems like a fair amount of work for something the module import mechanism is already doing internally. Is that functionality exposed anywhere?

Using python2.3 and above, you could also use the pkgutil module:
>>> import pkgutil
>>> [name for _, name, _ in pkgutil.iter_modules(['testpkg'])]
['modulea', 'moduleb']
EDIT: Note that the parameter for pkgutil.iter_modules is not a list of modules, but a list of paths, so you might want to do something like this:
>>> import os.path, pkgutil
>>> import testpkg
>>> pkgpath = os.path.dirname(testpkg.__file__)
>>> print([name for _, name, _ in pkgutil.iter_modules([pkgpath])])

import module
help(module)

Maybe this will do what you're looking for?
import imp
import os
MODULE_EXTENSIONS = ('.py', '.pyc', '.pyo')
def package_contents(package_name):
file, pathname, description = imp.find_module(package_name)
if file:
raise ImportError('Not a package: %r', package_name)
# Use a set because some may be both source and compiled.
return set([os.path.splitext(module)[0]
for module in os.listdir(pathname)
if module.endswith(MODULE_EXTENSIONS)])

Don't know if I'm overlooking something, or if the answers are just out-dated but;
As stated by user815423426 this only works for live objects and the listed modules are only modules that were imported before.
Listing modules in a package seems really easy using inspect:
>>> import inspect, testpkg
>>> inspect.getmembers(testpkg, inspect.ismodule)
['modulea', 'moduleb']

This is a recursive version that works with python 3.6 and above:
import importlib.util
from pathlib import Path
import os
MODULE_EXTENSIONS = '.py'
def package_contents(package_name):
spec = importlib.util.find_spec(package_name)
if spec is None:
return set()
pathname = Path(spec.origin).parent
ret = set()
with os.scandir(pathname) as entries:
for entry in entries:
if entry.name.startswith('__'):
continue
current = '.'.join((package_name, entry.name.partition('.')[0]))
if entry.is_file():
if entry.name.endswith(MODULE_EXTENSIONS):
ret.add(current)
elif entry.is_dir():
ret.add(current)
ret |= package_contents(current)
return ret

There is a __loader__ variable inside each package instance. So, if you import the package, you can find the "module resources" inside the package:
import testpkg # change this by your package name
for mod in testpkg.__loader__.get_resource_reader().contents():
print(mod)
You can of course improve the loop to find the "module" name:
import testpkg
from pathlib import Path
for mod in testpkg.__loader__.get_resource_reader().contents():
# You can filter the name like
# Path(l).suffix not in (".py", ".pyc")
print(Path(mod).stem)
Inside the package, you can find your modules by directly using __loader__ of course.

This should list the modules:
help("modules")

If you would like to view an inforamtion about your package outside of the python code (from a command prompt) you can use pydoc for it.
# get a full list of packages that you have installed on you machine
$ python -m pydoc modules
# get information about a specific package
$ python -m pydoc <your package>
You will have the same result as pydoc but inside of interpreter using help
>>> import <my package>
>>> help(<my package>)

Based on cdleary's example, here's a recursive version listing path for all submodules:
import imp, os
def iter_submodules(package):
file, pathname, description = imp.find_module(package)
for dirpath, _, filenames in os.walk(pathname):
for filename in filenames:
if os.path.splitext(filename)[1] == ".py":
yield os.path.join(dirpath, filename)

The other answers here will run the code in the package as they inspect it. If you don't want that, you can grep the files like this answer
def _get_class_names(file_name: str) -> List[str]:
"""Get the python class name defined in a file without running code
file_name: the name of the file to search for class definitions in
return: all the classes defined in that python file, empty list if no matches"""
defined_class_names = []
# search the file for class definitions
with open(file_name, "r") as file:
for line in file:
# regular expression for class defined in the file
# searches for text that starts with "class" and ends with ( or :,
# whichever comes first
match = re.search("^class(.+?)(\(|:)", line) # noqa
if match:
# add the cleaned match to the list if there is one
defined_class_name = match.group(1).strip()
defined_class_names.append(defined_class_name)
return defined_class_names

To complete #Metal3d answer, yes you can do testpkg.__loader__.get_resource_reader().contents() to list the "module resources" but it will work only if you imported your package in the "normal" way and your loader is _frozen_importlib_external.SourceFileLoader object.
But if you imported your library with zipimport (ex: to load your package in memory), your loader will be a zipimporter object, and its get_resource_reader function is different from importlib; it will require a "fullname" argument.
To make it work in these two loaders, just specify your package name in argument to get_resource_reader :
# An example with CrackMapExec tool
import importlib
import cme.protocols as cme_protocols
class ProtocolLoader:
def get_protocols(self):
protocols = {}
protocols_names = [x for x in cme_protocols.__loader__.get_resource_reader("cme.protocols").contents()]
for prot_name in protocols_names:
prot = importlib.import_module(f"cme.protocols.{prot_name}")
protocols[prot_name] = prot
return protocols

def package_contents(package_name):
package = __import__(package_name)
return [module_name for module_name in dir(package) if not module_name.startswith("__")]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get/set logical directory path in python - python

Related

how to preserve module path of a module executed as a script

sh.cd using context manager

Python - bulk promote variables to parent scope

pushd through os.system

Is there a standard way to list names of Python modules in a package?

Categories

Resources