pathlib relative path issue for upper/parent level directories [duplicate] - python

The python library pathlib provides Path.relative_to. This function works fine if one path is a subpath of the other one, like this:
from pathlib import Path
foo = Path("C:\\foo")
bar = Path("C:\\foo\\bar")
bar.relative_to(foo)
> WindowsPath('bar')
However, if two paths are on the same level, relative_to does not work.
baz = Path("C:\\baz")
foo.relative_to(baz)
> ValueError: 'C:\\foo' does not start with 'C:\\baz'
I would expect the result to be
WindowsPath("..\\baz")
The function os.path.relpath does this correctly:
import os
foo = "C:\\foo"
bar = "C:\\bar"
os.path.relpath(foo, bar)
> '..\\foo'
Is there a way to achieve the functionality of os.path.relpath using pathlib.Path?

The first section solves the OP's problem, though if like me, he really wanted the solution relative to a common root then the second section solves it for him. The third section describes how I originally approached it and is kept for interest sake.
Relative Paths
Recently, as in Python 3.4-6, the os.path module has been extended to accept pathlib.Path objects. In the following case however it does not return a Path object and one is forced to wrap the result.
foo = Path("C:\\foo")
baz = Path("C:\\baz")
Path(os.path.relpath(foo, baz))
> Path("..\\foo")
Common Path
My suspicion is that you're really looking a path relative to a common root. If that is the case the following, from EOL, is more useful
Path(os.path.commonpath([foo, baz]))
> Path('c:/root')
Common Prefix
Before I'd struck upon os.path.commonpath I'd used os.path.comonprefix.
foo = Path("C:\\foo")
baz = Path("C:\\baz")
baz.relative_to(os.path.commonprefix([baz,foo]))
> Path('baz')
But be forewarned you are not supposed to use it in this context (See commonprefix : Yes, that old chestnut)
foo = Path("C:\\route66\\foo")
baz = Path("C:\\route44\\baz")
baz.relative_to(os.path.commonprefix([baz,foo]))
> ...
> ValueError : `c:\\route44\baz` does not start with `C:\\route`
but rather the following one from J. F. Sebastian.
Path(*os.path.commonprefix([foo.parts, baz.parts]))
> Path('c:/root')
... or if you're feeling verbose ...
from itertools import takewhile
Path(*[set(i).pop() for i in (takewhile(lambda x : x[0]==x[1], zip(foo.parts, baz.parts)))])

This was bugging me, so here's a pathlib-only version that I think does what os.path.relpath does.
def relpath(path_to, path_from):
path_to = Path(path_to).resolve()
path_from = Path(path_from).resolve()
try:
for p in (*reversed(path_from.parents), path_from):
head, tail = p, path_to.relative_to(p)
except ValueError: # Stop when the paths diverge.
pass
return Path('../' * (len(path_from.parents) - len(head.parents))).joinpath(tail)

A recursive version of #Brett_Ryland's relpath for pathlib. I find this to be a tad more readable and it is going to succeed on first try in most cases so it should have similar performance as the original relative_to function:
def relative(target: Path, origin: Path):
""" return path of target relative to origin """
try:
return Path(target).resolve().relative_to(Path(origin).resolve())
except ValueError as e: # target does not start with origin
# recursion with origin (eventually origin is root so try will succeed)
return Path('..').joinpath(relative(target, Path(origin).parent))

Related

Get Python's LIB path

I can see that INCLUDE path is sysconfig.get_path('include').
But I don't see any similar value for LIB.
NumPy outright hardcodes it as os.path.join(sys.prefix, "libs") in Windows and get_config_var('LIBDIR') (not documented and missing in Windows) otherwise.
Is there a more supported way?
Since it's not a part of any official spec/doc, and, as shown by another answer, there are cases when none of appropriate variables from sysconfig/distutils.sysconfig .get_config_var() are set,
the only way to reliably get it in all cases, exactly as a build would (e.g. even for a Python in the sourcetree) is to delegate to the reference implementation.
In distutils, the logic that sets the library path for a compiler is located in distutils.commands.build_ext.finalize_options(). So, this code would get it with no side effects on the build:
import distutils.command.build_ext #imports distutils.core, too
d = distutils.core.Distribution()
b = distutils.command.build_ext.build_ext(d) #or `d.get_command_class('build_ext')(d)',
# then it's enough to import distutils.core
b.finalize_options()
print b.library_dirs
Note that:
Not all locations in the resulting list necessarily exist.
If your setup.py is setuptools-based, use setuptools.Distribution and setuptools.command.build_ext instead, correspondingly.
If you pass any values to setup() that affect the result, you must pass them to Distribution here, too.
Since there are no guarantees that the set of the additional values you need to pass will stay the same, and the value is only needed when building an extension,
it seems like you aren't really supposed to get this value independently at all:
If you're using another build facility, you should rather subclass build_ext and get the value from the base method during the build.
Below is the (rather long) subroutine in skbuild.cmaker that locates libpythonxx.so/pythonxx.lib for the running Python. In CMake, 350-line Modules/FindPythonLibs.cmake is dedicated to this task.
The part of the former that gets just the directory is much simpler though:
libdir = dustutils.sysconfig.get_config_var('LIBDIR')
if sysconfig.get_config_var('MULTIARCH'):
masd = sysconfig.get_config_var('multiarchsubdir')
if masd:
if masd.startswith(os.sep):
masd = masd[len(os.sep):]
libdir = os.path.join(libdir, masd)
if libdir is None:
libdir = os.path.abspath(os.path.join(
sysconfig.get_config_var('LIBDEST'), "..", "libs"))
def get_python_library(python_version):
"""Get path to the python library associated with the current python
interpreter."""
# determine direct path to libpython
python_library = sysconfig.get_config_var('LIBRARY')
# if static (or nonexistent), try to find a suitable dynamic libpython
if (python_library is None or
os.path.splitext(python_library)[1][-2:] == '.a'):
candidate_lib_prefixes = ['', 'lib']
candidate_extensions = ['.lib', '.so', '.a']
if sysconfig.get_config_var('WITH_DYLD'):
candidate_extensions.insert(0, '.dylib')
candidate_versions = [python_version]
if python_version:
candidate_versions.append('')
candidate_versions.insert(
0, "".join(python_version.split(".")[:2]))
abiflags = getattr(sys, 'abiflags', '')
candidate_abiflags = [abiflags]
if abiflags:
candidate_abiflags.append('')
# Ensure the value injected by virtualenv is
# returned on windows.
# Because calling `sysconfig.get_config_var('multiarchsubdir')`
# returns an empty string on Linux, `du_sysconfig` is only used to
# get the value of `LIBDIR`.
libdir = du_sysconfig.get_config_var('LIBDIR')
if sysconfig.get_config_var('MULTIARCH'):
masd = sysconfig.get_config_var('multiarchsubdir')
if masd:
if masd.startswith(os.sep):
masd = masd[len(os.sep):]
libdir = os.path.join(libdir, masd)
if libdir is None:
libdir = os.path.abspath(os.path.join(
sysconfig.get_config_var('LIBDEST'), "..", "libs"))
candidates = (
os.path.join(
libdir,
''.join((pre, 'python', ver, abi, ext))
)
for (pre, ext, ver, abi) in itertools.product(
candidate_lib_prefixes,
candidate_extensions,
candidate_versions,
candidate_abiflags
)
)
for candidate in candidates:
if os.path.exists(candidate):
# we found a (likely alternate) libpython
python_library = candidate
break
# TODO(opadron): what happens if we don't find a libpython?
return python_library

Using pathlib's relative_to for directories on the same level

The python library pathlib provides Path.relative_to. This function works fine if one path is a subpath of the other one, like this:
from pathlib import Path
foo = Path("C:\\foo")
bar = Path("C:\\foo\\bar")
bar.relative_to(foo)
> WindowsPath('bar')
However, if two paths are on the same level, relative_to does not work.
baz = Path("C:\\baz")
foo.relative_to(baz)
> ValueError: 'C:\\foo' does not start with 'C:\\baz'
I would expect the result to be
WindowsPath("..\\baz")
The function os.path.relpath does this correctly:
import os
foo = "C:\\foo"
bar = "C:\\bar"
os.path.relpath(foo, bar)
> '..\\foo'
Is there a way to achieve the functionality of os.path.relpath using pathlib.Path?
The first section solves the OP's problem, though if like me, he really wanted the solution relative to a common root then the second section solves it for him. The third section describes how I originally approached it and is kept for interest sake.
Relative Paths
Recently, as in Python 3.4-6, the os.path module has been extended to accept pathlib.Path objects. In the following case however it does not return a Path object and one is forced to wrap the result.
foo = Path("C:\\foo")
baz = Path("C:\\baz")
Path(os.path.relpath(foo, baz))
> Path("..\\foo")
Common Path
My suspicion is that you're really looking a path relative to a common root. If that is the case the following, from EOL, is more useful
Path(os.path.commonpath([foo, baz]))
> Path('c:/root')
Common Prefix
Before I'd struck upon os.path.commonpath I'd used os.path.comonprefix.
foo = Path("C:\\foo")
baz = Path("C:\\baz")
baz.relative_to(os.path.commonprefix([baz,foo]))
> Path('baz')
But be forewarned you are not supposed to use it in this context (See commonprefix : Yes, that old chestnut)
foo = Path("C:\\route66\\foo")
baz = Path("C:\\route44\\baz")
baz.relative_to(os.path.commonprefix([baz,foo]))
> ...
> ValueError : `c:\\route44\baz` does not start with `C:\\route`
but rather the following one from J. F. Sebastian.
Path(*os.path.commonprefix([foo.parts, baz.parts]))
> Path('c:/root')
... or if you're feeling verbose ...
from itertools import takewhile
Path(*[set(i).pop() for i in (takewhile(lambda x : x[0]==x[1], zip(foo.parts, baz.parts)))])
This was bugging me, so here's a pathlib-only version that I think does what os.path.relpath does.
def relpath(path_to, path_from):
path_to = Path(path_to).resolve()
path_from = Path(path_from).resolve()
try:
for p in (*reversed(path_from.parents), path_from):
head, tail = p, path_to.relative_to(p)
except ValueError: # Stop when the paths diverge.
pass
return Path('../' * (len(path_from.parents) - len(head.parents))).joinpath(tail)
A recursive version of #Brett_Ryland's relpath for pathlib. I find this to be a tad more readable and it is going to succeed on first try in most cases so it should have similar performance as the original relative_to function:
def relative(target: Path, origin: Path):
""" return path of target relative to origin """
try:
return Path(target).resolve().relative_to(Path(origin).resolve())
except ValueError as e: # target does not start with origin
# recursion with origin (eventually origin is root so try will succeed)
return Path('..').joinpath(relative(target, Path(origin).parent))

Python: error handling in recursive functions

Me: I am running Python 2.3.3 without possibility to upgrade and i don't have much experience with Python. My method for learning is googling and reading tons of stackoverflow.
Background: I am creating a python script whose purpose is to take two directories as arguments and then perform comparisons/diff of all the files found within the two directories. The directories have sub-directories that also have to be included in the diff.
Each directory is a List and sub-directories are nested Lists and so on...
the two directories:
oldfiles/
a_tar_ball.tar
a_text_file.txt
nest1/
file_in_nest
nest1a/
file_in_nest
newfiles/
a_tar_ball.tar
a_text_file.txt
nest1/
file_in_nest
nest1a/
Problem: Normally all should go fine as all files in oldfiles should exist in newfiles but in the above example one of the 'file_in_nest' is missing in 'newfiles/'.
I wish to print an error message telling me which file that is missing but when i'm using the code structure below the current instance of my 'compare' function doesn't know any other directories but the closest one. I wonder if there is a built in error handling that can send information about files and directory up in the recursion ladder adding info to it as we go. If i would just print the filename of the missing file i would not know which one of them it might be as there are two 'file_in_nest' in 'oldfiles'
def compare(file_tree)
for counter, entry in enumerate(file_tree[0][1:]):
if not entry in file_tree[1]
# raise "some" error and send information about file back to the
# function calling this compare, might be another compare.
elif not isinstance(entry, basestring):
os.chdir(entry[0])
compare(entry)
os.chdir('..')
else:
# perform comparison (not relevant to the problem)
# detect if "some" error has been raised
# prepend current directory found in entry[0] to file information
break
def main()
file_tree = [['/oldfiles', 'a_tar_ball.tar', 'a_text_file.txt', \
[/nest1', 'file_in_nest', [/nest1a', 'file_in_nest']], \
'yet_another_file'], \
['/newfiles', 'a_tar_ball.tar', 'a_text_file.txt', \
[/nest1', 'file_in_nest', [/nest1a']], \
'yet_another_file']]
compare(file_tree)
# detect if "some" error has been raised and print error message
This is my first activity on stackoverflow other than reading som please tell me if i should improve on the question!
// Stefan
Well, it depends whether you want to report an error as an exception or as some form of status.
Let's say you want to go the 'exception' way and have the whole program crash if one file is missing, you can define your own exception saving the state from the callee to the caller:
class PathException(Exception):
def __init__(self, path):
self.path = path
Exception.__init__(self)
def compare(filetree):
old, new = filetree
for counter, entry in enumerate(old[1:]):
if entry not in new:
raise PathException(entry)
elif not isinstance(entry, basestring):
os.chdir(entry[0])
try:
compare(entry)
os.chdir("..")
except PathException as e:
os.chdir("..")
raise PathException(os.path.join(entry, e.path))
else:
...
Where you try a recursive call, and update any incoming exception with the information of the caller.
To see it on a smaller example, let's try to deep-compare two lists, and raise an exception if they are not equal:
class MyException(Exception):
def __init__(self, path):
self.path = path
Exception.__init__(self)
def assertEq(X, Y):
if hasattr(X, '__iter__') and hasattr(Y, '__iter__'):
for i, (x, y) in enumerate(zip(X, Y)):
try:
assertEq(x, y)
except MyException as e:
raise MyException([i] + e.path)
elif X != Y:
raise MyException([]) # Empty path -> Base case
This gives us:
>>> L1 = [[[1,2,3],[4,5],[[6,7,8],[7,9]]],[3,5,[7,8]]]
>>> assertEq(L1, L1)
Nothing happens (lists are similar), and:
>>> L1 = [[[1,2,3],[4,5],[[6,7,8],[7,9]]],[3,5,[7,8]]]
>>> L2 = [[[1,2,3],[4,5],[[6,7,8],[7,5]]],[3,5,[7,8]]] # Note the [7,9] -> [7,5]
>>> try:
... assertEq(L1, L2)
... except MyException as e:
... print "Diff at",e.path
Diff at [0, 2, 1, 1]
>>> print L1[0][2][1][1], L2[0][2][1][1]
9 5
Which gives the full path.
As recursive lists or paths are basically the same thing, it is easy to adapt it to your use case.
Another simple way of solving this would be to report this difference in files as a simple diff, similar to the others: you can return it as a difference between the old file and the (non-existent) new file, or return both the list of differences in files and the list of differences of files, in which case it is easy to update recursively the values as they are returned by the recursive calls.

QFileDialog returns selected file with wrong seperators

I noticed that QFileDialog instance is returning absolute paths for the member function selectedFile() that have the wrong separator for the given operating system. This is not expected on a cross platform language (python)
What should I do to correct this so that the rest of my properly OS-independant python code using 'os.sep' can be correct? I don't want to have to remember where I can and can't use it.
You use the os.path.abspath function:
>>> import os
>>> os.path.abspath('C:/foo/bar')
'C:\\foo\\bar'
The answer came from another thread ( HERE ) that mentioned I need to use QDir.toNativeSeparators()
so I did the following in my loop (which should probably be done in pyqt itself for us):
def get_files_to_add(some_directory):
addq = QFileDialog()
addq.setFileMode(QFileDialog.ExistingFiles)
addq.setDirectory(some_directory)
addq.setFilter(QDir.Files)
addq.setAcceptMode(QFileDialog.AcceptOpen)
new_files = list()
if addq.exec_() == QDialog.Accepted:
for horrible_name in addq.selectedFiles():
### CONVERSION HERE ###
temp = str(QDir.toNativeSeparators(horrible_name)
###
# temp is now as the os module expects it to be
# let's strip off the path and the extension
no_path = temp.rsplit(os.sep,1)[1]
no_ext = no_path.split(".")[0]
#... do some magic with the file name that has had path stripped and extension stripped
new_files.append(no_ext)
pass
pass
else:
#not loading anything
pass
return new_files

Alternatives to imp.find_module?

Background
I've grown tired of the issue with pylint not being able to import files when you use namespace packages and divide your code-base into separate folders. As such I started digging into the astNG source-code which has been identified as the source of the trouble (see bugreport 8796 on astng). At the heart of the issue seems to be the use of pythons own imp.find_module in the process of finding imports.
What happens is that the import's first (sub)package - a in import a.b.c - is fed to find_module with a None path. Whatever path comes back is then fed into find_module the next pass in the look up loop where you try to find b in the previous example.
Pseudo-code from logilab.common.modutils:
path = None
while import_as_list:
try:
_, found_path, etc = find_module(import_as_list[0], path)
#exception handling and checking for a better version in the .egg files
path = [found_path]
import_as_list.pop(0)
The Problem
This is what's broken: you only get the first best hit from find_module, which may or may not have your subpackages in it. If you DON'T find the subpackages, you have no way to back out and try the next one.
I tried explicitly using sys.path instead of None, so that the result could be removed from the path list and a second attempt be made, but python's module finder is clever enough that there doesn't have to be an exact match in the paths, making this approach unusable - to the best of my knowledge anyway.
Teary-eyed Plea
Is there an alternative to find_modules which will return ALL possible matches or take an exclude list? I'm also open to completely different solutions. Preferably not patching python by hand, but it wouldn't be impossible - at least for a local solution.
(Caveat emptor: I'm running python 2.6 and for reasons of current company policy can't upgrade, suggestions for p3k etc won't get marked as accepted unless it's the only answer.)
Since Python 2.5, the right way to do this is with pkgutil.iter_modules() (for a flat list) or pkgutil.walk_packages() (for a subpackage tree). Both are fully compatible with namespace packages.
For example, if I wanted to find just the subpackages/submodules of 'jmb', I would do:
import jmb, pkgutil
for (module_loader, name, ispkg) in pkgutil.iter_modules(jmb.__path__, 'jmb.'):
# 'name' will be 'jmb.foo', 'jmb.bar', etc.
# 'ispkg' will be true if 'jmb.foo' is a package, false if it's a module
You can also use iter_modules or walk_packages to walk all the modules on sys.path; see the docs linked above for details.
I've grown tired of this limitation in PyLint too.
I don't know a replacement for imp.find_modules(), but I think I found another way to deal with namespace packages in PyLint. See my comment on the bug report you linked to (http://www.logilab.org/ticket/8796).
The idea is to use pkg_resources to find namespace packages. Here's my addition to logilab.common.modutils._module_file(), just after while modpath:
while modpath:
if modpath[0] in pkg_resources._namespace_packages and len(modpath) > 1:
module = sys.modules[modpath.pop(0)]
path = module.__path__
This not very refined and only handles top-level namespace packages though.
warning + disclaimer: not tested yet!
before:
for part in parts:
modpath.append(part)
curname = '.'.join(modpath)
# ...
if module is None:
mp_file, mp_filename, mp_desc = imp.find_module(part, path)
module = imp.load_module(curname, mp_file, mp_filename, mp_desc)
after: - thanks pjeby for mentioning pkgutil!
for part in parts:
modpath.append(part)
curname = '.'.join(modpath)
# ...
if module is None:
# + https://stackoverflow.com/a/14820895/611007
# # mp_file, mp_filename, mp_desc = imp.find_module(part, path)
# # module = imp.load_module(curname, mp_file, mp_filename, mp_desc)
import pkgutil
mp_file = None
for loadr,name,ispkg in pkgutil.iter_modules(path=path,prefix='.'.join(modpath[:-1])+'.'):
if name.split('.')[-1] == part:
if not hasattr(loadr,'path') and hasattr(loadr,'archive'):
# with zips `name` was like '.somemodule'
# it gives `RuntimeWarning: Parent module '' not found while handling absolute import`
# I expect the name I need to be 'somemodule'
# TODO: I don't know why python does this or what the correct usage is.
# https://stackoverflow.com/questions/2267984/
if name and name[0] == '.':
name = name[1:]
ldr= loadr.find_module(name,loadr.archive)
module = ldr.load_module(name)
break
imploader= loadr.find_module(name,loadr.path)
mp_file,mp_filename,mp_desc= imploader.file,imploader.filename,imploader.etc
module = imploader.load_module(imploader.fullname)
break
if module is None:
raise ImportError

Categories