get absolute path from one absolute and one relative - python

I want to get absolute path from absolute path and relative:
absolute1 = '/a/b/c/d.js'
relative = '../../e.js'
absolute2 = getAbsoluteFromAbsoluteAndRelative(absolute1, relative)
In this example absolute2 should be equal 'a/e.js'
How to write getAbsoluteFromAbsoluteAndRelative method?
Update:
I found os.path.abspath but it takes only one argument

Your absolute path still contains a filename, so remove that with os.path.dirname() to obtain just the directory.
Then join the two and apply os.path.normpath() to the result:
os.path.normpath(os.path.join(os.path.dirname(absolute1), relative))
normpath normalizes a path with relative references in it; A/foo/../B becomes A/B, for example.
Demo:
>>> import os.path
>>> absolute1 = '/a/b/c/d.js'
>>> relative = '../../e.js'
>>> os.path.normpath(os.path.join(os.path.dirname(absolute1), relative))
'/a/e.js'

Try absolute2 = os.path.join(os.path.dirname(absolute1), relative)
Edit: Martijn beat me to it. Wrapping this in os.path.normpath is the way to go.

Related

How do I get the parent directory's name only, not full path?

I am trying to get the parent directory's name only. Meaning, only its last component, not the full path.
So for example for the path a/b/c/d/e I want to get d, and not a/b/c/d.
My current code:
import os
path = "C:/example/folder/file1.jpg"
directoryName = os.path.dirname(os.path.normpath(path))
print(directoryName)
This prints out C:/example/folder and I want to get just folder.
The simplest way to do this would be using pathlib. Using parent will get you the parent's full path, and name will give you just the last component:
>>> from pathlib import Path
>>> path = Path("/a/b/c/d/e")
>>> path.parent.name
'd'
For comparison, to do the same with os.path, you will need to get the basename of the dirname of your path. So that translates directly to:
import os
path = "C:/example/folder/file1.jpg"
print(os.path.basename(os.path.dirname(path)))
Which is the nicer version of:
os.path.split(os.path.split(path)[0])[1]
Where both give:
'folder'
As you can see, the pathlib approach is much clearer and readable. Because pathlib incorporates the OOP approach for representing paths, instead of strings, we get a clear chain of attributes/method calls.
path.parent.name
Is read in order as:
start from path -> take its parent -> take its name
Whereas in the os functions-accepting-strings approach you actually need to read from inside-out!
os.path.basename(os.path.dirname(path))
Is read in order as:
The name of the parent of the path
Which I'm sure you'll agree is much harder to read and understand (and this is just a simple-case example).
You could also use the str.split method together with os.sep:
>>> path = "C:\\example\\folder\\file1.jpg"
>>> path.split(os.sep)[-2]
'folder'
But as the docs state:
Note that knowing this [(the separator)] is not sufficient to be able to parse or
concatenate pathnames — use os.path.split() and os.path.join() — but
it is occasionally useful.
Use pathlib.Path to get the .name of the .parent:
from pathlib import Path
p = Path("C:/example/folder/file1.jpg")
print(p.parent.name) # folder
Compared to os.path, pathlib represents paths as a separate type instead of strings. It generally is shorter and more convenient to use.
this works
path = "C:/example/folder/file1.jpg"
directoryName = os.path.dirname(path)
parent = directoryName.split("/")
parent.reverse()
print(parent[0])
Simple to solve using pathlib
0. Import Path from pathlib
from pathlib import Path
path = "C:/example/folder/file1.jpg"
1. Get parent level 1
parent_lv1 = Path(path).parent
2. Get parent level 2
parent_lv2 = parent_lv1.parent
3. Get immediate parent
imm_parent = parent_lv1.relative_to(parent_lv2)
print(imm_parent)
I prefer regex
import re
def get_parent(path: str) -> str:
match = re.search(r".*[\\|/](\w+)[\\|/].*", path)
if match:
return match.group(1)
else:
return ""
if __name__ == '__main__':
my_path = "/home/tony/some/cool/path"
print(get_parent(my_path))
win_path = r"C:\windows\path\has\dumb\backslashes"
print(get_parent(win_path))
Output
cool
dumb

How to resolve relative paths in python?

I have Directory structure like this
projectfolder/fold1/fold2/fold3/script.py
now I'm giving script.py a path as commandline argument of a file which is there in
fold1/fold_temp/myfile.txt
So basically I want to be able to give path in this way
../../fold_temp/myfile.txt
>>python somepath/pythonfile.py -input ../../fold_temp/myfile.txt
Here problem is that I might be given full path or relative path so I should be able to decide and based on that I should be able to create absolute path.
I already have knowledge of functions related to path.
Question 1
Question 2
Reference questions are giving partial answer but I don't know how to build full path using the functions provided in them.
try os.path.abspath, it should do what you want ;)
Basically it converts any given path to an absolute path you can work with, so you do not need to distinguish between relative and absolute paths, just normalize any of them with this function.
Example:
from os.path import abspath
filename = abspath('../../fold_temp/myfile.txt')
print(filename)
It will output the absolute path to your file.
EDIT:
If you are using Python 3.4 or newer you may also use the resolve() method of pathlib.Path. Be aware that this will return a Path object and not a string. If you need a string you can still use str() to convert it to a string.
Example:
from pathlib import Path
filename = Path('../../fold_temp/myfile.txt').resolve()
print(filename)
A practical example:
sys.argv[0] gives you the name of the current script
os.path.dirname() gives you the relative directory name
thus, the next line, gives you the absolute working directory of the current executing file.
cwd = os.path.abspath(os.path.dirname(sys.argv[0]))
Personally, I always use this instead of os.getcwd() since it gives me the script absolute path, independently of the directory from where the script was called.
For Python3, you can use pathlib's resolve functionality to resolve symlinks and .. components.
You need to have a Path object however it is very simple to do convert between str and Path.
I recommend for anyone using Python3 to drop os.path and its messy long function names and stick to pathlib Path objects.
import os
dir = os.path.dirname(__file__)
path = raw_input()
if os.path.isabs(path):
print "input path is absolute"
else:
path = os.path.join(dir, path)
print "absolute path is %s" % path
Use os.path.isabs to judge if input path is absolute or relative, if it is relative, then use os.path.join to convert it to absolute

How/where to use os.path.sep?

os.path.sep is the character used by the operating system to separate pathname components.
But when os.path.sep is used in os.path.join(), why does it truncate the path?
Example:
Instead of 'home/python', os.path.join returns '/python':
>>> import os
>>> os.path.join('home', os.path.sep, 'python')
'/python'
I know that os.path.join() inserts the directory separator implicitly.
Where is os.path.sep useful? Why does it truncate the path?
Where os.path.sep is usefull?
I suspect that it exists mainly because a variable like this is required in the module anyway (to avoid hardcoding), and if it's there, it might as well be documented. Its documentation says that it is "occasionally useful".
Why it truncates the path?
From the docs for os.path.join():
If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
and / is an absolute path on *nix systems.
Drop os.path.sep from the os.path.join() call. os.path.join() uses os.path.sep internally.
On your system, os.path.sep == '/' that is interpreted as a root directory (absolute path) and therefore os.path.join('home', '/', 'python') is equivalent to os.path.join('/', 'python') == '/python'. From the docs:
If a component is an absolute path, all previous components are thrown
away and joining continues from the absolute path component.
As correctly given in the docstring of os.path.join -
Join two or more pathname components, inserting '/' as needed. If any component is an absolute path, all previous path components will be discarded.
Same is given in the docs as well -
os.path.join(path, *paths)
Join one or more path components intelligently. The return value is the concatenation of path and any members of *paths with exactly one directory separator (os.sep) following each non-empty part except the last, meaning that the result will only end in a separator if the last part is empty. If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
When you give os.path.sep alone, it is considered as an absolute path to the root directory - / .
Please note , this is for unix/linux based os.path , which internally is posixpath . Though the same behavior is seen in windows os.path.join() .
Example -
>>> import os.path
>>> os.path.join.__doc__
"Join two or more pathname components, inserting '/' as needed.\n If any component is an absolute path, all previous path components\n will be discarded."
Here's the snippet of code that is run if you are on a POSIX machine:
posixpath.py
# Join pathnames.
# Ignore the previous parts if a part is absolute.
# Insert a '/' unless the first part is empty or already ends in '/'.
def join(a, *p):
"""Join two or more pathname components, inserting '/' as needed.
If any component is an absolute path, all previous path components
will be discarded. An empty last part will result in a path that
ends with a separator."""
sep = _get_sep(a)
path = a
try:
if not p:
path[:0] + sep #23780: Ensure compatible data type even if p is null.
for b in p:
if b.startswith(sep):
path = b
elif not path or path.endswith(sep):
path += b
else:
path += sep + b
except (TypeError, AttributeError, BytesWarning):
genericpath._check_arg_types('join', a, *p)
raise
return path
Specifically, the lines:
if b.startswith(sep):
path = b
And, since os.path.sep definitely starts with this character, whenever we encounter it we throw out the portion of the variable path that has already been constructed and start over with the next element in p.
But when os.path.sep is used in os.path.join() , why it truncates the path?
Quoting directly from the documentation of os.path.join
If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
So when you do:
os.path.join('home', os.path.sep, 'python')
os.path.sep returns '/' which is an absolute path, and so 'home' is thrown away and you get only '/python' as the output.
This can is also clear from the example:
>>> import os
>>> os.path.join('home','/python','kivy')
'/python/kivy'
Where os.path.sep is usefull?
os.path.sep or os.sep returns the character used by the operating system to separate pathname components.
But again quoting from the docs:
Note that knowing this is not sufficient to be able to parse or concatenate pathnames — use os.path.split() and os.path.join() — but it is occasionally useful.

replace part of path - python

Is there a quick way to replace part of the path in python?
for example:
old_path='/abc/dfg/ghi/f.txt'
I don't know the beginning of the path (/abc/dfg/), so what I'd really like to tell python to keep everything that comes after /ghi/ (inclusive) and replace everything before /ghi/ with /jkl/mno/:
>>> new_path
'/jkl/mno/ghi/f.txt/'
If you're using Python 3.4+, or willing to install the backport, consider using pathlib instead of os.path:
path = pathlib.Path(old_path)
index = path.parts.index('ghi')
new_path = pathlib.Path('/jkl/mno').joinpath(*path.parts[index:])
If you just want to stick with the 2.7 or 3.3 stdlib, there's no direct way to do this, but you can get the equivalent of parts by looping over os.path.split. For example, keeping each path component until you find the first ghi, and then tacking on the new prefix, will replace everything before the last ghi (if you want to replace everything before the first ghi, it's not hard to change things):
path = old_path
new_path = ''
while True:
path, base = os.path.split(path)
new_path = os.path.join(base, new_path)
if base == 'ghi':
break
new_path = os.path.join('/jkl/mno', new_path)
This is a bit clumsy, so you might want to consider writing a simple function that gives you a list or tuple of the path components, so you can just use find, then join it all back together, as with the pathlib version.
>>> import os.path
>>> old_path='/abc/dfg/ghi/f.txt'
First grab the relative path from the starting directory of your choice using os.path.relpath
>>> rel = os.path.relpath(old_path, '/abc/dfg/')
>>> rel
'ghi\\f.txt'
Then add the new first part of the path to this relative path using os.path.join
>>> new_path = os.path.join('jkl\mno', rel)
>>> new_path
'jkl\\mno\\ghi\\f.txt'
You can use the index of ghi:
old_path.replace(old_path[:old_path.index("ghi")],"/jkl/mno/")
In [4]: old_path.replace(old_path[:old_path.index("ghi")],"/jkl/mno/" )
Out[4]: '/jkl/mno/ghi/f.txt'
A rather naive approach, but does the job:
Function:
def replace_path(path, frm, to):
pre, match, post = path.rpartition(frm)
return ''.join((to if match else pre, match, post))
Example:
>>> s = '/abc/dfg/ghi/f.txt'
>>> replace_path(s, '/ghi/', '/jkl/mno')
'/jkl/mno/ghi/f.txt'
>>> replace_path(s, '/whatever/', '/jkl/mno')
'/abc/dfg/ghi/f.txt'
The following is useful when you want to replace some known base directory in your path.
from pathlib import Path
old_path = Path('/abc/dfg/ghi/f.txt')
old_root = Path('/abc/dfg')
new_root = Path('/jkl/mno')
new_path = new_root / old_path.relative_to(old_root)
# Result: /jkl/mno/ghi/f.txt
I understand that the OP specifically mentioned that the path to the base directory is not known. However, since it is a common task to remove the path to the base directory, and the title of the question ("replace part of the path") is certainly bringing some folks with this subtype of problem here, I am posting it anyway.
I needed to replace an arbitrary number of an arbitrary strings in a path
e.g. replace 'package' with foo in
VERSION_FILE = Path(f'{Path.home()}', 'projects', 'package', 'package', '_version.py')
So I use this call
_replace_path_text(VERSION_FILE, 'package', 'foo)
def _replace_path_text(path, text, replacement):
parts = list(path.parts)
new_parts = [part.replace(text, replacement) for part in parts]
return Path(*new_parts)

Joining: string and absolute path with os.path

Why is this not working, what am I doing wrong?
>>> p1 = r'\foo\bar.txt'
>>> os.path.join('foo1', 'foo2', os.path.normpath(p1))
'\\foo\\bar.txt'
I expected this:
'foo1\\foo2\\foo\\bar.txt'
Edit:
A Solution
>>> p1 = r'\foo\bar.txt'
>>> p1 = p1.strip('\\') # Strip '\\' so the path would not be absolute
>>> os.path.join('foo1', 'foo2', os.path.normpath(p1))
'foo1\\foo2\\foo\\bar.txt'
When os.path.join encounters an absolute path, it throws away what it has accumulated to far. An absolute string is one that starts with a slash (ans on windows, with an optional drive letter). normpath won't touch that slash as it has the same notion of absolute paths. You have to strip that slash.
And if I may ask: where does it come from in the first place?
p1 is an absolute path (starts with \) - thus it is returned by itself, per the documentation:
join(a, *p)
Join two or more pathname components, inserting "\" as needed.
If any component is an absolute path, all previous path components
will be discarded.
If you want the target behaviour of os.path.join to join two absolute paths together, strip out the separator:
import os
p1 = os.path.join(os.sep, 'foo1', 'foo2')
p2 = os.path.join(os.sep, 'foo', 'bar.txt')
os.path.join(p1, p2.lstrip(os.sep))
If you want to modify the paths, you can also do cool things like this using list comprehensions:
# Make sure all folder names are lowercase:
os.path.join(p1, *[x.lower() for x in p2.split(os.sep)])

Categories