Python dedicated class for file names - python

I am writing a program that utilises a large number of files. Does Python feature an inbuilt class for file paths, or does one have to be implemented by the user (like below):
class FilePath:
def __init__(path):
shazam(path)
def shazam(self, path):
""" something happens, path is formatted, etc """
self.formatted_path = foobar
Why would it be useful?
Suppose the program and its data is copied to a different operating system. The class could modify itself on launch to support the a different path separator.
Why not just write it yourself?
Someone might have already written a class in the standard Python library.

Python has several cross platform modules for dealing with the file system, paths and operating system.
The os module specifically has an os.sep character.
os.path.join() is OS-aware and will use the correct separator when joining paths together.
Additionally, os.path.normpath() will take any path and convert the separators to whatever the native OS supports.

Since Python 3.4 there is pathlib which seems to be what you are looking for. Of course there are the functions from os.path, too - but for an object-oriented approach pathlib is fitting better.

Related

Proper way of getting path of main script from library in Python

I'm writing a Python library which which you can load an object from a file and do something with it. For convenience, I'd like to make it so that people can provide three kinds of paths:
A path starting with a "/", which will be interpreted as an absolute path
A path starting with a "~/", which will be interpreted as relative to the user's home directory (for which I plan to use os.path.expanduser)
A path starting with neither, which would be interpreted as relative to the directory of the top-level script that is importing my library.
My question is: what is the best way of approaching the third case? The two ways I have come across on Stack Overflow are both kind of klugey:
1)
import __main__
if hasattr(__main__, "__file__"):
script_dir = os.path.dirname(os.path.abspath(__main__.__file__))
import inspect
frame_info = inspect.stack()[-1]
mod = inspect.getmodule(frame_info[0])
script_dir = os.path.dirname(mod.__file__)
Obviously both of these will fail in the case that we're running from an interactive terminal or something, and in that case I would want to fall back to just treating it as an absolute path.
Anyway, I get the sense that using inspect like this in a library is frowned upon, but the other way seems klugey as well. Is there a better way to do this that I'm unaware of?
Use the built-in package pathlib's getcwd method. It defaults to the root folder of the top package.
Will not work if you change the working directory. Although, an unprefixed path is common practice to be relative to the working directory.
import pathlib
print(pathlib.Path.cwd())
>>> A:\Programming\Python\generalfile

How to check folder / file permissions with Pathlib

Is there a Pathlib equivalent of os.access()?
Without Pathlib the code would look like this:
import os
os.access('my_folder', os.R_OK) # check if script has read access to folder
However, in my code I'm dealing with Pathlib paths, so I would need to do this (this is just an example):
# Python 3.5+
from pathlib import Path
import os
# get path ~/home/github if on Linux
my_folder_pathlib = Path.home() / "github"
os.access(str(my_folder_pathlib), os.R_OK)
The casting to str() is kinda ugly.
I was wondering if there is a pure Pathlib solution for what I'm trying to achieve?
p.s. I'm aware of the principle "easier to ask for forgiveness", however this is part of a bigger framework, and I need to know as soon as possible if the script has the right permissions to a NAS stored folder.
From Python 3.6, os.access() accepts path-like objects, therefore no str() needed anymore:
https://docs.python.org/3/library/os.html#os.access
Although this is still not a pure Pathlib solution.
Use the stat() method on a Path object, then lookup the st_mode attribute.
Path().stat().st_mode

python: get directory two levels up

Ok...I dont know where module x is, but I know that I need to get the path to the directory two levels up.
So, is there a more elegant way to do:
import os
two_up = os.path.dirname(os.path.dirname(__file__))
Solutions for both Python 2 and 3 are welcome!
You can use pathlib. Unfortunately this is only available in the stdlib for Python 3.4. If you have an older version you'll have to install a copy from PyPI here. This should be easy to do using pip.
from pathlib import Path
p = Path(__file__).parents[1]
print(p)
# /absolute/path/to/two/levels/up
This uses the parents sequence which provides access to the parent directories and chooses the 2nd one up.
Note that p in this case will be some form of Path object, with their own methods. If you need the paths as string then you can call str on them.
Very easy:
Here is what you want:
import os.path as path
two_up = path.abspath(path.join(__file__ ,"../.."))
I was going to add this just to be silly, but also because it shows newcomers the potential usefulness of aliasing functions and/or imports.
Having written it, I think this code is more readable (i.e. lower time to grasp intention) than the other answers to date, and readability is (usually) king.
from os.path import dirname as up
two_up = up(up(__file__))
Note: you only want to do this kind of thing if your module is very small, or contextually cohesive.
The best solution (for python >= 3.4) when executing from any directory is:
from pathlib import Path
two_up = Path(__file__).resolve().parents[1]
For getting the directory 2 levels up:
import os.path as path
curr_dir=Path(os.path.dirname(os.path.abspath(__file__)))
two_dir_up_=os.fspath(Path(curr_dir.parent.parent).resolve())
I have done the following to go up two and drill down on other dir
default_config_dir=os.fspath(Path(curr_dir.parent.parent,
'data/config').resolve())
Personally, I find that using the os module is the easiest method as outlined below. If you are only going up one level, replace ('../..') with ('..').
import os
os.chdir('../..')
--Check:
os.getcwd()
More cross-platform implementation will be:
import pathlib
two_up = (pathlib.Path(__file__) / ".." / "..").resolve()
Using parent is not supported on Windows. Also need to add .resolve(), to:
Make the path absolute, resolving all symlinks on the way and also normalizing it (for example turning slashes into backslashes under Windows)
(pathlib.Path('../../') ).resolve()
I have found that the following works well in 2.7.x
import os
two_up = os.path.normpath(os.path.join(__file__,'../'))
You can use this as a generic solution:
import os
def getParentDir(path, level=1):
return os.path.normpath( os.path.join(path, *([".."] * level)) )
Assuming you want to access folder named xzy two folders up your python file. This works for me and platform independent.
".././xyz"
100% working answer:
os.path.abspath(os.path.join(os.getcwd() ,"../.."))
There is already an accepted answer, but for two levels up I think a chaining approach is arguably more readable:
pathlib.Path(__file__).parent.parent.resolve()
Surprisingly it seems no one has yet explored this nice one-liner option:
import os
two_up = os.path.normpath(__file__).rsplit(os.sep, maxsplit=2)[0]
rsplit is interesting since the maxsplit parameter directly represents how many parent folders to move up and it always returns a result in just one pass through the path.
With Pathlib (recommended after Python 3.5, the/a general solution that works not only in file.py files, but also in Jupyter (or other kind of) notebook and Python shell is:
p = Path.cwd().resolve().parents[1]
You only need to substitute (__file__) for cwd() (current working directory).
Indeed it would even work just with:
p = Path().resolve().parents[1]
(and of course with .parent.parent instead of parents[1])
I don't yet see a viable answer for 2.7 which doesn't require installing additional dependencies and also starts from the file's directory. It's not nice as a single-line solution, but there's nothing wrong with using the standard utilities.
import os
grandparent_dir = os.path.abspath( # Convert into absolute path string
os.path.join( # Current file's grandparent directory
os.path.join( # Current file's parent directory
os.path.dirname( # Current file's directory
os.path.abspath(__file__) # Current file path
),
os.pardir
),
os.pardir
)
)
print grandparent_dir
And to prove it works, here I start out in ~/Documents/notes just so that I show the current directory doesn't influence outcome. I put the file grandpa.py with that script in a folder called "scripts". It crawls up to the Documents dir and then to the user dir on a Mac.
(testing)AlanSE-OSX:notes AlanSE$ echo ~/Documents/scripts/grandpa.py
/Users/alancoding/Documents/scripts/grandpa.py
(testing)AlanSE-OSX:notes AlanSE$ python2.7 ~/Documents/scripts/grandpa.py
/Users/alancoding
This is the obvious extrapolation of the answer for the parent dir. Better to use a general solution than a less-good solution in fewer lines.

os function to both join directories and create them?

Currently if I want to specify and create a new directory, I'll do:
newPath = os.path.join(oldPath,"newfolder")
if(not os.path.exists(newPath)): os.makedirs(newPath)
I'm wondering if a pre-packaged os function (or in other package) exists to do this in one function? I know I can make my own but I'd rather a pre-packaged solution.
Try pylib's path functionality.
It's basically a really nice OOP (Object Oriented) abstraction around local paths (and svn paths).
Example:
from py.path import local
p = local("/some/path").join("/some/other/path").mkdir("/some/oth/path")
NB: The above example is contrived. Please refer to the documentation.

Is this correct way to import python scripts residing in arbitrary folders?

This snippet is from an earlier answer here on SO. It is about a year old (and the answer was not accepted). I am new to Python and I am finding the system path a real pain. I have a few functions written in scripts in different directories, and I would like to be able to import them into new projects without having to jump through hoops.
This is the snippet:
def import_path(fullpath):
""" Import a file with full path specification. Allows one to
import from anywhere, something __import__ does not do.
"""
path, filename = os.path.split(fullpath)
filename, ext = os.path.splitext(filename)
sys.path.append(path)
module = __import__(filename)
reload(module) # Might be out of date
del sys.path[-1]
return module
Its from here:
How to do relative imports in Python?
I would like some feedback as to whether I can use it or not - and if there are any undesirable side effects that may not be obvious to a newbie.
I intend to use it something like this:
import_path(/home/pydev/path1/script1.py)
script1.func1()
etc
Is it 'safe' to use the function in the way I intend to?
The "official" and fully safe approach is the imp module of the standard Python library.
Use imp.find_module to find the module on your precisely-specified list of acceptable directories -- it returns a 3-tuple (file, pathname, description) -- if unsuccessful, file is actually None (but it can also raise ImportError so you should use a try/except for that as well as checking if file is None:).
If the search is successful, call imp.load_module (in a try/finally to make sure you close the file!) with the above three arguments after the first one which must be the same name you passed to find_module -- it returns the module object (phew;-).
As mentioned, please consider thread safety, if appropriate. I prefer something closer to a solution posted in a similar post. The main differences below: the use of insert to specify priority of the import, correct restoration of sys.path using try...finally, and setting the global namespace.
# inspired by Alex Martelli's solution to
# http://stackoverflow.com/questions/1096216/override-namespace-in-python/1096247#1096247
def import_from_absolute_path(fullpath, global_name=None):
"""Dynamic script import using full path."""
import os
import sys
script_dir, filename = os.path.split(fullpath)
script, ext = os.path.splitext(filename)
sys.path.insert(0, script_dir)
try:
module = __import__(script)
if global_name is None:
global_name = script
globals()[global_name] = module
sys.modules[global_name] = module
finally:
del sys.path[0]
It does feel like a bit of a hack, but at the moment, I can't think of any unintended side effects that are likely to occur, at least not as long as you're just using this for your own scripts. Basically what it does is temporarily add the parent directory of the specified file (in your example, /home/pydev/path1/) to the list of paths that Python checks when it's looking for a module to import.
The only risk I can think of right now would arise in a multithreaded environment, where two or more threads (or processes) are running this function simultaneously. If thread A wants to import module A from path dirA/A.py, and thread B wants to import module B from path dirB/B.py, you'd wind up with both dirA and dirB in sys.path for a short time. And if there is a file named B.py in dirA, it's possible that thread B will find that (dirA/B.py) instead of the file it's looking for (dirB/B.py), thus importing the wrong module. For this reason, I wouldn't use it in production code, or code that you're going to distribute to other people (at least not without warning them that this hack is in here!). In a situation like that, you could write a more complex function that allows you to specify the file to import without messing with the standard set of paths. (That's what mod_python does, for example)
I would be worried that your script name might correspond with a module that shows up earlier in the path. To dispel this fear, I would fully replace the path with a new list containing just the directory containing the module, then put it back once the import has completed. Also, you should wrap this in some sort of lock so that multiple threads trying to do the same thing don't interfere with each other.

Categories