python import with absolute path - python

I have the following folder structure and want to find a good way to import python modules.
project1/test/benchmark/benchmark_project1.py
#in benchmark_project1.py
from project1.test.benchmark import *
My question is how to get rid of project1, since it might be renamed to "project2" or something else. I want to use import with absolute path, but don't know a good way to achieve that.

You can use os.chdir() to change your directory before the import statement, and then to change it back afterward. This will allow you to specify the precise file to import. You can use os.listdir() to get the list of all files in the directory, and then simply index them. Using a loop will get all the modules in the folder, or providing the right index according to some pattern will give you a specific one. The glob module allows you to select files using regex.
import os
cwd = os.getcwd()
new_dir = 'project1/test/benchmark/'
list_dir = os.listdir(new_dir) # Find all matching
os.chdir(new_dir)
for i in range(len(list_dir)): # Import all of them (or index them in some way)
module = list_dir[i][0:-3] # Filter off the '.py' file extension
from module import *
os.chdir(cwd)
Alternatively, you can add the location to your path instead of changing directories. Take a look at this question for some additional resources.

Related

Convert a relative path (mp3) from a master file path (playlist) using python pathlib

I have three files
My python file running in an unimportant different folder: C:\DD\CC\BB\AA\code.py
A playlist file "C:\ZZ\XX\Playlist.pls" which points to ....\mp3\song.mp3
The C:\mp3\song.mp3 file.
What I want is to get the location of the mp3 as an absolute path. But every attemp I try I get everything related to whenever the code.py file is.
import pathlib
plMaster = pathlib.Path(r"C:\ZZ\XX\Playlist.pls")
plSlave = pathlib.Path(r"..\..\mp3\song.mp3")
I have tried plSlave.absolute() and gives me "C:\DD\CC\BB\AA....\mp3\song.mp3"
Using relative_to doesn't work. I feel like I am doing such an easy task but I must be missing something because I can't find any function that lets me set the reference to compute the relative path.
Note: I already have parsed the pls file, and have the string r"....\mp3\song.mp3" extracted. I just need to get the path "C:\mp3\song.mp3" knowing that they are relative to the pls. (Not relative to the code.py)
If you're using a Windows version of Python, this is fairly easy. You can join the directory of plMaster (plMaster.parent) with the relative path of plSlave, then resolve the path using resolve(). You can use strict=False to force the resolve even if the path components aren't found.
This worked for me:
>>> plMaster = pathlib.Path(r"C:\ZZ\XX\Playlist.pls")
>>> plSlave = pathlib.Path(r"..\..\mp3\song.mp3")
>>> plMaster.parent.joinpath(plSlave).resolve(strict=False)
WindowsPath('C:/mp3/song.mp3')
If you're on a Unix version of Python, using Windows paths, I couldn't get this to work no matter what I tried, even using pathlib.PureWindowsPath().
Might well be a better method here, but you can use pathlib.Path.parents and pathlib.Path.parts to extract some useful info here and get where you are going
new_relative_path = r"..\..\mp3\song.mp3" #however you got this from reading your .pls file or whatever
pls_path = pathlib.Path(r'C:\ZZ\XX\Playlist.pls')
relative_save = pathlib.Path(new_relativePath)
n = relative_save.parts.count('..')
new_path = pls_path.parents[n-1].joinpath(*relative_save.parts[n:])
The key thing here is that you are going to navigate up the original path (the pls_path) n times (so n-1 since we start at 0), and then you are going to append to that whatever your new relative path is, stripping the '..' segments from the beginning of it.
Whilst I was waiting for other answers I manage to figure it out ditching pathlib and using os instead.
import os
plMaster = r"C:\ZZ\XX\Playlist.pls"
plSlave = r"..\..\mp3\song.mp3"
os.chdir(os.path.dirname(plMaster))
os.path.abspath(plSlave)

list of jpeg files in nested subdirectories

I use the following python code to get list of jpg files in nested subdirectories which are in parent directory.
import glob2,os
all_header_files = glob2.glob(os.path.join('Path/to/parent/directory','/**/*.jpg'))
However, I get nothing but when I cd into the parent directory and I use the following python code then I get the list of jpeg files.
import glob2
all_header_files = glob2.glob('./**/*.jpg')
How can I get the result with the absolute path?(first version)
You have an extra slash.
The os.path.join will insert the filepath separators for you, so you should think of it as this to get the correct directory
join('Path/to/parent directory' , '**/*.jpg')
Even more accurately,
parent = os.path.join('Path', 'to', 'parent directory')
os.path.join(parent, '**/*.jpg')
If you are trying to use your Home directory, see os.path.expanduser
In [10]: import os, glob
In [11]: glob.glob(os.path.join('~', 'Downloads', "**/*.sh"))
Out[11]: []
In [12]: glob.glob(os.path.expanduser(os.path.join('~', 'Downloads', "**/*.sh")))
Out[12]:
['/Users/name/Downloads/dir/script.sh']
You should not join with the trailing slash as you'll end up with the root. You can debug by printing out the resulting path before passing it to glob.
Try to change your code like this (note the dot):
import glob2,os
all_header_files = glob2.glob(os.path.join('Path/to/parent directory','./**/*.jpg'))
os.path.join() joins paths in an intelligent way.
os.path.join('Path/to/anything','/**/*.jpg'))
resolves to '/**/*.jpg' since '/**/*.jpg' is any path, ever.
Change the '/**/*.jpg' to '**/*.jpg' and it should work.
In cases like this, I recommend to always try out the result of a certain function within the python command line. At least, this is how I found out the issue here.
The problem with the code you have posted lies in the use of os.path.join.
In the documentation it says for os.path.join(path, *paths):
If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
In your case, the component /**/*.jpg is an absolute path, as it starts with a /. Consequently your initial input /Path/to/parent directory is being truncated by the call to the join function. (https://docs.python.org/3.5/library/os.path.html#os.path.join)
I have locally tested the joining part with python3 and for me it is the case, that using os.path.join(any_path, "/**/*.pdf") returns the string '/**/*.pdf'.
The fix for this error is:
import glob2,os
all_header_files = glob2.glob(os.path.join('Path/to/parent directory','**/*.jpg'))
This returns the path 'Path/to/parent directory/**/*.jpg'

How can I recursively find the *directories* containing a file of a certain type?

I have a set of .bam, files scattered inside a tree of folders. Not every directory contains such a file. I know how to recursively get the path of the files themselves using glob, but not the directory containing them.
import glob2
bam_files = glob2.glob('/data2/**/*.bam')
print bam_files
The above code gives the .bam files, but I want just the folders. Wondering if there is a direct way to do this using glob without regular expressions.
Use a set and os.path.dirname() [https://docs.python.org/2/library/os.path.html#os.path.dirname]:
import glob2
import os
bam_dirs = {os.path.dirname(p) for p in glob2.glob('/data2/**/*.bam')}
print bam_dirs

How to set current working directory in python in a automatic way

How can I set the current path of my python file "myproject.py" to the file itself?
I do not want something like this:
path = "the path of myproject.py"
In mathematica I can set:
SetDirectory[NotebookDirectory[]]
The advantage with the code in Mathematica is that if I change the path of my Mathematica file, for example if I give it to someone else or I put it in another folder, I do not need to do anything extra. Each time Mathematica automatically set the directory to the current folder.
I want something similar to this in Python.
The right solution is not to change the current working directory, but to get the full path to the directory containing your script or module then use os.path.join to build your files path:
import os
ROOT_PATH = os.path.dirname(os.path.abspath(__file__))
# then:
myfile_path = os.path.join(ROOT_PATH, "myfile.txt")
This is safer than messing with current working directory (hint : what would happen if another module changes the current working directory after you did but before you access your files ?)
I want to set the directory in which the python file is, as working directory
There are two step:
Find out path to the python file
Set its parent directory as the working directory
The 2nd is simple:
import os
os.chdir(module_dir) # set working directory
The 1st might be complex if you want to support a general case (python file that is run as a script directly, python file that is imported in another module, python file that is symlinked, etc). Here's one possible solution:
import inspect
import os
module_path = inspect.getfile(inspect.currentframe())
module_dir = os.path.realpath(os.path.dirname(module_path))
Use the os.getcwd() function from the built in os module also there's os.getcwdu() which returns a unicode object of the current working directory
Example usage:
import os
path = os.getcwd()
print path
#C:\Users\KDawG\Desktop\Python

Python - How to specify a relative path by jumping a subdirectory?

I'm in one location i.e. 'c:/program files/java' and I want to jump two levels down without having to specify the subfolders i.e. I want to move to 'c:/program files/java/7.0/jre/bin' without specifying '/7.0/'.
A snippet I'm using is:
import os
os.chdir('c://program files//java')
os.getcwd()
'c:/program files/java'
Now I want to use os.chdir() to move to '/7.0/jre' so os.getcwd() is 'c://program files//java/7.0/jre'
without having to specify '7.0' i.e. os.chdir('.\**7.0**\jre')
Does anyone have any suggestions?
You can use glob.glob:
import glob
import os
os.chdir('c:/program files/java')
os.chdir(glob.glob('*/jre')[0])
Above code will change working directory to c:/program files/java/*/jre.
In case there are multiple java directory, and you want to go to specific directory (for example, to the newest version directory), you should manipulate the return value of glob.glob().

Categories