I have a requirement for adding and splitting path in my app.I want to work with this app on windows and linux.Here is my code to add paths
path = os.path.join(dir0,dir1,dir2,fn)
But when i am splitting with slashes i am facing problems .Because
the path in windows like:
dir0\dir1\dir2\fn
the path in linux like
dir0/dir1/dir2/fn
Now how can i split the path with single code(with out changing the code while using other platform/platform independent)
You can use os.sep
just
import os
path_string.split(os.sep)
For more info, look the doc
os.path.join(path1[, path2[, ...]])
Join one or more path components intelligently. If any component is an absolute path, all previous components (on Windows, including the previous drive letter, if there was one) are thrown away, and joining continues. The return value is the concatenation of path1, and optionally path2, etc., with exactly one directory separator (os.sep) following each non-empty part except the last. (This means that an empty last part will result in a path that ends with a separator.) Note that on Windows, since there is a current directory for each drive, os.path.join("c:", "foo") represents a path relative to the current directory on drive C: (c:foo), not c:\foo.
Use os.path.split. It is a system independent way to split paths. Note that this only splits into (head, tail). To get all the individual parts, you need to recursively split head or use str.split using os.path.sep as the separator.
Related
I use the following python code to get list of jpg files in nested subdirectories which are in parent directory.
import glob2,os
all_header_files = glob2.glob(os.path.join('Path/to/parent/directory','/**/*.jpg'))
However, I get nothing but when I cd into the parent directory and I use the following python code then I get the list of jpeg files.
import glob2
all_header_files = glob2.glob('./**/*.jpg')
How can I get the result with the absolute path?(first version)
You have an extra slash.
The os.path.join will insert the filepath separators for you, so you should think of it as this to get the correct directory
join('Path/to/parent directory' , '**/*.jpg')
Even more accurately,
parent = os.path.join('Path', 'to', 'parent directory')
os.path.join(parent, '**/*.jpg')
If you are trying to use your Home directory, see os.path.expanduser
In [10]: import os, glob
In [11]: glob.glob(os.path.join('~', 'Downloads', "**/*.sh"))
Out[11]: []
In [12]: glob.glob(os.path.expanduser(os.path.join('~', 'Downloads', "**/*.sh")))
Out[12]:
['/Users/name/Downloads/dir/script.sh']
You should not join with the trailing slash as you'll end up with the root. You can debug by printing out the resulting path before passing it to glob.
Try to change your code like this (note the dot):
import glob2,os
all_header_files = glob2.glob(os.path.join('Path/to/parent directory','./**/*.jpg'))
os.path.join() joins paths in an intelligent way.
os.path.join('Path/to/anything','/**/*.jpg'))
resolves to '/**/*.jpg' since '/**/*.jpg' is any path, ever.
Change the '/**/*.jpg' to '**/*.jpg' and it should work.
In cases like this, I recommend to always try out the result of a certain function within the python command line. At least, this is how I found out the issue here.
The problem with the code you have posted lies in the use of os.path.join.
In the documentation it says for os.path.join(path, *paths):
If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
In your case, the component /**/*.jpg is an absolute path, as it starts with a /. Consequently your initial input /Path/to/parent directory is being truncated by the call to the join function. (https://docs.python.org/3.5/library/os.path.html#os.path.join)
I have locally tested the joining part with python3 and for me it is the case, that using os.path.join(any_path, "/**/*.pdf") returns the string '/**/*.pdf'.
The fix for this error is:
import glob2,os
all_header_files = glob2.glob(os.path.join('Path/to/parent directory','**/*.jpg'))
This returns the path 'Path/to/parent directory/**/*.jpg'
I'm writing some python code to generate the relative path. Situation need to be considered:
Under the same folder. I want "." or ".\", both of tham are ok for me.
Other folder. I want like ".\xxx\" and "..\xxx\xxx\"
os.path.relpath() will generate the relative path, but without .\ at the beginning and \ in the end. We can add \ in the end by using os.path.join(dirname, ""). But i can't figure out how to add ".\" at the beginning without impacting the first case when they are under the same folder and "..\xxx\xxx\".
It will give you relative path
import os
dir = os.path.dirname(__file__)
filename = os.path.join(dir,'Path')
The relpath() function will produce the ".." syntax given the appropriate base to start from (second parameter). For instance, supposing you were writing something like a script generator that produces code using relative paths, if the working directory is as the second parameter to relpath() as below indicates, and you want to reference in your code another file in your project under a directory one level up and two deep, you'll get "../blah/blah".. In the case where you want to prefix paths in the same folder, you can simply do a join with ".". That will produce a path with the correct OS specific separator.
print(os.path.relpath("/foo/bar/blah/blah", "/foo/bar/baz"))
>>> ../blah/blah
print(os.path.join('.', 'blah'))
>>> ./blah
The following line, unless I'm mistaken, will grab the absolute path to your directory so you can access files
PATH = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0])))
This is what I've been using typically access files in my current directory when I need to use images etc in the programs i've been writing.
Now, say I do the following since I'm using windows to access a specific image in the directory
image = PATH + "\\" + "some_image.gif"
This is where my question lies, this works on windows, but if I remember correctly "\\" is for windows and this will not work on other OS? I cannot directly test this myself as I don't have other operating systems or I wouldn't have bothered posting. As far as I can tell from where I've looked this isn't mentioned in the documentation.
If this is indeed the case is there a way around this?
Yes, '\\' is just for Windows. You can use os.sep, which will be '\\' on Windows, ':' on classic Mac, '/' on almost everything else, or whatever is appropriate for the current platform.
You can usually get away with using '/'. Nobody's likely to be running your program on anything but Windows or Unix. And Windows will accept '/' pathnames in most cases. But there are many Windows command-line tools that will confuse your path for a flag if it starts with /, and some even if you have a / in the middle, and if you're using \\.\ paths, a / is treated like a regular character rather than a separator, and so on. So you're better off not doing that.
The simple thing to do is just use os.path.join:
image = os.path.join(PATH, "some_image.gif")
As a side note, in your first line, you're already using join—but you don't need it there:
PATH = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0])))
It's perfectly legal to call join with only one argument like this, but also perfectly useless; you just join the one thing with nothing; you will get back exactly what you passed in. Just do this:
PATH = os.path.abspath(os.path.dirname(sys.argv[0]))
One last thing: If you're on Python 3.4+, you may want to consider using pathlib instead of os.path:
PATH = Path(sys.argv[0]).parent.resolve()
image = PATH / "some_image.gif"
Use os.path.join instead of "\\":
os.path.join(PATH, "some_image.gif")
The function will join intelligently the different parts of the path.
PATH = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0])))
image = os.path.join(PATH, "some_image.gif")
os.path.join will intelligently join the arguments using os.sep which uses the OS file separator for you.
I have a basic file/folder structure on the Desktop where the "Test" folder contains "Folder 1", which in turn contains 2 subfolders:
An "Original files" subfolder which contains shapefiles (.shp).
A "Processed files" subfolder which is empty.
I am attempting to write a script which looks into each parent folder (Folder 1, Folder 2 etc) and if it finds an Original Files subfolder, it will run a function and output the results into the Processed files subfolder.
I made a simple diagram to showcase this where if Folder 1 contains the relevant subfolders then the function will run; if Folder 2 does not contain the subfolders then it's simply ignored:
I looked into the following posts but having some trouble:
python glob issues with directory with [] in name
Getting a list of all subdirectories in the current directory
How to list all files of a directory?
The following is the script which seems to run happily, annoying thing is that it doesn't produce an error so this real noob can't see where the problem is:
import os, sys
from os.path import expanduser
home = expanduser("~")
for subFolders, files in os.walk(home + "\Test\\" + "\*Original\\"):
if filename.endswith('.shp'):
output = home + "\Test\\" + "\*Processed\\" + filename
# do_some_function, output
I guess you mixed something up in your os.walk()-loop.
I just created a simple structure as shown in your question and used this code to get what you're looking for:
root_dir = '/path/to/your/test_dir'
original_dir = 'Original files'
processed_dir = 'Processed files'
for path, subdirs, files in os.walk(root_dir):
if original_dir in path:
for file in files:
if file.endswith('shp'):
print('original dir: \t' + path)
print('original file: \t' + path + os.path.sep + file)
print('processed dir: \t' + os.path.sep.join(path.split(os.path.sep)[:-1]) + os.path.sep + processed_dir)
print('processed file: ' + os.path.sep.join(path.split(os.path.sep)[:-1]) + os.path.sep + processed_dir + os.path.sep + file)
print('')
I'd suggest to only use wildcards in a directory-crawling script if you are REALLY sure what your directory tree looks like. I'd rather use the full names of the folders to search for, as in my script.
Update: Paths
Whenever you use paths, take care of your path separators - the slashes.
On windows systems, the backslash is used for that:
C:\any\path\you\name
Most other systems use a normal, forward slash:
/the/path/you/want
In python, a forward slash could be used directly, without any problem:
path_var = '/the/path/you/want'
...as opposed to backslashes. A backslash is a special character in python strings. For example, it's used for the newline-command: \n
To clarify that you don't want to use it as a special character, but as a backslash itself, you either have to "escape" it, using another backslash: '\\'. That makes a windows path look like this:
path_var = 'C:\\any\\path\\you\\name'
...or you could mark the string as a "raw" string (or "literal string") with a proceeding r. Note that by doing that, you can't use special characters in that string anymore.
path_var = r'C:\any\path\you\name'
In your comment, you used the example root_dir = home + "\Test\\". The backslash in this string is used as a special character there, so python tries to make sense out of the backslash and the following character: \T. I'm not sure if that has any meaning in python, but \t would be converted to a tab-stop. Either way - that will not resolve to the path you want to use.
I'm wondering why your other example works. In "C:\Users\me\Test\\", the \U and \m should lead to similar errors. And you also mixed single and double backslashes.
That said...
When you take care of your OS path separators and trying around with new paths now, also note that python does a lot of path-concerning things for you. For example, if your script reads a directory, as os.walk() does, on my windows system the separators are already processed as double backslashes. There's no need for me to check that - it's usually just hardcoded strings, where you'll have to take care.
And finally: The Python os.path module provides a lot of methods to handle paths, seperators and so on. For example, os.path.sep (and os.sep, too) wil be converted in the correct seperator for the system python is running on. You can also build paths using os.path.join().
And finally: The home-directory
You use expanduser("~") to get the home-path of the current user. That should work fine, but if you're using an old python version, there could be a bug - see: expanduser("~") on Windows looks for HOME first
So check if that home-path is resolved correct, and then build your paths using the power of the os-module :-)
Hope that helps!
After reading the online documentation for the os.path.join() method, the following case seems like it should qualify but apparently it doesn't. Am I reading that documentation correctly?
>>> import os
>>>
>>> os.path.join("/home/user", "/projects/pyproject", "mycode.py")
>>> '/projects/pyproject/mycode.py'
After trying different combinations of trailing and leading os.sep on the first and second paths, it seems that the second path to join cannot have its first character start with an os.sep.
>>> os.path.join("/home/user", "projects/pyproject", "mycode.py")
>>> '/home/user/projects/pyproject/mycode.py'
In the case where path1 and path2 are parts from, say, user input means writing code to parse their input for that leading os.sep.
From the python.org online reference:
os.path.join(path1[, path2[, ...]]) Join one or more path components
intelligently. If any component is an absolute path, all previous
components (on Windows, including the previous drive letter, if there
was one) are thrown away, and joining continues. The return value is
the concatenation of path1, and optionally path2, etc., with exactly
one directory separator (os.sep) following each non-empty part except
the last. (This means that an empty last part will result in a path
that ends with a separator.) Note that on Windows, since there is a
current directory for each drive, os.path.join("c:", "foo") represents
a path relative to the current directory on drive C: (c:foo), not
c:\foo.
Am I reading that documentation correctly?
Try reading it again, emphasis mine:
Join one or more path components intelligently. If any component is an
absolute path, all previous components (on Windows, including the
previous drive letter, if there was one) are thrown away, and
joining continues. The return value is the concatenation of path1,
and optionally path2, etc., with exactly one directory separator
(os.sep) following each non-empty part except the last. (This means
that an empty last part will result in a path that ends with a
separator.) Note that on Windows, since there is a current directory
for each drive, os.path.join("c:", "foo") represents a path relative
to the current directory on drive C: (c:foo), not c:\foo.
When it says previous components are "thrown away" means that they are ignored and not included in the final result.
It is just as the documentation says: if any component is absolute, the previous components are thrown away. If your path begins with /, then it is absolute. If it's not supposed to be absolute, it shouldn't start with /.