Accessing file path in the os - python

The following line, unless I'm mistaken, will grab the absolute path to your directory so you can access files
PATH = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0])))
This is what I've been using typically access files in my current directory when I need to use images etc in the programs i've been writing.
Now, say I do the following since I'm using windows to access a specific image in the directory
image = PATH + "\\" + "some_image.gif"
This is where my question lies, this works on windows, but if I remember correctly "\\" is for windows and this will not work on other OS? I cannot directly test this myself as I don't have other operating systems or I wouldn't have bothered posting. As far as I can tell from where I've looked this isn't mentioned in the documentation.
If this is indeed the case is there a way around this?

Yes, '\\' is just for Windows. You can use os.sep, which will be '\\' on Windows, ':' on classic Mac, '/' on almost everything else, or whatever is appropriate for the current platform.
You can usually get away with using '/'. Nobody's likely to be running your program on anything but Windows or Unix. And Windows will accept '/' pathnames in most cases. But there are many Windows command-line tools that will confuse your path for a flag if it starts with /, and some even if you have a / in the middle, and if you're using \\.\ paths, a / is treated like a regular character rather than a separator, and so on. So you're better off not doing that.
The simple thing to do is just use os.path.join:
image = os.path.join(PATH, "some_image.gif")
As a side note, in your first line, you're already using join—but you don't need it there:
PATH = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0])))
It's perfectly legal to call join with only one argument like this, but also perfectly useless; you just join the one thing with nothing; you will get back exactly what you passed in. Just do this:
PATH = os.path.abspath(os.path.dirname(sys.argv[0]))
One last thing: If you're on Python 3.4+, you may want to consider using pathlib instead of os.path:
PATH = Path(sys.argv[0]).parent.resolve()
image = PATH / "some_image.gif"

Use os.path.join instead of "\\":
os.path.join(PATH, "some_image.gif")
The function will join intelligently the different parts of the path.

PATH = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0])))
image = os.path.join(PATH, "some_image.gif")
os.path.join will intelligently join the arguments using os.sep which uses the OS file separator for you.

Related

Primer needed in python pathnames

I am a very novice coder, and Python is my first (and, practically speaking, only) language. I am charged as part of a research job with manipulating a collection of data analysis scripts, first by getting them to run on my computer. I was able to do this, essentially by removing all lines of coding identifying paths, and running the scripts through a Jupyter terminal opened in the directory where the relevant modules and CSV files live so the script knows where to look (I know that Python defaults to the location of the terminal).
Here are the particular blocks of code whose function I don't understand
import sys
sys.path.append('C:\Users\Ben\Documents\TRACMIP_Project\mymodules/')
import altdata as altdata
I have replaced the pathname in the original code with the path name leading to the directory where the module is; the file containing all the CSV files that end up being referenced here is also in mymodules.
This works depending on where I open the terminal, but the only way I can get it to work consistently is by opening the terminal in mymodules, which is fine for now but won't work when I need to work by accessing the server remotely. I need to understand better precisely what is being done here, and how it relates to the location of the terminal (all the documentation I've found is overly technical for my knowledge level).
Here is another segment I don't understand
import os.path
csvfile = 'csv/' + model +'_' + exp + '.csv'
if os.path.isfile(csvfile): # csv file exists
hcsvfile = open(csvfile )
I get here that it's looking for the CSV file, but I'm not sure how. I'm also not sure why then on some occasions depending on where I open the terminal it's able to find the module but not the CSV files.
I would love an explanation of what I've presented, but more generally I would like information (or a link to information) explaining paths and how they work in scripts in modules, as well as what are ways of manipulating them. Thanks.
sys.path
This is simple list of directories where python will look for modules and packages (.py and dirs with __init__.py file, look at modules tutorial). Extending this list will allow you to load modules (custom libs, etc.) from non default locations (usually you need to change it in runtime, for static dirs you can modify startup script to add needed enviroment variables).
os.path
This module implements some useful functions on pathnames.
... and allows you to find out if file exists, is it link, dir, etc.
Why you failed loading *.csv?
Because sys.path responsible for module loading and only for this. When you use relative path:
csvfile = 'csv/' + model +'_' + exp + '.csv'
open() will look in current working directory
file is either a string or bytes object giving the pathname (absolute or relative to the current working directory)...
You need to use absolute paths by constucting them with os.path module.
I agree with cdarke's comment that you are probably running into an issue with backslashes. Replacing the line with:
sys.path.append(r'C:\Users\Ben\Documents\TRACMIP_Project\mymodules')
will likely solve your problem. Details below.
In general, Python treats paths as if they're relative to the current directory (where your terminal is running). When you feed it an absolute path-- which is a path that includes the root directory, like the C:\ in C:\Users\Ben\Documents\TRACMIP_Project\mymodules-- then Python doesn't care about the working directory anymore, it just looks where you tell it to look.
Backslashes are used to make special characters within strings, such as line breaks (\n) and tabs (\t). The snag you've hit is that Python paths are strings first, paths second. So the \U, \B, \D, \T and \m in your path are getting misinterpreted as special characters and messing up Python's path interpretation. If you prefix the string with 'r', Python will ignore the special characters meaning of the backslash and just interpret it as a literal backslash (what you want).
The reason it still works if you run the script from the mymodules directory is because Python automatically looks in the working directory for files when asked. sys.path.append(path) is telling the computer to include that directory when it looks for commands, so that you can use files in that directory no matter where you're running the script. The faulty path will still get added, but its meaningless. There is no directory where you point it, so there's nothing to find there.
As for path manipulation in general, the "safest" way is to use the function in os.path, which are platform-independent and will give the correct output whether you're working in a Windows or a Unix environment (usually).
EDIT: Forgot to cover the second part. Since Python paths are strings, you can build them using string operations. That's what is happening with the line
csvfile = 'csv/' + model +'_' + exp + '.csv'
Presumably model and exp are strings that appear in the filenames in the csv/ folder. With model = "foo" and exp = "bar", you'd get csv/foo_bar.csv which is a relative path to a file (that is, relative to your working directory). The code makes sure a file actually exists at that path and then opens it. Assuming the csv/ folder is in the same path as you added in sys.path.append, this path should work regardless of where you run the file, but I'm not 100% certain on that. EDIT: outoftime pointed out that sys.path.append only works for modules, not opening files, so you'll need to either expand csv/ into an absolute path or always run in its parent directory.
Also, I think Python is smart enough to not care about the direction of slashes in paths, but you should probably not mix them. All backslashes or all forward slashes only. os.path.join will normalize them for you. I'd probably change the line to
csvfile = os.path.join('csv\', model + '_' + exp + '.csv')
for consistency's sake.

What are some specific examples of using the wrong path separator failing? [duplicate]

I'm not able to see the bigger picture here I think; but basically I have no idea why you would use os.path.join instead of just normal string concatenation?
I have mainly used VBScript so I don't understand the point of this function.
Portable
Write filepath manipulations once and it works across many different platforms, for free. The delimiting character is abstracted away, making your job easier.
Smart
You no longer need to worry if that directory path had a trailing slash or not. os.path.join will add it if it needs to.
Clear
Using os.path.join makes it obvious to other people reading your code that you are working with filepaths. People can quickly scan through the code and discover it's a filepath intrinsically. If you decide to construct it yourself, you will likely detract the reader from finding actual problems with your code: "Hmm, some string concats, a substitution. Is this a filepath or what? Gah! Why didn't he use os.path.join?" :)
Will work on Windows with '\' and Unix (including Mac OS X) with '/'.
for posixpath here's the straightforward code
In [22]: os.path.join??
Type: function
String Form:<function join at 0x107c28ed8>
File: /usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/posixpath.py
Definition: os.path.join(a, *p)
Source:
def join(a, *p):
"""Join two or more pathname components, inserting '/' as needed.
If any component is an absolute path, all previous path components
will be discarded."""
path = a
for b in p:
if b.startswith('/'):
path = b
elif path == '' or path.endswith('/'):
path += b
else:
path += '/' + b
return path
don't have windows but the same should be there with '\'
It is OS-independent. If you hardcode your paths as C:\Whatever they will only work on Windows. If you hardcode them with the Unix standard "/" they will only work on Unix. os.path.join detects the operating system it is running under and joins the paths using the correct symbol.

How to use "/" (directory separator) in both Linux and Windows in Python?

I have written a code in python which uses / to make a particular file in a folder, if I want to use the code in windows it will not work, is there a way by which I can use the code in Windows and Linux.
In python I am using this code:
pathfile=os.path.dirname(templateFile)
rootTree.write(''+pathfile+'/output/log.txt')
When I will use my code in suppose windows machine my code will not work.
How do I use "/" (directory separator) in both Linux and Windows?
Use os.path.join().
Example: os.path.join(pathfile,"output","log.txt").
In your code that would be: rootTree.write(os.path.join(pathfile,"output","log.txt"))
Use:
import os
print os.sep
to see how separator looks on a current OS.
In your code you can use:
import os
path = os.path.join('folder_name', 'file_name')
You can use os.sep:
>>> import os
>>> os.sep
'/'
os.path.normpath(pathname) should also be mentioned as it converts / path separators into \ separators on Windows. It also collapses redundant uplevel references... i.e., A/B and A/foo/../B and A/./B all become A/B. And if you are Windows, these all become A\B.
If you are fortunate enough to be running Python 3.4+, you can use pathlib:
from pathlib import Path
path = Path(dir, subdir, filename) # returns a path of the system's path flavour
or, equivalently,
path = Path(dir) / subdir / filename
Some useful links that will help you:
os.sep
os.path
os.pathsep
Do a import os and then use os.sep
You can use "os.sep "
import os
pathfile=os.path.dirname(templateFile)
directory = str(pathfile)+os.sep+'output'+os.sep+'log.txt'
rootTree.write(directory)
Don't build directory and file names your self, use python's included libraries.
In this case the relevant one is os.path. Especially join which creates a new pathname from a directory and a file name or directory and split that gets the filename from a full path.
Your example would be
pathfile=os.path.dirname(templateFile)
p = os.path.join(pathfile, 'output')
p = os.path.join( p, 'log.txt')
rootTree.write(p)
If someone is looking for something like this:
He/she wants to know the parent directory and then go to the sub-folders and maybe than to a specific file. If so, I use the following approach.
I am using python 3.9 as of now. So in that version, we have the os module for handling such tasks. So, for getting the parent directory:
parent_dir = os.path.pardir
It's a good coding practice to not hardcode the file path separators (/ or \). Instead, use the operating system dependant mechanism provided by the above-mentioned os module. It makes your code very much reusable for other purposes/people. It goes like this (just an example) :
path = os.path.pardir + os.sep + 'utils' + os.sep + 'properties.ini'
print(f'The path to my global properties file is :: {path}')
Output:
..\utils\properties.ini
You can surely look at the whole documentation here : https://docs.python.org/3/library/os.html
I use pathlib for most things, so I like: pathlib.os.sep.
Usually pathlib is the better choice if you don't need os!

Python - Can (or should) I change os.path.sep?

I am writing a script to parse multiple log files and maintain a list of the files that have been processed. When I read the list of files to process I use os.walk and get names similar to the following:
C:/Users/Python/Documents/Logs\ServerUI04\SystemOut_13.01.01_20.22.25.log
This is created by the following code:
filesToProcess.extend(os.path.join(root, filename) for filename in filenames if logFilePatternMatch.match(filename))
It appears that "root" used forward slashes as the separator (I am on Windows and find that more convenient) but "filename" uses backslashes so I end up with an inconsistent path for the file as it contains a mixture of forward and back slashes as separators.
I have tried setting the separator with:
os.path.sep = "/"
and
os.sep = "/"
Before the .join but it seems to have no effect. I realize that in theory I could manipulate the string but longer term I'd like my script to run on Unix as well as Windows so would prefer that it be dynamic if possible.
Am I missing something?
Update:
Based on the helpful responses below it looks like my problem was self inflicted, for convenience I had set the initial path used as root like this:
logFileFolder = ['C:/Users/Python/Documents/Logs']
When I changed it to this:
logFileFolder = ['C:\\Users\\Python\\Documents\\Logs']
Everything works and my resulting file paths all use the "\" throughout. It looks like my approach was wrong in that I was trying to get Python to change behavior rather than correcting what I was setting as a value.
Thank you!
I would keep my fingers off os.sep and use os.path.normpath() on the result of combining the root and a filename:
filesToProcess.extend(os.path.normpath(os.path.join(root, filename))
for filename in filenames if logFilePatternMatch.match(filename))
I have preferred the following utility function.
from os.path import sep, join
def pjoin(*args, **kwargs):
return join(*args, **kwargs).replace(sep, '/')
It converts both variations (linux style and windows style) to linux style. Both windows and linux supports '/' separator in python.
I rejected the simplistic os.sep.join(['str','str','str']) because it does not take into account existing separators. Take the following case with sep.join vs vanilla join:
In[79]: os.sep.join(['/existing/my/', 'short', 'path'])
Out[79]: '/existing/my/\\short\\path'
In[80]: os.path.join('/existing/my/', 'short', 'path')
Out[80]: '/existing/my/short\\path'
The vanilla join could be repaired with the suggested:
In[75]: os.path.normpath(os.path.join('/existing/my/', 'short', 'path'))
Out[75]: '\\existing\\my\\short\\path'
So far so good. But then we introduce the following scenario where we will be interacting with linux from windows.
local_path = os.path.normpath(os.path.join('C:\\local\\base', 'subdir', 'filename.txt'))
remote_path = os.path.normpath(os.path.join('/remote/base', 'subdir', 'filename.txt'))
sftp_server.upload(local_path, remote_path)
The above will then fail because the sftp server expects a '/' separator while os.path.normpath will on windows normalize to '\'.
By using the pjoin utility function or similar, it will work cross OS, web, ftp, etc.
I use '/'.join([path1, path2]) to solve this probelm, because '/' works well in windows and linux.
You are better off not touching os.sep and os.path.sep as they are not what os.path.join is using. You could use os.path.normpath as suggested by Anthon. Another alternative is to have your own simple path join:
os.sep.join([i1,i2,i3])

os.path.join not properly formatting path

I'm writing a command-line directory navigator for Windows in Python and struggling a bit with os.path.join. Here's, in essence, what I'm trying to do:
abspath = "C:\Python32\Projects\ls.py"
abspath = abspath.split('\\')
print(abspath) #this prints ['C:', 'Python32', 'Projects', 'ls.py']
if(options.mFlag):
print(os.path.join(*abspath)) #this prints C:Python32\Projects\ls.py
m = time.ctime(os.path.getmtime(os.path.join(*abspath))) #this throws an exception
The problem is that os.path.join is not inserting a '/' after 'C:' and I can't figure out why. Any help?
Edit: In case anyone in the future comes here looking for a solution, I just added os.sep after "C:" instead of hardcoding a backslash and that worked.
From the documentation:
Note that on Windows, since there is a current directory for each drive, os.path.join("c:", "foo") represents a path relative to the current directory on drive C: (c:foo), not c:\foo.
It's a little hard to tell what you're trying to accomplish, since all your code seems to be aiming at is to split the path and then put it back together exactly the way it was, in which case why split it in the first place? But maybe os.path.splitdrive will help you? It splits the drive letter from the path.
The docs ( http://docs.python.org/2/library/os.path.html) specify this behaviour:
Note that on Windows, since there is a current directory for each drive, os.path.join("c:", "foo") represents a path relative to the current directory on drive C: (c:foo), not c:\foo.

Categories