Weird python file path behavior

Weird python file path behavior - python

I have this folder structure, within edi_standards.py I want to open csv/transaction_groups.csv
But the code only works when I access it like this os.path.join('standards', 'csv', 'transaction_groups.csv')
What I think it should be is os.path.join('csv', 'transaction_groups.csv') since both edi_standards.py and csv/ are on the same level in the same folder standards/
This is the output of printing __file__ in case you doubt what I say:
>>> print(__file__)
~/edi_parser/standards/edi_standards.py

when you're running a python file, the python interpreter does not change the current directory to the directory of the file you're running.
In your case, you're probably running (from ~/edi_parser):
standards/edi_standards.py
For this you have to hack something using __file__, taking the dirname and building the relative path of your resource file:
os.path.join(os.path.dirname(__file__),"csv","transaction_groups.csv")
Anyway, it's good practice not to rely on the current directory to open resource files. This method works whatever the current directory is.

I do agree with Answer of Jean-Francois above,
I would like to mention that os.path.join does not consider the absolute path of your current working directory as the first argument
For example consider below code
>>> os.path.join('Functions','hello')
'Functions/hello'
See another example
>>> os.path.join('Functions','hello','/home/naseer/Python','hai')
'/home/naseer/Python/hai'
Official Documentation
states that whenever we have given a absolute path as a argument to the os.path.join then all previous path arguments are discarded and joining continues from the absolute path argument.
The point I would like to highlight is we shouldn't expect that the function os.path.join will work with relative path. So You have to submit absolute path to be able to properly locate your file.

Related

Navigating directories

After getting the path to the current working directory using:
cwd = os.getcwd()
How would one go up one folder: C:/project/analysis/ to C:/project/ and enter a folder called data (C:/project/data/)?

In general it a bad idea to 'enter' a directory (ie change the current directory), unless that is explicity part of the behaviour of the program.
In general to open a file in one directory 'over from where you are you can do .. to navigate up one level.
In your case you can open a file using the path ../data/<filename> - in other words use relative file names.
If you really need to change the current working directory you can use os.chdir() but remember this could well have side effects - for example if you import modules from your local directory then using os.chdir() will probably impact that import.

As per Python documentation, you could try this:
os.chdir("../data")

How Python manages relative path

I can't find out how to find a relative path in Python. The code only allows me to use the absolute path.
What I want
config = configparser.ConfigParser()
config.read('..\\main.ini',)
print(config.sections())
FilePaths = config['data']
DictionaryFilePath = FilePaths['DictionaryFilePath']
print(DictionaryFilePath)
what it forces me to do
config = configparser.ConfigParser()
config.read('C:\\Users\\***confendential***\\OneDrive\\文档\\Python\\Chinese for practice\\ChineseWords\\app\\main.ini',)
print(config.sections())
FilePaths = config['data']
DictionaryFilePath = FilePaths['DictionaryFilePath']
print(DictionaryFilePath)
Any Ideas???
Is it Onedrive?

There are already some answers that cover this here, for example.
To summarise, if you just do
config.read('..\\main.ini',)
then your program expects a file, main.ini to be located in the parent directory of wherever you executed the program from.
What you usually want is to specify a path relative to the location of the file that is executing.
You can get the path to that file with __file__ and then manipulate it with the os.path module (see this answer)
In your case, assuming that your main.ini is in the parent directory of the script you are running,you could do
inifile = os.path.join(os.path.dirname(os.path.dirname(__file__)), "main.ini")
config.read(inifile)
Another thing to note in your case, is it could be useful to actually check that the ini file is loaded. If the file is not found it just returns an empty object, so you can check this and print a message or error.
Hope this is helpful. The other linked answers give some more useful info on these ideas.

I think it is just the onedrive. It is forcing you to use a strait forward path

Primer needed in python pathnames

I am a very novice coder, and Python is my first (and, practically speaking, only) language. I am charged as part of a research job with manipulating a collection of data analysis scripts, first by getting them to run on my computer. I was able to do this, essentially by removing all lines of coding identifying paths, and running the scripts through a Jupyter terminal opened in the directory where the relevant modules and CSV files live so the script knows where to look (I know that Python defaults to the location of the terminal).
Here are the particular blocks of code whose function I don't understand
import sys
sys.path.append('C:\Users\Ben\Documents\TRACMIP_Project\mymodules/')
import altdata as altdata
I have replaced the pathname in the original code with the path name leading to the directory where the module is; the file containing all the CSV files that end up being referenced here is also in mymodules.
This works depending on where I open the terminal, but the only way I can get it to work consistently is by opening the terminal in mymodules, which is fine for now but won't work when I need to work by accessing the server remotely. I need to understand better precisely what is being done here, and how it relates to the location of the terminal (all the documentation I've found is overly technical for my knowledge level).
Here is another segment I don't understand
import os.path
csvfile = 'csv/' + model +'_' + exp + '.csv'
if os.path.isfile(csvfile): # csv file exists
hcsvfile = open(csvfile )
I get here that it's looking for the CSV file, but I'm not sure how. I'm also not sure why then on some occasions depending on where I open the terminal it's able to find the module but not the CSV files.
I would love an explanation of what I've presented, but more generally I would like information (or a link to information) explaining paths and how they work in scripts in modules, as well as what are ways of manipulating them. Thanks.

sys.path
This is simple list of directories where python will look for modules and packages (.py and dirs with __init__.py file, look at modules tutorial). Extending this list will allow you to load modules (custom libs, etc.) from non default locations (usually you need to change it in runtime, for static dirs you can modify startup script to add needed enviroment variables).
os.path
This module implements some useful functions on pathnames.
... and allows you to find out if file exists, is it link, dir, etc.
Why you failed loading *.csv?
Because sys.path responsible for module loading and only for this. When you use relative path:
csvfile = 'csv/' + model +'_' + exp + '.csv'
open() will look in current working directory
file is either a string or bytes object giving the pathname (absolute or relative to the current working directory)...
You need to use absolute paths by constucting them with os.path module.

I agree with cdarke's comment that you are probably running into an issue with backslashes. Replacing the line with:
sys.path.append(r'C:\Users\Ben\Documents\TRACMIP_Project\mymodules')
will likely solve your problem. Details below.
In general, Python treats paths as if they're relative to the current directory (where your terminal is running). When you feed it an absolute path-- which is a path that includes the root directory, like the C:\ in C:\Users\Ben\Documents\TRACMIP_Project\mymodules-- then Python doesn't care about the working directory anymore, it just looks where you tell it to look.
Backslashes are used to make special characters within strings, such as line breaks (\n) and tabs (\t). The snag you've hit is that Python paths are strings first, paths second. So the \U, \B, \D, \T and \m in your path are getting misinterpreted as special characters and messing up Python's path interpretation. If you prefix the string with 'r', Python will ignore the special characters meaning of the backslash and just interpret it as a literal backslash (what you want).
The reason it still works if you run the script from the mymodules directory is because Python automatically looks in the working directory for files when asked. sys.path.append(path) is telling the computer to include that directory when it looks for commands, so that you can use files in that directory no matter where you're running the script. The faulty path will still get added, but its meaningless. There is no directory where you point it, so there's nothing to find there.
As for path manipulation in general, the "safest" way is to use the function in os.path, which are platform-independent and will give the correct output whether you're working in a Windows or a Unix environment (usually).
EDIT: Forgot to cover the second part. Since Python paths are strings, you can build them using string operations. That's what is happening with the line
csvfile = 'csv/' + model +'_' + exp + '.csv'
Presumably model and exp are strings that appear in the filenames in the csv/ folder. With model = "foo" and exp = "bar", you'd get csv/foo_bar.csv which is a relative path to a file (that is, relative to your working directory). The code makes sure a file actually exists at that path and then opens it. Assuming the csv/ folder is in the same path as you added in sys.path.append, this path should work regardless of where you run the file, but I'm not 100% certain on that. EDIT: outoftime pointed out that sys.path.append only works for modules, not opening files, so you'll need to either expand csv/ into an absolute path or always run in its parent directory.
Also, I think Python is smart enough to not care about the direction of slashes in paths, but you should probably not mix them. All backslashes or all forward slashes only. os.path.join will normalize them for you. I'd probably change the line to
csvfile = os.path.join('csv\', model + '_' + exp + '.csv')
for consistency's sake.

How to generate path of a directory in python

I have a file abc.py under the workspace dir.
I am using os.listdir('/home/workspace/tests') in abc.py to list all the files (test1.py, test2.py...)
I want to generate the path '/home/workspace/tests' or even '/home/workspace' instead of hardcoding it.
I tried os.getcwd() and os.path.dirname(os.path.abspath(____file____)) but this instead generates the path where the test script is being run.
How to go about it?

The only way you can refer to a specific folder from which you don't relate in any way and you don't want to hardcode it, is to pass it as a parameter to the script (search for: command line argument)

I think you are asking about how to get the relative path instead of absolute one.
Absolute path is the one like: "/home/workspace"
Relative looks like the following "./../workspace"
You should construct the relative path from the dir where your script is (/home/workspace/tests) to the dir that you want to acces (/home/workspace) that means, in this case, to go one step up in the directory tree.
You can get this by executing:
os.path.dirname(os.path.join("..", os.path.abspath(__file__)))
The same result may be achieved if you go two steps up and one step down to workspace dir:
os.path.dirname(os.path.join("..", "..", "workspace", os.path.abspath(__file__)))
In this manner you actually can access any directory without knowing it's absolute path, but only knowing where it resides relatively to your executed file.

Where does python look for files in a script? [duplicate]

This question already has answers here:
How do I get the parent directory in Python?
(21 answers)
Closed 9 years ago.
So I've just coded this class for a title screen and it works well. However, one of the people I'm working with on the project mentioned that I shouldn't use:
os.chdir(os.getcwd() + "/..")
resource = (os.getcwd() + "/media/file name")
to get to the super directory. He did mention something about the pythonpath though. We're using Eclipse if this is of some help.
For more context we're making a multi-platform game so we can't just synchronize our directories and hard-code it (although we are using git so the working directory is synchronized). Basically, I need some way to get from a script file in a "src' folder to a "media" folder that's next to it (AKA There's a super (project) folder with both "src" and "media" folders in it).
Any help would be greatly appreciated, but please don't say "google it" because I tried that before coming here (I don't know if that's a frequent thing here, but I've seen it too many times elsewhere...when I've googled for answers, sorry if I sound jerkish for saying that)

Python programs do have the concept of a current working directory, which is generally the directory from which the script was run. This is "where they look for files" with a relative path.
However, since your program can be run from a different folder than the one it is in, your directory of reference needs to instead refer to the directory your script is in (the current directory is not related to the location of your script, in general). The directory where your script is found is obtained with
script_dir = os.path.dirname(__file__)
Note that this path can be relative (possibly empty), so it is still important that the current working directory of your script be the same as the directory when your script was read by the python interpreter (which is when __file__ is set). It is important to convert the possibly relative script_dir into an absolute path if the current working directory is changed later in the code:
# If script_dir is relative, the current working directory is used, here. This is correct if the current
# working directory is the same as when the script was read by the Python interpreter (which is
# when __file__ was set):
script_dir = os.path.abspath(script_dir)
You can then get to the directory media in the parent directory with the platform-independent
os.path.join(script_dir, os.path.pardir, 'media')
In fact os.path.pardir (or equivalently os.pardir) is the platform-independent parent directory convention, and os.path.join() simply joins paths in a platform independent way.

I'd recommend something like:
import os.path
base_folder = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
media_folder = os.path.join(base_folder, "media")
src_folder = os.path.join(base_folder, "src")
resource = os.path.join(media_folder, "filename")
for path in [base_folder, media_folder, src_folder, resource]:
print path
The main ingredients are:
__file__: gets the path to the current source file (unlike sys.argv[0], which gives the path the script that was called)
os.path.split(): splits a path into the relative file/folder name and the base folder containing it. Using it twice as in base_folder = ... will give the parent directory.
os.path.join: OS-independent and correct combination of path names. Is aware of missing or multiple /s or \s

Consider using os.path.dirname() and os.path.join(). These should work in a platform independent way.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.