Python - File Path not found if script run from another directory

Python - File Path not found if script run from another directory - python

I'm trying to run a script that works without issue when I run using in console, but causes issue if I try to run it from another directory (via IPython %run <script.py>)
The issue comes from this line, where it references a folder called "Pickles".
with open('Pickles/'+name+'_'+date.strftime('%y-%b-%d'),'rb') as f:
obj = pickle.load(f)
In Console:
python script.py <---works!
In running IPython (Jupyter) in another folder, it causes a FileNotFound exception.
How can I make any path references within my scripts more robust, without putting the whole extended path?
Thanks in advance!

Since running in the console the way you show works, the Pickles directory must be in the same directory as the script. You can make use of this fact so that you don't have to hard code the location of the Pickles directory, but also don't have to worry about setting the "current working directory" to be the directory containing Pickles, which is what your current code requires you to do.
Here's how to make your code work no matter where you run it from:
with open(os.path.join(os.path.dirname(__file__), 'Pickles', name + '_' + date.strftime('%y-%b-%d')), 'rb') as f:
obj = pickle.load(f)
os.path.dirname(__file__) provides the path to the directory containing the script that is currently running.
Generally speaking, it's a good practice to always fully specify the locations of things you interact with in the filesystem. A common way to do this as shown here.
UPDATE: I updated my answer to be more correct by not assuming a specific path separator character. I had chosen to use '/' only because the original code in the question already did this. It is also the case that the code given in the original question, and the code I gave originally, will work fine on Windows. The open() function will accept either type of path separator and will do the right thing on Windows.

You have to use absolute paths. Also to be cross platform use join:
First get the path of your script using the variable __file__
Get the directory of this file with os.path.dirname(__file__)
Get your relative path with os.path.join(os.path.dirname(__file__), "Pickles", f"{name}_{date.strftime('%y-%b-%d')}")
it gives you:
with open(os.path.join(os.path.dirname(__file__), "Pickles", f"{name}_{date.strftime('%y-%b-%d')}"), 'rb') as f:
obj = pickle.load(f)

Related

How to load dependencies and subfolders in Pycharm?

I am so new with python and pycharm and i got confuse!!
When I run my project in pycharm it gives me an error about not finding the path of my file. The physical file path is:
'../Project/BC/RequiredFiles/resources/a_reqs.csv'
My project working directory is "Project/BC" and the project running file (startApp.sh) is there too. but the .py file that wants to work with a_req.csv is inside the "RequiredFiles" folder. There is the following code in the .py file:
reqsfile = os.getcwd() + "/resources/a_reqs.csv"
it returns: '../Project/BC/resources/a_reqs.csv'
instead of: '../Project/BC/resources/RequiredFiles/a_reqs.csv'
while the .py file is in "RequiredFiles" the os.getcwd() must include it too. but it does not.
The problem is that i can not change the addressing code. because this code works in another IDE and other people who work with the code in other platform or OS do not have any problem. I am working in mac OS and if i am not mistaken the code works with windows!!
So, how can i tell Pycharm (in mac) to see and load "RequiredFiles" folder as the subfolder of my working directory!!!

os.getcwd returns the current working directory of the process (which may be the directory where startApp.sh is located or another one, depending on the PyCharm's run configuration setting, or, if you start the program from the command line, the directory in which you execute the command).
To make a path independent on the current working directory, you can take the directory where your Python file is located and build the path from it:
os.path.dirname(__file__) + "/resources/a_reqs.csv"

From your question what I see is:
reqsfile = os.getcwd() + "/resources/a_reqs.csv"
Which produces: "../Project/BC/resources/a_reqs.csv", whereas your desired output is
"../Project/BC/resources/RequiredFiles/a_reqs.csv". Since we know os.getcwd is returning "/Project/BC/", then to get your desired result you should be doing:
reqsfile = os.getcwd() + "/resources/RequiredFiles/a_reqs.csv"
But since you want the solution to work with or without the RequiredFiles subdirectory you could apply a conditional solution, ie something like:
import os.path
if os.path.exists(os.getcwd() + "/resources/RequiredFiles/a_reqs.csv"):
reqsfile = os.getcwd() + "/resources/RequiredFiles/a_reqs.csv"
else:
reqsfile = os.getcwd() + "/resources/a_reqs.csv"
This solution will set the reqsfile to the csv in the RequiredFiles directory if the directory exists, and thus will work for you. On the other-hand, if the RequiredFiles directory doesn't exist, it will default to the csv in /resources/.
Typically when groups collaborate on projects, the maintain the same file hierarchy so that these types of issues are avoided, so you might want to consider moving the csv from /RequiredFiles/ to /resources/.

File not found from Python although file exists

I'm trying to load a simple text file with an array of numbers into Python. A MWE is
import numpy as np
BASE_FOLDER = 'C:\\path\\'
BASE_NAME = 'DATA.txt'
fname = BASE_FOLDER + BASE_NAME
data = np.loadtxt(fname)
However, this gives an error while running:
OSError: C:\path\DATA.txt not found.
I'm using VSCode, so in the debug window the link to the path is clickable. And, of course, if I click it the file opens normally, so this tells me that the path is correct.
Also, if I do print(fname), VSCode also gives me a valid path.
Is there anything I'm missing?
EDIT
As per your (very helpful for future reference) comments, I've changed my code using the os module and raw strings:
BASE_FOLDER = r'C:\path_to_folder'
BASE_NAME = r'filename_DATA.txt'
fname = os.path.join(BASE_FOLDER, BASE_NAME)
Still results in error.
Second EDIT
I've tried again with another file. Very basic path and filename
BASE_FOLDER = r'Z:\Data\Enzo\Waste_Code'
BASE_NAME = r'run3b.txt'
And again, I get the same error.
If I try an alternative approach,
os.chdir(BASE_FOLDER)
a = os.listdir()
then select the right file,
fname = a[1]
I still get the error when trying to import it. Even though I'm retrieving it directly from listdir.
>> os.path.isfile(a[1])
False

Using the module os you can check the existence of the file within python by running
import os
os.path.isfile(fname)
If it returns False, that means that your file doesn't exist in the specified fname. If it returns True, it should be read by np.loadtxt().
Extra: good practice working with files and paths
When working with files it is advisable to use the amazing functionality built in the Base Library, specifically the module os. Where os.path.join() will take care of the joins no matter the operating system you are using.
fname = os.path.join(BASE_FOLDER, BASE_NAME)
In addition it is advisable to use raw strings by adding an r to the beginning of the string. This will be less tedious when writing paths, as it allows you to copy-paste from the navigation bar. It will be something like BASE_FOLDER = r'C:\path'. Note that you don't need to add the latest '\' as os.path.join takes care of it.

You may not have the full permission to read the downloaded file. Use
sudo chmod -R a+rwx file_name.txt
in the command prompt to give yourself permission to read if you are using Ubuntu.

For me the problem was that I was using the Linux home symbol in the link (~/path/file). Replacing it with the absolute path /home/user/etc_path/file worked like charm.

Primer needed in python pathnames

I am a very novice coder, and Python is my first (and, practically speaking, only) language. I am charged as part of a research job with manipulating a collection of data analysis scripts, first by getting them to run on my computer. I was able to do this, essentially by removing all lines of coding identifying paths, and running the scripts through a Jupyter terminal opened in the directory where the relevant modules and CSV files live so the script knows where to look (I know that Python defaults to the location of the terminal).
Here are the particular blocks of code whose function I don't understand
import sys
sys.path.append('C:\Users\Ben\Documents\TRACMIP_Project\mymodules/')
import altdata as altdata
I have replaced the pathname in the original code with the path name leading to the directory where the module is; the file containing all the CSV files that end up being referenced here is also in mymodules.
This works depending on where I open the terminal, but the only way I can get it to work consistently is by opening the terminal in mymodules, which is fine for now but won't work when I need to work by accessing the server remotely. I need to understand better precisely what is being done here, and how it relates to the location of the terminal (all the documentation I've found is overly technical for my knowledge level).
Here is another segment I don't understand
import os.path
csvfile = 'csv/' + model +'_' + exp + '.csv'
if os.path.isfile(csvfile): # csv file exists
hcsvfile = open(csvfile )
I get here that it's looking for the CSV file, but I'm not sure how. I'm also not sure why then on some occasions depending on where I open the terminal it's able to find the module but not the CSV files.
I would love an explanation of what I've presented, but more generally I would like information (or a link to information) explaining paths and how they work in scripts in modules, as well as what are ways of manipulating them. Thanks.

sys.path
This is simple list of directories where python will look for modules and packages (.py and dirs with __init__.py file, look at modules tutorial). Extending this list will allow you to load modules (custom libs, etc.) from non default locations (usually you need to change it in runtime, for static dirs you can modify startup script to add needed enviroment variables).
os.path
This module implements some useful functions on pathnames.
... and allows you to find out if file exists, is it link, dir, etc.
Why you failed loading *.csv?
Because sys.path responsible for module loading and only for this. When you use relative path:
csvfile = 'csv/' + model +'_' + exp + '.csv'
open() will look in current working directory
file is either a string or bytes object giving the pathname (absolute or relative to the current working directory)...
You need to use absolute paths by constucting them with os.path module.

I agree with cdarke's comment that you are probably running into an issue with backslashes. Replacing the line with:
sys.path.append(r'C:\Users\Ben\Documents\TRACMIP_Project\mymodules')
will likely solve your problem. Details below.
In general, Python treats paths as if they're relative to the current directory (where your terminal is running). When you feed it an absolute path-- which is a path that includes the root directory, like the C:\ in C:\Users\Ben\Documents\TRACMIP_Project\mymodules-- then Python doesn't care about the working directory anymore, it just looks where you tell it to look.
Backslashes are used to make special characters within strings, such as line breaks (\n) and tabs (\t). The snag you've hit is that Python paths are strings first, paths second. So the \U, \B, \D, \T and \m in your path are getting misinterpreted as special characters and messing up Python's path interpretation. If you prefix the string with 'r', Python will ignore the special characters meaning of the backslash and just interpret it as a literal backslash (what you want).
The reason it still works if you run the script from the mymodules directory is because Python automatically looks in the working directory for files when asked. sys.path.append(path) is telling the computer to include that directory when it looks for commands, so that you can use files in that directory no matter where you're running the script. The faulty path will still get added, but its meaningless. There is no directory where you point it, so there's nothing to find there.
As for path manipulation in general, the "safest" way is to use the function in os.path, which are platform-independent and will give the correct output whether you're working in a Windows or a Unix environment (usually).
EDIT: Forgot to cover the second part. Since Python paths are strings, you can build them using string operations. That's what is happening with the line
csvfile = 'csv/' + model +'_' + exp + '.csv'
Presumably model and exp are strings that appear in the filenames in the csv/ folder. With model = "foo" and exp = "bar", you'd get csv/foo_bar.csv which is a relative path to a file (that is, relative to your working directory). The code makes sure a file actually exists at that path and then opens it. Assuming the csv/ folder is in the same path as you added in sys.path.append, this path should work regardless of where you run the file, but I'm not 100% certain on that. EDIT: outoftime pointed out that sys.path.append only works for modules, not opening files, so you'll need to either expand csv/ into an absolute path or always run in its parent directory.
Also, I think Python is smart enough to not care about the direction of slashes in paths, but you should probably not mix them. All backslashes or all forward slashes only. os.path.join will normalize them for you. I'd probably change the line to
csvfile = os.path.join('csv\', model + '_' + exp + '.csv')
for consistency's sake.

Running python script with new directories

I have recently begun working on a new computer. All my python files and my data are in the dropbox folder, so having access to the data is not a problem. However, the "user" name on the file has changed. Thus, none of my os.chdir() operations work. Obviously, I can modify all of my scripts using a find and replace, but that won't help if I try using my old computer.
Currently, all the directories called look something like this:
"C:\Users\Old_Username\Dropbox\Path"
and the files I want to access on the new computer look like:
"C:\Users\New_Username\Dropbox\Path"
Is there some sort of try/except I can build into my script so it goes through the various path-name options if the first attempt doesn't work?
Thanks!

Any solution will involve editing your code; so if you are going to edit it anyway - its best to make it generic enough so it works on all platforms.
In the answer to How can I get the Dropbox folder location programmatically in Python? there is a code snippet that you can use if this problem is limited to dropbox.
For a more generic solution, you can use environment variables to figure out the home directory of a user.
On Windows the home directory is location is stored in %UserProfile%, on Linux and OSX it is in $HOME. Luckily Python will take care of all this for you with os.path.expanduser:
import os
home_dir = os.path.expanduser('~')
Using home_dir will ensure that the same path is resolved on all systems.

Thought the file sq.py with these codes(your olds):
C:/Users/Old_Username/Dropbox/Path
for x in range:
#something
def Something():
#something...
C:/Users/Old_Username/Dropbox/Path
Then a new .py file run these codes:
with open("sq.py","r") as f:
for x in f.readlines():
y=x
if re.findall("C:/Users/Old_Username/Dropbox/Path",x) == ['C:/Users/Old_Username/Dropbox/Path']:
x="C:/Users/New_Username/Dropbox/Path"
y=y.replace(y,x)
print (y)
Output is:
C:/Users/New_Username/Dropbox/Path
for x in range:
#something
def Something():
#something...
C:/Users/New_Username/Dropbox/Path
Hope its your solution at least can give you some idea dealing with your problem.

Knowing that eventually I will move or rename my projects or scripts, I always use this code right at the beginning:
import os, inspect
this_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
this_script = inspect.stack()[0][1]
this_script_name = this_script.split('/')[-1]
If you call your script not with the full but a relative path, then this_script will also not contain a full path. this_dir however will always be the full path to the directory.

Absolute Path and Relative Path issue in python

How to remove path related problems in python?
For e.g. I have a module test.py inside a directory TEST
**test.py**
import os
file_path = os.getcwd() + '/../abc.txt'
f = open(file_path)
lines = f.readlines()
f.close
print lines
Now, when I execute the above program outside TEST directory, it gives me error:-
Traceback (most recent call last):
File "TEST/test.py", line 4, in ?
f = open(file_path)
IOError: [Errno 2] No such file or directory: 'abc.txt'
how to resolve this kind of problem. Basically this is just a small example that I have given up.
I am dealing with a huge problem of this kind.
I am using existing packages, which needs to be run only from that directory where it exists, how to resolve such kind of problems, so that I can run the program from anywhere I want.
Or able to deal with the above example either running inside TEST directory or outside TEST directory.
Any help.?

I think the easiest thing is to change the current working directory to the one of the script file:
import os
os.chdir(os.path.dirname(__file__))
This may cause problems, however, if the script is also working with files in the original working directory.

Your code is looking at the current working directory, and uses that as a basis for finding the files it needs. This is almost never a good idea, as you are now finding out.
The solution mentioned in the answer by Emil Vikström is a quickfix solution, but a more correct solution would be to not use current working directory as a startingpoint.
As mentioned in the other answer, __file__ isn't available in the interpreter, but it's an excellent solution for your code.
Rewrite your second line to something like this:
file_path = os.path.join(os.path.dirname(__file__), "..", "abc.txt")
This will take the directory the current file is in, join it first with .. and then with abc.txt, to create the path you want.
You should fix similar usage of os.getcwd() elsewhere in your code in the same way.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.