How to load dependencies and subfolders in Pycharm? - python

I am so new with python and pycharm and i got confuse!!
When I run my project in pycharm it gives me an error about not finding the path of my file. The physical file path is:
'../Project/BC/RequiredFiles/resources/a_reqs.csv'
My project working directory is "Project/BC" and the project running file (startApp.sh) is there too. but the .py file that wants to work with a_req.csv is inside the "RequiredFiles" folder. There is the following code in the .py file:
reqsfile = os.getcwd() + "/resources/a_reqs.csv"
it returns: '../Project/BC/resources/a_reqs.csv'
instead of: '../Project/BC/resources/RequiredFiles/a_reqs.csv'
while the .py file is in "RequiredFiles" the os.getcwd() must include it too. but it does not.
The problem is that i can not change the addressing code. because this code works in another IDE and other people who work with the code in other platform or OS do not have any problem. I am working in mac OS and if i am not mistaken the code works with windows!!
So, how can i tell Pycharm (in mac) to see and load "RequiredFiles" folder as the subfolder of my working directory!!!

os.getcwd returns the current working directory of the process (which may be the directory where startApp.sh is located or another one, depending on the PyCharm's run configuration setting, or, if you start the program from the command line, the directory in which you execute the command).
To make a path independent on the current working directory, you can take the directory where your Python file is located and build the path from it:
os.path.dirname(__file__) + "/resources/a_reqs.csv"

From your question what I see is:
reqsfile = os.getcwd() + "/resources/a_reqs.csv"
Which produces: "../Project/BC/resources/a_reqs.csv", whereas your desired output is
"../Project/BC/resources/RequiredFiles/a_reqs.csv". Since we know os.getcwd is returning "/Project/BC/", then to get your desired result you should be doing:
reqsfile = os.getcwd() + "/resources/RequiredFiles/a_reqs.csv"
But since you want the solution to work with or without the RequiredFiles subdirectory you could apply a conditional solution, ie something like:
import os.path
if os.path.exists(os.getcwd() + "/resources/RequiredFiles/a_reqs.csv"):
reqsfile = os.getcwd() + "/resources/RequiredFiles/a_reqs.csv"
else:
reqsfile = os.getcwd() + "/resources/a_reqs.csv"
This solution will set the reqsfile to the csv in the RequiredFiles directory if the directory exists, and thus will work for you. On the other-hand, if the RequiredFiles directory doesn't exist, it will default to the csv in /resources/.
Typically when groups collaborate on projects, the maintain the same file hierarchy so that these types of issues are avoided, so you might want to consider moving the csv from /RequiredFiles/ to /resources/.

Related

Python - File Path not found if script run from another directory

I'm trying to run a script that works without issue when I run using in console, but causes issue if I try to run it from another directory (via IPython %run <script.py>)
The issue comes from this line, where it references a folder called "Pickles".
with open('Pickles/'+name+'_'+date.strftime('%y-%b-%d'),'rb') as f:
obj = pickle.load(f)
In Console:
python script.py <---works!
In running IPython (Jupyter) in another folder, it causes a FileNotFound exception.
How can I make any path references within my scripts more robust, without putting the whole extended path?
Thanks in advance!
Since running in the console the way you show works, the Pickles directory must be in the same directory as the script. You can make use of this fact so that you don't have to hard code the location of the Pickles directory, but also don't have to worry about setting the "current working directory" to be the directory containing Pickles, which is what your current code requires you to do.
Here's how to make your code work no matter where you run it from:
with open(os.path.join(os.path.dirname(__file__), 'Pickles', name + '_' + date.strftime('%y-%b-%d')), 'rb') as f:
obj = pickle.load(f)
os.path.dirname(__file__) provides the path to the directory containing the script that is currently running.
Generally speaking, it's a good practice to always fully specify the locations of things you interact with in the filesystem. A common way to do this as shown here.
UPDATE: I updated my answer to be more correct by not assuming a specific path separator character. I had chosen to use '/' only because the original code in the question already did this. It is also the case that the code given in the original question, and the code I gave originally, will work fine on Windows. The open() function will accept either type of path separator and will do the right thing on Windows.
You have to use absolute paths. Also to be cross platform use join:
First get the path of your script using the variable __file__
Get the directory of this file with os.path.dirname(__file__)
Get your relative path with os.path.join(os.path.dirname(__file__), "Pickles", f"{name}_{date.strftime('%y-%b-%d')}")
it gives you:
with open(os.path.join(os.path.dirname(__file__), "Pickles", f"{name}_{date.strftime('%y-%b-%d')}"), 'rb') as f:
obj = pickle.load(f)

Regarding file io in Python

I mistakenly, typed the following code:
f = open('\TestFiles\'sample.txt', 'w')
f.write('I just wrote this line')
f.close()
I ran this code and even though I have mistakenly typed the above code, it is however a valid code because the second backslash ignores the single-quote and what I should get, according to my knowledge is a .txt file named "\TestFiles'sample" in my project folder. However when I navigated to the project folder, I could not find a file there.
However, if I do the same thing with a different filename for example. Like,
f = open('sample1.txt', 'w')
f.write('test')
f.close()
I find the 'sample.txt' file created in my folder. Is there a reason for the file to not being created even though the first code was valid according to my knowledge?
Also is there a way to mention a file relative to my project folder rather than mentioning the absolute path to a file? (For example I want to create a file called 'sample.txt' in a folder called 'TestFiles' inside my project folder. So without mentioning the absolute path to TestFiles folder, is there a way to mention the path to TestFiles folder relative to the project folder in Python when opening files?)
I am a beginner in Python and I hope someone could help me.
Thank you.
What you're looking for are relative paths, long story short, if you want to create a file called 'sample.txt' in a folder 'TestFiles' inside your project folder, you can do:
import os
f = open(os.path.join('TestFiles', 'sample1.txt'), 'w')
f.write('test')
f.close()
Or using the more recent pathlib module:
from pathlib import Path
f = open(Path('TestFiles', 'sample1.txt'), 'w')
f.write('test')
f.close()
But you need to keep in mind that it depends on where you started your Python interpreter (which is probably why you're not able to find "\TestFiles'sample" in your project folder, it's created elsewhere), to make sure everything works fine, you can do something like this instead:
from pathlib import Path
sample_path = Path(Path(__file__).parent, 'TestFiles', 'sample1.txt')
with open(sample_path, "w") as f:
f.write('test')
By using a [context manager]{https://book.pythontips.com/en/latest/context_managers.html} you can avoid using f.close()
When you create a file you can specify either an absolute filename or a relative filename.
If you start the file path with '\' (on Win) or '/' it will be an absolute path. So in your first case you specified an absolute path, which is in fact:
from pathlib import Path
Path('\Testfile\'sample.txt').absolute()
WindowsPath("C:/Testfile'sample.txt")
Whenever you run some code in python, the relative paths that will be generate will be composed by your current folder, which is the folder from which you started the python interpreter, which you can check with:
import os
os.getcwd()
and the relative path that you added afterwards, so if you specify:
Path('Testfiles\sample.txt').absolute()
WindowsPath('C:/Users/user/Testfiles/sample.txt')
In general I suggest you use pathlib to handle paths. That makes it safer and cross platform. For example let's say that your scrip is under:
project
src
script.py
testfiles
and you want to store/read a file in project/testfiles. What you can do is get the path for script.py with __file__ and build the path to project/testfiles
from pathlib import Path
src_path = Path(__file__)
testfiles_path = src_path.parent / 'testfiles'
sample_fname = testfiles_path / 'sample.txt'
with sample_fname.open('w') as f:
f.write('yo')
As I am running the first code example in vscode, I'm getting a warning
Anomalous backslash in string: '\T'. String constant might be missing an r prefix.
And when I am running the file, it is also creating a file with the name \TestFiles'sample.txt. And it is being created in the same directory where the .py file is.
now, if your working tree is like this:
project_folder
-testfiles
-sample.txt
-something.py
then you can just say: open("testfiles//hello.txt")
I hope you find it helpful.

Python os.getcwd() is not working on subfolders in VSCODE

I have a python file, converted from a Jupiter Notebook, and there is a subfolder called 'datasets' inside this file folder. When I'm trying to open a file that is inside that 'datasets' folder, with this code:
import pandas as pd
# Load the CSV data into DataFrames
super_bowls = pd.read_csv('/datasets/super_bowls.csv')
It says that there is no such file or folder. Then I add this line
os.getcwd()
And the output is the top-level folder of the project, and not the subfolder when is this python file. And I think maybe that's the reason why it's not working.
So, how can I open that csv file with relative paths? I don't want to use absolute path because this code is going to be used in another computers.
Why os.getcwd() is not getting the actual folder path?
My observation, the dot (.) notation to move to the parent directory sometimes does not work depending on the operating system. What I generally do to make it os agnostic is this:
import pandas as pd
import os
__location__ = os.path.realpath(os.path.join(os.getcwd(), os.path.dirname(__file__)))
super_bowls = pd.read_csv(__location__ + '/datasets/super_bowls.csv')
This works on my windows and ubantu machine equally well.
I am not sure if there are other and better ways to achieve this. Would like to hear back if there are.
(edited)
Per your comment below, the current working directory is
/Users/ivanparra/AprendizajePython/
while the file is in
/Users/ivanparra/AprendizajePython/Jupyter/datasets/super_bowls.csv
For that reason, going to the datasets subfolder of the current working directory (CWD) takes you to /Users/ivanparra/AprendizajePython/datasets which either doesn't exist or doesn't contain the file you're looking for.
You can do one of two things:
(1) Use an absolute path to the file, as in
super_bowls = pd.read_csv("/Users/ivanparra/AprendizajePython/Jupyter/datasets/super_bowls.csv")
(2) use the right relative path, as in
super_bowls = pd.read_csv("./Jupyter/datasets/super_bowls.csv")
There's also (3) - use os.path.join to contact the CWD to the relative path - it's basically the same as (2).
(you can also use
The answer really lies in the response by user2357112:
os.getcwd() is working fine. The problem is in your expectations. The current working directory is the directory where Python is running, not the directory of any particular source file. – user2357112 supports Monica May 22 at 6:03
The solution is:
data_dir = os.path.dirname(__file__)
Try this code
super_bowls = pd.read_csv( os.getcwd() + '/datasets/super_bowls.csv')
I noticed this problem a few years ago. I think it's a matter of design style. The problem is that: your workspace folder is just a folder, not a project folder. Most of the time, your relative reference is based on the current file.
VSCode actually supports the dynamic setting of cwd, but that's not the default. If your work folder is not a rigorous and professional project, I recommend you adding the following settings to launch.json. This is the simplest answer you need.
"cwd": "${fileDirname}"
Thanks to everyone that tried to help me. Thanks to the Roy2012 response, I got a code that works for me.
import pandas as pd
import os
currentPath = os.path.dirname(__file__)
# Load the CSV data into DataFrames
super_bowls = pd.read_csv(currentPath + '/datasets/super_bowls.csv')
The os.path.dirname gives me the path of the current file, and let me work with relative paths.
'/Users/ivanparra/AprendizajePython/Jupyter'
and with that it works like a charm!!
P.S.: As a side note, the behavior of os.getcwd() is quite different in a Jupyter Notebook than a python file. Inside the notebook, that function gives the current file path, but in a python file, gives the top folder path.

How to set current working directory in python in a automatic way

How can I set the current path of my python file "myproject.py" to the file itself?
I do not want something like this:
path = "the path of myproject.py"
In mathematica I can set:
SetDirectory[NotebookDirectory[]]
The advantage with the code in Mathematica is that if I change the path of my Mathematica file, for example if I give it to someone else or I put it in another folder, I do not need to do anything extra. Each time Mathematica automatically set the directory to the current folder.
I want something similar to this in Python.
The right solution is not to change the current working directory, but to get the full path to the directory containing your script or module then use os.path.join to build your files path:
import os
ROOT_PATH = os.path.dirname(os.path.abspath(__file__))
# then:
myfile_path = os.path.join(ROOT_PATH, "myfile.txt")
This is safer than messing with current working directory (hint : what would happen if another module changes the current working directory after you did but before you access your files ?)
I want to set the directory in which the python file is, as working directory
There are two step:
Find out path to the python file
Set its parent directory as the working directory
The 2nd is simple:
import os
os.chdir(module_dir) # set working directory
The 1st might be complex if you want to support a general case (python file that is run as a script directly, python file that is imported in another module, python file that is symlinked, etc). Here's one possible solution:
import inspect
import os
module_path = inspect.getfile(inspect.currentframe())
module_dir = os.path.realpath(os.path.dirname(module_path))
Use the os.getcwd() function from the built in os module also there's os.getcwdu() which returns a unicode object of the current working directory
Example usage:
import os
path = os.getcwd()
print path
#C:\Users\KDawG\Desktop\Python

Where does python look for files in a script? [duplicate]

This question already has answers here:
How do I get the parent directory in Python?
(21 answers)
Closed 9 years ago.
So I've just coded this class for a title screen and it works well. However, one of the people I'm working with on the project mentioned that I shouldn't use:
os.chdir(os.getcwd() + "/..")
resource = (os.getcwd() + "/media/file name")
to get to the super directory. He did mention something about the pythonpath though. We're using Eclipse if this is of some help.
For more context we're making a multi-platform game so we can't just synchronize our directories and hard-code it (although we are using git so the working directory is synchronized). Basically, I need some way to get from a script file in a "src' folder to a "media" folder that's next to it (AKA There's a super (project) folder with both "src" and "media" folders in it).
Any help would be greatly appreciated, but please don't say "google it" because I tried that before coming here (I don't know if that's a frequent thing here, but I've seen it too many times elsewhere...when I've googled for answers, sorry if I sound jerkish for saying that)
Python programs do have the concept of a current working directory, which is generally the directory from which the script was run. This is "where they look for files" with a relative path.
However, since your program can be run from a different folder than the one it is in, your directory of reference needs to instead refer to the directory your script is in (the current directory is not related to the location of your script, in general). The directory where your script is found is obtained with
script_dir = os.path.dirname(__file__)
Note that this path can be relative (possibly empty), so it is still important that the current working directory of your script be the same as the directory when your script was read by the python interpreter (which is when __file__ is set). It is important to convert the possibly relative script_dir into an absolute path if the current working directory is changed later in the code:
# If script_dir is relative, the current working directory is used, here. This is correct if the current
# working directory is the same as when the script was read by the Python interpreter (which is
# when __file__ was set):
script_dir = os.path.abspath(script_dir)
You can then get to the directory media in the parent directory with the platform-independent
os.path.join(script_dir, os.path.pardir, 'media')
In fact os.path.pardir (or equivalently os.pardir) is the platform-independent parent directory convention, and os.path.join() simply joins paths in a platform independent way.
I'd recommend something like:
import os.path
base_folder = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
media_folder = os.path.join(base_folder, "media")
src_folder = os.path.join(base_folder, "src")
resource = os.path.join(media_folder, "filename")
for path in [base_folder, media_folder, src_folder, resource]:
print path
The main ingredients are:
__file__: gets the path to the current source file (unlike sys.argv[0], which gives the path the script that was called)
os.path.split(): splits a path into the relative file/folder name and the base folder containing it. Using it twice as in base_folder = ... will give the parent directory.
os.path.join: OS-independent and correct combination of path names. Is aware of missing or multiple /s or \s
Consider using os.path.dirname() and os.path.join(). These should work in a platform independent way.

Categories