Dealing with subfolders when usin crontab

Dealing with subfolders when usin crontab - python

I'm having some troubles trying to schedule the execution of one project.
The structure is:
main folder
|- lib
| |- file1.py
| |- file2.py
|
|- data
| |- file.csv
|
|- temp
| |- file.json
|
|-main.py
The contrab line is:
*/5 * * * * python3 /home/myName/main_folder/main.py
I've been trying this command line with simple python scripts without dependences and works fine. The problem is that in this case the main.py import classes and functions inside lib and I think it can deal with it.
On my main.py I'm importing like this from lib import file1, file2. Exists another way maybe using os that the program knows the absolute path?

Please try to add an empty file named: __init__.py in lib directory.

Related

Problems trying to import a package

I have the following project structure:
Project |
|- sim |
| |- out |
| |- main.py
|- libs |
|- __init__.py
|- plus.py
Inside plus.py there is a function called sum(a,b) that returns the sum of 2 numbers. I'm trying to import the module plus.py into main.py to the able to call this function, however I'm getting the following error: ImportError: attempted relative import with no known parent package.
Here is the code inside main.py:
from ...libs import plus
a = 1
b = 5
c = plus.sum(a,b)
print(c)
One of the solutions I found is to add the project directory to path, but I'm trying to avoid that.
I'm using VSCode to call python, this could be also a useful information.
What I'm doing wrong here?
Thanks in advance.
EDIT:
Added __init__.py files in sim, out and Project directories as #ThePjot suggested and the error remains. Now the project structure is in the following form:
Project |
|- __init__.py
|
|- sim |
| |- __init__.py
| |- out |
| |- __init__.py
| |- main.py
|- libs |
|- __init__.py
|- plus.py
The __init__.py files are empty.

I've had similar issues and I've created an experimental new import library ultraimport that allows to do file system based imports to solve your issue.
In your main.py you would then write:
import ultraimport
plus = ultraimport('__dir__/../../libs/plus.py', 'plus')
a = 1
b = 5
c = plus.sum(a,b)
print(c)
PS: With ultraimport, it's also not necessary to create those __init__.py files. You could remove them again and it will still work.

Python os.listdir() returning the parent directory list

Hello everybody! I have a really straight forward question.
I have a directory that looks like this:
|- folder1
| |- folder1_1
| | |- ...
| |- other_file.py
| |- test.py
|- folder2
| |- ...
|- file.py
In test.py I call
os.listdir()
and what I'm getting back is:
['folder1', 'folder2', 'file.py']
while I was expecting to get
['folder1_1', 'other_file.py', 'test.py']
as I was calling it inside test.py, I thought it would "list the current directory" from where I'm calling (as the documentation says that default parameter path='.' for the function). Or am I missing something?

The current directory depends on where you launched your python script from.
So if you run from the root folder of your project:
python3 folder1/test.py
You will get ['folder1', 'folder2', 'file.py'].
Instead if you run from folder1 you will get the result you expected:
cd folder1
python3 test.py
To get the path of the file running some function you can rely on the __file__ variable:
import os
os.path.dirname(os.path.abspath(__file__))
Then you can use that as parameter for os.listdir

Portable way to import modules from parent directory in Python

I know this is a worn out topic but the import mechanism/s in python is still confusing the masses. What I want is the ability to import a custom module that is in a parent directory in a way that allows me to take a project to another environment and have all of the imports work.
For example a structure like this:
repo
|--- folder1
| |--- script1.py
|--- folder2
| |--- script2.py
|--- utils
|--- some-util.py
How can I import from some-util.py in both script1 and script2? The idea is that I could clone the repo into a remote host and run scripts from folders 1 and 2 that may have the shared dependency of some-util.py only I don't want to have to run anything before hand. I want to be able to:
connect to box
git clone repo
python repo/folder1/script1.py
contents of script1 and script2:
import some-util
<code>
EDIT:
I forgot to mention that occasionally the scripts need to be run from another directory like:
/nas/some_folder/repo/folder1/script1.py args..
Also, the box is limited to python 2.7.5

The trick is to implement as your scripts as modules (read here and here for an overview of what the python -m switch means).
Here is a structure, also notice every directory contains an (empty file) named __init__.py:
repo/
|____utils/
| |____someutil.py
| |___ __init__.py
|___ __init__.py
|____folder1/
|____script1.py
|___ __init__.py
utils.someutil may contain something like this:
def say_hello():
return "Hello World."
And your script1.py may contain something like:
from ..utils.someutil import say_hello
if __name__ == "__main__":
print(say_hello())
Then running the following:
python -m repo.folder1.script1
... produces:
Hello World.

Handle file imports after package installation [duplicate]

I use setuptools to distribute my python package. Now I need to distribute additional datafiles.
From what I've gathered fromt the setuptools documentation, I need to have my data files inside the package directory. However, I would rather have my datafiles inside a subdirectory in the root directory.
What I would like to avoid:
/ #root
|- src/
| |- mypackage/
| | |- data/
| | | |- resource1
| | | |- [...]
| | |- __init__.py
| | |- [...]
|- setup.py
What I would like to have instead:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
I just don't feel comfortable with having so many subdirectories, if it's not essential. I fail to find a reason, why I /have/ to put the files inside the package directory. It is also cumbersome to work with so many nested subdirectories IMHO. Or is there any good reason that would justify this restriction?

Option 1: Install as package data
The main advantage of placing data files inside the root of your Python package
is that it lets you avoid worrying about where the files will live on a user's
system, which may be Windows, Mac, Linux, some mobile platform, or inside an Egg. You can
always find the directory data relative to your Python package root, no matter where or how it is installed.
For example, if I have a project layout like so:
project/
foo/
__init__.py
data/
resource1/
foo.txt
You can add a function to __init__.py to locate an absolute path to a data
file:
import os
_ROOT = os.path.abspath(os.path.dirname(__file__))
def get_data(path):
return os.path.join(_ROOT, 'data', path)
print get_data('resource1/foo.txt')
Outputs:
/Users/pat/project/foo/data/resource1/foo.txt
After the project is installed as an Egg the path to data will change, but the code doesn't need to change:
/Users/pat/virtenv/foo/lib/python2.6/site-packages/foo-0.0.0-py2.6.egg/foo/data/resource1/foo.txt
Option 2: Install to fixed location
The alternative would be to place your data outside the Python package and then
either:
Have the location of data passed in via a configuration file,
command line arguments or
Embed the location into your Python code.
This is far less desirable if you plan to distribute your project. If you really want to do this, you can install your data wherever you like on the target system by specifying the destination for each group of files by passing in a list of tuples:
from setuptools import setup
setup(
...
data_files=[
('/var/data1', ['data/foo.txt']),
('/var/data2', ['data/bar.txt'])
]
)
Updated: Example of a shell function to recursively grep Python files:
atlas% function grep_py { find . -name '*.py' -exec grep -Hn $* {} \; }
atlas% grep_py ": \["
./setup.py:9: package_data={'foo': ['data/resource1/foo.txt']}

I Think I found a good compromise which will allow you to mantain the following structure:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
You should install data as package_data, to avoid the problems described in samplebias answer, but in order to mantain the file structure you should add to your setup.py:
try:
os.symlink('../../data', 'src/mypackage/data')
setup(
...
package_data = {'mypackage': ['data/*']}
...
)
finally:
os.unlink('src/mypackage/data')
This way we create the appropriate structure "just in time", and mantain our source tree organized.
To access such data files within your code, you 'simply' use:
data = resource_filename(Requirement.parse("main_package"), 'mypackage/data')
I still don't like having to specify 'mypackage' in the code, as the data could have nothing to do necessarally with this module, but i guess its a good compromise.

I could use importlib_resources or importlib.resources (depending on python version).
https://importlib-resources.readthedocs.io/en/latest/using.html

I think that you can basically give anything as an argument *data_files* to setup().

setuptools: package data folder location

I use setuptools to distribute my python package. Now I need to distribute additional datafiles.
From what I've gathered fromt the setuptools documentation, I need to have my data files inside the package directory. However, I would rather have my datafiles inside a subdirectory in the root directory.
What I would like to avoid:
/ #root
|- src/
| |- mypackage/
| | |- data/
| | | |- resource1
| | | |- [...]
| | |- __init__.py
| | |- [...]
|- setup.py
What I would like to have instead:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
I just don't feel comfortable with having so many subdirectories, if it's not essential. I fail to find a reason, why I /have/ to put the files inside the package directory. It is also cumbersome to work with so many nested subdirectories IMHO. Or is there any good reason that would justify this restriction?

Option 1: Install as package data
The main advantage of placing data files inside the root of your Python package
is that it lets you avoid worrying about where the files will live on a user's
system, which may be Windows, Mac, Linux, some mobile platform, or inside an Egg. You can
always find the directory data relative to your Python package root, no matter where or how it is installed.
For example, if I have a project layout like so:
project/
foo/
__init__.py
data/
resource1/
foo.txt
You can add a function to __init__.py to locate an absolute path to a data
file:
import os
_ROOT = os.path.abspath(os.path.dirname(__file__))
def get_data(path):
return os.path.join(_ROOT, 'data', path)
print get_data('resource1/foo.txt')
Outputs:
/Users/pat/project/foo/data/resource1/foo.txt
After the project is installed as an Egg the path to data will change, but the code doesn't need to change:
/Users/pat/virtenv/foo/lib/python2.6/site-packages/foo-0.0.0-py2.6.egg/foo/data/resource1/foo.txt
Option 2: Install to fixed location
The alternative would be to place your data outside the Python package and then
either:
Have the location of data passed in via a configuration file,
command line arguments or
Embed the location into your Python code.
This is far less desirable if you plan to distribute your project. If you really want to do this, you can install your data wherever you like on the target system by specifying the destination for each group of files by passing in a list of tuples:
from setuptools import setup
setup(
...
data_files=[
('/var/data1', ['data/foo.txt']),
('/var/data2', ['data/bar.txt'])
]
)
Updated: Example of a shell function to recursively grep Python files:
atlas% function grep_py { find . -name '*.py' -exec grep -Hn $* {} \; }
atlas% grep_py ": \["
./setup.py:9: package_data={'foo': ['data/resource1/foo.txt']}

I Think I found a good compromise which will allow you to mantain the following structure:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
You should install data as package_data, to avoid the problems described in samplebias answer, but in order to mantain the file structure you should add to your setup.py:
try:
os.symlink('../../data', 'src/mypackage/data')
setup(
...
package_data = {'mypackage': ['data/*']}
...
)
finally:
os.unlink('src/mypackage/data')
This way we create the appropriate structure "just in time", and mantain our source tree organized.
To access such data files within your code, you 'simply' use:
data = resource_filename(Requirement.parse("main_package"), 'mypackage/data')
I still don't like having to specify 'mypackage' in the code, as the data could have nothing to do necessarally with this module, but i guess its a good compromise.

I could use importlib_resources or importlib.resources (depending on python version).
https://importlib-resources.readthedocs.io/en/latest/using.html

I think that you can basically give anything as an argument *data_files* to setup().

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dealing with subfolders when usin crontab - python

Please try to add an empty file named: init.py in lib directory.

Related

Problems trying to import a package

Python os.listdir() returning the parent directory list

Portable way to import modules from parent directory in Python

Handle file imports after package installation [duplicate]

setuptools: package data folder location

Categories

Resources