Package serverless python service with individually packaged functions but shared code - python

Files from the utils folder are not included in the package when individually packaging with serverless-python-requirements plugin.
In my Serverless.com AWS Python project I have the following folder structure:
.
├── serverless.yml
├── generate_features
│   ├── requirements.txt
│   └── generate_features.py
├── requirements.txt
├── utils
│   ├── utility.py
│   └── additional_code.py
│  
:
The relevant section of my serverless.yml looks as follows
package:
individually: true
functions:
generate-features:
handler: generate_features.handler
module: generate_features
timeout: 400
...
I would not include everything that is in the utils folder with each individually packaged function (there are more than one and they share some code).
Unfortunately when I use the serverless-python-requirements it appears that it won't let me do that. It only includes whatever is in the module directory. I would basically like to add additional modules though.
Any ideas? Am I not seeing some obvious way to include utils/? adding
package:
include:
- utils/**
at the function level unfortunately doesn't seem to work.
Thx

Related

Including local source code directories and files in aws-sam?

I work in a mono repo where I have a AWS SAM (lambda) application. The structure is more or less like so:
.
├── my_folder1
  ├── file1.py
  ├── __init__.py
   └── file2.py
   ├── my_folder2
   │   └── my_folder3
   │   ├── __init__.py
   │   └── file3.py
├── sam-app
│   ├── events
│   │   └── event.json
│   ├── README.md
│   ├── samconfig.toml
│   ├── template.yaml
│   ├── tests
│   └── hello_world
│   ├── __init__.py
│   ├── requirements.txt
:preview:
numpy
pandas
...
│   └── app.py
:preview:
import pandas
from my_folder1.my_folder2.my_folder3.file3 import MyClass
...
.
.
.
more_folders...
.
.
.
You can separate my application into three portions:
External pip packages like pandas, numpy, etc. These packages change rarely. Occassionally I add a new one. They take a long time to install. They need to be built in a docker container which corresponds to the lambda runtime (as anyone who has used numpy/pandas has learned the hard way).
Source code within the mono repo, implicitly called from my lambda. So if you are following along with the preview above.
my_folder1.my_folder2.my_folder3.file3 might look like:
from my_folder1.my_folder837278 import my_utility_function
class MyClass:
my_utility_function()
...
This code changes often. Doesn't need to build in docker image.
My actual lambda code explicitly called by a handler within: ./my_folder1/sam-app/hello_world/app.py
Let's say I run the sam build --use-container command. This is great! It creates .aws-sam/build/MyFunction/pandas, .aws-sam/build/MyFunction/numpy, etc. And these pip packages are built against amazon-linux and can be used by the lambda. The problem is that my local imports (e.g. from my_folder1.my_folder2.my_folder3.file3 import MyClass) will clearly fail.
My question is: Is it possible to sync the local imports of my code to my sam application using the existing template specification? Best docs regarding template spec
Additional information:
How I get around this currently?
I write bash scripts that look like this and run them after sam build --use-container:
.
.
.
rsync "${EXCLUSIONS[#]}" -r $LOCAL_MODULES0 $LOCAL_MODULES1 ./.aws-sam/build/${FUNCTION}/my_folder2
rsync "${EXCLUSIONS[#]}" -r $LOCAL_MODULES2 $LOCAL_MODULES3 ./.aws-sam/build/${FUNCTION}/
.
.
.
which copy/sync the relevant local directories into the build directory to seem as if they were pip installed. Because lambda must already have that folder in the python path.
I also have one just for ./my_folder1/sam-app/hello_world/.
Things that suck:
Because the pip packages are so large you cannot view the lambda functional code in the UI. If they were layers and were put in /opt/python instead of /var/task, this would actually be possible.
Things I want:
sam build --use-container --dependencies - Builds external packages (things installed with pip) on docker image. Takes a long time, but only need be done once (unless removed/added dependencies).
sam build - Since --dependencies isn't specified just mounts code within ./my_folder1/sam-app/hello_world/ along with any local directories specified in template.yaml
Pre build commands in template.yaml. Improves customizability.
sam local invoke - Should be sam build && sam local invoke
External dependencies (e.g. pip packages) should be made into layers by default. Uploaded to s3.

Fixing 'Import [module] could not be resolved' in pyright

I'm using pyright for type checking and I'm also using pytest for testing inside Visual Studio Code. The folder structure for my tests is to have a 'test' subfolder in the package root . For example
|
MyPackage
|-- __init__.py
|-- MyModule.py
|--test
|-- __init__.py
|--MyModule_test.py
I'm organizing things like this as there will be many packages and I want to keep things organized.
Inside pytest I have
import pytest
import MyPackage.MyModule
...
Pytest is able to discover the tests and run them OK because it has some special ability to adjust its sys.path (or something).
However, pyright will just complain that it cannot import the module,
Import 'MyPackage.MyModule' could not be resolvedpyright (reportMissingImports). This makes sense, but is there some way to deal with this, either in pyright or in the Visual Studio Code settings to stop this from complaining?
You can add the library path to the path variable.
import sys
sys.path.insert(1, str('..'))
import MyModule
To enable Pylance to use your library properly (for auto-complete ...), use the following steps:
Pylance, by default, includes the root path of your workspace. If you want to include other subdirectories as import resolution paths, you can add them using the python.analysis.extraPaths setting for the workspace.
In VS Code press +<,> to open Settings.
Type in python.analysis.extraPaths
Select "Add Item"
Type in the path to your library `..'
Ok, a relative import as illustrated here was able to solve this. So in my case I should have
# MyModule_test.py
import pytest
from .. import MyModule
You should create a pyrightconfig.json file or pyproject.toml file at the root of your project. For example, if it's a Django project, you should have one of those files where manage.py is placed. Then, set include parameter and add the subdirectories (or app folders in Django terms).
You can consult this sample config file. See this issue ticket.
For example, if this were my project structure:
├── manage.py
├── movie
│   ├── admin.py
│   ├── apps.py
│   ├── __init__.py
│   ├── models.py
│   ├── tests.py
│   └── views.py
├── moviereviews
│   ├── asgi.py
│   ├── __init__.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
└── pyproject.toml
my pyproject.toml would be:
[tool.pyright]
include = ["movie", "moviereviews"]
If you are working within a Python virtual environment, set venvPath and venv. Consult the documentation for an exhaustive list of options.

How can I pip install for development?

I'm trying to pip install a GitHub project locally, outside of site-packages so that I can modify it, etc.
I've added -e git+git#github.com:Starcross/django-starcross-gallery.git#egg=gallery to my requirements.txt which brings the relevant part of my project layout to look like this:
/home/mat/venv/proj/
└── src
└── gallery
├── admin.py
├── apps.py
├── build.sh
├── django_starcross_gallery.egg-info
│   ├── dependency_links.txt
│   ├── PKG-INFO
│   ├── requires.txt
│   ├── SOURCES.txt
│   └── top_level.txt
├── forms.py
├── __init__.py
├── LICENSE
├── MANIFEST.in
├── models.py
├── README.rst
├── settings.py
├── setup.py
├── signals.py
├── static
│   └── ...
├── templates
│   └── ...
├── tests
│   └── ...
├── tests.py
├── urls.py
└── views.py
As far as I can see the problem is that these .egg-link and .pth files like one level too deep:
lib/python3.6/site-packages/django-starcross-gallery.egg-link:
/home/mat/venv/proj/src/gallery
.
lib/python3.6/site-packages/easy-install.pth:
/home/mat/venv/proj/src/gallery
I can fix everything by either moving gallery a level deeper, or changing django-starcross-gallery.egg-link and easy-install.pth to point to src.
Is there a config parameter I can pass in requirements.txt to make this work properly? Or do I have to adjust the project layout to fit?
Since you want to modify it, why not just clone the repo. To make your interpreter able to find and use it, you have some options:
modify your sys.path, append path to the repo
create a symlink under your project directory that points to the repo
And in this way, you don't have to pip install every time you modify it.
As has been mentioned, the best way to do this is to clone the repo. This would go for most packages as pip may build extensions, and carry other actions during install aimed at using the module for production rather than editing the source.
To explain why I chose this structure, I wanted to be able to develop the package inside a Django project. As the Django docs say, the app should be placed in a separate directory, which enables setuptools to install the package correctly. There is no way I could find that would enable this to continue to work inside a project, hence the build script to move the files into a suitable directory and generate the package.

How to use the Python package inside a project

I have the following directory structure:
├── DynamicProgramming
│   ├── 0-1_kp_problem.py
│   ├── b.py
│   ├── largest_contigous_subarray.py
│   ├── longest_common_substring.py
│   ├── min_change_for_given_money.py
│   ├── optimal_matrix_chain.py
│   ├── Readme.md
│   └── wis.py
├── helper
│   ├── a.py
│   └── __init__.py
└── Readme.md
The helper directory contains the library functions which will be used all over the code. How can I import the helper package from the scripts inside DynamicProgramming without adding it to the path?
Edit=>
I cannot move helper directory inside dynamicProgramming because there can be more than one directories using it.
You could use something like:
from ..helper import a
See python docs on packages.
If you run your code from project root folder, you are likely to succeed with import helper or import helper.a. If not, you would have to add current directory to PYTHONPATH:
$ export PYTHONPATH="."
better use project setup.py
Instead of playing with PYTHONPATH (what can be tricky business sometime), you shall create your project as python package.
You add setup.py into your project root, specify attributes of that package and build it from it.
setup.py can define multiple packages at once, but generally it is more often
using only one. For this purpose it would be better moving the helper package
into DynamicProgramming structure and import it from there.
Search for setup.py python packaging tutorials, it requires some study, but it will pay back.

directory structure for a project that mixes C++ and Python

Say you want want to create a programming project that mixes C++ and Python. The Foo C++ project structure uses CMake, and a Python module is created by using Swig. The tree structure would look something like this:
├── CMakeLists.txt
├── FooConfig.cmake.in
├── FooConfigVersion.cmake.in
├── Makefile
├── README
├── foo
│   ├── CMakeLists.txt
│   ├── config.hpp.in
│   ├── foo.cpp
│   └── foo.hpp
└── swig
└── foo.i
Now you would like to make use of the Foo project within a Python project, say Bar:
├── AUTHORS.rst
├── CONTRIBUTING.rst
├── HISTORY.rst
├── LICENSE
├── MANIFEST.in
├── Makefile
├── README.rst
├── docs
│   ├── Makefile
│   ├── authors.rst
│   ├── conf.py
│   ├── contributing.rst
│   ├── history.rst
│   ├── index.rst
│   ├── installation.rst
│   ├── make.bat
│   ├── readme.rst
│   └── usage.rst
├── bar
│   ├── __init__.py
│   └── bar.py
├── requirements.txt
├── setup.cfg
├── setup.py
├── tests
│   ├── __init__.py
│   └── test_bar.py
└── tox.ini
This structure was crated by using cookiecutter's pypackage template. A BoilerplatePP template is also available to generate a CMake C++ project using cookiecutter (no Swig part).
So now that I have the structure of both projects, and considering that the development will take place mainly in Python and the the project will be run in different systems, I need to address the following questions:
What's the best way to mix them? Should I collapse both root directories? Should I have the Foo C++ project as a directory of the Bar project or the other way around? I may be inclined to put the entire C++ structure shown above in a folder at the root level of the Python project, but I would like to know a priori any pitfalls as the CMake system is quite powerful and it may be convenient to do it the other way around.
In case I decide to put the Foo project as a directory within Bar, is the Python setuptools package as powerful as the CMake build system? I ask this because when I take a look at the Bar project, at the top level it seems there's only a bunch of scripts, but I don't know if this is the equivalent to CMake as I'm new to Python.
The Bar project outlined above has a single bar directory, but I assume that whenever this project expands, instead of having many other directories at the root level, other directories containing Python code will be placed within bar. Is this correct (in the Pythonic sense)?
I assume that a single egg will be produced from the entire project, so that it can be installed and run in many different python systems. Is the integration of the module created by the Foo project easy? I assume that this module will be created in a different directory than bar.
In order for the Python code within the bar directory, the module created by Swig has to be available, so I guess the most straightforward way to do this is to modify the environmental variable PYTHONPATH using the CMake system. Is this fine or is there a better way?
If the C++ application has no use outside the Python package that will contain it:
You can pretty safely place the C++ code within the python package that owns it. Have the "foo" directory within the "bar" directory within your example. This will make packaging the final Python module a bit easier.
If the C++ application is reusable:
I would definitely try to think of things in terms of "packages", where independent parts are self-contained. All independent parts live on the same level. If one part depends on another, you import from its corresponding "package" from the same level. This is how dependencies typically work.
I would NOT include one within the other, because one does not strictly belong to the other. What if you started a third project that needed "foo", but did not need "bar"?
I would place both "foo" and "bar" packages into the same "project" directory (and I would probably give each package it's own code repository so each package can be easily maintained and installed).

Categories