Google Cloud Buildpack custom source directory for Python app

Google Cloud Buildpack custom source directory for Python app - python

I am experimenting with Google Cloud Platform buildpacks, specifically for Python. I started with the Sample Functions Framework Python example app, and got that running locally, with commands:
pack build --builder=gcr.io/buildpacks/builder sample-functions-framework-python
docker run -it -ePORT=8080 -p8080:8080 sample-functions-framework-python
Great, let's see if I can apply this concept on a legacy project (Python 3.7 if that matters).
The legacy project has a structure similar to:
.gitignore
source/
main.py
lib
helper.py
requirements.txt
tests/
<test files here>
The Dockerfile that came with this project packaged the source directory contents without the "source" directory, like this:
COPY lib/ /app/lib
COPY main.py /app
WORKDIR /app
... rest of Dockerfile here ...
Is there a way to package just the contents of the source directory using the buildpack?
I tried to add this config to the project.toml file:
[[build.env]]
name = "GOOGLE_FUNCTION_SOURCE"
value = "./source/main.py"
But the Python modules/imports aren't set up correctly for that, as I get this error:
File "/workspace/source/main.py", line 2, in <module>
from source.lib.helper import mymethod
ModuleNotFoundError: No module named 'source'
Putting both main.py and /lib into the project root dir would make this work, but I'm wondering if there is a better way.
Related question, is there a way to see what project files are being copied into the image by the buildpack? I tried using verbose logging but didn't see anything useful.
Update:
The python module error:
File "/workspace/source/main.py", line 2, in <module>
from source.lib.helper import mymethod
ModuleNotFoundError: No module named 'source'
was happening because I moved the lib dir into source in my test project, and when I did this, Intellij updated the import statement in main.py without me catching it. I fixed the import, then applied the solution listed below and it worked.

I had been searching the buildpack and Google cloud function documentation, but I discovered the option I need on the pack build documentation page: option --path.
This command only captures the source directory contents:
pack build --builder=gcr.io/buildpacks/builder --path source sample-functions-framework-python
If changing the path, the project.toml descriptor needs to be in that directory too (or specify with --descriptor on command line).

Related

Python module not found when using Docker

I have an Angular-Flask application that I'm trying to Dockerize.
One of the Python files used is a file called trial.py. This file uses a module called _trial.pyd. When I run the project on local everything works fine. However, when I dockerize using the following code and run, it gives me error "Module _trial not found".
FROM node:latest as node
COPY . /APP
COPY package.json /APP/package.json
WORKDIR /APP
RUN npm install
RUN npm install -g #angular/cli#7.3.9
CMD ng build --base-href /static/
FROM python:3.6
WORKDIR /root/
COPY --from=0 /APP/ .
RUN pip install -r requirements.txt
EXPOSE 5000
ENTRYPOINT ["python"]
CMD ["app.py"]
Where am I going wrong? Do I need to change some path in my dockerfile?
EDIT:
Directory structure:
APP [folder with src Angular files]
static [folder with js files]
templates [folder with index.html]
_trial.pyd
app.py [Starting point of execution]
Dockerfile
package.json
requirements.txt
trial.py
app.py calls trial.py (successfully).
trial.py needs _trial.pyd. I try to call it using import _trial.
It works fine locally, but when I dockerize it, i get the following error message:
Traceback (most recent call last):
File "app.py", line 7, in
from trial import *
File "/root/trial.py", line 15, in
import _trial
ModuleNotFoundError: No module named '_trial'
The commands run are:
docker image build -t prj .
docker container run --publish 5000:5000 --name prj prj
UPDATE
It is able to recognize other .py files but not .pyd module.

I think this could be due to the following reasons
Path
Make sure the required file is in the pythonpath. It sounds like you have done that so probably not this one.
Debug Mode
If you are working with a debug mode you will need to rename this module "_trail.pyd" to "_trail_d.pyd"
Missing DLL
The .pyd file may require a dll or other dependency that is not available and cant be imported due to that. there are tools such as depends.exe or this that allow you to find what is required
Name Space Issue
If there was a file already called "_trail.py" in the python path that could create unwanted behavior.

Solved it!
The issue was that I was using .pyd files in a Linux container. However, it seems that Linux doesn't support .pyd files. This is why it was not able to detect the _trial.pyd module. I had to generate a shared object (.so) file (i.e. _trial.so) and it worked fine.
EDIT: The .so file was generated on a Linux system by me. Generating .so on Windows and then using it in a Linux container gives "invalid ELF header" error.

Google Dataflow - Failed to import custom python modules

My Apache beam pipeline implements custom Transforms and ParDo's python modules which further imports other modules written by me. On Local runner this works fine as all the available files are available in the same path. In case of Dataflow runner, pipeline fails with module import error.
How do I make custom modules available to all the dataflow workers? Please advise.
Below is an example:
ImportError: No module named DataAggregation
at find_class (/usr/lib/python2.7/pickle.py:1130)
at find_class (/usr/local/lib/python2.7/dist-packages/dill/dill.py:423)
at load_global (/usr/lib/python2.7/pickle.py:1096)
at load (/usr/lib/python2.7/pickle.py:864)
at load (/usr/local/lib/python2.7/dist-packages/dill/dill.py:266)
at loads (/usr/local/lib/python2.7/dist-packages/dill/dill.py:277)
at loads (/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py:232)
at apache_beam.runners.worker.operations.PGBKCVOperation.__init__ (operations.py:508)
at apache_beam.runners.worker.operations.create_pgbk_op (operations.py:452)
at apache_beam.runners.worker.operations.create_operation (operations.py:613)
at create_operation (/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py:104)
at execute (/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py:130)
at do_work (/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py:642)

The issue is probably that you haven't grouped your files as a package. The Beam documentation has a section on it.
Multiple File Dependencies
Often, your pipeline code spans multiple files. To run your project remotely, you must group these files as a Python package and specify the package when you run your pipeline. When the remote workers start, they will install your package. To group your files as a Python package and make it available remotely, perform the following steps:
Create a setup.py file for your project. The following is a very basic setup.py file.
setuptools.setup(
name='PACKAGE-NAME'
version='PACKAGE-VERSION',
install_requires=[],
packages=setuptools.find_packages(),
)
Structure your project so that the root directory contains the setup.py file, the main workflow file, and a directory with the rest of the files.
root_dir/
setup.py
main.py
other_files_dir/
See Juliaset for an example that follows this required project structure.
Run your pipeline with the following command-line option:
--setup_file /path/to/setup.py
Note: If you created a requirements.txt file and your project spans multiple files, you can get rid of the requirements.txt file and instead, add all packages contained in requirements.txt to the install_requires field of the setup call (in step 1).

I ran into the same issue and unfortunately, the docs are not as verbose as they need to be.
So, the problem as it turns out is that both the root_dir and the other_files_dir must contain an __init__.py file. When a directory contains an __init__.py file (even if it's empty) python will treat that directory as a package, which in this instance is what we want. So, your final folder structure should look something like this:
root_dir/
__init__.py
setup.py
main.py
other_files_dir/
__init__.py
module_1.py
module_2.py
And what you'll find is that python will build an .egg-info folder that describes your package including all pip dependencies. It will also contain the top_level.txt file which contains the name of the directory that holds the modules (i.e other_files_dir)
Then you would simply call the modules in main.py as below
from other_files_dir import module_1

How to run a python program like pycharm does

I've been developing a project in python for some time using pycharm. I have no problem running it in pycharm, but I want to try and running it from the command line (I'm using windows).
When I try to run my main file using python <filename> from within the root directory, I get a module not found error. What is pycharm doing/how can i replicate it?
Also, pycharm creates pycache folders. If my understanding is correct its compiling the files, and that makes the runtime faster. Since my runtime is already long i'd like to do the same.
I'm using python 3.6
edit
File Structure
-Root\
----scheduler\
------main.py
------matrices
------models
------rhc
------Singleton.py
------utils.py
------__init__.py
------apis\
acedb.py
metrics.py
__init__.py
------matrices\
distancematrix.py
__init__.py
------models\
branch.py
constants.py
customer.py
dispatch.py
technician.py
workorder.py
__init__.py
------rhc\
pathfinder.py
rhc.py
schedule.py
sched_time.py
tech_schedule.py
__init__.py
Edit 2
Found the solution, i had to move my main files outside of the modules
Thanks!

If you have the below folder structure for instance. add an empty file named init.py and import from your app.py, in case you have a Module1Class , you can always import it like this
from modules.module1 import Module1Class
from modules.module2 import Module2Class
Folder structure
/app.py
/modules
/module1.py
/module2.py
/__init__.py
The first time you rum app.py from terminal like python app.py, the .pyc files will be created, leaving you with the following folder structure
/app.py
/modules
/module1.py
/module1.pyc
/module2.py
/module2.pyc
/__init__.py
Please refer to the python documentation as it's very well documented on how to create modules and importing them. I'm more used to python2.7 , so my answer might not be an exact fit to newer python versions.
https://docs.python.org/3/tutorial/modules.html
From there you can learn more about __ init __.py and module creation , exporting and importing
PS: I use only text editors to develop in python, as I find pycharm a bit on the heavy side, so I cannot explain how exactly pycharm works behind the curtains.

see ModuleNotFoundError: What does it mean __main__ is not a package? for a good example of ModuleNotFoundError description (was answered fast)
I bet that pycharm is configured to use a different python interpreter of virtual environment.

How to run tests without installing package?

I have some Python package and some tests. The files are layed out following http://pytest.org/latest/goodpractices.html#choosing-a-test-layout-import-rules
Putting tests into an extra directory outside your actual application
code, useful if you have many functional tests or for other reasons
want to keep tests separate from actual application code (often a good
idea):
setup.py # your distutils/setuptools Python package metadata
mypkg/
__init__.py
appmodule.py
tests/
test_app.py
My problem is, when I run the tests py.test, I get an error
ImportError: No module named 'mypkg'
I can solve this by installing the package python setup.py install but this means the tests run against the installed package, not the local one, which makes development very tedious. Whenever I make a change and want to run the tests, I need to reinstall, else I am testing the old code.
What can I do?

I know this question has been already closed, but a simple way I often use is to call pytest via python -m, from the root (the parent of the package).
$ python -m pytest tests
This works because -m option adds the current directory to the python path, and hence mypkg is detected as a local package (not as the installed).
See:
https://docs.pytest.org/en/latest/usage.html#calling-pytest-through-python-m-pytest

The normal approach for development is to use a virtualenv and use pip install -e . in the virtualenv (this is almost equivalent to python setup.py develop). Now your source directory is used as installed package on sys.path.
There are of course a bunch of other ways to get your package on sys.path for testing, see Ensuring py.test includes the application directory in sys.path for a question with a more complete answer for this exact same problem.

On my side, while developing, I prefer to run tests from the IDE (using a runner extension) rather than using the command line. However, before pushing my code or prior to a release, I like to use the command line.
Here is a way to deal with this issue, allowing you to run tests from both the test runner used by your IDE and the command line.
My setup:
IDE: Visual Studio Code
Testing: pytest
Extension (test runner): https://marketplace.visualstudio.com/items?itemName=LittleFoxTeam.vscode-python-test-adapter
Work directory structure (my solution should be easily adaptable to your context):
project_folder/
src/
mypkg/
__init__.py
appmodule.py
tests/
mypkg/
appmodule_test.py
pytest.ini <- Use so pytest can locate pkgs from ./src
.env <- Use so VsCode and its extention can locate pkgs from ./src
.env:
PYTHONPATH="${PYTHONPATH};./src;"
pytest.ini (tried with pytest 7.1.2):
[pytest]
pythonpath = . src
./src/mypkg/appmodule.py:
def i_hate_configuring_python():
return "Finally..."
./tests/mypkg/appmodule_test.py:
from mypkg import app_module
def test_demo():
print(app_module.i_hate_configuring_python())
This should do the trick

Import the package using from .. import mypkg. For this to work you will need to add (empty) __init__.py files to the tests directory and the containing directory. py.test should take care of the rest.

Py.test No module named *

I have a folder structure like this
App
--App
--app.py
--Docs
--Tests
--test_app.py
In my test_app.py file, I have a line to import my app module. When I run py.test on the root folder, I get this error about no module named app. How should I configure this?

Working with Python 3 and getting the same error on a similar project layout, I solved it by adding an __init__ file to my tests module.
$ touch tests/__init__.py
I'm not great at packaging and importing, but I think that this helps pytest work out where the target App module is located.

I already had an __init__.py file in the /App/App directory and wanted to run tests from the project root without any path-mangling magic:
python -m pytest tests
The output immediately looks like this:
➟ python -m pytest tests
====================================== test session starts ======================================
platform linux -- Python 3.5.1, pytest-2.9.0, py-1.4.31, pluggy-0.3.1
rootdir: /home/andrew/code/app, inifile:
plugins: teamcity-messages-1.17
collected 46 items
... lines omitted ...
============================= 44 passed, 2 skipped in 1.61 seconds ==============================

I had a similar problem and had to delete __init__.py from the root and add an __init__.py to the tests folder.

So you are running py.test from /App. Are you sure /App/App is in your $PYTHONPATH?
If it's not, code that tries to import app will fail with such a message.
EDIT0: including the info from my comment below, for completeness.
An attempt to import app will only succeed if it was executed inside /App/App, which is not the case here. You probably want to make /App/App a package by putting __init__.py inside it, and change your import to qualify app as from App import app.
EDIT1: by request, adding further explanation from my second comment below.
By putting __init__.py inside /App/App, that directory becomes a package. Which means you can import from it, as long as it - the directory - is visible in the $PYTHONPATH. I.e. you can do from App import app if /App is in the $PYTHONPATH. Your current working directory gets automatically added to $PYTHONPATH, so when you run a script from /App, the import will work.

Running pytest with the python -m pytest command helps with this exact thing.
Since your current package is not yet in your $PYTHONPATH or sys.path - pytest gets this error.
By using python -m pytest you automatically add the working directory into sys.path for running pytest. Their documentation also mentions:
This is almost equivalent to invoking the command line script pytest

I also got same error while running test cases for my app located as below
myproject
--app1
--__init.py__
--test.py
--app2
--__init.py__
--test.py
--__init.py__
I deleted my myproject's init.py file to run my test cases.

TL;DR
You might as well add an empty conftest.py file to your root app folder.
(if you take a look at the question folder structure, that would be the same level as the "Tests" folder, not inside of it).
More info:
Pytest looks for conftest.py files inside all your project folders.
conftest.py provides configuration for the file tree pytest finds it in. Because pytest somehow scans all subdirectories starting from conftest.py folder, it should find packages/modules outside the tests folder (as long as a conftest.py file is in your app root folder).
Eventually, you might want to write some code in your empty conftest.py, specially to share fixtures among different tests files, in order to avoid duplicate code. Afterall, the DRY principle (Don't Repeat Yourself) should also be follwed when writing tests.
Adding __init.py__ to the tests folder also should help pytest to find modules throughout your application. However, note that Python 3.3+ has implicit namespace packages that allow it to create a packages without an __init__.py file. That been said, creating __init__.py files for this specific purpose seems more like a a workaround for pytest than a python requirement. More about that in: Is __init__.py not required for packages in Python 3.3+

I got the similar issue. And after trying multiple things including installing pytest on virtual environment and adding/removing __init__.py file from the package, none worked for me.
Solution that worked for me is(windows solution):
Added Project folder(not package folder) to python path(set PYTHONPATH=%PYTHONPATH%;%CD%)
Ran my script from Project Folder and boom, it worked.

I hit the same issue.
my-app
--conf
--my-app
--tests
I set the __init__.py files. I added a conftest.py ( for sharing pytest.fixtures ). I added this to my Poetry file ( pyproject.toml:
[tool.pytest.ini_options]
pythonpath = [
"."
Turned out it was my use of hyphens and not underscores ! Noo...
# pytest can't find Module
--my-app
# works
--my_app

What worked for me: I had to make absolute imports in my test file and call python -m test in the root folder.

This worked for me:
Went to parent app, and pip install -e . (install a local and editable app).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Google Cloud Buildpack custom source directory for Python app - python

Related

Python module not found when using Docker

Google Dataflow - Failed to import custom python modules

How to run a python program like pycharm does

How to run tests without installing package?

Py.test No module named *

Categories

Resources