File not found even after adding the file inside docker - python

I have written a docker file which adds my python script inside the container:
ADD test_pclean.py /test_pclean.py
My directory structure is:
.
├── Dockerfile
├── README.md
├── pipeline.json
└── test_pclean.py
My json file which acts as a configuration file for creating a pipeline in Pachyderm is as follows:
{
"pipeline": {
"name": "mopng-beneficiary-v2"
},
"transform": {
"cmd": ["python3", "/test_pclean.py"],
"image": "avisrivastava254084/mopng-beneficiary-v2-image-7"
},
"input": {
"atom": {
"repo": "mopng_beneficiary_v2",
"glob": "/*"
}
}
}
Even though I have copied the official documentation's example, I am facing an error:
python3: can't open file '/test_pclean.py': [Errno 2] No such file or directory
My dockerfile is:
FROM debian:stretch
# Install opencv and matplotlib.
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y unzip wget build-essential \
cmake git pkg-config libswscale-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt
RUN apt update
RUN apt-get -y install python3-pip
RUN pip3 install matplotlib
RUN pip3 install pandas
ADD test_pclean.py /test_pclean.py
ENTRYPOINT [ "/bin/bash/" ]

Like some of the comments above suggest. It looks like your test_pclean.py file isn't in the docker image. Here's what should fix it.
Make sure your test_pclean.py file is in your docker image by having be included as part of the build process. Put this as the last step in your dockerfile:
COPY test_pclean.py .
Ensure that your pachyderm pipeline spec has the following for the cmd portion:
"cmd": ["python3", "./test_pclean.py"]
And this is more of a suggestion than a requirement.... You'll make life easier for yourself if you use image tags as part of your docker build. If you default to latest tag, any future iterations/builds of this step in your pipeline could have negitave affects (new bugs in your code etc.). Therefore the best practice is to use a particular version in your pipeline: mopng-beneficiary-v2-image-7:v1 and mopng-beneficiary-v2-image-7:v2 and so on. That way you can iterate on say version 3 and it won't affect the already running pipeline.
docker build -t avisrivastava254084/mopng-beneficiary-v2-image-7:v1
Then just update your pipeline spec to use avisrivastava254084/mopng-beneficiary-v2-image-7:v1

I was not changing the commits to my docker images on each build and hence, Kubernetes was using the local docker file that it had(w/o tags and commits, it doesn't acknowledge any change). Once I started using commit with each build, Kubernetes started downloading the intended docker image.

Related

AWS CDK: Installing external dependencies using requirements.txt via PythonFunction

I am trying to synthesize a CDK app (typeScript) which has some python lambda functions.
I am using PythonFunction to use a requirements.txt file to install the external dependencies. I am running vscode on WSL. I am encountering the following error.
Bundling asset Test/test-lambda-stack/test-subscriber-data-validator-poc/Code/Stage...
node:internal/fs/utils:347
throw err;
^
Error: ENOENT: no such file or directory, open '~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/node_modules/highlight.js/styles/cp -rTL /asset-input/ /asset-output && cd /asset-output && python -m pip install -r requirements.txt -t /asset-output.css'
at Object.openSync (node:fs:594:3)
at Object.readFileSync (node:fs:462:35)
at module.exports (~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/src/getColourScheme.js:47:26)
at ~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/src/docker.js:809:47
at FSReqCallback.readFileAfterClose [as oncomplete] (node:internal/fs/read_file_context:68:3)
at FSReqCallback.callbackTrampoline (node:internal/async_hooks:130:17) {
errno: -2,
syscall: 'open',
code: 'ENOENT',
path: '~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/node_modules/highlight.js/styles/cp -rTL /asset-input/ /asset-output && cd /asset-output && python -m pip install -r requirements.txt -t /asset-output.css'
}
Error: Failed to bundle asset Test/test-lambda-stack/test-subscriber-data-validator-poc/Code/Stage, bundle output is located at ~/Code/AWS/CDK/test-dev-poc/cdk.out/asset.6b577fe604573a3b53e635f09f768df3f87ad6651b18e9f628c2a086a525bb49-error: Error: docker exited with status 1
at AssetStaging.bundle (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:2:614)
at AssetStaging.stageByBundling (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:4506)
at stageThisAsset (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:1867)
at Cache.obtain (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/private/cache.js:1:242)
at new AssetStaging (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:2262)
at new Asset (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/aws-s3-assets/lib/asset.js:1:736)
at AssetCode.bind (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/aws-lambda/lib/code.js:1:4628)
at new Function (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/aws-lambda/lib/function.js:1:2803)
at new PythonFunction (~/Code/AWS/CDK/test-dev-poc/node_modules/#aws-cdk/aws-lambda-python-alpha/lib/function.ts:73:5)
at new lambdaInfraStack (~/Code/AWS/CDK/test-dev-poc/lib/serviceInfraStacks/lambda-infra-stack.ts:24:40)
My requirements.txt file looks like this
attrs==22.1.0
jsonschema==4.16.0
pyrsistent==0.18.1
My cdk code is this
new PythonFunction(this,`${appName}-subscriber-data-validator-${stage}`,{
runtime: Runtime.PYTHON_3_9,
entry: join('lambdas/subscriber_data_validator'),
handler: 'lambda_hander',
index: 'subscriber_data_validator.py'
})
Do I need to install anything additional? I have esbuild installed as a devDependency. Having a real hard time getting this work. Any help is appreciated.

Docker : exec /usr/bin/sh: exec format error

Hi guys need some help.
I created a custom docker image and push it to docker hub but when I run it in CI/CD it gives me this error.
exec /usr/bin/sh: exec format error
Where :
Dockerfile
FROM ubuntu:20.04
RUN apt-get update
RUN apt-get install -y software-properties-common
RUN apt-get install -y python3-pip
RUN pip3 install robotframework
.gitlab-ci.yml
robot-framework:
image: rethkevin/rf:v1
allow_failure: true
script:
- ls
- pip3 --version
Output
Running with gitlab-runner 15.1.0 (76984217)
on runner zgjy8gPC
Preparing the "docker" executor
Using Docker executor with image rethkevin/rf:v1 ...
Pulling docker image rethkevin/rf:v1 ...
Using docker image sha256:d2db066f04bd0c04f69db1622cd73b2fc2e78a5d95a68445618fe54b87f1d31f for rethkevin/rf:v1 with digest rethkevin/rf#sha256:58a500afcbd75ba477aa3076955967cebf66e2f69d4a5c1cca23d69f6775bf6a ...
Preparing environment
00:01
Running on runner-zgjy8gpc-project-1049-concurrent-0 via 1c8189df1d47...
Getting source from Git repository
00:01
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/reth.bagares/test-rf/.git/
Checking out 339458a3 as main...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:00
Using docker image sha256:d2db066f04bd0c04f69db1622cd73b2fc2e78a5d95a68445618fe54b87f1d31f for rethkevin/rf:v1 with digest rethkevin/rf#sha256:58a500afcbd75ba477aa3076955967cebf66e2f69d4a5c1cca23d69f6775bf6a ...
exec /usr/bin/sh: exec format error
Cleaning up project directory and file based variables
00:01
ERROR: Job failed: exit code 1
any thoughts on this to resolve the error?
The problem is that you built this image for arm64/v8 -- but your runner is using a different architecture.
If you run:
docker image inspect rethkevin/rf:v1
You will see this in the output:
...
"Architecture": "arm64",
"Variant": "v8",
"Os": "linux",
...
Try building and pushing your image from your GitLab CI runner so the architecture of the image will match your runner's architecture.
Alternatively, you can build for multiple architectures using docker buildx . Alternatively still, you could also run a GitLab runner on ARM architecture so that it can run the image for the architecture you built it on.
In my case, I was building it using buildx
docker buildx build --platform linux/amd64 -f ./Dockerfile -t image .
however the problem was in AWS lambda

cant docker build with pip install from jenkins pipeline

i am building a docker image, that will run a flask application.
when i do it locally with no problem i can build the image
my dockerfile:
FROM python:3.7
#RUN apt-get update -y
WORKDIR /app
RUN curl www.google.com
COPY requirements.txt requirements.txt
RUN pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org -r requirements.txt
my jenkinspipeline
pipeline {
agent {
label "linux_machine"
}
stages {
stage('Stage1') {
steps {
//sh 'docker --version'
//sh 'python3 --version'
//sh 'pip3 --version'
checkout([$class: 'GitSCM', branches: [[name: '*/my_branch']], extensions: [], userRemoteConfigs: [[credentialsId: 'credentials_', url: 'https://myrepo.git']]])
}
}
stage('Stage2'){
steps{
sh "docker build --tag tag1 --file path/to/docker_file_in_repo docker_folder_path"
}
}
}
}
i was able to install docker and jenkins locally in my machine and all works fine, but when i put it on the jenkins server with real agents i get:
File "/usr/local/lib/python3.7/site-packages/pip/_internal/network/auth.py", line 256, in handle_401
username, password, save = self._prompt_for_password(parsed.netloc)
File "/usr/local/lib/python3.7/site-packages/pip/_internal/network/auth.py", line 226, in _prompt_for_password
username = ask_input(f"User for {netloc}: ")
File "/usr/local/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 237, in ask_input
return input(message)
EOFError: EOF when reading a line
Removed build tracker: '/tmp/pip-req-tracker-i4mhh7vg'
The command '/bin/sh -c pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org -r requirements.txt' returned a non-zero code: 2
i try using --no-input but same error,
it seems that is asking for a user and password, why is that?
is the docker using the certification of the agent/host to pass that to the commands of the dockerfile?
any sugestion on how could i make this work?
thanks guys.
Unfortunately, the problem is not clear at all from the message. What happens is that pip gets a 401 Unauthorized from the package index. You have to provide credentials so it can log-in.
You can add --no-input so it doesn't try to ask for a password (where it then fails due to STDIN being unavailable). That doesn't solve the underlying problem of it being unable to connect.

Docker pip install creates recursive tmp/pip-build/tmp/pip-build... folder

I'm a bit new to Docker and Python apps. I was running into a really perplexing problem, and after a bit of button-pushing, I miraculously solved it. However, I understand neither the problem nor the solution, and want to understand both.
So I have a Dockerfile in the root directory of my app that does something like this:
COPY . .
RUN apt-get update && apt-get install -y enchant \
&& pip install --extra-index-url=${ARTIFACTORY} --no-cache-dir requirements.txt \
&& pip install . \ # installs my python app using setup.py
&& python -m app.run_model
ENTRYPOINT ...
This was failed consistently because it ran out of disk space. Ok, fine, I deleted old, unused images. But I don't think that was the issue, because it kept failing, and was also generating these super long, super weird recursive filenames like:
/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/tmp/pip-fih7z5-build/...
Adding the following wrapper around the Docker command somehow worked:
COPY . /workdir
RUN cd /workdir \
...
&& rm -rf /workdir
To get past the installation phase (although now it appears like the app process is still failing.) I'm not sure what was/is going on. Does anyone have any insight? My best guess is somehow the two pip installs were creating some sort of recursive nightmare?
My setup.py is pretty standard, I think:
#!/usr/bin/env python
from glob import glob
from os.path import abspath, basename, dirname, join, splitext
from setuptools import find_packages, setup
here = abspath(dirname(__file__))
with open(join(here, 'README.md')) as f:
long_description = f.read()
setup(
...
packages=find_packages('src'),
package_dir={'': 'src'},
py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
zip_safe=False,
include_package_data=True,
install_requires=[
'scikit-learn==0.18.1',
'scipy==0.19.1',
'nltk==3.2.3',
'requests==2.17.3',
'jsonschema==2.6.0',
'pandas==0.20.1',
'numpy==1.13.0',
'textblob==0.12.0',
'textstat==0.3.1',
'langdetect==1.0.7',
'unidecode==0.4.20',
'Flask==0.12.1',
'Flask-Env==1.0.1',
'pyenchant==1.6.11'
],
setup_requires=[
'flake8==3.3.0',
],
tests_require=[],
)
```

How to use gitlab-ci to manage test/construction of interdependent wheels

I have 3 python packages proj1, proj12 and proj13. Both proj12 and
proj13 depend on proj1 (with from proj1.xxx import yyy).
The 3 projects are on a private gitlab instance, each one has it's own .gitlab-ci.
In proj1 http://gitlab.me.com/group/proj1/.gitlab-ci.yml we run unittest
and create a wheel exposed as an artifact::
# http://gitlab.me.com/group/proj1/.gitlab-ci.yml
image: python:2
mytest:
artifacts:
paths:
- dist
script:
- apt-get update -qy; apt-get install -y python-dev python-pip
- pip install -r requirements.txt
- python setup.py test
- python setup.py bdist_wheel
look:
stage: deploy
script:
- ls -lah dist
For proj12 and proj13 in e.g.
http://gitlab.me.com/group/proj12/.gitlab-ci.yml we would like to run tests
too, but I need to install proj1 wheel to make it run.
All 3 projects are in the same gitlab private group.
What is the gitlab way to do this ?
to pass the proj1 wheel to the proj12 with an artifact
in this case I don't know how to call/get the artifact in
http://gitlab.me.com/group/proj12/.gitlab-ci.yml ? It's the same gitlab, the
same group, but a different project.
Use a gitlab Secret Variable to store ssh_keys to clone proj2 in proj12/.gitlab-ci.yml ?
related to https://gitlab.com/gitlab-org/gitlab-ce/issues/4194
this does not take benefit of the fact that proj1, proj12 and
proj13 are in the same gitlab and same group, the person who do the build
for one project as credentials to do the others. All 3 are connected by the user private token.
I try to avoid to have to deploy devpi or pypiserver like solutions.
So I'm looking on what to write in the proj12 .gitlab-ci.yml to get the
dist/proj1-0.42-py2-none-any.whl wheel from the proj1 precedent build::
# http://gitlab.me.com/group/proj12/.gitlab-ci.yml
image: python:2
mytest12:
script:
- apt-get update -qy; apt-get install -y python-dev python-pip
- pip install -r requirements.txt
- pip install .
- => some way here to get the proj1 wheel
- pip install proj1-0.42-py2-none-any.whl
- python setup.py test
Links related to our issue:
Allow access to build artifacts by using restricted access tokens https://gitlab.com/gitlab-org/gitlab-ce/issues/19628
"People need to be able to share links to artifacts based on a git ref (branch, tag, etc.), without knowing a specific build ID https://gitlab.com/gitlab-org/gitlab-ce/issues/4255
https://docs.gitlab.com/ce/api/ci/builds.html#upload-artifacts-to-build
download-the-artifacts-file https://docs.gitlab.com/ce/api/builds.html#download-the-artifacts-file https://gitlab.com/gitlab-org/gitlab-ce/issues/22957
You have two ways you can do it:
Pass the object from previous build using the artifacts (works inside the same project only)
Build a docker image with your packages pre-installed in a git job, store it in the in-built registry and use that to run build in your other projects.
Clone the repository
I would advise passing as an artifact since then you will have it build exactly in the pipeline you are running. As for the cloning, AFAIK you don't need any workaround when cloning submodules but for cloning other repositories I would go with ssh deploy key as it's connected with a repo and not a user like the private token.

Categories