How to create a Python executable using Docker - python

I have a running python executable EIA.py which extracts information from EIA.gov website and downloads necessary information into an excel file on my laptop's C:/Python Folder. However, when I convert this file into image and run using docker run command for image it gives me following error.
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Python/Sid.xls'
I am not adding any file but Python should rather create an excel file with contents extracted from website.
Following is my code from Dockerfile
FROM python
VOLUME ["C:/Sid"]
WORKDIR /app
COPY . /app
RUN pip install EIA-python
RUN pip install requests
RUN pip install pandas
RUN pip install xlwt
RUN python /app/EIA.py
Following is my python code
import eia
import pandas as pd
api_key = "mykey"
api = eia.API(api_key)
series_storage = api.data_by_series(series='NG.NW2_EPG0_SWO_R48_BCF.W')
df1 = pd.DataFrame(series_storage)
df1.reset_index(inplace=True)
df1.columns = ['Date', 'Value']
df1['Date'] = pd.to_datetime(df1['Date'].str[:-3], format='%Y %m%d')
df1.to_excel("C:/Python/Sid.xls")

Docker containers do not have persistent storage. To save a file locally from a container, you can either bind a folder mount or create a docker volume. Docker volumes are the preferred mechanism for persisting data as they are completely managed within Docker CLI itself. Check out here for more info.

Related

How to run Python inside an expressjs Docker container

i am trying to build a container for my express.js application. The express.js-app makes use of python via the npm package PythonShell.
I have plenty of python-code, which is in a subfolder of my express-app and with npm start everything works perfectly.
However, i am new to docker and i need to containerize the app. My Dockerfile looks like this:
FROM node:18
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3001
CMD ["node", "./bin/www"]
I built the Image with:
docker build . -t blahblah-server and ran it with docker run -p 8080:3001 -d blahblah-server.
I make use of imports at the top of the python-script like this:
import datetime
from pathlib import Path # Used for easier handling of auxiliary file's local path
import pyecma376_2 # The base library for Open Packaging Specifications. We will use the OPCCoreProperties class.
from assi import model
When the pythonscript is executed (only in the container!!!) I get following error-message:
/usr/src/app/public/javascripts/service/pythonService.js:12
if (err) throw err;
^
PythonShellError: ModuleNotFoundError: No module named 'pyecma376_2'
at PythonShell.parseError (/usr/src/app/node_modules/python-shell/index.js:295:21)
at terminateIfNeeded (/usr/src/app/node_modules/python-shell/index.js:190:32)
at ChildProcess.<anonymous> (/usr/src/app/node_modules/python-shell/index.js:182:13)
at ChildProcess.emit (node:events:537:28)
at ChildProcess._handle.onexit (node:internal/child_process:291:12)
----- Python Traceback -----
File "/usr/src/app/public/pythonscripts/myPython/wtf.py", line 6, in <module>
import pyecma376_2 # The base library for Open Packaging Specifications. We will use the OPCCoreProperties class. {
traceback: 'Traceback (most recent call last):\n' +
' File "/usr/src/app/public/pythonscripts/myPython/wtf.py", line 6, in <module>\n' +
' import pyecma376_2 # The base library for Open Packaging Specifications. We will use the OPCCoreProperties class.\n' +
"ModuleNotFoundError: No module named 'pyecma376_2'\n",
executable: 'python3',
options: null,
script: 'public/pythonscripts/myPython/wtf.py',
args: null,
exitCode: 1
}
If I comment the first three imports out, I get the same error:
PythonShellError: ModuleNotFoundError: No module named 'assi'
Please notice, that assi actually is from my own python-code, which is included in the expressjs-app-directory
Python seems to be installed in the container correctly. I stepped inside the container via docker exec -it <container id> /bin/bash and there are the python packages in the #/usr/lib-directory.
I really have absolute no idea how all this works together and why python doesn't find this modules...
You are trying to use libs that are not in Standard Python Library. It seems that you are missing to run pip install , when you build the docker images.
Try adding RUN docker commands that can do this for you. Example:
RUN pip3 install pyecma376_2
RUN pip3 install /path/to/assi
Maybe, that can solve your problem. Don't forget to check if python are already installed in your container, it semms that it is. And if you have python2 and pyhton3 installed, make sure that you use pip3 instead of only pip.

Is there a way to use Kaggle notebooks on the go?

I've been using the site Kaggle to take some courses on AI, but whenever I try to download one of the exercises and run the code within VS Code, it doesn't work. I will always get an error like this:
<ipython-input-1-76a2777bc721> in <module>
1 # Set up feedback system
----> 2 from learntools.core import binder
3 binder.bind(globals())
4 from learntools.ethics.ex4 import *
5 import pandas as pd
ModuleNotFoundError: No module named 'learntools'
Is there any way to circumvent this error so I can use Kaggle notebooks on the go?
Step1. Install Docker
Follow the link to install docker in your machine
https://docs.docker.com/engine/install
Step2. Download the relevant Docker-file/Docker image.Here in this case
use docker pull kaggle/python
Step3: Launch docker container from the folder where you have the notebook using the command below.
docker run -v $PWD:/src -p 8888:8888 --rm -it kaggle/python jupyter notebook --no-browser --ip="0.0.0.0" --notebook-dir=/src
Step4: Copy the url from the terminal and paste it in your browser.
The url looks something like this http://127.0.0.1:8888/?token=xxxxxxxxx

aws lambda - failed to find libmagic

I'm using in my lambda function the magic library to determine the file`s type.
I first deployed it to a local container to check that everything works.
My DockerFile :
FROM lambci/lambda:build-python3.8
WORKDIR /app
RUN mkdir -p .aws
COPY requirements.txt ./
COPY credentials /app/.aws/
RUN mv /app/.aws/ ~/.aws/
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements.txt -t "/app/dependencies/"
WORKDIR /app/dependencies
RUN zip -r lambda.zip *
requirements.txt :
python-magic
libmagic
In my local container when I run tests on the lambda logic everything went ok and passed (including the part that uses the magic code..).
I created a zip that contains the lambda.py code and with the python dependencies (last 3 lines in the docker file).
When I upload the zip to aws and test the lambda I'm getting the following error :
{
"errorMessage": "Unable to import module 'lambda': failed to find libmagic. Check your installation",
"errorType": "Runtime.ImportModuleError"
}
As you can see, on my local container I'm using baseline image lambci/lambda:build-python3.8 that should be the same aws uses when the lambda is launching.
I tried also to add python-magic-bin==0.4.14 to the requirements.txt instead of the magic and libmagic but it didnt help either because it seems that this module is for windows.
Into the lambda.zip I put also the lambda.py which is the file that includes my lambda function :
import boto3
import urllib.parse
from io import BytesIO
import magic
def lambda_handler(event, context):
s3 = boto3.client("s3")
if event:
print("Event : ", event)
event_data = event["Records"][0]
file_name = urllib.parse.unquote_plus(event_data['s3']['object']['key'])
print("getting file: {}".format(file_name))
bucket_name = event_data['s3']['bucket']['name']
file_from_s3 = s3.get_object(Bucket=bucket_name, Key=file_name)
file_obj = BytesIO(file_from_s3['Body'].read())
print(magic.from_buffer(file_obj.read(2048)))
What am I doing wrong ?
While using filetype as suggested by other answers is much simpler, that library does not detect as many file types as magic does.
You can make python-magic work on aws lambda with python3.8 by doing the following:
Add libmagic.so.1 to a lib folder at the root of the lambda package. This lib folder will be automatically added to LD_LIBRARY_PATH on aws lambda. This library can be found in /usr/lib64/libmagic.so.1 on an amazon linux ec2 instance for example.
Create a magic file or take the one available on an amazon linux ec2 instance in /usr/share/misc/magic and add it to your lambda package.
The Magic constructor from python-magic takes a magic_file argument. Make this point to your magic file. You can then create the magic object with magic.Magic(magic_file='path_to_your_magic_file') and then call any function from python-magic you like on that object.
These steps are not necessary on the python3.7 runtime as those libraries are already present in aws lambda.
I didn't find a way to solve this issue, therefore, I decided to use a different library called filetype .
Example how to use it :
file_type = filetype.guess(file_object)
if file_type is not None:
file_type = file_type.MIME
file_object.seek(0)
print("File type : {}".format(file_type))
if file_type == "application/gzip":
file_binary_content = gzip.GzipFile(filename=None, mode='rb', fileobj=file_object).read()
elif file_type == "application/zip":
zipfile = ZipFile(file_object)
file_binary_content = zipfile.read(zipfile.namelist()[0])
Maybe someone finds it helpful. This is how I make my lambda become friends with python-magic. I used a local Ubuntu machine to create a zip-package and Python 3.7.
First, create a directory. The name doesn't matter.
mkdir lambda-package
cd lambda-package
Then, install locally python-magic. I used pip3 for it.
pip3 install python-magic -t .
Now, in your lambda-package directory create or copy (if you already have some code in your lambda) a lambda_function.py file. Be aware, Lambda expects such name. After this step you should have the following directory structure inside the lambda-package:
lambda-package
│ lambda_function.py
│ magic/
│ python_magic-0.4.24.dist-info/
Now, zip the contents of this directory. Remember to zip the contents only, not the folder itself. So, inside the lambda-package run:
zip -r lambda-package.zip .
Final step. Create a new Lambda function from scratch. Make sure to choose a proper runtime. When your function is created, click Upload from -> choose Zip file and upload lambda-package.zip file.
Now you will be able to import python-magic normally, like this:
import magic
That's it!
p.s. The error failed to find libmagic appears under Python 3.8 runtime. Python 3.7 works fine.
The simplest way to resolve this at the time of this writing:
Create a lib/ directory in the root of your Lambda package. If you're using a Lambda layer create opt/lib/.
Download the binary package from here, it will have a filename something like file-libs-5.11-36.amzn2.0.1.x86_64.rpm
Unarchive that package. (If you're on MacOS, 7zip will work. It will extract a .cpio file. Extract the outputted file using the standard MacOS unarchiver)
Move the libmagic.so.1.0.0 file into the lib/ folder of your Lambda package (or /opt/lib directory of the Lambda layer). DO NOT move the other file in the same folder named libmagic.so.1, it's a symlink and not a real file and will not work.
Re-name libmagic.so.1.0.0 to libmagic.so.1
Deploy

Is it possible to write files to your machine (local drive) using GitLab CI?

I'm trying to run a Python script, using GitLab CI, which will create a Pandas dataframe and write this as a .csv file on my machine.
As a test script I've created the following do_stuff_2.py file:
import datetime
import pandas as pd
import numpy as np
current_time = datetime.datetime.now()
print(f'Hello.\nCurrent date/time is:{current_time}')
df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
print(df)
df.to_csv('C:\\<USER_PATH>\\Desktop\\df_out.csv', index = False)
This should:
print the current time
generate a dataframe with 10 rows and 4 columns which is randomly populated with values between 0 and 100
print said dataframe
save the dataframe to the local drive
When the CI pipeline is executed I get no errors and the first 3 steps run successfully:
I have a .gitlab-ci.yml file with the following:
stages:
- build
build:
stage: build
image: python:3.6
script:
- echo "Running python..."
- pip install -r requirements.txt
- python do_stuff_2.py
and a requirements.txt file:
numpy
pandas
It looks like I've got everything set up correctly, as the time is being displayed and the print function returns the dataframe. However, no file is written to the specified location. When I run the script locally everything works as expected and the dataframe is saved on my desktop as df_out.csv.
I'm using Python 3.6, on a Windows 10 machine.
Is there an alternate way to do this from within a CI pipeline in GitLab?
You need to install the gitlab-runner on your local machine.
If you can't, you can use artifact: keyword to upload the result of your script on the gitlab server and download it afterward from the UI. Your gitlab-ci.yml will look like :
stages:
- build
build:
stage: build
image: python:3.6
script:
- echo "Running python..."
- pip install -r requirements.txt
- python do_stuff_2.py
artifacts:
paths:
- df_out.csv
and you code must change to :
df.to_csv('df_out.csv', index = False)

AWS Lambda/Python3 can't import numpy

I have a python3.6 script that uses sqlalchemy, pandas and numpy. To get this working on AWS Lambda, I took the following steps.
Created a new, clean directory
Create a new virtualenv
Create a holding directory (mkdir dist)
Install packages pip install sqlalchemy numpy pandas
Navigate to packages cd env/lib/python3.6/site-packages
Zip packages to holding directory zip -r path/dist/Transfer.zip .
Navigate to root
Zip python file zip -g dist/Transfer.zip my_python.py
Upload to S3
Direct Lambda > Configuration > Code entry type > Upload a file from S3 > path to my file
Set Handler to my_python.lambda_handler
Save and test
I always get the error
{
"errorMessage": "Unable to import module 'my_python'"
}
With the logs as
Unable to import module 'heap_consolidation_lambda': Missing required dependencies ['numpy']
Why can it not see numpy? Fwiw, numpy is the third import, so apparently it has no issues with sqlalchemy and pandas.

Categories