I installed gettyimages/spark docker image and jupyter/pyspark-notebook inside my machine.
However as the gettyimage/spark python version is 3.5.3 while jupyter/pyspark-notebook python version is 3.7, the following error come out:
Exception: Python in worker has different version 3.5 than that in
driver 3.7, PySpark cannot run with different minor versions.Please
check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON
are correctly set.
So, i have tried to upgrade the python version of gettyimage/spark image OR downgrade the python version of jupyter/pyspark-notebook docker image to fix it.
Lets talk about method 1, downgrade jupyter/pyspark-notebook python version first:
I use conda install python=3.5 to downgrade the python version of jupyter/pyspark-notebook docker image. However, after i do so , my jupyter notebook cannot connect to any single ipynb and the kernel seems dead. Also, when i type conda again, it shows me conda command not found, but python terminal work well
I have compare the sys.path before the downgrade and after it
['', '/usr/local/spark/python',
'/usr/local/spark/python/lib/py4j-0.10.7-src.zip',
'/opt/conda/lib/python35.zip', '/opt/conda/lib/python3.5',
'/opt/conda/lib/python3.5/plat-linux',
'/opt/conda/lib/python3.5/lib-dynload',
'/opt/conda/lib/python3.5/site-packages']
['', '/usr/local/spark/python',
'/usr/local/spark/python/lib/py4j-0.10.7-src.zip',
'/opt/conda/lib/python37.zip', '/opt/conda/lib/python3.7',
'/opt/conda/lib/python3.7/lib-dynload',
'/opt/conda/lib/python3.7/site-packages']
I think more or less, it is correct. So why i cannot use my jupyter notebook to connect to the kennel?
So i use another method, i tried upgrade the gettyimage/spark image
sudo docker run -it gettyimages/spark:2.4.1-hadoop-3.0 apt-get install
python3.7.3 ; python3 -v
However, I find that even i do so, i cannot run the spark well.
I am not quite sure what to do. May you share with me how to modify the docker images internal package version
If I look at the Dockerfile here, it installs python3 which by default is installing python 3.5 for debian:stretch. You can instead install python 3.7 by editing the Dockerfile and building it yourself. In your Dockerfile, remove lines 19-25 and replace line 1 with the following, and then build the image locally.
FROM python:3.7-stretch
If you are not familiar with building your own image, download the Dockerfile and keep it in its own standalone directory. Then after cd into the directory, run the command below . You may want to first remove the already downloaded image. After this you should be able to run other docker commands the same way as if you had pulled the image from docker hub.
docker build -t gettyimages/spark .
Related
I'm trying to install Anaconda distribution on Windows docker based nano-server.
This is my dockerfile (based on 4 yeras old example I found in GitHub)
# escape= `
# Use the latest Windows Server Core 2022 image.
FROM mcr.microsoft.com/windows/servercore:ltsc2022 AS base
RUN powershell (New-Object System.Net.WebClient).DownloadFile('https://repo.anaconda.com/archive/Anaconda3-2022.05-Windows-x86_64.exe', 'Anaconda3.exe')
RUN powershell Unblock-File -Path Anaconda3.exe
RUN Anaconda3.exe /InstallationType=JustMe /RegisterPython=1 /S /D=C:\Python
FROM mcr.microsoft.com/windows/nanoserver:ltsc2022-amd64
COPY --from=base C:\Python C:\Python
ENV PATH="C:\Python;C:\Python\Library\mingw-w64\bin;C:\Python\Library\usr\bin;C:\Python\Library\bin;C:\Python\Scripts;C:\Python\bin;C:\Python\condabin;C:\Windows;C:\Windows\System32;"
CMD ["cmd"]
The docker looks ok, I can launch Python, but once I try to import numpy or sklearn I got errors (fail to load some DLLs). I could make numpy work by reinstalling it (pip install --force-reinstall numpy), but this workaround didn't work for other libraries.
I also tried to install only Miniconda, then use conda install to add libraries, but trying to run conda in the nanoserver fails (pythoncom error: coinitializeex failed).
Trying a similar docker but on top of server core (servercore:ltsc2022) works flawlessly.
The reason I struggle to use Anaconda on nanoserver is to minimize its size - the docker size of the container based on servercore adds ~7GB to the size on the nanoserver.
Is it possible to run Anaconda on nano-server?
thanks,
Adi
I try to create a standalone python app using Pyinstaller which should run and can be reproducibly recreated on Windows, Linux, and Mac.
The idea is to use Docker to have a fixed environment that creates the app and exports it again. For Linux and Windows, I could make it work using https://github.com/cdrx/docker-pyinstaller.
the idea is to let docker create a fully functioning pyinstaller app (with GUI) within and export this app. Since pyinstaller depends on the package versions etc. this should be fixed in the docker and only the new source code should be supplied to compile and export a new version of the software.
in the ideal scenario (and how it already works with Linux and Windows), the user can create the docker and compile the software itself:
docker build -t docker_pyinstaller_linux https://raw.githubusercontent.com/loipf/calimera_docker_export/main/linux/Dockerfile
docker run --rm -v "/path/to/app_source_code:/code/" -v "${PWD}:/code/dist/" docker_pyinstaller_linux
For Mac, however, there is no simple straightforward solution. There is one Mac docker image out there https://github.com/sickcodes/Docker-OSX, but the docker creation code is not that simple.
my idea:
take https://github.com/sickcodes/Docker-OSX/blob/master/Dockerfile.auto
add to the end the download of miniconda:
RUN chmod -R 777 /Users/user
### install miniconda to /miniconda
RUN curl -LO "http://repo.continuum.io/miniconda/Miniconda3-4.4.10-MacOSX-x86_64.sh"
RUN bash Miniconda3-4.4.10-MacOSX-x86_64.sh -p /miniconda -b
ENV PATH=/miniconda/bin:${PATH}
RUN conda update -y conda
### install packages from conda
RUN conda install -c anaconda -y python=3.8
RUN apt-get install -y python3-pyqt5
...
but already the first command fails due to chmod: cannot access '/Users/user': No such file or directory since I am in a different "environment" than the interactive session (/home/arch/OSX-KVM). Can somebody help me out?
I know these asking for code pieces questions are not the best, and I could show you all the things I tried, but I doubt they will help anybody. I would like to have a minimum Mac example without user login or gui etc (which should be possible using https://github.com/sickcodes/osx-optimizer). It should only have miniconda installed (with pyinstaller and a few packages).
other info:
I can run the previous commands in the mac environment interactively in the docker environment but would like to fix these commands permanently in the image:
docker pull sickcodes/docker-osx:auto ### replace with docker with pyinstaller and packages
docker run -it \
--device /dev/kvm \
-p 50922:10022 \
sickcodes/docker-osx:auto
ideally in the end, we can run the above command with OSX_commands pyinstaller file.spec
related questions but without solution:
Using Docker and Pyinstaller to distribute my application
How to create OS X app with Python on Windows
Firstly i'm new to python environment.
I installed PIP in mac using terminal and the python version is 2.7.16. I want to install and use Simple Image annatator(https://github.com/sgp715/simple_image_annotator).
I install flask(pip install flask) and i did cd into the cloned folder if simple image annotator. For reference please see the github repo for installation steps.
Than after i don't know what to do to use this library if i run this command(python app.py /images/directory) it gives No files.
How can i install and use the simple image annotator in mac can somebody please tell me the setup process steps.
working with python libraries is entirely new to me. Thanks in advance.
BELOW IS EDITED QUESTION FOR MORE EXPLANATION:
I installed flask and moved to the cloned folder please see the terminal commands as shown in screenshot.
Next remaining installation steps are:
start the app
$ python app.py /images/directory
you can also specify the file you would like the annotations output to (out.csv is the default)
$ python app.py /images/directory --out test.csv
open http://127.0.0.1:5000/tagger in your browser
only tested on Chrome
I don't know how to execute remaining steps. images are available in downloads/images folder in my mac.
How to execute these commands?
Screenshots to explain more:
The problem can be from your python version.
Try with python3 (if installed)
python3 app.py /images/directory
I've a RHEL host with docker installed, it has default Py 2.7. My python scripts needs a bit more modules which
I can't install due to lack of sudo access & moreover, I dont want to screw up with the default Py which is needed for host to function.
Now, I am trying to get a python in docker container where I get to add few modules do the needfull.
Issue - docker installed RHEL is not connected to internet and cant be connected as well
The laptop i have doesnt have the docker either and I can't install docker here (no admin acccess) to create the docker image and copy them to RHEL host
I was hoping if docker image with python can be downloaded from Internet I might be able to use that as is!,
Any pointers in any approprite direction would be appreciated.
what have I done - tried searching for the python images, been through the dockers documentation to create the image.
Apologies if the above question sounds silly, I am getting better with time on docker :)
If your environment is restricted enough that you can't use sudo to install packages, you won't be able to use Docker: if you can run any docker run command at all you can trivially get unrestricted root access on the host.
My python scripts needs a bit more modules which I can't install due to lack of sudo access & moreover, I dont want to screw up with the default Py which is needed for host to function.
That sounds like a perfect use for a virtual environment: it gives you an isolated local package tree that you can install into as an unprivileged user and doesn't interfere with the system Python. For Python 2 you need a separate tool for it, with a couple of steps to install:
export PYTHONUSERBASE=$HOME
pip install --user virtualenv
~/bin/virtualenv vpy
. vpy/bin/activate
pip install ... # installs into vpy/lib/python2.7/site-packages
you can create a docker image on any standalone machine and push the final required image to docker registry ( docker hub ). Then in your laptop you can pull that image and start working :)
Below are some key commands that will be required for the same.
To create a image, you will need to create a Dockerfile with all the packages installed
Or you can also do sudo docker run -it ubuntu:16.04 then install python and other packages as required.
then sudo docker commit container_id name
sudo docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
sudo docker push IMAGE_NAME
Then you pull this image in your laptop and start working.
You can refer to this link for more docker commands https://github.com/akasranjan005/docker-k8s/blob/master/docker/basic-commands.md
Hope this helps. Thanks
Currently using AWS to run some tests on a machine learning project. I would like to run Python scripts without internet (via root) because the internet bandwidth is extremely limited. I try to run the convnets.py script by doing
sudo python convnets.py >> output
But that does not work, as Anaconda does not use PYTHONPATH, making it impossible for root to find the Anaconda Python environment. So errors like "cannot import" and "module not found" are thrown.
How do I set this up so I can get Anaconda and sudo to play fair together?
Because using sudo uses a different PATH than your typical environment, you need to be sure to specify that you want to use Anaconda's python interpreter rather than the system python. You can check which one is being run with the following command
sudo which python
To fix this, and point to Anaconda's python interpreter, specify the full path to the correct interpreter.
sudo /path/to/anaconda/bin/python convnets.py >> output
If you do this, you should be able to access all of the modules managed by anaconda.
On the other hand, if you have an Anaconda environment created
conda create --name $ENVIRONMENT_NAME python
You can activate it prior to running your command
sudo source activate $ENVIRONMENT_NAME && python convnets.py >> output