I want to use "Select a script to run after creation" when I create a notebook instance in GCP.
Specifically, I want to use it to install python packages.
What kind of script (extension and contents) do I need to write?
This will be an example of Post startup script that installs Voila.
Save this file in a GCS bucket and when creating the Notebook, define the path, for example:
gcloud notebooks instances create nb-1 \
'--vm-image-project=deeplearning-platform-release' \
'--vm-image-family=tf2-latest-cpu' \
'--metadata=post-startup-script=gs://ai-platform-notebooks-tools/install-voila.sh' \
'--location=us-central1-a'
Script contents:
#!/bin/bash -eu
# Installs Voila in AI Platform Notebook
function install_voila() {
echo 'Installing voila...'
/opt/conda/condabin/conda install -y -c conda-forge ipywidgets ipyvolume bqplot scipy
/opt/conda/condabin/conda install -y -c conda-forge voila
/opt/conda/bin/jupyter lab build
systemctl restart jupyter.service || echo 'Error restarting jupyter.service.'
}
function download_samples() {
echo 'Downloading samples...'
cd /home/jupyter
git clone https://github.com/voila-dashboards/voila
}
function main() {
install_voila || echo 'Error installing voila.'
download_samples || echo 'Error downloading voila samples.'
}
main
Related
I want to run a shell script in conda, but it shows the errors like
./run_augment_data.sh: 9: python: not found
but when I type
type python python3
the shell gives me an existing path.
python is /home/rd142857/anaconda3/envs/test_env/bin/python
python3 is /home/rd142857/anaconda3/envs/test_env/bin/python3
I tried changing python into python3, the above error disappears but the new error is
/usr/bin/python3: Error while finding module specification for 'torch.distributed.launch' (ModuleNotFoundError: No module named 'torch')
I notice that the python the script want to use is not the python in my conda. So I add the following sentence to the top of the script
#!/home/rd142857/anaconda3/envs/test_env/bin/python
then re-run the script, the new error is
File "/home/rd142857/grappa/grappa/./run_augment_data.sh", line 6
rm -r $LOGDIR
^
SyntaxError: invalid syntax
I really don't know what to do now.
The full content of the shell script is
#export NGPU=2;
#CUDA_VISIBLE_DEVICES=0,1 python -u -m torch.distributed.launch --nproc_per_node=$NGPU finetuning_roberta.py --train_corpus data/augment_data.txt \
LOGDIR="grappa_logs_checkpoints/ssp/"
rm -r $LOGDIR
mkdir $LOGDIR
export NGPU=4;
python -u -m torch.distributed.launch --nproc_per_node=$NGPU finetuning_roberta.py --train_corpus data/augment_data.txt \
--eval_corpus data/spider_dev_data_v2.txt \
--train_eval_corpus data/spider_train_data_small_v2.txt \
--bert_model roberta-large \
--output_dir $LOGDIR/ \
--do_train \
--do_eval \
--train_batch_size 12 \
--max_seq_length 218 \
--num_train_epochs 10 \
> $LOGDIR/log.out
The rm -r $LOGDIR implies that your script is a shell script, not a Python one, and so the shebang (top of the script) should be something like
run_augment_data.sh
#!/usr/bin/env bash -l
When trying to use a Python interpreter from a specific Conda environment, I recommend using conda run. Something like
conda run -n test_env python script.py
See conda run --help.
I am trying to synthesize a CDK app (typeScript) which has some python lambda functions.
I am using PythonFunction to use a requirements.txt file to install the external dependencies. I am running vscode on WSL. I am encountering the following error.
Bundling asset Test/test-lambda-stack/test-subscriber-data-validator-poc/Code/Stage...
node:internal/fs/utils:347
throw err;
^
Error: ENOENT: no such file or directory, open '~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/node_modules/highlight.js/styles/cp -rTL /asset-input/ /asset-output && cd /asset-output && python -m pip install -r requirements.txt -t /asset-output.css'
at Object.openSync (node:fs:594:3)
at Object.readFileSync (node:fs:462:35)
at module.exports (~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/src/getColourScheme.js:47:26)
at ~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/src/docker.js:809:47
at FSReqCallback.readFileAfterClose [as oncomplete] (node:internal/fs/read_file_context:68:3)
at FSReqCallback.callbackTrampoline (node:internal/async_hooks:130:17) {
errno: -2,
syscall: 'open',
code: 'ENOENT',
path: '~/.nvm/versions/node/v16.17.0/lib/node_modules/docker/node_modules/highlight.js/styles/cp -rTL /asset-input/ /asset-output && cd /asset-output && python -m pip install -r requirements.txt -t /asset-output.css'
}
Error: Failed to bundle asset Test/test-lambda-stack/test-subscriber-data-validator-poc/Code/Stage, bundle output is located at ~/Code/AWS/CDK/test-dev-poc/cdk.out/asset.6b577fe604573a3b53e635f09f768df3f87ad6651b18e9f628c2a086a525bb49-error: Error: docker exited with status 1
at AssetStaging.bundle (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:2:614)
at AssetStaging.stageByBundling (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:4506)
at stageThisAsset (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:1867)
at Cache.obtain (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/private/cache.js:1:242)
at new AssetStaging (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:2262)
at new Asset (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/aws-s3-assets/lib/asset.js:1:736)
at AssetCode.bind (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/aws-lambda/lib/code.js:1:4628)
at new Function (~/Code/AWS/CDK/test-dev-poc/node_modules/aws-cdk-lib/aws-lambda/lib/function.js:1:2803)
at new PythonFunction (~/Code/AWS/CDK/test-dev-poc/node_modules/#aws-cdk/aws-lambda-python-alpha/lib/function.ts:73:5)
at new lambdaInfraStack (~/Code/AWS/CDK/test-dev-poc/lib/serviceInfraStacks/lambda-infra-stack.ts:24:40)
My requirements.txt file looks like this
attrs==22.1.0
jsonschema==4.16.0
pyrsistent==0.18.1
My cdk code is this
new PythonFunction(this,`${appName}-subscriber-data-validator-${stage}`,{
runtime: Runtime.PYTHON_3_9,
entry: join('lambdas/subscriber_data_validator'),
handler: 'lambda_hander',
index: 'subscriber_data_validator.py'
})
Do I need to install anything additional? I have esbuild installed as a devDependency. Having a real hard time getting this work. Any help is appreciated.
The problem is related to using LibreOffice headless converter to automatically convert uploaded files. Getting this error:
LibreOffice 7 fatal error - Application cannot be started
Ubuntu ver: 21.04
What I have tried:
Getting the file from Azure Blob storage,
put it into BASE_DIR/Input_file,
convert it to PDF using Linux command that I am running by subproccess,
put it into BASE_DIR/Output_file folder.
Below is my code:
I am installing the LibreOffice to docker this way
RUN apt-get update \
&& ACCEPT_EULA=Y apt-get -y install LibreOffice
The main logic:
blob_client = container_client.get_blob_client(f"Folder_with_reports/")
with open(os.path.join(BASE_DIR, f"input_files/{filename}"), "wb") as source_file:
source_file.write(data)
source_file = os.path.join(BASE_DIR, f"input_files/{filename}") # original docs here
output_folder = os.path.join(BASE_DIR, "output_files") # pdf files will be here
# assign the command of converting files through LibreOffice
command = rf"lowriter --headless --convert-to pdf {source_file} --outdir {output_folder}"
# running the command
subprocess.run(command, shell=True)
# reading the file and uploading it back to Azure Storage
with open(os.path.join(BASE_DIR, f"output_files/MyFile.pdf"), "rb") as outp_file:
outp_data = outp_file.read()
blob_name_ = f"test"
container_client.upload_blob(name = blob_name_ ,data = outp_data, blob_type="BlockBlob")
Should I install lowriter instead of LibreOffice? Is it okay to use BASE_DIR for this kind of operations? I would appreciate any suggestion.
Patial solution:
Here I have simplified the case and created additional docker image with this Dockerfile.
I apply both methods: unoconv and straight conversion.
Dockerfile:
FROM ubuntu:21.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get -y upgrade && \
apt-get -y install python3.10 && \
apt update && apt install python3-pip -y
# Method1 - installing LibreOffice and java
RUN apt-get --no-install-recommends install libreoffice -y
RUN apt-get install -y libreoffice-java-common
# Method2 - additionally installing unoconv
RUN apt-get install unoconv
ARG CACHEBUST=1
ADD BASE.py /code/BASE.py
# copying input doc/docx files to the docker's linux
COPY /input_files /code/input_files
CMD ["/code/BASE.py"]
ENTRYPOINT ["python3"]
BASE.py
import os
import subprocess
BASE_DIR = "/code"
# subprocess.run("ls code/input_files", shell=True)
for filename in os.listdir('code/input_files'):
source_file = f"/code/input_files/{filename}" # original document
output_filename = os.path.splitext(filename)[0]+".pdf"
output_file = f"code/output_files/{output_filename}"
output_folder = "code/output_files" # pdf files will be here
# METHOD 1 - LibreOffice straightly
assign the command of converting files through LibreOffice
convert_to_pdf = rf"libreoffice --headless --convert-to pdf {source_file} --outdir {output_folder}"
subprocess.run(r'ls code/output_files/', shell=True)
## METHOD 2 - Using unoconv - also working
# convert_to_pdf = f"unoconv -f pdf {source_file}"
# subprocess.run(convert_to_pdf, shell=True)
# print(f'file {filename} converted')
The above mentioned methods allows to work with the problem if files was already in Linux filesystem while building. But still didn't find a way to write files into system after building the docker image.
I am trying to launch jupyter lab in VSCode remote server, capsuled by Docker, but got error saying
Unable to start session for kernel Python 3.8.5 64-bit. Select another kernel to launch with.
I set up Dockerfile and .devcontainer.json in workspace dir.
Do I also need docker-compose.yaml file for jupyter lab setting like port forwarding?
Or I can handle and replace docker-compose file by .devcontainer.json file?
Dockerfile:
FROM python:3.8
RUN apt-get update --fix-missing && apt-get upgrade -y
# Set Japanese UTF-8 as locale so Japanese can be used
RUN apt-get install -y locales \
&& locale-gen ja_JP.UTF-8
ENV LANG ja_JP.UTF-8
ENV LANGUAGE ja_JP:ja
ENV LC_ALL ja_JP.UTF-8
# RUN apt-get install zsh -y && \
# chsh -s /usr/bin/zsh
# Install zsh with theme and some plugins
RUN sh -c "$(wget -O- https://raw.githubusercontent.com/deluan/zsh-in-docker/master/zsh-in-docker.sh)" \
-t mrtazz \
-p git -p ssh-agent
RUN pip install jupyterlab
RUN jupyter serverextension enable --py jupyterlab
WORKDIR /app
CMD ["bash"]
.devcontainer.json
{
"name": "Python 3.8",
"build": {
"dockerfile": "Dockerfile",
"context": ".."
},
// Uncomment to use docker-compose
// "dockerComposeFile": "docker-compose.yml",
// "service": "dev",
// Set *default* container specific settings.json values on container create.
"settings": {
"terminal.integrated.shell.linux": "/bin/bash",
"python.pythonPath": "/usr/local/bin/python",
"python.linting.enabled": true,
"python.linting.pylintEnabled": true,
"python.formatting.autopep8Path": "/usr/local/py-utils/bin/autopep8",
"python.formatting.blackPath": "/usr/local/py-utils/bin/black",
"python.formatting.yapfPath": "/usr/local/py-utils/bin/yapf",
"python.linting.banditPath": "/usr/local/py-utils/bin/bandit",
"python.linting.flake8Path": "/usr/local/py-utils/bin/flake8",
"python.linting.mypyPath": "/usr/local/py-utils/bin/mypy",
"python.linting.pycodestylePath": "/usr/local/py-utils/bin/pycodestyle",
"python.linting.pydocstylePath": "/usr/local/py-utils/bin/pydocstyle",
"python.linting.pylintPath": "/usr/local/py-utils/bin/pylint"
},
// Add the IDs of extensions you want installed when the container is created.
"extensions": [
"ms-python.python",
"teabyii.ayu",
"jeff-hykin.better-dockerfile-syntax",
"coenraads.bracket-pair-colorizer-2",
"file-icons.file-icons",
"emilast.logfilehighlighter",
"zhuangtongfa.material-theme",
"ibm.output-colorizer",
"wayou.vscode-todo-highlight",
"atishay-jain.all-autocomplete",
"amazonwebservices.aws-toolkit-vscode",
"hookyqr.beautify",
"phplasma.csv-to-table",
"alefragnani.bookmarks",
"mrmlnc.vscode-duplicate",
"tombonnike.vscode-status-bar-format-toggle",
"donjayamanne.githistory",
"codezombiech.gitignore",
"eamodio.gitlens",
"zainchen.json",
"ritwickdey.liveserver",
"yzhang.markdown-all-in-one",
"pkief.markdown-checkbox",
"shd101wyy.markdown-preview-enhanced",
"ionutvmi.path-autocomplete",
"esbenp.prettier-vscode",
"diogonolasco.pyinit",
"ms-python.vscode-pylance",
"njpwerner.autodocstring",
"kevinrose.vsc-python-indent",
"mechatroner.rainbow-csv",
"msrvida.vscode-sanddance",
"rafamel.subtle-brackets",
"formulahendry.terminal",
"tyriar.terminal-tabs",
"redhat.vscode-yaml"
],
// Use 'forwardPorts' to make a list of ports inside the container available locally.
"forwardPorts": [8888],
// Use 'postCreateCommand' to run commands after the container is created.
// "postCreateCommand": "pip3 install -r requirements.txt",
// Comment out to connect as root instead.
// "remoteUser": "myname",
"shutdownAction": "none"
}
I'm trying to setup my Mac OS X system to use the pdblp Python library which requires me to first install the Bloomberg Open API libary for Python. After cloning the git repo and running python setup.py install, I get
File "setup.py", line 20, in <module>
raise Exception("BLPAPI_ROOT environment variable isn't defined")
Exception: BLPAPI_ROOT environment variable isn't defined
How should I proceed?
Just to complete the question (thanks mob :)
Packages Source - https://www.bloomberglabs.com/api/libraries/
Preparation
SDK for C/C++
SDK for Python
Instructions
# navigate yourself to the path where you want to keep your SDK for some tim
cd /Users/msam/
# unzip C/C++ Package
tar zxvf Downloads/blpapi_cpp_3.8.1.1-darwin.tar.gz
# set variable
export BLPAPI_ROOT=/some/directory/blpapi_cpp_3.8.1.1/
export DYLD_LIBRARY_PATH=/Users/sampathkumarm/blpapi_cpp_3.8.1.1/Darwin/
# save variable to reuse in next session
echo >> ~/.bash_profile
echo "Bloomberg API (python)library Settings " >> ~/.bash_profile
echo "export BLPAPI_ROOT=/some/directory/blpapi_cpp_3.8.1.1/" >> ~/.bash_profile
echo "export DYLD_LIBRARY_PATH=/Users/sampathkumarm/blpapi_cpp_3.8.1.1/Darwin/" >> ~/.bash_profile
echo >> ~/.bash_profile
Ref:
1. python blpapi installation error
You also need to install the C/C++ libraries and then set BLPAPI_ROOT to the location of the libblpapi3_32.so or libblpapi3_64.so files. For example:
cd /some/directory
wget https://bloomberg.bintray.com/BLPAPI-Experimental-Generic/blpapi_cpp_3.8.1.1-darwin.tar.gz
tar zxvf blpapi_cpp_3.8.1.1-darwin.tar.gz
export BLPAPI_ROOT=/some/directory/blpapi_cpp_3.8.1.1/Darwin
export BLPAPI_ROOT=/some/directory/blpapi_cpp_3.8.1.1
Then you can proceed with installing the python library.