I want to override the args of a model in the "comp1" folder by passing the parameters to the main file in the "component" folder and hence need some mechanism to pass the override args.
I've run it before in wsl2 and it worked.I want it to work in windows cmd and hence need some workaround or an alternate of echo to be able to pass the override parameters to the main file.
Adding project folder structure for reference:
Folder Component1
Folder comp1
Adding the MLproject file(Used for wsl2) for reference:
name: KNN
conda_env: conda.yml
entry_points:
main:
parameters:
hydra_options:
description: Hydra parameters to override
type: str
default: ''
command: >-
python main.py $(echo {hydra_options})
I've tried the set command in windows to assign a variable to the override params(passed through cmd) and then use it to concatenate with the python main.py file to incorporate the hydra override parameters but it doesn't seem to work as well.
Adding for reference:
name: KNN_main
conda_env: conda.yml
entry_points:
main:
parameters:
hydra_options:
description: Hydra values to override
type: str
default: " "
command: >-
#echo off
set command = "python main.py" and %{hydra_options}%
echo %command%
Tech stack: MLflow==1.29.0 Hydra==1.2.0
OS: Windows 10
According to this answer, you shouldn't place space before and after = in the set command.
It would work if you rewrote the MLproject into this:
name: KNN_main
conda_env: conda.yml
entry_points:
main:
parameters:
hydra_options:
description: Hydra values to override
type: str
default: " "
command: >-
#echo off
set command="python main.py %{hydra_options}%"
echo %command%
Also, I'm not sure but I think you don't need echo and this command will work.
command: >-
python main.py %{hydra_options}%
Related
Summary
What specific syntax must be changed in the code below in order for the multi-line contents of the $MY_SECRETS environment variable to be 1.) successfully written into the C:\\Users\\runneradmin\\somedir\\mykeys.yaml file on a Windows runner in the GitHub workflow whose code is given below, and 2.) read by the simple Python 3 main.py program given below?
PROBLEM DEFINITION:
The echo "$MY_SECRETS" > C:\\Users\\runneradmin\\somedir\\mykeys.yaml command is only printing the string literal MY_SECRETS into the C:\\Users\\runneradmin\\somedir\\mykeys.yaml file instead of printing the multi-line contents of the MY_SECRETS variable.
We confirmed that this same echo command does successfully print the same multi-line secret in an ubuntu-latest runner, and we manually validated the correct contents of the secrets.LIST_OF_SECRETS environment variable. ... This problem seems entirely isolated to either the windows command syntax, or perhaps to the windows configuration of the GitHub windows-latest runner, either of which should be fixable by changing the workflow code below.
EXPECTED RESULT:
The multi-line secret should be printed into the C:\\Users\\runneradmin\\somedir\\mykeys.yaml file and read by main.py.
The resulting printout of the contents of the C:\\Users\\runneradmin\\somedir\\mykeys.yaml file should look like:
***
***
***
***
LOGS THAT DEMONSTRATE THE FAILURE:
The result of running main.py in the GitHub Actions log is:
ccc item is: $MY_SECRETS
As you can see, the string literal $MY_SECRETS is being wrongly printed out instead of the 4 *** secret lines.
REPO FILE STRUCTURE:
Reproducing this error requires only 2 files in a repo file structure as follows:
.github/
workflows/
test.yml
main.py
WORKFLOW CODE:
The minimal code for the workflow to reproduce this problem is as follows:
name: write-secrets-to-file
on:
push:
branches:
- dev
jobs:
write-the-secrets-windows:
runs-on: windows-latest
steps:
- uses: actions/checkout#v3
- shell: python
name: Configure agent
env:
MY_SECRETS: ${{ secrets.LIST_OF_SECRETS }}
run: |
import subprocess
import pathlib
pathlib.Path("C:\\Users\\runneradmin\\somedir\\").mkdir(parents=True, exist_ok=True)
print('About to: echo "$MY_SECRETS" > C:\\Users\\runneradmin\\somedir\\mykeys.yaml')
output = subprocess.getoutput('echo "$MY_SECRETS" > C:\\Users\\runneradmin\\somedir\\mykeys.yaml')
print(output)
os.chdir('D:\\a\\myRepoName\\')
mycmd = "python myRepoName\\main.py"
p = subprocess.Popen(mycmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while(True):
# returns None while subprocess is running
retcode = p.poll()
line = p.stdout.readline()
print(line)
if retcode is not None:
break
MINIMAL APP CODE:
Then the minimal main.py program that demonstrates what was actually written into the C:\\Users\\runneradmin\\somedir\\mykeys.yaml file is:
with open('C:\\Users\\runneradmin\\somedir\\mykeys.yaml') as file:
for item in file:
print('ccc item is: ', str(item))
if "var1" in item:
print("Found var1")
STRUCTURE OF MULTI-LINE SECRET:
The structure of the multi-line secret contained in the secrets.LIST_OF_SECRETS environment variable is:
var1:value1
var2:value2
var3:value3
var4:value4
These 4 lines should be what gets printed out when main.py is run by the workflow, though the print for each line should look like *** because each line is a secret.
The problem is - as it is so often - the quirks of Python with byte arrays and strings and en- and de-coding them in the right places...
Here is what I used:
test.yml:
name: write-secrets-to-file
on:
push:
branches:
- dev
jobs:
write-the-secrets-windows:
runs-on: windows-latest
steps:
- uses: actions/checkout#v3
- shell: python
name: Configure agent
env:
MY_SECRETS: ${{ secrets.LIST_OF_SECRETS }}
run: |
import subprocess
import pathlib
import os
# using os.path.expanduser() instead of hard-coding the user's home directory
pathlib.Path(os.path.expanduser("~/somedir")).mkdir(parents=True, exist_ok=True)
secrets = os.getenv("MY_SECRETS")
with open(os.path.expanduser("~/somedir/mykeys.yaml"),"w",encoding="UTF-8") as file:
file.write(secrets)
mycmd = ["python","./main.py"]
p = subprocess.Popen(mycmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while(True):
# returns None while subprocess is running
retcode = p.poll()
line = p.stdout.readline()
# If len(line)==0 we are at EOF and do not need to print this line.
# An empty line from main.py would be '\n' with len('\n')==1!
if len(line)>0:
# We decode the byte array to a string and strip the
# new-line characters \r and \n from the end of the line,
# which were read from stdout of main.py
print(line.decode('UTF-8').rstrip('\r\n'))
if retcode is not None:
break
main.py:
import os
# using os.path.expanduser instead of hard-coding user home directory
with open(os.path.expanduser('~/somedir/mykeys.yaml'),encoding='UTF-8') as file:
for item in file:
# strip the new-line characters \r and \n from the end of the line
item=item.rstrip('\r\n')
print('ccc item is: ', str(item))
if "var1" in item:
print("Found var1")
secrets.LIST_OF_SECRETS:
var1: secret1
var2: secret2
var3: secret3
var4: secret4
And my output in the log was
ccc item is: ***
Found var1
ccc item is: ***
ccc item is: ***
ccc item is: ***
Edit: updated with fixed main.py and how to run it.
You can write the key file directly with Python:
- shell: python
name: Configure agent
env:
MY_SECRETS: ${{ secrets.LIST_OF_SECRETS }}
run: |
import os
import pathlib
pathlib.Path('C:\\Users\\runneradmin\\somedir\\').mkdir(parents=True, exist_ok=True)
with open('C:\\Users\\runneradmin\\somedir\\mykeys.yaml', 'w') as key_file:
key_file.write(os.environ['MY_SECRETS'])
- uses: actions/checkout#v3
- name: Run main
run: python main.py
To avoid newline characters in your output, you need a main.py that removes the newlines (here with .strip().splitlines()):
main.py
with open('C:\\Users\\runneradmin\\somedir\\mykeys.yaml') as file:
for item in file.read().strip().splitlines():
print('ccc item is: ', str(item))
if "var1" in item:
print("Found var1")
Here's the input:
LIST_OF_SECRETS = '
key:value
key2:value
key3:value
'
And the output:
ccc item is: ***
Found var1
ccc item is: ***
ccc item is: ***
ccc item is: ***
Here is my complete workflow:
name: write-secrets-to-file
on:
push:
branches:
- master
jobs:
write-the-secrets-windows:
runs-on: windows-latest
steps:
- shell: python
name: Configure agent
env:
MY_SECRETS: ${{ secrets.LIST_OF_SECRETS }}
run: |
import os
import pathlib
pathlib.Path('C:\\Users\\runneradmin\\somedir\\').mkdir(parents=True, exist_ok=True)
with open('C:\\Users\\runneradmin\\somedir\\mykeys.yaml', 'w') as key_file:
key_file.write(os.environ['MY_SECRETS'])
- uses: actions/checkout#v3
- name: Run main
run: python main.py
Also, a simpler version using only Windows shell (Powershell):
- name: Create key file
env:
MY_SECRETS: ${{ secrets.LIST_OF_SECRETS }}
run: |
mkdir C:\\Users\\runneradmin\\somedir
echo "$env:MY_SECRETS" > C:\\Users\\runneradmin\\somedir\\mykeys.yaml
- uses: actions/checkout#v3
- name: Run main
run: python main.py
I tried the following code and it worked fine :
LIST_OF_SECRETS
key1:val1
key2:val2
Github action (test.yml)
name: write-secrets-to-file
on:
push:
branches:
- main
jobs:
write-the-secrets-windows:
runs-on: windows-latest
steps:
- uses: actions/checkout#v3
- shell: python
name: Configure agentt
env:
MY_SECRETS: ${{ secrets.LIST_OF_SECRETS }}
run: |
import base64, subprocess, sys
import os
secrets = os.environ["MY_SECRETS"]
def powershell(cmd, input=None):
cmd64 = base64.encodebytes(cmd.encode('utf-16-le')).decode('ascii').strip()
stdin = None if input is None else subprocess.PIPE
process = subprocess.Popen(["powershell.exe", "-NonInteractive", "-EncodedCommand", cmd64], stdin=stdin, stdout=subprocess.PIPE)
if input is not None:
input = input.encode(sys.stdout.encoding)
output, stderr = process.communicate(input)
output = output.decode(sys.stdout.encoding).replace('\r\n', '\n')
return output
command = r"""$secrets = #'
{}
'#
$secrets | Out-File -FilePath .\mykeys.yaml""".format(secrets)
command1 = r"""Get-Content -Path .\mykeys.yaml"""
powershell(command)
print(powershell(command1))
Output
***
***
As you also mention in the question, Github will obfuscate any printed value containing the secrets with ***
EDIT : Updated the code to work with multiple line secrets. This answer was highly influenced by this one
You need to use yaml library:
import yaml
data = {'MY_SECRETS':'''
var1:value1
var2:value2
var3:value3
var4:value4
'''}#add your secret
with open('file.yaml', 'w') as outfile: # Your file
yaml.dump(data, outfile, default_flow_style=False)
This is result:
I used this.
I have an custom config class in an application and I'd like to override the defaults by reading from environment variables. I'm facing some strange behaviour with Python's str.format() and I'd like to understand why. This code runs successfully depending on the value of the env vars that are passed in. Here it is:
class Config(object):
SQS_QUEUE = '{client}-{env}'
class ClientConfig(Config):
ENV = os.environ.get('ENV', default='dev')
CLIENT = os.environ.get('CLIENT', default='v')
SQS_QUEUE = Config.SQS_QUEUE.format(client=CLIENT, env=ENV)
config = ClientConfig()
print(config.ENV)
print(config.CLIENT)
print(config.SQS_QUEUE)
This is my env var file:
export ENV="prod"
export CLIENT="r"
It is being loaded like so: source .env and I can see that the vars are set by running the env command:
$ env
ENV=prod
CLIENT=r
[...]
When I run the python code above, I would expect the SQS queue variable to be a string with value "r-prod" instead, I'm getting "-prod" which is strange considering both ENV and CLIENT are set (as I can see from the print statement)
EDIT: here's the full output
$ python3 test.py
prod
r
-prod
I am trying to setup CI on github action, I already have a working action shown bellow. What it does is it run pylint, save the output to git env var and write a message to the pull request with it. This works great.
However I need to check the pylint score and fail the action if it is bellow a certain threshold. I already know that there is a --fail-under flag, however I don't want to use it, because it will skip writing the output to comment. Is it possible for me to save the pylint score and check it later? After the comment is already writen? I currently use --exit-zero flag so pylint would pass and I can write the comment.
Here is what I currently have:
- name: Lint with pylint
working-directory: ./
run: |
echo '${{ steps.files.outputs.files_updated }} ${{ steps.files.outputs.files_created }}'
pip install pylint
OUTPUT = $(pylint ${{ steps.files.outputs.files_updated }} ${{ steps.files.outputs.files_created }} --exit-zero --jobs=0)
SCORE = 0
echo "Pylint finished with score: $SCORE"
echo 'MESSAGE<<EOF' >> $GITHUB_ENV
echo "$OUTPUT" >> $GITHUB_ENV
echo 'EOF' >> $GITHUB_ENV
How do I get score from pylint on this line: SCORE = 0?
And how do I fail the test afterward?
The output from pylint look like this:
************* Module src.server
src/server.py:1:0: C0114: Missing module docstring (missing-module-docstring)
src/server.py:5:0: W0401: Wildcard import src.endpoints (wildcard-import)
...
-----------------------------------
Your code has been rated at 0.18/10
I am working to implement a snakemake pipeline on our university's HPC. I am doing so in an activated conda environment and with the following script submitted using sbatch:
snakemake --dryrun --summary --jobs 100 --use-conda -p \
--configfile config.yaml --cluster-config cluster.yaml \
--profile /path/to/conda/env --cluster "sbatch --parsable \
--qos=unlim --partition={cluster.queue} \
--job-name=username.{rule}.{wildcards} --mem={cluster.mem}gb \
--time={cluster.time} --ntasks={cluster.threads} \
--nodes={cluster.nodes}"
config.yaml
metaG_accession: PRJNA766694
metaG_ena_table: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/ENA_tables/PRJNA766694_metaG_wenv.txt
inputDIR: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input
outputDIR: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/output
scratch: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/scratch
adapters: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/adapters/illumina-adapters.fa
metaG_sample_list: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/SampleList_ForAssembly_metaG.txt
megahit_other: --continue --k-list 29,39,59,79,99,119
megahit_cpu: 80
megahit_min_contig: 1000
megahit_mem: 0.95
restart-times: 0
max-jobs-per-second: 1
max-status-checks-per-secon: 10
local-cores: 1
rerun-incomplete: true
keep-going: true
Snakefile
configfile: "config.yaml"
import io
import os
import pandas as pd
import numpy as np
import pathlib
from snakemake.exceptions import print_exception, WorkflowError
#----SET VARIABLES----#
METAG_ACCESSION = config["metaG_accession"]
METAG_SAMPLES = pd.read_table(config["metaG_ena_table"])
INPUTDIR = config["inputDIR"]
ADAPTERS = config["adapters"]
SCRATCHDIR = config["scratch"]
OUTPUTDIR = config["outputDIR"]
METAG_SAMPLELIST = pd.read_table(config["metaG_sample_list"], index_col="Assembly_group")
METAG_ASSEMBLYGROUP = list(METAG_SAMPLELIST.index)
ASSEMBLYGROUP = METAG_ASSEMBLYGROUP
#----COMPUTE VAR----#
MEGAHIT_CPU = config["megahit_cpu"]
MEGAHIT_MIN_CONTIG = config["megahit_min_contig"]
MEGAHIT_MEM = config["megahit_mem"]
MEGAHIT_OTHER = config["megahit_other"]
and slurm error output
snakemake: error: unrecognized arguments: --metaG_accession=PRJNA766694
--metaG_ena_table=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/ENA_tables/PRJNA766694_metaG_wenv.txt
--inputDIR=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input
--outputDIR=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/output
--scratch=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/scratch
--adapters=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/adapters/illumina-adapters.fa
--metaG_sample_list=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/SampleList_ForAssembly_metaG.txt
--megahit_cpu=80 --megahit_min_contig=1000 --megahit_mem=0.95
On execution it fails to recognize arguments in my config.yaml file (for ex.):
snakemake: error: unrecognized arguments: --inputDIR=[path\to\dir]
In my understanding the Snakefile should be able to take any arguments stated in the config.yaml using:
INPUTDIR = config["inputDIR"]
when:
configfile: "config.yaml"
is input in my Snakefile.
Also, my config.yaml properly recognizes non-custom arguments such as:
max-jobs-per-second: 1
Is there some custom library setup that I need to initiate for this particular config.yaml? This is my first time using Snakemake and I am still learning how to properly work with config files.
Also, on swapping the paths directly into the Snakefile I was able to get the summary output for my dryrun without the unrecognized arguments error.
The issue was the way in which the workflow was executed using Slurm. I had been executing snakemake with sbatch as a bash script.
Instead, snakemake can be executed directly through the terminal or using bash. While I'm not exactly sure why, this caused my jobs to run on the cluster's local, which has tight memory limits, rather than on the hpc partitions that have appropriate capacity. Executing it this way also caused snakemake to not recognize the paths set in my config.yaml.
The lesson learned is that snakemake has a built in way of communicating with the slurm manager and it seems that executing snakemake through sbatch will cause conflicts.
("Conda Env") [user#log001 "Main Directory"]$ snakemake \
> --jobs 100 --use-conda -p -s Snakefile \
> --cluster-config cluster.yaml --cluster "sbatch \
> --parsable --qos=unlim --partition={cluster.queue} \
> --job-name=TARA.{rule}.{wildcards} --mem={cluster.mem}gb \
> --time={cluster.time} --ntasks={cluster.threads} --nodes={cluster.nodes}"
I already set SLACK_TOKEN environment Variable. But "SLACK_TOKEN=os.environ.get('SLACK_TOKEN')" is returning "None".
The type of SLACK_TOKEN is NoneType. I think os.environ.get not fetching value of environment variable. so rest of the code is not executing.
import os
from slackclient import SlackClient
SLACK_TOKEN= os.environ.get('SLACK_TOKEN') #returning None
print(SLACK_TOKEN) # None
print(type(SLACK_TOKEN)) # NoneType class
slack_client = SlackClient(SLACK_TOKEN)
print(slack_client.api_call("api.test")) #{'ok': True}
print(slack_client.api_call("auth.test")) #{'ok': False, 'error': 'not_authed'}
def list_channels():
channels_call = slack_client.api_call("channels.list")
if channels_call['ok']:
return channels_call['channels']
return None
def channel_info(channel_id):
channel_info = slack_client.api_call("channels.info", channel=channel_id)
if channel_info:
return channel_info['channel']
return None
if __name__ == '__main__':
channels = list_channels()
if channels:
print("Channels: ")
for c in channels:
print(c['name'] + " (" + c['id'] + ")")
detailed_info = channel_info(c['id'])
if detailed_info:
print(detailed_info['latest']['text'])
else:
print("Unable to authenticate.") #Unable to authenticate
I faced similar issue.I fixed it by removing quotes from the values.
Example:
I created a local.env file wherein I stored my secret key values :
*local.env:*
export SLACK_TOKEN=xxxxxyyyyyyyzzzzzzz
*settings.py:*
SLACK_TOKEN = os.environ.get('SLACK_TOKEN')
In your python terminal or console,run the command : *source local.env*
****Involve local.env in gitignore.Make sure you dont push it to git as you have to safeguard your information.
This is applicable only to the local server.Hope this helps.Happy coding :)
In my case, I write wrong content in env file:
SLACK_TOKEN=xxxxxyyyyyyyzzzzzzz
I forgot export befor it, the correct should be:
export SLACK_TOKEN=xxxxxyyyyyyyzzzzzzz
You can use a config file to get the env vars without using export,
in the env file store varibale normally
.env:
DATABASE_URL=postgresql+asyncpg://postgres:dina#localhost/mysenseai
Then create a config file that will be used to store the env variable like so
config.py:
from pydantic import BaseSettings
class Settings(BaseSettings):
database_url: str
class Config:
env_file = '.env'
settings = Settings()
than you can use it that way
from config import settings
url = settings.database_url
If you declared the variable SLACK_TOKEN in the windows command prompt you will be able to access it in the same instance of that command prompt not anywhere including Powershell and the git bash. Be careful with that
whenever you want to run that python script, consider running it in the same command prompt where you declared those variables
you can always check if the variable exists in the cmd by running echo %SLACK_TOKEN% if it does not exists the cmd will return %SLACK_TOKEN%