Job hangs forever with no logs - python

With the Python SDK, the job seems to hang forever (I have to kill it manually at some point) if I use the extra_package option to use a custom ParDo.
Here is a job id for example : 2016-12-22_09_26_08-4077318648651073003
No explicit logs or errors are thrown...
I noticed that It seems related to the extra_package option because if I use this option without actually triggering the ParDo (code commented), it doesn't work either.
The initial Bq query with a simple output schema and no transform steps works.
Did it happen to someone ?
P.S : I'm using the DataFlow 0.4.3 version. I tested inside a venv and it seems to work with a DirectPipelineRunner

As determined by thylong and jkff:
The extra_package was binary-incompatible with Dataflow's packages. The requirements.txt in the root directory and the one in the extra_package were different, causing the exec.go in DataFlow container failing again and again. To fix, we recreated the venv with the same frozen dependencies.

Related

Singularity behaviour: shell vs exec

So I'm trying to debug an error I got on an HPC setup I have access to. I won't go into details about the error since it's package specific and I'm pretty sure this is an environment variable kind of problem. That said the package is neuron, and if anyone has experience with it and singularity I would appreciate your input.
When I tested everything locally using:
singularity exec --bind ./:/mnt container.sif my_script.py
there were no problems. However the same command ran into an error on the HPC cluster. I set about trying to recreate the error locally to see what the problem was.
For reasons still unknown to me, the error I got on the cluster can be reproduced locally by adding the --containall flag to the exec command. In fact, even the --contain flag can reproduce the error. I can see from the docs that --contain will:
use minimal /dev and empty other directories (e.g. /tmp and $HOME) instead of sharing filesystems from your host
which makes me guess its a path/environment problem, but I'm not 100% sure since I am still new-ish to everything that isn't python.
In order to try and solve the problem I tried using singularity shell to recreate the error. And this is where I hope someone can elucidate matters for me. If I do this:
singularity shell --containall --bind ./:/mnt container.sif
cd /mnt
python3 my_script.py
The script runs fine, I get no errors. However when I run:
singularity exec --containall --bind ./:/mnt container.sif python3 /mnt/my_script.py
I get the same error as I got on the cluster.
What is different about these two approaches? Why might shelling into the container work, and executing it like this not work? I am just looking for help figuring out to debug this.
Additionally, why might the scripts run locally but not on the HPC? My understanding of containers is that they are supposed to allow scripts to be run on different systems because everything is well, contained, in the container. What am I allowing through in these different scenarios that's stopping me from running my code?
My instincts (which aren't exactly experienced) tell me that there is some environment variable that I am carrying through when I shell in (or when I run the scripts locally) that I am losing when I run it in the other ways, but I am not sure where to begin looking for such a thing, or how to keep it in the container.
EDIT:
I also just tried shelling into the container, while in the HPC, and I get the same error. So there's something on my local machine that is being used when I shell in or when I execute the script without the --contain flag
Versions:
Singularity 3.5
Python 3.6.9
NEURON 8.0
Sounds like environment issue: you have something set in your dev env that doesn't exist in your cluster env. By default, all your environment variables are automatically forwarded on to the singularity environment. I recommend using -e/--cleanenv to catch that. When using that, only variables prefixed with SINGULARITYENV_ set in the singularity environment. e.g., to have NEURON_HOME=/mnt/neuron you would use export SINGULARITYENV_NEURON_HOME=/mnt/neuron before running the singularity command.
Once you figure out what the variable to be updated is you can add it normally in %environment or %post, however you prefer. If it's a value that changes depending on the environment, you can export the value in SINGULARITYENV_VARNAME.

Git push-to-deploy post-receive python script not cannot set env var

I am stuck since 2 days trying to set up a small automatic deployment script.
The thing is: I have been using Git for some months now, but I always used it locally just by myself, just with the purpose of easily saving version of my code. All good until here.
Now I have to find a way to "publish" the code as soon as new functionalities are implemented and I think the code is stable enough.
Searching around I've discovered these 'hooks', which are scripts that are executed by Git in certain situations. Basically the idea is to have my master branch sync'd with my published code, so that everytime I merge a branch to the master and 'push', the files are automatically copied into '/my/published/folder'.
That said, I've found this tutorial that explains to do exactly what I want using a 'hooks' post-receive script, which is written in Ruby. Since at my studio I don't have and don't want to use Ruby at this time, I've found a Python version of the same script.
I tested and tested, but I couldn't make it work. I keep getting the same error:
remote: GIT_WORK_TREE is not recognized as as internal or external command,
Consider this is based on the tutorial I've shared above. Same prj name, same structure, etc.
I even installed Ruby on my personal laptop and tried the original script, but it still doesn't work...
I'm using Windows, and the Git env variable is set and accessible. But nevertheless it seems like it's not recognizing the GIT_WORK_TREE command. If I run it from the Git Bash it works just fine, but if I use the Windows Shell I get the same error message.
I suppose that when in my py script use the call() function, it runs the cmd using the Windows Shell. That's my guess, but I don't really know how to solve it. Google didn't help, as if no one ever had this problem before.
Maybe I'm just not seeing something obvious here, but I spent the whole day on this and I cannot get out of this bog!
Does anyone know how to solve it, or at least have an idea for a workaround?
Hope someone can help...
Thanks a lot!
The Ruby script you are talking about generates "bash" command:
GIT_WORK_TREE=/deploy/path git checkout -f ...
It means: define environment variable "GIT_WORK_TREE" with value "/deploy/path" and execute "git checkout -f ...".
As I understand it doesn't work for Windows command line.
Try to use something like:
set GIT_WORK_TREE=c:\temp\deploy && git checkout -f ...
I've had this problem as well - the best solution I've found is to pass the working tree across as one of the parameters
git --work-tree="/deploy/path" checkout -f ...

Jenkins: Python + Selenium

I've been looking around for a simple step-by-step tutorial on how to configure selenium tests to be run with Jenkins, but honestly I didn't find anything special.
I have a bash script by whom the test are run. The test are written in Python. I found out it's possible to run them, but I couldn't figure out to configure it properly.
I found that video https://www.youtube.com/watch?v=2lP_VUP0YF0 , but there's been skipped the configuration.
Would anyone give me some tips or even write a simple tutorial ?
Thanks a lot for all answers!
I have a bash script by whom the test are run
In its simplest form, that's what you need to put in your Jenkins job.
Create a job which can access all your code, probably via a git, svn or mercurial repository (or a shared drive)
and either have it execute your bash script via "Add step --> bash command", or
Put each step of your bash script as a bash step in your job.
You're not going to get a tutorial here. Better to try something here and then ask a specific question.

How do you run a project in Ninja-IDE as root?

I have a project in Ninja-IDE that I need to run as root. How can I do that from the IDE? I tried to run the project after running Ninja-IDE as root but that did not work. I still get 'permission denied' when running my project.
Here: I found the source code for this project, searched for "F6", searched for the resulting term "execute-project", searched for the resulting term, "execute_project", followed the code a bit, found the eventual call to a sort of generic "call executable" helper. It in turn leads to a 'run widget', which handles the pre-execute, execute, and post-execute for project execution.
Here's the link to that portion of the code.
All this is to say that it might be as simple as changing settings.PYTHON_EXEC to "sudo python". Depending on your OS, this might break, since sudo will likely be looking for a password. It's a good start though, I think ;)
For sudo and password prompt issues, try this thread on askubuntu.

How to get pycassaShell working in windows?

EDIT: I got it working, I went into the pycassa directory and typed python pycassaShell but the 2nd part of my question (at the bottom there) is still valid: how do I run a script in pycassaShell?
I recently installed Cassandra and pycassa and followed the instruction from here.
They work fine, except I cant get pycassaShell to load. When I type pycassaShell at the command prompt, I get
'pycassaShell' is not recognized as an internal or external command,
operable program or batch file.
Do I need to set up a path for it?
Also, does anyone know if you can run ddl scripts using pycassaShell? It is for this reason that I want to try it out. At the moment, I'm doing all my ddl in the cassandra CLI, I'd like to be able to put it in a script to automate it.
You probably don't want to be running scripts with pycassaShell. It's designed more as an interactive environment to quickly try things out. For serious scripts, I recommend just writing a normal python script that imports pycassa and sets up the connection pool and column families itself; it should only be an extra 5 or so lines.
However, there is an (undocumented, I just noticed) optional -f or --file flag that you can use. It will essentially run execfile() on that script after startup completes, so you can use the SYSTEM_MANAGER and CF variables that are already set up in your script. This is intended primarily to be used as a prep script for your environment, similar to how you might use a .bashrc file (I don't know of a Windows equivalent).
Regarding DDL statements, I suggest you look at the SystemManager class.

Categories