check if my commit has 'import pdb' in emacs/git? - python

I commit import pdb;pdb.set_trace() quite often.
Is there a convenient way of preventing me doing it?
I use emacs/git (magit).

For completeness, here's how to examine the contents of the version in the index, building off eugene's answer and with a few more changes (not tested as a complete hook, but should work):
#!/bin/sh
has_import=false
git diff --cached --no-renames --name-status --diff-filter=AM |
while read st file; do
case "$file" in
*.py)
if git show ":$file" |
grep -E "^[^#]*\bimport[[:space:]]+pdb\b"; then
echo "$file: has import pdb"
exit 1
fi;;
esac
done || has_import=true
if $has_import; then
exit 1
fi
The most important bit of change is the git show ":$file" trick, which uses git show to extract the staged version from the index.
I also:
added --no-renames to make renamed files show up as Added (dealing with R is harder, might as well just treat them as new);
removed C as it would fail if it triggered (because the "other" file name is also printed, just as for Renames, but I think it will not trigger here anyway);
removed some bash-specific syntax by using case; and
beefed up the grep expression a bit (it's still not perfect, you could do from pdb import ..., or more likely, something like import collections, pdb, which it would not catch; but now it handles multiple spaces after import, and avoids false hits on, e.g., import pdbase).
per Matthieu Moy's comment, beefed up the shell fragment to set a has_import variable you can use later. (If you don't intend to use anything later you can eliminate the variable and use exit 1 there directly, as he suggested.)
(This still has at least one minor flaw: the extracted file-contents do not have any smudge filters applied. But if your smudge and clean filters add and remove import lines, I suspect there's nothing a pre-commit hook can to do help you. :-) )

You can create .git/hooks/pre-commit
#!/bin/bash
git diff --cached --name-status --diff-filter=ACM | while read st file; do
if [[ "$file" =~ .py$ ]] && grep "^[^#]*import pdb" "$file"; then
echo "$file: has import pdb"
exit 1
fi
done
I just made it up. not sure if it's good enough for general use but works for me.
Thanks David

python3 -m pip install pre-commit (or use pipx)
cd my_repo
Create a file called .pre-commit-config.yaml with the following contents
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.0.1
hooks:
- id: debug-statements
Run pre-commit install
The next time you run git commit it will fail with
(random) mark#DESKTOP:~/pytest-bdd$ git commit -am "Adding breakpoint"
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Debug Statements (Python)................................................Failed
- hook id: debug-statements
- exit code: 1
setup.py:2:0 - pdb imported
And it will not allow you to commit until
breakpoint() or import pdb is removed from the commit.
Note: if you just remove import pdb but not pdb.set_trace() e.g if you're in a rush and forget, it wont complain - but instead you now have introduced a syntax error.
See https://pre-commit.com for more information
See https://pre-commit.com/hooks.html for more hooks

Related

Execute bash commands that are within a list from python

i got this list
commands = ['cd var','cd www','cd html','sudo rm -r folder']
I'm trying to execute one by one all the elements inside as a bash script, with no success. Do i need a for loop here?
how to achieve that?, thanks all!!!!
for command in commands:
os.system(command)
is one way you could do it ... although just cd'ing into a bunch of directories isnt going to have much impact
NOTE this will run each command in its own subshell ... so they would not remember their state (ie any directory changes or environmental variables)
if you need to run them all in one subshell than you need to chain them together with "&&"
os.system(" && ".join(commands)) # would run all of the commands in a single subshell
as noted in the comments, in general it is preferred to use subprocess module with check_call or one of the other variants. however in this specific instance i personally think that you are in a 6 to 1 half a dozen to the other, and os.system was less typing (and its gonna exist whether you are using python3.7 or python2.5 ... but in general use subprocess exactly which call probably depends on the version of python you are using ... there is a great description in the post linked in the comments by #triplee why you should use subprocess instead)
really you should reformat your commands to simply
commands = ["sudo rm -rf var/www/html/folder"] note that you will probably need to add your python file to your sudoers file
also Im not sure exactly what you are trying to accomplish here ... but i suspect this might not be the ideal way to go about it (although it should work...)
This is just a suggestion, but if your just wanting to change directories and delete folders, you could use os.chdir() and shutil.rmtree():
from os import chdir
from os import getcwd
from shutil import rmtree
directories = ['var','www','html','folder']
print(getcwd())
# current working directory: $PWD
for directory in directories[:-1]:
chdir(directory)
print(getcwd())
# current working directory: $PWD/var/www/html
rmtree(directories[-1])
Which will cd three directories deep into html, and delelte folder. The current working directory changes when you call chdir(), as seen when you call os.getcwd().
declare -a command=("cd var","cd www","cd html","sudo rm -r folder")
## now loop through the above array
for i in "${command[#]}"
do
echo "$i"
# or do whatever with individual element of the array
done
# You can access them using echo "${arr[0]}", "${arr[1]}" also

Skyfield.api loader behaves differently in docker container

I wish to specify to Skyfield a download directory as documented here :
http://rhodesmill.org/skyfield/files.html
Here is my script:
from skyfield.api import Loader
load = Loader('~/data/skyfield')
# Next line downloads deltat.data, deltat.preds, Leap_Second.dat in ~/data/skyfield
ts = load.timescale()
t = ts.utc(2017,9,13,0,0,0)
stations_url = 'http://celestrak.com/NORAD/elements/stations.txt'
# Next line downloads stations.txt in ~/data/skyfield AND deltat.data, deltat.preds, Leap_Second.dat in $PWD !!!
satellites = load.tle(stations_url)
satellite = satellites['ISS (ZARYA)']
Expected behaviour (works fine outside docker)
The 3 deltat files (deltat.data, deltat.preds and Leap_Second.dat) are downloaded in ~/data/skyfield with load.timescale() and stations.txt is downloaded at the same place with load.tle(stations_url)
Behaviour when run in a container
The 3 deltat files get downloaded twice :
one time in the specified folder at the call load.timescale()
another time in the current directory at the call load.tle(stations_url)
This is frustrating because they already exist at this point and they pollute current directory. Note that stations.txt end up in the right place (~/data/skyfield)
If the container is ran interactively, then calling exec(open("script.py").read()) in a python shell gives a normal behaviour again. Can anyone reproduce this issue? It is hard to tell wether it comes from python, docker or skyfield.
The dockerfile is just these 2 lines:
FROM continuumio/anaconda3:latest
RUN conda install -c astropy astroquery && conda install -c anaconda ephem=3.7.6.0 && pip install skyfield
Then (assuming the built image is tagged astro) I run it with :
docker run --rm -w /tmp/working -v $PWD:/tmp/working astro:latest python script.py
And here is the output (provided the folders are empty before the run):
[#################################] 100% deltat.data
[#################################] 100% deltat.preds
[#################################] 100% Leap_Second.dat
[#################################] 100% stations.txt
[#################################] 100% deltat.data
[#################################] 100% deltat.preds
[#################################] 100% Leap_Second.dat
EDIT
Adding -t to docker run did not solve the issue but helped to even illustrate it better. I think it may come from Skyfield because some recent issues on github seem quite similar although not exactly the same.
The simple solution here is to add -t to your docker run command to allocate a pseudo TTY:
docker run --rm -t -w /tmp/working -v $PWD:/tmp/working astro:latest python script.py
What you are seeing is caused by the way the lines are printed and buffering of non-TTY based stdout. The percentage up to 100% is likely printed on a line without newlines. Then after 100% it is printed again with a newline. With buffering, this causes it to be printed twice.
When you run the same command with a TTY, there is no buffering and the lines are printed realtime so the newlines actually work as desired.
The code path isn't actually running twice :)
See Docker run with pseudoTTY (-t) gives instant stdout, buffering happens without it for another explanation (possibly better than mine).

Prevent pdb or pytest set_trace from being committed using a pre commit hook

I'd like to create a git pre commit hook that prevents an uncommented pytest.set_trace() or a pdb.set_trace() and other .set_trace(). This is because I debug from the command line often and sometimes forget that I left the debug statement in the code. By using a pre commit hook, I should be able to avoid this in the future.
Keul's Blog has a solution for this but the session has to be in the root directory of the git repo for it to work or it will complain.
I basically want the not equivalent of this to work in grep
#(\s+)?.*\.set_trace\(\)
See the regexr test
Thanks
The correct regular expression is ^\s?[^#]+\.set_trace\(\)
Explanation:
^ starts with
\s? matches a space or more or no space
[^#]+ matches any characters excluding #
\.set_trace\(\) matches any function ending with .set_trace()
we can be more explicit by only including pdb or pytest set_traces but there may be other packages that also have a set_trace
Sample python code
import pdb
if __name__ == "__main__":
pdb.set_trace()
pytest.set_trace()
# pdb.set_trace()
# pytest.set_trace()
#pytest.set_trace()
# pdb.set_trace()
# somethingelse.set_trace()
Verify using ripgrep
$ rg '^\s?[^#]+\.set_trace\(\)'
main.py
4: pdb.set_trace()
5: pytest.set_trace()
Now we can use git-secrets which gets triggered on a pre-commit hook and this will prevent us from committing this line
$ git secrets --add '^\s?[^#]+\.set_trace\(\)'
$ git add main.py
$ git commit -m 'test'
main.py:4: pdb.set_trace()
main.py:5: pytest.set_trace()

Error in check_call() subprocess, executing 'mv' unix command: "Syntax error: '(' unexpected"

I'm making a python script for Travis CI.
.travis.yml
...
script:
- support/travis-build.py
...
The python file travis-build.py is something like this:
#!/usr/bin/env python
from subprocess import check_call
...
check_call(r"mv !(my_project|cmake-3.0.2-Darwin64-universal) ./my_project/final_folder", shell=True)
...
When Travis building achieves that line, I'm getting an error:
/bin/sh: 1: Syntax error: "(" unexpected
I just tried a lot of different forms to write it, but I get the same result. Any idea?
Thanks in advance!
Edit
My current directory layout:
- my_project/final_folder/
- cmake-3.0.2-Darwin64-universal/
- fileA
- fileB
- fileC
I'm trying with this command to move all the current files fileA, fileB and fileC, excluding my_project and cmake-3.0.2-Darwin64-universal folders into ./my_project/final_folder. If I execute this command on Linux shell, I get my aim but not through check_call() command.
Note: I can't move the files one by one, because there are many others
I don't know which shell Travis are using by default because I don't specify it, I only know that if I write the command in my .travis.yml:
.travis.yml
...
script:
# Here is the previous Travis code
- mv !(my_project|cmake-3.0.2-Darwin64-universal) ./my_project/final_folder
...
It works. But If I use the script, it fails.
I found this command from the following issue:
How to use 'mv' command to move files except those in a specific directory?
You're using the bash feature extglob, to try to exclude the files that you're specifying. You'll need to enable it in order to have it exclude the two entries you're specifying.
The python subprocess module explicitly uses /bin/sh when you use shell=True, which doesn't enable the use of bash features like this by default (it's a compliance thing to make it more like original sh).
If you want to get bash to interpret the command; you have to pass it to bash explicitly, for example using:
subprocess.check_call(["bash", "-O", "extglob", "-c", "mv !(my_project|cmake-3.0.2-Darwin64-universal) ./my_project/final_folder"])
I would not choose to do the job in this manner, though.
Let me try again: in which shell do you expect your syntax !(...) to work? Is it bash? Is it ksh? I have never used it, and a quick search for a corresponding bash feature led nowhere. I suspect your syntax is just wrong, which is what the error message is telling you. In that case, your problem is entirely independent form python and the subprocess module.
If a special shell you have on your system supports this syntax, you need to make sure that Python is using the same shell when invoking your command. It tells you which shell it has been using: /bin/sh. This is usually just a link to the real shell executable. Does it point to the same shell you have tested your command in?
Edit: the SO solution you referenced contains the solution in the comments:
Tip: Note however that using this pattern relies on extglob. You can
enable it using shopt -s extglob (If you want extended globs to be
turned on by default you can add shopt -s extglob to .bashrc)
Just to demonstrate that different shells might deal with your syntax in different ways, first using bash:
$ !(uname)
-bash: !: event not found
And then, using /bin/dash:
$ !(uname)
Linux
The argument to a subprocess.something method must be a list of command line arguments. Use e.g. shlex.split() to make the string be split into correct command line arguments:
import shlex, subprocess
subprocess.check_call( shlex.split("mv !(...)") )
EDIT:
So, the goal is to move files/directories, with the exemption of some file(s)/directory(ies). By playing around with bash, I could get it to work like this:
mv `ls | grep -v -e '\(exclusion1\|exclusion2\)'` my_project
So in your situation that would be:
mv `ls | grep -v -e '\(myproject\|cmake-3.0.2-Darwin64-universal\)'` my_project
This could go into the subprocess.check_call(..., shell=True) and it should do what you expect it to do.

Can I restrict nose coverage output to directory (rather than package)?

My SUT looks like:
foo.py
bar.py
tests/__init__.py [empty]
tests/foo_tests.py
tests/bar_tests.py
tests/integration/__init__.py [empty]
tests/integration/foo_tests.py
tests/integration/bar_tests.py
When I run nosetests --with-coverage, I get details for all sorts of
modules that I'd rather ignore. But I can't use the
--cover-package=PACKAGE option because foo.py & bar.py are not in a
package. (See the thread after
http://lists.idyll.org/pipermail/testing-in-python/2008-November/001091.html
for my reasons for not putting them in a package.)
Can I restrict coverage output to just foo.py & bar.py?
Update - Assuming that there isn't a better answer than Nadia's below, I've asked a follow up question: "How do I write some (bash) shell script to convert all matching filenames in directory to command-line options?"
You can use it like this:
--cover-package=foo --cover-package=bar
I had a quick look at nose source code to confirm: This is the line
if options.cover_packages:
for pkgs in [tolist(x) for x in options.cover_packages]:
You can use:
--cover-package=.
or even set environment variable
NOSE_COVER_PACKAGE=.
Tested with nose 1.1.2
I have a lot of top-level Python files/packages and find it annoying to list them all manually using --cover-package, so I made two aliases for myself. Alias nosetests_cover will run coverage with all your top-level Python files/packages listed in --cover-package. Alias nosetests_cover_sort will do the same and additionally sort your results by coverage percentage.
nosetests_cover_cmd="nosetests --with-coverage --cover-erase --cover-inclusive --cover-package=\$( ls | sed -r 's/[.]py$//' | fgrep -v '.' | paste -s -d ',' )"
alias nosetests_cover=$nosetests_cover_cmd
alias nosetests_cover_sort="$nosetests_cover_cmd 2>&1 | fgrep '%' | sort -nr -k 4"
Notes:
This is from my .bashrc file. Modify appropriately if you don't use bash.
These must be run from your top-level directory. Otherwise, the package names will be incorrect and coverage will silently fail to process them (i.e. instead of telling you your --cover-package is incorrect, it will act like you didn't supply the option at all).
I'm currently using Python 2.7.6 on Ubuntu 13.10, with nose version 1.3.0 and coverage version 3.7.1. This is the only setup in which I've tested these commands.
In your usage, remove --cover-erase and --cover-inclusive if they don't match your needs.
If you want to sort in normal order instead of reverse order, replace -nr with -n in the sort command.
These commands assume that all of your top-level Python files/packages are named without a dot (other than the dot in ".py"). If this is not true for you, read Details section below to understand the command parts, then modify the commands as appropriate.
Details:
I don't claim that these are the most efficient commands to achieve the results I want. They're just the commands I happened to come up with. =P
The main thing to explain would be the argument to --cover-package. It builds the comma-separated list of top-level Python file/package names (with ".py" stripped from file names) as follows:
\$ -- Escapes the $ character in a double-quoted string.
$( ) -- Inserts the result of the command contained within.
ls -- Lists all names in current directory (must be top-level Python directory).
| sed -r 's/[.]py$//' -- In the list, replaces "foo_bar.py" with "foo_bar".
| fgrep -v '.' -- In the list, removes all names without a dot (e.g. removes foo_bar.pyc and notes.txt).
| paste -s -d ',' -- Changes the list from newline-separated to comma-separated.
I should also explain the sorting.
2>&1 -- Joins stderr and stdout.
| fgrep '%' -- Removes all output lines without a % character.
| sort -nr -k 4 -- Sorts the remaining lines in reverse numerical order by the 4th column (which is the column for coverage percentage). If you want normal order instead of reverse order, replace -nr with -n.
Hope this helps someone! =)
If you use coverage:py 3.0, then code in the Python directory is ignored by default, including the standard library and all installed packages.
I would do this:
nosetests --with-coverage --cover-package=foo,bar tests/*
I prefer this solution to the others suggested; it's simple yet you are explicit about which packages you wish to have coverage for. Nadia's answer involves a lot more redundant typing, Stuart's answer uses sed and still creates a package by invoking touch __init__.py, and
--cover-package=. doesn't work for me.
For anyone trying to do this with setup.cfg, the following works. I had some trouble figuring out how to specify multiple packages.
[nosetests]
with-coverage=1
cover-html=1
cover-package=module1,module2
touch __init__.py; nosetests --with-coverage --cover-package=`pwd | sed 's#.*/##g'`
You can improve the accepted answer like so --cover-package=foo,bar

Categories