We've got a Python application and want to count all the lines of code under a specific directory and its subdirectories.
We don't need to ignore comments but we want to ignore all files containing test cases.
Path to test cases file always has /tests/ in the path (e.g. /python/trieus/persistence/service/tests/batch_service_tests.py).
I used below command to find the count but it did not exclude test files.
find . -name "*.py" -not -path "./tests*" | xargs wc -l | sort
What's the correct syntax here?
You can do:
find . -type d -name tests -prune -o -type f -name '*.py' \
-exec grep -hxc '.*' {} + | paste -sd+ | bc
find . -type d -name tests -prune excludes the tests directory
-type f -name '*.py' matches only .py files
-exec grep -hxc '.*' {} + get individual lines, you can modify the Regex pattern to meet you need here
paste -sd+ formats the output to put in a single line with + in between
bc does the addition on its STDIN data
I would suggest you to go one step at a time to understand the whole thing better.
As a side note, this will not get you the actual LOC, instead would get you the lines count only.
Example from running in an example directory on my system:
% find . -type d -name tests -prune -o -type f -name '*.py' -exec grep -hxc '.*' {} + | paste -sd+ | bc
5594
You can exclude a directory at ANY level by changing the first . to *. so find -name "*.py" -not -path "*/tests/*" would omit files from a directory named "tests" at any level of depth.
So the command will look like this:
find . -name "*.py" -not -path "*/tests/*" | xargs wc -l | sort
Related
I am trying to run xargs on multiple files at once:
sh -c 'find . -name "*.py" | xargs pylint'
This will give me a single pylint score for all py files in a repo. However when I try to modify it to do both black and pylint, it loops through each file individually and gives me the pylint score and black diff on each file:
find . -name "*.py" | xargs -I % sh -c 'pylint %; black --check --diff %;'
Any way to pass in the py files in batch rather than each individually?
If you use -I option it runs given command once for each entry in input.
You can do this instead:
find . -name "*.py" |
xargs sh -c 'pylint "$#"; black --check --diff "$#"'
Running this on osx...
cd ${BUILD_DIR}/mydir && for DIR in $(find ./ '.*[^_].py' | sed 's/\/\//\//g' | awk -F "/" '{print $2}' | sort |uniq | grep -v .py); do
if [ -f $i/requirements.txt ]; then
pip install -r $i/requirements.txt -t $i/
fi
cd ${DIR} && zip -r ${DIR}.zip * > /dev/null && mv ${DIR}.zip ../../ && cd ../
done
cd ../
error:
(env) ➜ sh package_lambdas.sh find: .*[^_].py: No such file or directory
why?
find takes as an argument a list of directories to search. You provided what appears to be regular expression. Because there is no directory named (literally) .*[^_].py, find returns an error.
Below I have revised your script to correct that mistake (if I understand your intention). Because I see so many ill-written shell scripts these days, I've taken the liberty of "traditionalizing" it. Please see if you don't also find it more readable.
Changes:
use #!/bin/sh, guaranteed to be on an Unix-like system. Faster than bash, unless (like OS X) it is bash.
use lower case for variable names to distinguish from system variables (and not hide them).
eschew braces for variables (${var}); they're not needed in the simple case
do not pipe output to /usr/bin/true; route it to dev/null if that's what you mean
rm -f by definition cannot fail; if you meant || true, it's superfluous
put then and do on separate lines, easier to read, and that's how the Bourne shell language was meant to be used
Let && and || serve as line-continuation, so you can see what's happening step by step
Other changes I would suggest:
Use a subshell when changing the working directory temporarily. When it terminates, the working directory is restored automatically (retained by the parent), saving you the cd .. step, and errors.
Use set -e to cause the script to terminate on error. For expected errors, use || true explicitly.
Change grep .py to grep '\.py$', just for good measure.
To avoid Tilting Matchstick Syndrome, use something other than / as a sed substitute delimiter, e.g., sed 's://:/:g'. But sed could be avoided altogether with awk -F '/+' '{print $2}'.
Revised version:
#! /bin/sh
src_dir=lambdas
build_dir=bin
mkdir -p $build_dir/lambdas
rm -rf $build_dir/*.zip
cp -r $src_dir/* $build_dir/lambdas
#
# The sed is a bit complicated to be osx / linux cross compatible :
# ( .//run.sh vs ./run.sh
#
cd $build_dir/lambdas &&
for L in $(find . -exec grep -l '.*[^_].py' {} + |
sed 's/\/\//\//g' |
awk -F "/" '{print $2}' |
sort |
uniq |
grep -v .py)
do
if [ -f $i/requirements.txt ]
then
echo "Installing requirements"
pip install -r $i/requirements.txt -t $i/
fi
cd $L &&
zip -r $L.zip * > /dev/null &&
mv $L.zip ../../ &&
cd ../
done
cd ../
The find(1) manpage says its args are [path ...] [expression], where "expression" consists of "primaries" and "operands" (-flags). '.*[^-].py' doesn't look like any expression, so it's being interpreted as a path, and it's reporting that there is no file named '.*[^-].py' in the working directory.
Perhaps you meant:
find ./ -regex '.*[^-].py'
I'm having trouble with Python Fabric removing some local files. Yet, I found a related post on StackOverflow with a solution. But I am wondering why does adding 2>&1 at the end fix it?
I can run the following perfectly fine in my terminal:
$ find app/views/ -type f -name '*%%.php' -exec rm {} \;
However when I do a fabric call I get:
$ fab rmcache
[localhost] local: find app/views/ -type f -name '*%%.php' -exec rm {} \;
find: missing argument to `-exec'
Fatal error: local() encountered an error (return code 1) while executing 'find
app/views/ -type f -name '*%%.php' -exec rm {} \;'
0: Why does it require 2>&1 through Fabric, but not locally?
1: Why does this work through Fabric?
def rmcache():
local("find {0} -type f -name '*%%.php' -exec rm {{}} \ 2>&1;".format('app/views/'));
2: But this does not work through fabric?
def rmcache():
local("find {0} -type f -name '*%%.php' -exec rm {{}} \;".format('app/views/'));
0: 2>&1 redirects stderr to stdout which means that if your command is throwing an error fabric won't pick it up because it isn't being returned to fabric (See this answer for more details on 2>&1).
1 & 2: My guess is that your code is throwing an error because 'app/views' is a relative path and find requires the directory to exist, therefore you would have to run your fabric command from the directory that contains the app directory. Try using '/full/path/to/app/views' to ensure you are using the correct directory.
I've got a python project with internationalized strings.
I've modified the source codes and the lines of the strings are changed, i.e. in pot and po files lines of he strings are not pointing to correct lines.
So how to update the po and pot files to new string locations in files.
You could have a look to this script to update your po files with new code. It use xgettext and msgmerge.
echo '' > messages.po # xgettext needs that file, and we need it empty
find . -type f -iname "*.py" | xgettext -j -f - # this modifies messages.po
msgmerge -N existing.po messages.po > new.po
mv new.po existing.po
rm messages.po
Using autoconf and automake you can simply change into the po subdirectory and run:
make update-po
or:
make update-gmo
For those who use meson:
<project_id>-pot and <project_id>-update-po.
E.g. for iputils project:
$ dir="/tmp/build"
$ meson . $dir && ninja iputils-pot -C $dir && ninja iputils-update-po -C $dir
SOURCE: https://mesonbuild.com/i18n-module.html
i have a large number of files/folders coming in each day that are being sorted automatically to a wide variety of folders. I'm looking for a way to automatically find these files/folders and create symlinks to them all within an "incoming" folder. Searching for file age should be sufficient for finding the files, however searching for age and owner would be ideal. Then once the files/folders being linked to reach a certain age, say 5 days, remove the symlinks to them automatically from the "incoming" folder. Is this possible to do with a simple shell or python script that can be run with cron? Thanks!
Use incron to create the symlink, then find -L in cron to break it.
Not quite sure what you want the symlinks to but here's a first shot:
find /incoming -mtime -5 -user nr -exec ln -s '{}' /usr/local/symlinks ';'
Finds anything in /incoming owned by nr less than 5 days old and links it into /usr/local/symlinks. Unfortunately ln doesn't have a nice option to ignore something that already exists. You are better off writing a script that links things in, and at the same time you can make things much more efficient:
find /incoming -mtime -5 -user nr -print0 | xargs -0 mylink
Where mylink has
#!/bin/bash
for i
do
link=/usr/local/symlinks/"$(basename "$i")"
[[ -L "$link" ]] || ln -s "$i" /usr/local/symlinks
done
If you want to be even more efficient you can accumulate the list of files to be linked
in an array and than link them all with one ln command, but that's a lot of notation and I probably wouldn't bother.
To remove the symlinks that point to files older than 5 days:
find -L /usr/local/symlinks -mtime +5 -user nr -exec rm '{}' ';'
or again you can use xargs:
find -L /usr/local/symlinks -mtime +5 -user nr -print0 | xargs -0 rm -f