Bash script to find files in folders

Bash script to find files in folders - python

I have a couple of folders as
Main/
/a
/b
/c
..
I have to pass input file abc1.txt, abc2.txt from each of these folders respectively as an input file to my python program.
The script right now is,
for i in `cat file.list`
do
echo $i
cd $i
#works on the assumption that there is only one .txt file
inputfile=`ls | grep .txt`
echo $inputfile
python2.7 ../getDOC.py $inputfile
sleep 10
cd ..
done
echo "Script executed successfully"
So I want the script to work correctly regardless of number of .txt files.
Can anyone let me know if there is any inbuilt command in shell to fetch the correct .txt files in case for multiple .txt files?

The find command is well suited for this with -exec:
find /path/to/Main -type f -name "*.txt" -exec python2.7 ../getDOC.py {} \; -exec sleep 10 \;
Explanation:
find - invoke find
/path/to/Main - The directory to start your search at. By default find searches recursively.
-type f - Only consider files (as opposed to directories, etc)
-name "*.txt" - Only find the files with .txt extension. This is quoted so bash doesn't auto-expand the wildcard * via globbing.
-exec ... \; - For each such result found, run the following command on it:
python2.7 ../getDOC.py {}; - the {} part is where the search result from the find gets substituted into each time.
sleep 10 - sleep for 10 seconds after each time python script is run on the file. Remove this if you don't want it to sleep.

Better using globs :
shopt -s globstar nullglob
for i in Main/**/*txt; do
python2.7 ../getDOC.py "$i"
sleep 10
done
This example is recursive and require bash4

find . -name *.txt | xargs python2.7 ../getDOC.py

Related

bash script to automatize python script problem

i have an issue with a bash script, it runs a python script over all the csv files that i have inside in a directory, but i have notice the resultant files are incomplete, any file contains more than 100 rows,and curiously it left to write in the row number 100
i include my bash code:
#!/usr/bin/env bash
find ./nodos_aleatorios100 -type f -name '*.csv' -print0 |
while IFS= read -r -d $'\0' line; do
nname=$(echo "$line" | sed "s/ecn/encn/")
python algoritmo_aleatorio.py $line $nname
done
and the question is how can solve this problem?
thanks in advance to everyone could help me,

A bash script that reads python files recursively and stops after the output of each file exists

I have a bash that reads *.py scripts recursively and stops when the output of each *.py files exists in the directory (a *.pkl file). The main idea of the bash is that if the output not exists, the python script has to run again until creating the output for each *.py script. 
bash.sh
model1.py
model2.py
model3.py
model1.pkl # expected output
model2.pkl # expected output
model3.pkl # expected output
However, I have a problem here: When the second/third output NOT exists (from the second/third .*py script) the bash did not run again (while if the first output NOT exists the bash run again, as should be).
My bash is the following:
#!/bin/bash
for x in $(find . -type f -name "*.py"); do
if [[ ! -f `basename -s .py $x`.pkl ]]; then #output with the same name of *.py file
python3 ${x}
else
exit 0
fi
done
So, how I can force the bash script to run again if the output of any *.py script is missing? Or it is a problem with the name of the outputs?
I tried using the commands while read and until, but I failed to do the script read all the *.py files.
Thanks in advance!

try this: not the best way: but at least will help you in right direction.
keep_running(){
for f in $(find . -type f -name "*.py");
do
file_name=$(dirname $f)/$(basename $f .py).pk1
if [ ! -f "$file_name" ];
then
echo "$file_name doesn't exists" # here you can run your python script
fi
done
}
cnt_py=0
cnt_pkl=1
while [ $cnt_pkl -ne $cnt_py ] ; do
keep_running
cnt_py=`ls -1 *.py| wc -l`
cnt_pkl=`ls -1 *.pk1| wc -l`
done

How to find and exec based on python variable in google colab

I'm working on google colaboratory and i have to do some elaboration to some files based on their extensions, like:
!find ./ -type f -name "*.djvu" -exec file '{}' ';'
and i expect an output:
./file.djvu: DjVu multiple page document
but when i try to mix bash and python to use a list of exensions:
for f in file_types:
!echo "*.{f}"
!find ./ -type f -name "*.{f}" -exec file '{}' ';'
!echo "*.$f"
!find ./ -type f -name "*.$f" -exec file '{}' ';'
i get only the output of both the echo but not of the files.
*.djvu
*.djvu
*.jpg
*.jpg
*.pdf
*.pdf
If i remove the exec part it actually find the files so i can't figure out why the find command combined with exec fail in some manner.
If needed i can provide more info/examples.

I found an ugly workaround passing trought a file, so first i write the array to a file in python:
with open('/content/file_types.txt', 'w') as f:
for ft in file_types:
f.write(ft + '\n')
and than i read and use it from bash in another cell:
%%bash
filename=/content/protected_file_types.txt
while IFS= read -r f;
do
find ./ -name "*.$f" -exec file {} ';' ;
done < $filename
Doing so i dont mix bash and python in the same cell as suggested in the comment of another answer.
I hope to find a better solution that maybe use some trick that I'm not aware of.

This works for me
declare -a my_array=("pdf" "py")
for i in ${my_array[*]}; do find ./ -type f -name "*.$i" -exec file '{}' ';'; done;

Find path of python program in execution

In my terminal, I can see a python program in execution:
python3 app.py
where can I find app.py?
I've tried to look in the /proc/$pid/exe but links to the python interpreter.
I have many app.py programs in my system, I want to find out exactly which is in execution with that pid.

i ran a short test on my machine and came up with this... maybe it helps:
find the process id PID of the job in question:
$ ps -u $USER -o pid,cmd | grep app.py
the PID will be in the first column. assign this number to the variable PID.
find the current working directory of that job:
$ ls -l /proc/$PID/cwd
(for more info: cat $ cat /proc/$PID/environ)
your app.py file will be in this directory.

check the file
/proc/[PID]/environ
There is PWD variable contains full path of the directory containing the executable file.

If you are on Mac Os X please try using:
sudo find / -type f -fstype local -iname "app.py"
If you are not on Mac Os X you can use:
sudo find / -mount -type f -iname "app.py"
The find command will start from your root folder and search recursively for all the files called "app.py" (case insensitive).

Simplest way to run Sphinx on one python file

We have a Sphinx configuration that'll generate a slew of HTML documents for our whole codebase. Sometimes, I'm working on one file and I just would like to see the HTML output from that file to make sure I got the syntax right without running the whole suite.
I looked for the simplest command I could run in a terminal to run sphinx on this one file and I'm sure the info's out there but I didn't see it.

Sphinx processes reST files (not Python files directly). Those files may contain references to Python modules (when you use autodoc). My experience is that if only a single Python module has been modified since the last complete output build, Sphinx does not regenerate everything; only the reST file that "pulls in" that particular Python module is processed. There is a message saying updating environment: 0 added, 1 changed, 0 removed.
To explicitly process a single reST file, specify it as an argument to sphinx-build:
sphinx-build -b html -d _build/doctrees . _build/html your_filename.rst

This is done in two steps:
Generate rst file from the python module with sphinx-apidoc.
Generate html from rst file with sphinx-build.
This script does the work. Call it while standing in the same directory as the module and provide it with the file name of the module:
#!/bin/bash
# Generate html documentation for a single python module
PACKAGE=${PWD##*/}
MODULE="$1"
MODULE_NAME=${MODULE%.py}
mkdir -p .tmpdocs
rm -rf .tmpdocs/*
sphinx-apidoc \
-f -e --module-first --no-toc -o .tmpdocs "$PWD" \
# Exclude all directories
$(find "$PWD" -maxdepth 1 -mindepth 1 -type d) \
# Exclude all other modules (apidoc crashes if __init__.py is excluded)
$(find "$PWD" -maxdepth 1 -regextype posix-egrep \
! -regex ".*/$MODULE|.*/__init__.py" -type f)
rm .tmpdocs/$PACKAGE.rst
# build crashes if index.rst does not exist
touch .tmpdocs/index.rst
sphinx-build -b html -c /path/to/your/conf.py/ \
-d .tmpdocs .tmpdocs .tmpdocs .tmpdocs/*.rst
echo "**** HTML-documentation for $MODULE is available in .tmpdocs/$PACKAGE.$MODULE_NAME.html"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bash script to find files in folders - python

Better using globs : shopt -s globstar nullglob for i in Main/**/*txt; do python2.7 ../getDOC.py "$i" sleep 10 done This example is recursive and require bash4

find . -name *.txt | xargs python2.7 ../getDOC.py

Related

bash script to automatize python script problem

A bash script that reads python files recursively and stops after the output of each file exists

How to find and exec based on python variable in google colab

Find path of python program in execution

Simplest way to run Sphinx on one python file

Categories

Resources