How to zip all outputs (*.csv and *.tiff) in jupyter notebook - python

I use this command
! zip -r results.zip . -i *.csv *.pdf
in Jupyter Notebook(Python 3.7.10) in order to zip all the output files. But, it shows
zip' is not recognized as an internal or external command, operable program or batch file.
Can anyone suggest what I miss?

I think you are trying to use,os.system():
If you are using linux
import os
os.system("zip -r results.zip . -i *.csv *.pdf")
#OR
import subprocess
subprocess.Popen("zip -r results.zip . -i *.csv *.pdf")
If you aren't using linux, in windows. There is library called zipfile, you can use it:
from zipfile import ZipFile
import os
filesname=os.listdir("<path or empty") # Or you can alternately use glob inside `with` ZipFile
# Empty means working folder
with ZipFile('output.zip', 'w') as myzip:
for file in files:
if file.endswith(".csv") and file.endswith(".pdf"):
myzip.write(file)

On modern Windows the Zip tool is Tar
It is not well documented nor as easy to use as most 3rd Party zippers thus you would need to custom wrap your OS call.
generally the following should be good enough
Tar -a -cf results.zip *.csv *.pdf
However if there are not one or other type the response will complete for the valid group, but with a very cryptic response:-
Tar: : Couldn't visit directory: No such file or directory
Tar: Error exit delayed from previous errors.

Related

Execute python program with multiple files - Python - Bash

I have hundreds of XML files and I would like to parse it into CSV files. I already code this program.
To execute the python program I use this command (on VScode MS):
python ConvertXMLtoCSV.py -i Alarm120.xml -o Alarm120.csv
My question is, how change this script to integrate a sort of for loop to execute this program for each xml files ?
UPDATE
If my files and folders are organized like in the picture:
I tried this and execute the file .bat in windows10 but it does nothing:
#!/bin/bash
for xml_file in XML_Files/*.xml
do
csv_file=${xml_file/.xml/.csv}
python ConvertXMLtoCSV.py -i XML_Files/$xml_file -o CSV_Files/$csv_file
done
Ideally the for loop would be included inside your ConvertXMLtoCSV.py itself. You can use this to find all xml files in a given directory:
for file in os.listdir(directory_path):
if file.endswith(".xml"):
# And here you can do your conversion
You could change the arguments given to the script to be the path of the directory the xml files are located in and the path for an output folder for the .csv files. For renaming, you can leave the files with the same name but give the .csv extension. i.e.
csv_name = file.replace(".xml", ".csv")
If you want to keep your Python script as-is (process one file), and add the looping externally in bash, you could do:
#!/bin/bash
for xml_file in *.xml
do
csv_file=${xml_file/.xml/.csv}
python ConvertXMLtoCSV.py -i $xml_file -o $csv_file
done
After discussion, it appears that you wish to use an external script so as to leave the original ConvertXMLtoCSV.py script unmodified (as required by other projects), but that although you tagged bash in the question, it turned out that you were not in fact able to use bash to invoke python when you tried it in your setup.
This being the case, it is possible to adapt Rolv Apneseth's answer so that you do the looping in Python, but inside a separate script (let's suppose that this is called convert_all.py), which then runs the unmodified ConvertXMLtoCSV.py as an external process. This way, the ConvertXMLtoCSV.py will still be set up to process only one file each time it is run.
To call an external process, you could either use os.system or subprocess.Popen, so here are two options.
Using os.system:
import os
import sys
directory_path = sys.argv[1]
for file in os.listdir(directory_path):
if file.endswith(".xml"):
csv_name = file.replace(".xml", ".csv")
os.system(f'python ConvertXMLtoCSV.py -i {file} -o {csv_name}')
note: for versions of python too old to support f-strings, that last line could be changed to
os.system('python ConvertXMLtoCSV.py -i {} -o {}'.format(file,csv_name))
Using subprocess.Popen:
import subprocess
import sys
directory_path = sys.argv[1]
for file in os.listdir(directory_path):
if file.endswith(".xml"):
csv_name = file.replace(".xml", ".csv")
p = subprocess.Popen(['python', 'ConvertXMLtoCSV.py',
'-i', file,
'-o', csv_name])
p.wait()
You could then run it using some command such as:
python convert_all.py C:/Users/myuser/Desktop/myfolder
or whatever the folder is where you have the XML files.

Executing all files inside a folder in Python

I have 20 Python files which is stored inside a directory in ubuntu 14.04 like 1.py, 2.py, 3.py , 4.py soon
i have execute these files by "python 1.py", "python 2.py" soon for 20 times.
is their a way to execute all python files inside a folder by single command ?
find . -maxdepth 1 -name "*.py" -exec python3 {} \;
for F in $(/bin/ls *.py); do ./$F; done
You can use any bash construct directly from the command line, like this for loop. I also force /bin/ls to make sure to bypass any alias you might have set.
Use a loop inside the folder:
#!/bin/bash
for script in $(ls); do
python $script
done
You can try with the library glob.
First install the glob lybrary.
Then import it:
import glob
Then use a for loop to iterate through all files:
for fileName in glob.glob('*.py'):
#do something, for example var1 = filename
The * is used to open them all.
More information here: https://docs.python.org/2/library/glob.html

How to use the previous command output to use as a part of another command: python

I have been trying to use the output of a system command to use it as a part of the command in the next portion. However, I cannot seem to join it up properly and hence not being able to run the second command properly. The OS used is KALI LINUX and python 2.7
#IMPORTS
import commands, os, subprocess
os.system('mkdir -p ~/Desktop/TOOLS')
checkdir = commands.getoutput('ls ~/Desktop')
if 'TOOLS' in checkdir:
currentwd = subprocess.check_output('pwd', shell=True)
cmd = 'cp -R {}/RAW ~/Desktop/TOOLS/'.format(currentwd)
os.system(cmd)
os.system('cd ~/Desktop/TOOLS')
os.system('pwd')
The errors are:
cp: missing destination file operand after ‘/media/root/ARSENAL’
Try 'cp --help' for more information.
sh: 2: /RAW: not found
/media/root/ARSENAL
It seems that the reading of the first command is alright but it can't join with the RAW portion. I have read many other solutions, but they seem to be for shell scripting instead.
Assuming you haven't called os.chdir() anywhere prior to the cp -R, then you can use a relative path. Changing the code to...
if 'TOOLS' in checkdir:
cmd = 'cp -R RAW ~/Desktop/TOOLS'
os.system(cmd)
...should do the trick.
Note that the line...
os.system('cd ~/Desktop/TOOLS')
...will not do what you expect. os.system() spawns a subshell, so it will just change the working directory for that process and then exit. The calling process's working directory will remain unchanged.
If you want to change the working directory for the calling process, use...
os.chdir(os.path.expanduser('~/Desktop/TOOLS'))
However, Python has all this functionality built-in, so you can do it without spawning any subshells...
import os, shutil
# Specify your path constants once only, so it's easier to change
# them later
SOURCE_PATH = 'RAW'
DEST_PATH = os.path.expanduser('~/Desktop/TOOLS/RAW')
# Recursively copy the files, creating the destination path if necessary.
shutil.copytree(SOURCE_PATH, DEST_PATH)
# Change to the new directory
os.chdir(DEST_PATH)
# Print the current working directory
print os.getcwd()

run python script in every file in a directory through command line

I have a python script which takes the filename as a command argument and processes that file. However, i have thousands of files I need to process, and I would like to run the script on every file without having to add the filename as the argument each time.
for example:
process.py file1 will do exactly what I want
however, I want to run process.py on a folder containing thousands of files (file1, file2, file3, etc.)
I have found out it that it can be done simply in Bash
for f in *; do python myscript.py $f; done
However, I am on windows and don't want to install something like Cygwin.
What would a piece of code for the Windows command line look like that would emulate what the above Bash code accomplishes?
for %%f in (*.py) do (
start %%f
)
I think that'll work -- I don't have a Windows box handy at the moment to try it
How to loop through files matching wildcard in batch file
That link might help
import os, subprocess
for f in os.listdir('.'):
if os.path.isfile(f):
subprocess.call(["python", "myscript.py", f])
this solution will work on every platform, provided the python executable is in the PATH.
Also, if you want to recursively process files in nested subdirectories, you can use os.walk() instead of os.listdir()+os.path.isfile().
Since you have python, why not use that?
import subprocess
import glob
import sys
import os.path
for fname in glob.iglob(os.path.join('some-directory-name','*')):
proc = subprocess.Popen([sys.executable, 'myscript.py', fname])
proc.wait()
What's more, its portable.
For each file in current dir.
for %f in (*) do C:\Python34\python.exe "%f"
Update:
Note the quotes on the %f. You need them if your files contain spaces in the name. You can also put any path+executable after the do.
If we imagine your files look like:
./process.py
./myScripts/file1.py
./myScripts/file2.py
./myScripts/file3.py
...
In your example, would simply be:
for %f in (.\myScripts\*) do process.py "%f"
This would invoke:
process.py ".\myScripts\file1.py"
process.py ".\myScripts\file2.py"
process.py ".\myScripts\file3.py"

Use curl to download multiple files

I have to use cURL on Windows using python script. My goal is: using python script get all files from remote directory ... preferably into local directory. After that I will compare each file with the files stored locally. I am able to get one file at a time but I need to get all of the files from remote directory.
Could someone please advice how to get multiple files?
I use this command:
curl.exe -o file1.txt sftp:///dir1/file1.txt -k -u user:password
thanks
I haven't tested this, but I think you could just try launching each shell command as a separate process to run them simultaneously. Obviously, this might be a bad idea if you have a large set of files, so you might need to manage that more carefully. Here's some untested code, and you'd need to edit the 'cmd' variable in the get_file function, of course.
from multiprocessing import Process
import subprocess
def get_file(filename):
cmd = '''curl.exe -o {} sftp:///dir1/{} -k -u user:password'''.format(filename, filename)
subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT) # run the shell command
files = ['file1.txt', 'file2.txt', 'file3.txt']
for filename in files:
p = Process(target=get_file, args=(filename,)) # create a process which passes filename to get_file()
p.start()

Categories