I am trying to print the directory size using python fabric. I used the below code
def getfilesize():
with settings(user='hduser',password='cisco'):
path='/app/hadoop/tmp/myoutput/'
os.path.getsize(path)
but it throws me an error " no such file or directory"
But I can see this directory
hduser#dn1:~$ cd /app/hadoop/tmp/myoutput/
hduser#dn1:/app/hadoop/tmp/myoutput$ ls
taskTracker tt_log_tmp ttprivate userlogs
Am i doing any sytax error here ?
This isn't a syntax error, rather the python code is still being executed on your machine and not the remote machine. So calling os.path.getsize is checking the path size on your local machine (where it does not exist).
Instead you need to use fabric to execute shell commands on the remote server and catch their output. There are fabric modules that wrap common use cases, like the files module, so you don't have to work in terms of bash commands. Unfortunately I don't know of any that will give you a recursive directory size. Fortunately, getting the size of a directory is just a oneliner so we can do something like:
def getfilesize():
with settings(user='hduser', password='cisco'):
output = run('du -s "/app/hadoop/tmp/myoutput/"')
# output is the commands stdout as a (potentially multiline) string
# for `du` it will look like: "183488582 /app/hadoop/tmp/mypoutput"
size_in_bytes = int(output.split()[0])
Related
I want to implement a userland command that will take one of its arguments (path) and change the directory to that dir. After the program completion I would like the shell to be in that directory. So I want to implement cd command, but with external program.
Can it be done in a python script or I have to write bash wrapper?
Example:
tdi#bayes:/home/$>python cd.py tdi
tdi#bayes:/home/tdi$>
Others have pointed out that you can't change the working directory of a parent from a child.
But there is a way you can achieve your goal -- if you cd from a shell function, it can change the working dir. Add this to your ~/.bashrc:
go() {
cd "$(python /path/to/cd.py "$1")"
}
Your script should print the path to the directory that you want to change to. For example, this could be your cd.py:
#!/usr/bin/python
import sys, os.path
if sys.argv[1] == 'tdi': print(os.path.expanduser('~/long/tedious/path/to/tdi'))
elif sys.argv[1] == 'xyz': print(os.path.expanduser('~/long/tedious/path/to/xyz'))
Then you can do:
tdi#bayes:/home/$> go tdi
tdi#bayes:/home/tdi$> go tdi
That is not going to be possible.
Your script runs in a sub-shell spawned by the parent shell where the command was issued.
Any cding done in the sub-shell does not affect the parent shell.
cd is exclusively(?) implemented as a shell internal command, because any external program cannot change parent shell's CWD.
As codaddict writes, what happens in your sub-shell does not affect the parent shell. However, if your goal is to present the user with a shell in a different directory, you could always have Python use os.chdir to change the sub-shell's working directory and then launch a new shell from Python. This will not change the working directory of the original shell, but will leave the user with one in a different directory.
As explained by mrdiskodave
in Equivalent of shell 'cd' command to change the working directory?
there is a hack to achieve the desired behavior in pure Python.
I made some modifications to the answer from mrdiskodave to make it work in Python 3:
The pipes.quote() function has moved to shlex.quote().
To mitigate the issue of user input during execution, you can delete any previous user input with the backspace character "\x08".
So my adaption looks like the following:
import fcntl
import shlex
import termios
from pathlib import Path
def change_directory(path: Path):
quoted_path = shlex.quote(str(path))
# Remove up to 32 characters entered by the user.
backspace = "\x08" * 32
cmd = f"{backspace}cd {quoted_path}\n"
for c in cmd:
fcntl.ioctl(1, termios.TIOCSTI, c)
I shall try to show how to set a Bash terminal's working directory to whatever path a Python program wants in a fairly easy way.
Only Bash can set its working directory, so routines are needed for Python and Bash. The Python program has a routine defined as:
fob=open(somefile,"w")
fob.write(dd)
fob.close()
"Somefile" could for convenience be a RAM disk file. Bash "mount" would show tmpfs mounted somewhere like "/run/user/1000", so somefile might be "/run/user/1000/pythonwkdir". "dd" is the full directory path name desired.
The Bash file would look like:
#!/bin/bash
#pysync ---Command ". pysync" will set bash dir to what Python recorded
cd `cat /run/user/1000/pythonwkdr`
I am creating a script to run shell commands for simulation purposes using a web app. I want to run a shell command in a django app and then save the output to a file.
The problem I am facing is that when running the shell command, the output tries to get saved in the url that is invoked (for example: localhost:8000/projects) which is understandable.
I want to save the output to for example:
/home/myoutput/output.txt rather than /projects or /tasks
I have to run a whole script and save it's output to the txt file later but that is easy once this is done.
Tried os.chdir() function to change directory to /desiredpath already
from subprocess import run
#the function invoked from views.py
def invoke_mpiexec():
run('echo "this is a test file" > fahadTest.txt')
FileNotFoundError at /projects
Exception Type: FileNotFoundError
First I want to say that directly calling external programs from a web request in Django is a bit of an anti-pattern. The preferred approach is to use a work queue like Celery or rq, but that comes with a bit of added complexity.
That being said, you can solve your problem with the argument shell=True:
from subprocess import run
#the function invoked from views.py
def invoke_mpiexec():
run('echo "this is a test file" > fahadTest.txt', shell=True)
Here is the documentation:
If shell is True, the specified command will be executed through the
shell. This can be useful if you are using Python primarily for the
enhanced control flow it offers over most system shells and still want
convenient access to other shell features such as shell pipes,
filename wildcards, environment variable expansion, and expansion of ~
to a user’s home directory. However, note that Python itself offers
implementations of many shell-like features (in particular, glob,
fnmatch, os.walk(), os.path.expandvars(), os.path.expanduser(), and
shutil).
Note: Using shell=True can lead to security issues:
If the shell is invoked explicitly, via shell=True, it is the
application’s responsibility to ensure that all whitespace and
metacharacters are quoted appropriately to avoid shell injection
vulnerabilities.
You should use subprocess.call with stdout argument
def invoke_mpiexec():
f = open("fahadTest.txt", "w")
subprocess.call(['echo', '"this is a test file"'], stdout=f)
or use write function
def invoke_mpiexec():
f = open('fahadTest.txt', 'w')
f.write("Now the file has more content!")
f.close()
So I figured it out.
Below is the fix:
run('mkdir -p $HOME/phdata/test/ && echo "this is a test file" > $HOME/phdata/test/fahadTest.txt', shell=True)
mkdir -p creates a directory if it doesn't exist
$HOME is used to go to the home directory and from there you can
navigate to folders.
shell=True argument is required to run it as shell command
You can also create a ssh connection and run the commands/ scripts on the remote server. For this, my approach will be to create a script on the remote server, call it through my app and provide arguments to it. Another workaround which is not that good but works is to create a script on the server using the above line and then call it.
#!/usr/bin/python
import requests, zipfile, StringIO, sys
extractDir = "myfolder"
zip_file_url = "download url"
response = requests.get(zip_file_url)
zipDocument = zipfile.ZipFile(StringIO.StringIO(response.content))
zipinfos = zipDocument.infolist()
for zipinfo in zipinfos:
extrat = zipDocument.extract(zipinfo,path=extractDir)
System configuration
Ubuntu OS 16.04
Python 2.7.12
$ python extract.py
when I run the code on Terminal with above command, it works properly and create the folder and extract the file into it.
Similarly, when I create a cron job using sodu rights the code executes but don't create any folder or extracts the files.
crontab command:-
40 10 * * * /usr/bin/sudo /usr/bin/python /home/ubuntu/demo/directory.py > /home/ubuntu/demo/logmyshit.log 2>&1
also tried
40 10 * * * /usr/bin/python /home/ubuntu/demo/directory.py > /home/ubuntu/demo/logmyshit.log 2>&1
Notes :
I check the syslog, it says the cron is running successfully
The above code gives no errors
also made the python program executable by chmod +x filename.py
Please help where am I going wrong.
Oups, there is nothing really wrong in running a Python script in crontab, but many bad things can happen because the environment is not the one you are used to.
When you type in an interactive shell python directory.py, the PATH and all required PYTHON environment variable have been set as part of login and interactive shell initialization, and the current directory is your home directory by default or anywhere you currently are.
When the same command is run from crontab, the current directory is not specified (but may not be what you expect), PATH is only /bin:/usr/bin and python environment variables are not set. That means that you will have to tweak environment variables in crontab file until you get a correct Python environment, and set the current directory.
I had a very similar problem and it turned out cron didn’t like importing matplotlib, I ended up having to specify Agg backend. I figured it out by putting log statements after each line to see how far the program got before it crapped out. Of course, my log was empty which tipped me off that it crashed on imports.
TLDR: log each line inside the script
I'm trying to run code from this repo: https://github.com/tylin/coco-caption, specifically from https://github.com/tylin/coco-caption/blob/master/pycocoevalcap/tokenizer/ptbtokenizer.py, line 51-52:
p_tokenizer = subprocess.Popen(cmd, cwd=path_to_jar_dirname, \
stdout=subprocess.PIPE)
The error I get running this is
OSError: [Errno 2] No such file or directory
I can't figure out why the file can't be found.
The jar I'm trying to run is:
stanford-corenlp-3.4.1.jar
You can see the structure of directory by going to https://github.com/tylin/coco-caption/tree/master/pycocoevalcap/tokenizer. For more specificity into what my actual arguments are when I run the line of code:
cmd= ['java', '-cp', 'stanford-corenlp-3.4.1.jar', 'edu.stanford.nlp.process.PTBTokenizer', '-preserveLines', '-lowerCase', 'tmpWS5p0Z'],
and
path_to_dirname =abs_path_to_folder/tokenizer
I can see the jar that needs to be run, and it looks to be in the right place, so why can't python find it. (Note: I'm using python2.7.) And the temporary File 'tmpWS5p0Z' is where it should be.
Edit: I'm using Ubuntu
try an absolute path ( meaning the path beginning from root / )
https://en.wikipedia.org/wiki/Path_(computing)#Absolute_and_relative_paths
for relative paths in python see i.e. Relative paths in Python , How to refer to relative paths of resources when working with a code repository in Python
UPDATE:
As a test try subprocess.Popen() with the shell=True option and give an absolute path for any involved file, including tmpWS5p0Z
in this subprocess.Popen() call are involved two paths :
1) the python path, python has to find the java executable and the stanford-corenlp-3.4.1.jar which is essentially a java program with its own path
2) the java path of stanford-corenlp-3.4.1.jar
as this is all too complicated try
p_tokenizer = subprocess.Popen(['/absolute_path_to/java -cp /absolute_path_to/stanford-corenlp-3.4.1.jar /absolute_path_to/edu.stanford.nlp.process.PTBTokenizer -preserveLines -lowerCase /absolute_path_to/tmpWS5p0Z' ], shell=True)
Python specify popen working directory via argument
Python subprocess.Popen() error (No such file or directory)
Just in case it might help someone:
I was struggling with the same problem (same https://github.com/tylin/coco-caption code). Might be relevant to say that I was running the code with python 3.7 on CentOS using qsub. So I changed
cmd = ['java', '-cp', 'stanford-corenlp-3.4.1.jar', 'edu.stanford.nlp.process.PTBTokenizer', '-preserveLines', '-lowerCase', 'tmpWS5p0Z']
to
cmd = ['/abs/path/to/java -cp /abs/path/to/stanford-corenlp-3.4.1.jar edu.stanford.nlp.process.PTBTokenizer -preserveLines -lowerCase ', ' /abs/path/to/temporary_file']
Using absolute paths fixed the OSError: [Errno 2] No such file or directory. Notice that I still put '/abs/path/to/temporary_file' as second element in the cmd list, because it got added later on. But then something went wrong in the tokenizer java subprocess, I don't know why or what, just observing because:
p_tokenizer = subprocess.Popen(cmd, cwd=path_to_jar_dirname, stdout=subprocess.PIPE, shell=True)
token_lines = p_tokenizer.communicate(input=sentences.rstrip())[0]
Here token_lines was an empty list (which is not the wanted behavior). Executing this in IPython resulted in the following (just the subprocess.Popen(..., not the communicate).
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.IOException: Input/output error
at edu.stanford.nlp.process.PTBTokenizer.getNext(PTBTokenizer.java:278)
at edu.stanford.nlp.process.PTBTokenizer.getNext(PTBTokenizer.java:163)
at edu.stanford.nlp.process.AbstractTokenizer.hasNext(AbstractTokenizer.java:55)
at edu.stanford.nlp.process.PTBTokenizer.tokReader(PTBTokenizer.java:444)
at edu.stanford.nlp.process.PTBTokenizer.tok(PTBTokenizer.java:416)
at edu.stanford.nlp.process.PTBTokenizer.main(PTBTokenizer.java:760)
Caused by: java.io.IOException: Input/output error
at java.base/java.io.FileInputStream.readBytes(Native Method)
at java.base/java.io.FileInputStream.read(FileInputStream.java:279)
at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:290)
at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:351)
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185)
at java.base/java.io.BufferedReader.read1(BufferedReader.java:210)
at java.base/java.io.BufferedReader.read(BufferedReader.java:287)
at edu.stanford.nlp.process.PTBLexer.zzRefill(PTBLexer.java:24511)
at edu.stanford.nlp.process.PTBLexer.next(PTBLexer.java:24718)
at edu.stanford.nlp.process.PTBTokenizer.getNext(PTBTokenizer.java:276)
... 5 more
Again, I don't know why or what, but I just wanted to share that doing this fixed it:
cmd = ['/abs/path/to/java -cp /abs/path/to/stanford-corenlp-3.4.1.jar edu.stanford.nlp.process.PTBTokenizer -preserveLines -lowerCase /abs/path/to/temporary_file']
And changing cmd.append(os.path.join(path_to_jar_dirname, os.path.basename(tmp_file.name))) into cmd[0] += os.path.join(path_to_jar_dirname, os.path.basename(tmp_file.name)).
So making cmd into a list with only 1 element, containing the entire command with absolute paths at once. Thanks for your help!
As #Lars mentioned above the issue I had was that I Java wasn't installed. Solved it with:
sudo apt update
sudo apt install default-jdk
sudo apt install default-jre
Making this post since I had this issue twice (due to reinstallation problems) and forgot about it.
My Crontab -l
# m h dom mon dow command
SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
00 8,20 * * * python /home/tomi/amaer/controller.py >>/tmp/out.txt 2>&1
My controller.py has config file settings.cfg also it uses other script in the folder it's located (I chmoded only controller.py)
The error
1;31mIOError^[[0m: [Errno 2] No such file or directory: 'settings.cfg'
I have no idea how to fix this? Please help me?
Edit: The part that read the config file
def main():
config=ConfigParser.ConfigParser()
config.readfp(open("settings.cfg"),"r")
As I initially wrote in my comment, this is because you are using relative path to the current working directory. However, that is not going to be the same when running all this via the cron executable rather than the python interpreter directly via the shebang.
Your current code would look for the "settings.cfg" in the current working directory which is where the cron executable resides, and not your script. Hence, you would need to change your code logic to using absolute paths by the help of the "os" built-in standard module.
Try to following line:
import os
...
def main():
config = ConfigParser.ConfigParser()
scriptDirectory = os.path.dirname(os.path.realpath(__file__))
settingsFilePath = os.path.join(scriptDirectory, "settings.cfg")
config.readfp(open(settingsFilePath,"r"))
This will get your the path of your script and then appends the "settings.cfg" with the appropriate dir separator for your operating system which is Linux in this particular case.
If the location of the config file changes any time in the future, you could use the argparse module for processing a command line argument to handle the config location properly, or even without it simply just using the first argument after the script name like sys.argv[1].
Your code is looking for settings.cfg in its current working directory.
This working directory will not be the same when cron executes the job, hence the error
You have two "easy" solutions:
Use an absolute path to the config file in your script (/home/tomi/amaer/config.cfg)
CD to the appropriate directory first in your crontab (cd /home/tomi/amaer/ && python /home/tomi/amaer/controller.py)
The "right" solution, though, would be to pass your script a parameter (or environment variable) that tells it where to look for the config file.
It's not exactly good practice to assume your config file will always be lying just next to your script.
You might want to have alook at this question: https://unix.stackexchange.com/questions/38951/what-is-the-working-directory-when-cron-executes-a-job