How to close a socket connection during pipe usage in Python

How to close a socket connection during pipe usage in Python - python

I am in such a scenario:
my script opens a socket connection to a remote database,
get an iterator (returned from executing a SQL statement against the remote database),
iterate the iterator and print the values
Since it is often that the SQL returns a lot of records, I often use the script in pipe fashion, i.e. python script.py | head -10. Of course I would have Socket error when I pipe it to head. The fix to the exception is:
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE,SIG_DFL)
My question is, in the case of pipe (e.g. head -5), does the database socket connection got closed automatically or properly? If not, how do I close it in my script for the case of pipe usage.
The code structure would look like this:
def getIter(n, conn):
for i in xrange(n):
yield i
def p(l):
for x in l:
print x
if __name__ == '__main__':
# dbms_socket_conn.open()
# get iterator
ii = getIter(100, conn=None)
p(ii)
print "is the dbms connection got closed in case of pipe (e.g. head -5) ?"
# dbms_socket_conn.close()

Yo could do something like:
try:
# dbms_socket_conn.open()
# get iterator
ii = getIter(100, conn=None)
p(ii)
print "is the dbms connection got closed in case of pipe (e.g. head -5) ?"
finally:
# dbms_socket_conn.close()
to ensure that your code execute the close method
EDIT:
The "head -5" only show the first 5 lines of the output but that doesn't mean that the script stops. If you try this:
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE,SIG_DFL)
def getIter(n, conn):
for i in xrange(n):
yield i
def p(l):
for x in l:
print x
if __name__ == '__main__':
try:
ii = getIter(100, conn=None)
p(ii)
finally:
with open("test.txt","w") as f:
f.write("asdasdasdasd")
You'll see that the files is written, that means the finally clause is executed. You should put there the dbms_socket_conn.close()

Related

Passing data between separately running Python scripts

If I have a python script running (with full Tkinter GUI and everything) and I want to pass the live data it is gathering (stored internally in arrays and such) to another python script, what would be the best way of doing that?
I cannot simply import script A into script B as it will create a new instance of script A, rather than accessing any variables in the already running script A.
The only way I can think of doing it is by having script A write to a file, and then script B get the data from the file. This is less than ideal however as something bad might happen if script B tries to read a file that script A is already writing in. Also I am looking for a much faster speed to communication between the two programs.
EDIT:
Here are the examples as requested. I am aware why this doesn't work, but it is the basic premise of what needs to be achieved. My source code is very long and unfortunately confidential, so it is not going to help here. In summary, script A is running Tkinter and gathering data, while script B is views.py as a part of Django, but I'm hoping this can be achieved as a part of Python.
Script A
import time
i = 0
def return_data():
return i
if __name__ == "__main__":
while True:
i = i + 1
print i
time.sleep(.01)
Script B
import time
from scriptA import return_data
if __name__ == '__main__':
while True:
print return_data() # from script A
time.sleep(1)

you can use multiprocessing module to implement a Pipe between the two modules. Then you can start one of the modules as a Process and use the Pipe to communicate with it. The best part about using pipes is you can also pass python objects like dict,list through it.
Ex:
mp2.py:
from multiprocessing import Process,Queue,Pipe
from mp1 import f
if __name__ == '__main__':
parent_conn,child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print(parent_conn.recv()) # prints "Hello"
mp1.py:
from multiprocessing import Process,Pipe
def f(child_conn):
msg = "Hello"
child_conn.send(msg)
child_conn.close()

If you wanna read and modify shared data, between 2 scripts, which run separately, a good solution is, take advantage of the python multiprocessing module, and use a Pipe() or a Queue() (see differences here). This way, you get to sync scripts, and avoid problems regarding concurrency and global variables (like what happens if both scripts wanna modify a variable at the same time).
As Akshay Apte said in his answer, the best part about using pipes/queues, is that you can pass python objects through them.
Also, there are methods to avoid waiting for data, if there hasn't been any passed yet (queue.empty() and pipeConn.poll()).
See an example using Queue() below:
# main.py
from multiprocessing import Process, Queue
from stage1 import Stage1
from stage2 import Stage2
s1= Stage1()
s2= Stage2()
# S1 to S2 communication
queueS1 = Queue() # s1.stage1() writes to queueS1
# S2 to S1 communication
queueS2 = Queue() # s2.stage2() writes to queueS2
# start s2 as another process
s2 = Process(target=s2.stage2, args=(queueS1, queueS2))
s2.daemon = True
s2.start() # Launch the stage2 process
s1.stage1(queueS1, queueS2) # start sending stuff from s1 to s2
s2.join() # wait till s2 daemon finishes
# stage1.py
import time
import random
class Stage1:
def stage1(self, queueS1, queueS2):
print("stage1")
lala = []
lis = [1, 2, 3, 4, 5]
for i in range(len(lis)):
# to avoid unnecessary waiting
if not queueS2.empty():
msg = queueS2.get() # get msg from s2
print("! ! ! stage1 RECEIVED from s2:", msg)
lala = [6, 7, 8] # now that a msg was received, further msgs will be different
time.sleep(1) # work
random.shuffle(lis)
queueS1.put(lis + lala)
queueS1.put('s1 is DONE')
# stage2.py
import time
class Stage2:
def stage2(self, queueS1, queueS2):
print("stage2")
while True:
msg = queueS1.get() # wait till there is a msg from s1
print("- - - stage2 RECEIVED from s1:", msg)
if msg == 's1 is DONE ':
break # ends loop
time.sleep(1) # work
queueS2.put("update lists")
EDIT: just found that you can use queue.get(False) to avoid blockage when receiving data. This way there's no need to check first if the queue is empty. This is no possible if you use pipes.

You could use the pickling module to pass data between two python programs.
import pickle
def storeData():
# initializing data to be stored in db
employee1 = {'key' : 'Engineer', 'name' : 'Harrison',
'age' : 21, 'pay' : 40000}
employee2 = {'key' : 'LeadDeveloper', 'name' : 'Jack',
'age' : 50, 'pay' : 50000}
# database
db = {}
db['employee1'] = employee1
db['employee2'] = employee2
# Its important to use binary mode
dbfile = open('examplePickle', 'ab')
# source, destination
pickle.dump(db, dbfile)
dbfile.close()
def loadData():
# for reading also binary mode is important
dbfile = open('examplePickle', 'rb')
db = pickle.load(dbfile)
for keys in db:
print(keys, '=>', db[keys])
dbfile.close()

This will pass data to and from two running scripts using TCP host socket. https://zeromq.org/languages/python/. required module zmq: use( pip install zmq ).
This this is called a client server communication. The server will wait for the client to send a request. The client will also not run if the server is not running. In addition, this client server communication allows for you to send a request from one device(client) to another device(server), as long as the client and server are on the same network and you change localhost (localhost for the server is marked with: * )to the actual IP of your device(server)( IP help( go into your device network settings, click on your network icon, find advanced or properties, look for IP address. note this may be different from going to google and asking for your ip. I am using IPV6 so. DDOS protection.)) Change the localhost IP of the client to the server IP. QUESTION to OP. Do you have to have script b always running or can script b be imported as a module to script a? If so look up how to make python modules.

I solved the same problem using the lib Shared Memory Dict, it's a very simple dict implementation of multiprocessing.shared_memory.
Source1.py
from shared_memory_dict import SharedMemoryDict
from time import sleep
smd_config = SharedMemoryDict(name='config', size=1024)
if __name__ == "__main__":
smd_config["status"] = True
while True:
smd_config["status"] = not smd_config["status"]
sleep(1)
Source2.py
from shared_memory_dict import SharedMemoryDict
from time import sleep
smd_config = SharedMemoryDict(name='config', size=1024)
if __name__ == "__main__":
while True:
print(smd_config["status"])
sleep(1)

How can I read arguments by a program running in the background?

Example: A simple program that prints the value of a list every 10 seconds
import argparse
import time
import sys
myList = []
def parseArguments():
parser = argparse.ArgumentParser(description="example")
parser.add_argument('-a', '--addElement', help='adds an element to the list')
args = parser.parse_args()
if args.addElement:
myList.append(args.addElement)
def main():
parseArguments()
while(True):
print(myList)
time.sleep(10)
The problem is that the program only reads the arguments passed at the start, I want it to read arguments passed at any time while it is running.
I want to run the program in the background like a service, and pass arguments to the program every once in a while.

I understand that what you are asking for looks like a service (or daemon process) able to accept asynchonous commands.
External interface:
prog foo
=> ok repeatedly prints ['foo']
later:
prog bar
=> second instance exits and first instance repeatedly prints ['foo', 'bar']
Internal design
That's far from being simple! You need to setup an IPC mechanisme to allow second instance to communicate with first one, with non blocking IO (or multithreading) in first instance. Under Unix, you could use os.mkfifo, but is you want a portable solution, your will have to use IP sockets on localhost
Structure in high level pseudo code
get argument via argparse
bind to a fix port on localhost, in UDP protocol
if success:
# ok it is the first prog
initialize list from argument
loop:
get command from UDP socket, with timeout = 10s
if cmd is add param:
add parameter to list
elif cmd is exit: # not asked in question but should exist
exit
print list
else:
# another prog has taken the socket, pass it the arg
send the arg to the UDP port with proper protocol
Caveats on this simple design: there is a race condition is there is already a prog waiting on the socket that exits between the first try to bind and the send. To deal with that, you should use TCP protocol, with a select with timeout on listening socket, and a graceful shutdown to ensure that the message was received on the other side. In case of an error, you iterate (a maximum number of time) because the first server could have exited in the while.
Here is an implementation example:
import socket
import select
import argparse
import time
import sys
TIMEOUT=10
IFACE='127.0.0.1'
PORT=4000
DEBUG=False
myList = []
old = ""
def parseArguments():
parser = argparse.ArgumentParser(description="example")
parser.add_argument('-a', '--addElement',
help='adds an element to the list')
parser.add_argument('-q', '--quit', action='store_true',
help='closes main service')
parser.add_argument('-d', '--debug', action='store_true',
help='display debug information')
args = parser.parse_args()
if args.quit:
senddata("QUIT\n")
sys.exit(0)
if args.debug:
DEBUG=True
if args.addElement:
myList.append(args.addElement)
def read(s):
global old
data = old
while True:
block = s.recv(1024)
if len(block) == 0: return data
if b'\n' in block:
block,o = block.split(b'\n', 1)
old = o.decode()
data += block.decode()
return data
data += block.decode()
def gracefulclose(s, msg):
s.send(msg.encode())
s.shutdown(socket.SHUT_WR)
try:
read(s)
finally:
s.close()
def server(s):
if DEBUG:
print("SERVER")
s.listen(5)
while True:
sl = select.select([s], [], [], TIMEOUT)
if len(sl[0]) > 0:
s2, peer = s.accept()
try:
data = read(s2)
print(data)
gracefulclose(s2, "OK")
finally:
s2.close()
if data.startswith("QUIT"):
return
elif data.startswith("DATA:"):
myList.append(data[5:])
print(myList)
def senddata(data):
s = socket.socket(socket.AF_INET)
try:
s.connect((IFACE, PORT))
s.send(data.encode())
data = read(s)
if (data.startswith("OK")):
return True
except:
pass
finally:
s.close()
return False
def client():
return senddata("DATA:" + myList[0] + "\n")
def main():
end = False
MAX = 5
while not end and MAX > 0:
s = socket.socket(socket.AF_INET)
try:
s.bind((IFACE, PORT))
except Exception:
s.close()
s = None
if s:
try:
server(s)
finally:
s.close()
return
else:
if DEBUG:
print("CLIENT", " ", 6 - MAX)
end = client()
MAX -= 1
time.sleep(1)
if __name__ == "__main__":
parseArguments()
main()

import argparse
import time
import sys
myList = []
def parseArguments():
parser = argparse.ArgumentParser(description="example")
parser.add_argument('-a', '--addElement', help='adds an element to the list')
args = parser.parse_args()
if args.addElement:
myList.append(args.addElement)
def main():
parseArguments()
import select
while(True):
while select.select([sys.stdin], [], [], 0)[0]:
myList.append(sys.stdin.readline().strip())
print(myList)
time.sleep(10)
If you are passing more arguments during execution, you must read them from the stdin. Using the select module you can check if there is any new line in stdin and then add them to myList.

Basically what you're asking is how to do Inter-process communication (IPC).
Why did I say that? Well, answer yourself: how would you like to pass these arguments to your background service? By hand? I don't think so (because that way you'd have a simple interactive program which should just wait for user input). You probably want some other script/program which sends these arguments via some kind of commands on-demand.
Generally there are several several ways to communicate two or more programs, the most popular being:
Shared file - you could simply check contents of a file on your disk. Advantage of this solution is that you could probably edit this file with your favourite text editor, without the need of writing a client application.
Pipes - one program reads its input which is the other program's output. You should simply read sys.stdin.
# receiver
def read_input():
for l in sys.stdin:
yield l
Sockets - a data stream sent over a network interface (but it can be sent locally on the same machine). Python docs have very nice introduction to sockets programming.
Shared memory - your programs read/write the same memory block. In Python you can use mmap module to achieve this.
Whichever way to communicate your processes you choose, you should establish some kind of interface between them. It can be very simple text-based interface like this one:
# command syntax
<command> SPACE <parameter> NEWLINE
SPACE := 0x20 # space character
NEWLINE := 0x0A # '\n' character
# a command adding element to receiver's list
ADD SPACE <element> NEWLINE
# a command removing element from receiver's list:
REMOVE SPACE <element> NEWLINE
# examples:
ADD first element\n
REMOVE first element\n
So for example if you send a message over a socket (which I recommend), your receiver (server) should read a buffer until a newline character, then check if the first word is "ADD" and then add remaining characters (minus newline) to your list. Of course you should be prepared for some kind of "attacks" - like you should specify that your messages cannot be longer than e.g. 4096 bytes. This way you can discard your current buffer after it reached its limitation, meaning that you won't allocate memory indefinitely while waiting for a newline character. That's one very important rule: don't trust user input.
Good luck! :)

Python threads hang and don't close

This is my first try with threads in Python,
I wrote the following program as a very simple example. It just gets a list and prints it using some threads. However, Whenever there is an error, the program just hangs in Ubuntu, and I can't seem to do anything to get the control prompt back, so have to restart another SSH session to get back in.
Also have no idea what the issue with my program is.
Is there some kind of error handling I can put in to ensure it doesn't hang.
Also, any idea why ctrl/c doesn't work (I don't have a break key)
from Queue import Queue
from threading import Thread
import HAInstances
import logging
log = logging.getLogger()
logging.basicConfig()
class GetHAInstances:
def oraHAInstanceData(self):
log.info('Getting HA instance routing data')
# HAData = SolrGetHAInstances.TalkToOracle.main()
HAData = HAInstances.main()
log.info('Query fetched ' + str(len(HAData)) + ' HA Instances to query')
# for row in HAData:
# print row
return(HAData)
def do_stuff(q):
while True:
print q.get()
print threading.current_thread().name
q.task_done()
oraHAInstances = GetHAInstances()
mainHAData = oraHAInstances.oraHAInstanceData()
q = Queue(maxsize=0)
num_threads = 10
for i in range(num_threads):
worker = Thread(target=do_stuff, args=(q,))
worker.setDaemon(True)
worker.start()
for row in mainHAData:
#print str(row[0]) + ':' + str(row[1]) + ':' + str(row[2]) + ':' + str(row[3])i
q.put((row[0],row[1],row[2],row[3]))
q.join()

In your thread method, it is recommended to use the "try ... except ... finally". This structure guarantees to return the control to the main thread even when errors occur.
def do_stuff(q):
while True:
try:
#do your works
except:
#log the error
finally:
q.task_done()
Also, in case you want to kill your program, go find out the pid of your main thread and use kill #pid to kill it. In Ubuntu or Mint, use ps -Ao pid,cmd, in the output, you can find out the pid (first column) by searching for the command (second column) you yourself typed to run your Python script.

Your q is hanging because your worker as errored. So your q.task_done() never got called.
import threading
to use
print threading.current_thread().name

Remove threads usage from script

The next script I'm using is used to listen to IMAP connection using IMAP IDLE and depends heavily on threads. What's the easiest way for me to eliminate the treads call and just use the main thread?
As a new python developer I tried editing def __init__(self, conn): method but just got more and more errors
A code sample would help me a lot
#!/usr/local/bin/python2.7
print "Content-type: text/html\r\n\r\n";
import socket, ssl, json, struct, re
import imaplib2, time
from threading import *
# enter gmail login details here
USER="username#gmail.com"
PASSWORD="password"
# enter device token here
deviceToken = 'my device token x x x x x'
deviceToken = deviceToken.replace(' ','').decode('hex')
currentBadgeNum = -1
def getUnseen():
(resp, data) = M.status("INBOX", '(UNSEEN)')
print data
return int(re.findall("UNSEEN (\d)*\)", data[0])[0])
def sendPushNotification(badgeNum):
global currentBadgeNum, deviceToken
if badgeNum != currentBadgeNum:
currentBadgeNum = badgeNum
thePayLoad = {
'aps': {
'alert':'Hello world!',
'sound':'',
'badge': badgeNum,
},
'test_data': { 'foo': 'bar' },
}
theCertfile = 'certif.pem'
theHost = ('gateway.push.apple.com', 2195)
data = json.dumps(thePayLoad)
theFormat = '!BH32sH%ds' % len(data)
theNotification = struct.pack(theFormat, 0, 32,
deviceToken, len(data), data)
ssl_sock = ssl.wrap_socket(socket.socket(socket.AF_INET,
socket.SOCK_STREAM), certfile=theCertfile)
ssl_sock.connect(theHost)
ssl_sock.write(theNotification)
ssl_sock.close()
print "Sent Push alert."
# This is the threading object that does all the waiting on
# the event
class Idler(object):
def __init__(self, conn):
self.thread = Thread(target=self.idle)
self.M = conn
self.event = Event()
def start(self):
self.thread.start()
def stop(self):
# This is a neat trick to make thread end. Took me a
# while to figure that one out!
self.event.set()
def join(self):
self.thread.join()
def idle(self):
# Starting an unending loop here
while True:
# This is part of the trick to make the loop stop
# when the stop() command is given
if self.event.isSet():
return
self.needsync = False
# A callback method that gets called when a new
# email arrives. Very basic, but that's good.
def callback(args):
if not self.event.isSet():
self.needsync = True
self.event.set()
# Do the actual idle call. This returns immediately,
# since it's asynchronous.
self.M.idle(callback=callback)
# This waits until the event is set. The event is
# set by the callback, when the server 'answers'
# the idle call and the callback function gets
# called.
self.event.wait()
# Because the function sets the needsync variable,
# this helps escape the loop without doing
# anything if the stop() is called. Kinda neat
# solution.
if self.needsync:
self.event.clear()
self.dosync()
# The method that gets called when a new email arrives.
# Replace it with something better.
def dosync(self):
print "Got an event!"
numUnseen = getUnseen()
sendPushNotification(numUnseen)
# Had to do this stuff in a try-finally, since some testing
# went a little wrong.....
while True:
try:
# Set the following two lines to your creds and server
M = imaplib2.IMAP4_SSL("imap.gmail.com")
M.login(USER, PASSWORD)
M.debug = 4
# We need to get out of the AUTH state, so we just select
# the INBOX.
M.select("INBOX")
numUnseen = getUnseen()
sendPushNotification(numUnseen)
typ, data = M.fetch(1, '(RFC822)')
raw_email = data[0][1]
import email
email_message = email.message_from_string(raw_email)
print email_message['Subject']
#print M.status("INBOX", '(UNSEEN)')
# Start the Idler thread
idler = Idler(M)
idler.start()
# Sleep forever, one minute at a time
while True:
time.sleep(60)
except imaplib2.IMAP4.abort:
print("Disconnected. Trying again.")
finally:
# Clean up.
#idler.stop() #Commented out to see the real error
#idler.join() #Commented out to see the real error
#M.close() #Commented out to see the real error
# This is important!
M.logout()

As far as I can tell, this code is hopelessly confused because the author used the "imaplib2" project library which forces a threading model which this code then never uses.
Only one thread is ever created, which wouldn't need to be a thread but for the choice of imaplib2. However, as the imaplib2 documentation notes:
This module presents an almost identical API as that provided by the standard python library module imaplib, the main difference being that this version allows parallel execution of commands on the IMAP4 server, and implements the IMAP4rev1 IDLE extension. (imaplib2 can be substituted for imaplib in existing clients with no changes in the code, but see the caveat below.)
Which makes it appear that you should be able to throw out much of class Idler and just use the connection M. I recommend that you look at Doug Hellman's excellent Python Module Of The Week for module imaplib prior to looking at the official documentation. You'll need to reverse engineer the code to find out its intent, but it looks to me like:
Open a connection to GMail
check for unseen messages in Inbox
count unseen messages from (2)
send a dummy message to some service at gateway.push.apple.com
Wait for notice, goto (2)
Perhaps the most interesting thing about the code is that it doesn't appear to do anything, although what sendPushNotification (step 4) does is a mystery, and the one line that uses an imaplib2 specific service:
self.M.idle(callback=callback)
uses a named argument that I don't see in the module documentation. Do you know if this code ever actually ran?
Aside from unneeded complexity, there's another reason to drop imaplib2: it exists independently on sourceforge and PyPi which one maintainer claimed two years ago "An attempt will be made to keep it up-to-date with the original". Which one do you have? Which would you install?

Don't do it
Since you are trying to remove the Thread usage solely because you didn't find how to handle the exceptions from the server, I don't recommend removing the Thread usage, because of the async nature of the library itself - the Idler handles it more smoothly than a one thread could.
Solution
You need to wrap the self.M.idle(callback=callback) with try-except and then re-raise it in the main thread. Then you handle the exception by re-running the code in the main thread to restart the connection.
You can find more details of the solution and possible reasons in this answer: https://stackoverflow.com/a/50163971/1544154
Complete solution is here: https://www.github.com/Elijas/email-notifier

How do I delete an instance of a class in python (by force)?

I'm using a script to test if a website runs smoothly, basically I open the site every 20 minutes or so and check the response time and so on. Like this:
while True:
MechBrowser = mechanize.Browser()
Response = MechBrowser.open("http://example.com")
time.sleep(1000)
I know python will do garbage collection itself and we should really not bother, but when I check network monitor I always find several unclosed connection each running 1h or more. And not all of the connection opened would hang there, just some of them. I'm confused, or maybe there's a method to destroy these instances manually?

Try also closing your response object.

delthe object manually, notice that this will not delete the object but will just decrement the reference count of the object. When the reference countof an object reaches zero the garbage collector removes it from the memory.

You can also use multiprocessing to ensure all the used resources are closed after checking:
from multiprocessing import Process
import time
import urllib2
def check_url(url):
try:
f = urllib2.urlopen(url)
f.close()
print "%s working fine" % url
except Exception as exc:
print "Error ", exc
if __name__ == '__main__':
while True:
p = Process(target=check_url, args=("http://www.google.com", ))
p.start()
p.join()
time.sleep(5)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.