I am using autobahn[twisted] to achieve some WAMP communication. While subscribing to a topic a getting feed from it i print it. When i do it i get something like this:
2016-09-25T21:13:29+0200 (u'USDT_ETH', u'12.94669009', u'12.99998074', u'12.90000334', u'0.00035594', u'18396.86929477', u'1422.19525455', 0, u'13.14200000', u'12.80000000')
I have sacrificed too many hours to take it out. And yes, i tested other things to print, it print without this timestamp. This is my code:
from twisted.internet.defer import inlineCallbacks
from autobahn.twisted.wamp import ApplicationSession, ApplicationRunner
class PushReactor(ApplicationSession):
#inlineCallbacks
def onJoin(self, details):
print "subscribed"
yield self.subscribe(self.onTick, u'ticker')
def onTick(self, *args):
print args
if __name__ == '__main__':
runner = ApplicationRunner(u'wss://api.poloniex.com', u'realm1')
runner.run(PushReactor)
How can i remove this timestamp?
Well, sys.stderr and sys.stdout are redirected to a twisted logger.
You need to change the logging format before running you app.
See: https://twistedmatrix.com/documents/15.2.1/core/howto/logger.html
How to reproduce
You can reproduce your problem with this simple application:
from autobahn.twisted.wamp import ApplicationRunner
if __name__ == '__main__':
print("hello1")
runner = ApplicationRunner(u'wss://api.poloniex.com', u'realm1')
print("hello2")
runner.run(None)
print("hello3")
When the process is killed, you'll see:
hello1
hello2
2016-09-26T14:08:13+0200 Received SIGINT, shutting down.
2016-09-26T14:08:13+0200 Main loop terminated.
2016-09-26T14:08:13+0200 hello3
During application launching, stdout (and stderr) are redirected to a file-like object (of class twisted.logger._io.LoggingFile).
Every call to print or write are changed in twister log messages (one for each line).
The redirection is done in the class twisted.logger._global.LogBeginner, look at the beginLoggingTo method.
Related
tl,dr: How can I programmably execute a python module (not function) as a separate process from a different python module?
On my development laptop, I have a 'server' module containing a bottle server. In this module, the name==main clause starts the bottle server.
#bt_app.post("/")
def server_post():
<< Generate response to 'http://server.com/' >>
if __name__ == '__main__':
serve(bt_app, port=localhost:8080)
I also have a 'test_server' module containing pytests. In this module, the name==main clause runs pytest and displays the results.
def test_something():
_rtn = some_server_function()
assert _rtn == desired
if __name__ == '__main__':
_rtn = pytest.main([__file__])
print("Pytest returned: ", _rtn)
Currently, I manually run the server module (starting the web server on localhost), then I manually start the pytest module which issues html requests to the running server module and checks the responses.
Sometimes I forget to start the server module. No big deal but annoying. So I'd like to know if I can programmatically start the server module as a separate process from the pytest module (just as I'm doing manually now) so I don't forget to start it manually.
Thanks
There is my test cases dir tree:
test
├── server.py
└── test_server.py
server.py start a web server with flask.
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello, World!'
if __name__ == '__main__':
app.run()
test_server.py make request to test.
import sys
import requests
import subprocess
import time
p = None # server process
def start_server():
global p
sys.path.append('/tmp/test')
# here you may want to do some check.
# whether the server is already started, then pass this fucntion
kwargs = {} # here u can pass other args needed
p = subprocess.Popen(['python','server.py'], **kwargs)
def test_function():
response = requests.get('http://localhost:5000/')
print('This is response body: ', response.text)
if __name__ == '__main__':
start_server()
time.sleep(3) # waiting server started
test_function()
p.kill()
Then you can do python test_server to start the server and do test cases.
PS: Popen() needs python3.5+. if older version, use run instead
import logging
import threading
import time
def thread_function(name):
logging.info("Thread %s: starting", name)
time.sleep(2)
logging.info("Thread %s: finishing", name)
if __name__ == "__main__":
format = "%(asctime)s: %(message)s"
logging.basicConfig(format=format, level=logging.INFO,
datefmt="%H:%M:%S")
threads = list()
for index in range(3):
logging.info("Main : create and start thread %d.", index)
x = threading.Thread(target=thread_function, args=(index,))
threads.append(x)
x.start()
for index, thread in enumerate(threads):
logging.info("Main : before joining thread %d.", index)
thread.join()
logging.info("Main : thread %d done", index)
With threading you can run multiple processes at once!
Wim baasically answered this question. I looked into the subprocess module. While reading up on it, I stumbled on the os.system function.
In short, subprocess is a highly flexible and functional program for running a program. os.system, on the other hand, is much simpler, with far fewer functions.
Just running a python module is simple, so I settled on os.system.
import os
server_path = "python -m ../src/server.py"
os.system(server_path)
Wim, thanks for the pointer. Had it been a full fledged answer I would have upvoted it. Redo it as a full fledged answer and I'll do so.
Async to the rescue.
import gevent
from gevent import monkey, spawn
monkey.patch_all()
from gevent.pywsgi import WSGIServer
#bt_app.post("/")
def server_post():
<< Generate response to 'http://server.com/' >>
def test_something():
_rtn = some_server_function()
assert _rtn == desired
print("Pytest returned: ",_rtn)
sleep(0)
if __name__ == '__main__':
spawn(test_something) #runs async
server = WSGIServer(("0.0.0.0", 8080, bt_app)
server.serve_forever()
i am trying to write a spider with multiprocessing module
here is my python code:
# -*- coding:utf-8 -*-
import multiprocessing
import requests
class SpiderWorker(object):
def __init__(self, q):
self._q = q
def run(self):
def _crawl_item(url):
requests.get("http://www.baidu.com")
if respon.ok:
print respon.url
while True:
rst = self._q.get()
_crawl_item(rst)
def general_worker():
q = multiprocessing.Queue()
CPU_COUNT = multiprocessing.cpu_count()
worker_processes = [
multiprocessing.Process(target=SpiderWorker(q).run)
for i in range(CPU_COUNT)
]
map( lambda process: process.start(), worker_processes )
return q, worker_processes
maybe it is my process way wrong
every time i run this code, my process tell me
<Process(Process-1, stopped[SIGSEGV])>
hope love it
The major problem here is that you don't have any information on why your processes fail. It could be gevent, but it could just as easily be something else. So learning the actual reason why your processes get terminated is the first step before doing anything else.
What you need is multiprocessing.log_to_stderr():
class SpiderWorker(object):
# ...
def run(self):
logger = multiprocessing.log_to_stderr()
logger.setLevel(multiprocessing.SUBDEBUG)
try:
# Here goes your original run() code
except Exception:
logger.exception('whoopsie')
What this code does:
Creates a special logger which will transmit it's information to the main process and dump it to stderr (console by default).
Configures this logger to report everything, including some internal multiprocessing module events (just in case as you probably don't need them).
Wraps your entire code in catch-all statement so whatever happens there cannot escape your notice.
Runs .exception() method on the logger, which not only logs the message (it's meaningless anyway as we don't know what actually happens) but most importantly logs the entire error traceback - which we actually need.
I'm new to Twisted and after finally figuring out how the deferreds work I'm struggling with the tasks. What I want to achieve is to have a script that sends a REST request in a loop, however if at some point it fails I want to stop the loop. Since I'm using callbacks I can't easily catch exceptions and because I don't know how to stop the looping from an errback I'm stuck.
This is the simplified version of my code:
def send_request():
agent = Agent(reactor)
req_result = agent.request('GET', some_rest_link)
req_result.addCallbacks(cp_process_request, cb_process_error)
if __name__ == "__main__":
list_call = task.LoopingCall(send_request)
list_call.start(2)
reactor.run()
To end a task.LoopingCall all you need to do is call the stop on the return object (list_call in your case).
Somehow you need to make that var available to your errback (cb_process_error) either by pushing it into a class that cb_process_error is in, via some other class used as a pseudo-global or by literally using a global, then you simply call list_call.stop() inside the errback.
BTW you said:
Since I'm using callbacks I can't easily catch exceptions
Thats not really true. The point of an errback to to deal with exceptions, thats one of the things that literally causes it to be called! Check out my previous deferred answer and see if it makes errbacks any clearer.
The following is a runnable example (... I'm not saying this is the best way to do it, just that it is a way...)
#!/usr/bin/python
from twisted.internet import task
from twisted.internet import reactor
from twisted.internet.defer import Deferred
from twisted.web.client import Agent
from pprint import pprint
class LoopingStuff (object):
def cp_process_request(self, return_obj):
print "In callback"
pprint (return_obj)
def cb_process_error(self, return_obj):
print "In Errorback"
pprint(return_obj)
self.loopstopper()
def send_request(self):
agent = Agent(reactor)
req_result = agent.request('GET', 'http://google.com')
req_result.addCallbacks(self.cp_process_request, self.cb_process_error)
def main():
looping_stuff_holder = LoopingStuff()
list_call = task.LoopingCall(looping_stuff_holder.send_request)
looping_stuff_holder.loopstopper = list_call.stop
list_call.start(2)
reactor.callLater(10, reactor.stop)
reactor.run()
if __name__ == '__main__':
main()
Assuming you can get to google.com this will fetch pages for 10 seconds, if you change the second arg of the agent.request to something like http://127.0.0.1:12999 (assuming that port 12999 will give a connection refused) then you'll see 1 errback printout (which will have also shutdown the loopingcall) and have a 10 second wait until the reactor shuts down.
The problem is, that Twisted doesn't seem to ever send anything until you close the connection. The problem is visible both on my client and firefox(the server isn't sending).
Here's the full code.
#!/usr/bin/env python
#-*- coding: utf-8 -*-
from twisted.internet.protocol import Protocol,Factory
from twisted.internet.endpoints import TCP4ServerEndpoint,TCP4ClientEndpoint
from twisted.internet import reactor
import thread
class echoProtocol(Protocol):
def dataReceived(self,data):
self.transport.write(data+"\n - Server")
class echoFactory(Factory):
def buildProtocol(self,addr):
print addr.host
return echoProtocol()
class clientProtocol(Protocol):
def sendMessage(self,message):
self.transport.write(message)
def dataReceived(self,data):
print data
class clientFactory(Factory):
def buildProtocol(self,addr):
return clientProtocol()
def messageLoop(p):
while 1 :
text=raw_input("")
p.sendMessage(text)
def connectedProtocol(p):
thread.start_new_thread(messageLoop, p)
if __name__ == '__main__':
choice=raw_input("Server?[y/n]")
if choice.lower()=="y":
TCP4ServerEndpoint(reactor,44554).listen(echoFactory())
reactor.run()
else:
TCP4ClientEndpoint(reactor,"127.0.0.1",44554).connect(clientFactory()).addCallback(connectedProtocol)
reactor.run()
How do I make Twisted actually send something before closing the connection?
Punching in ctrl-c in your looping callback shows the problem. Your protocol is stuck in "write" mode and can never get to the dataReceived section until after it leaves the callback.
Is there any reason you can't follow the default echo client example? You also don't have reactor.stop called anywhere.
The primary problem is a misunderstanding of the deferred concept. You block inside the while loop, which means you never get to the dataReceived. But if you don't loop, how do you continue sending data? You need to add another deferred within your current deferred.
Notice in the code for the single use client how the callback gotProtocol adds another message to the reactor for calling later, then adds a closing callback. You need to make a recursive callback setup.
Here's your code, set up to recursively chain an additional callback as needed. It also has a shutdown function for the errback chain. You should add some code to check the contents of raw_input and attach a closing callback if something like quit is input. Otherwise it loops forever, unless the user hits it with ctrl-c.
#!/usr/bin/env python
#-*- coding: utf-8 -*-
from twisted.internet.protocol import Protocol,Factory
from twisted.internet.endpoints import TCP4ServerEndpoint,TCP4ClientEndpoint
from twisted.internet import reactor
import thread
class echoProtocol(Protocol):
def dataReceived(self,data):
self.transport.write(data+"\n - Server")
class echoFactory(Factory):
def buildProtocol(self,addr):
print addr.host
return echoProtocol()
class clientProtocol(Protocol):
def sendMessage(self,message):
self.transport.write(message)
def dataReceived(self,data):
print data
class clientFactory(Factory):
def buildProtocol(self,addr):
return clientProtocol()
def messageLoop(p):
text=raw_input("")
p.sendMessage(text)
reactor.callLater(1, messageLoop, p)
def connectedProtocol(p):
thread.start_new_thread(messageLoop, p)
def shutdown(ignored):
reactor.stop()
if __name__ == '__main__':
choice=raw_input("Server?[y/n]")
if choice.lower()=="y":
TCP4ServerEndpoint(reactor,44554).listen(echoFactory())
reactor.run()
else:
TCP4ClientEndpoint(reactor,"127.0.0.1",44554).connect(clientFactory()).addCallback(messageLoop).addErrback(shutdown)
reactor.run()
I suspect that you are reading lines but not sending lines. In this circumstance the read blocks until it gets a newline or EOS. If you never send an EOL you will get one big line when you close the socket.
What is connectedProtocol? Nothing in your sample uses it, but it sits there evoking images of nasty thread-related bugs in your actual application.
Apart from that, writeSomeData is the wrong method to call. Try write instead.
I am working on a daemon where I need to embed a HTTP server. I am attempting to do it with BaseHTTPServer, which when I run it in the foreground, it works fine, but when I try and fork the daemon into the background, it stops working. My main application continues to work, but BaseHTTPServer does not.
I believe this has something to do with the fact that BaseHTTPServer sends log data to STDOUT and STDERR. I am redirecting those to files. Here is the code snippet:
# Start the HTTP Server
server = HTTPServer((config['HTTPServer']['listen'],config['HTTPServer']['port']),HTTPHandler)
# Fork our process to detach if not told to stay in foreground
if options.foreground is False:
try:
pid = os.fork()
if pid > 0:
logging.info('Parent process ending.')
sys.exit(0)
except OSError, e:
sys.stderr.write("Could not fork: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
# Second fork to put into daemon mode
try:
pid = os.fork()
if pid > 0:
# exit from second parent, print eventual PID before
print 'Daemon has started - PID # %d.' % pid
logging.info('Child forked as PID # %d' % pid)
sys.exit(0)
except OSError, e:
sys.stderr.write("Could not fork: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
logging.debug('After child fork')
# Detach from parent environment
os.chdir('/')
os.setsid()
os.umask(0)
# Close stdin
sys.stdin.close()
# Redirect stdout, stderr
sys.stdout = open('http_access.log', 'w')
sys.stderr = open('http_errors.log', 'w')
# Main Thread Object for Stats
threads = []
logging.debug('Kicking off threads')
while ...
lots of code here
...
server.serve_forever()
Am I doing something wrong here or is BaseHTTPServer somehow prevented from becoming daemonized?
Edit: Updated code to demonstrate the additional, previously missing code flow and that log.debug shows in my forked, background daemon I am hitting code after fork.
After a bit of googling I finally stumbled over this BaseHTTPServer documentation and after that I ended up with:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
from SocketServer import ThreadingMixIn
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
"""Handle requests in a separate thread."""
server = ThreadedHTTPServer((config['HTTPServer']['listen'],config['HTTPServer']['port']), HTTPHandler)
server.serve_forever()
Which for the most part comes after I fork and ended up resolving my problem.
Here's how to do this with the python-daemon library:
from BaseHTTPServer import (HTTPServer, BaseHTTPRequestHandler)
import contextlib
import daemon
from my_app_config import config
# Make the HTTP Server instance.
server = HTTPServer(
(config['HTTPServer']['listen'], config['HTTPServer']['port']),
BaseHTTPRequestHandler)
# Make the context manager for becoming a daemon process.
daemon_context = daemon.DaemonContext()
daemon_context.files_preserve = [server.fileno()]
# Become a daemon process.
with daemon_context:
server.serve_forever()
As usual for a daemon, you need to decide how you will interact with the program after it becomes a daemon. For example, you might register a systemd service, or write a PID file, etc. That's all outside the scope of the question though.
In particular, it's outside the scope of the question to ask: once it's become a daemon process (necessarily detached from any controlling terminal), how do I stop the daemon process? That's up to you to decide, as part of defining the program's behaviour.
You start by instantiating a HTTPServer. But you don't actually tell it to start serving in any of the supplied code. In your child process try calling server.serve_forever().
See this for reference
A simple solution that worked for me was to override the BaseHTTPRequestHandler method log_message(), so we prevent any kind of writing in stdout and avoid problems when demonizing.
class CustomRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def log_message(self, format, *args):
pass
...
rest of custom class code
...
Just use daemontools or some other similar script instead of rolling your own daemonizing process. It is much better to keep this off your script.
Also, your best option: Don't use BaseHTTPServer. It is really bad. There are many good HTTP servers for python, i.e. cherrypy or paste. Both includes ready-to-use daemonizing scripts.
Since this has solicited answers since I originally posted, I thought that I'd share a little info.
The issue with the output has to do with the fact that the default handler for the logging module uses the StreamHandler. The best way to handle this is to create your own handlers. In the case where you want to use the default logging module, you can do something like this:
# Get the default logger
default_logger = logging.getLogger('')
# Add the handler
default_logger.addHandler(myotherhandler)
# Remove the default stream handler
for handler in default_logger.handlers:
if isinstance(handler, logging.StreamHandler):
default_logger.removeHandler(handler)
Also at this point I have moved to using the very nice Tornado project for my embedded http servers.