Python Requests library throws exceptions in logging

Python Requests library throws exceptions in logging - python

The Python requests library appears to have some rather strange quirks when it comes to its logging behaviour.
Using the latest Python 2.7.8, I have the following code:
import requests
import logging
logging.basicConfig(
filename='mylog.txt',
format='%(asctime)-19.19s|%(task)-36s|%(levelname)s:%(name)s: %(message)s',
level=eval('logging.%s' % 'DEBUG'))
logger = logging.getLogger(__name__)
logger.info('myprogram starting up...', extra={'task': ''}) # so far so good
...
(ommited code)
...
payload = {'id': 'abc', 'status': 'ok'}
# At this point the program continues but throws an exception.
requests.get('http://localhost:9100/notify', params=payload)
print 'Task is complete! NotifyURL was hit! - Exiting'
My program seems to exit normally, however inside the log file it creates (mylog.txt) I always find the following exception:
KeyError: 'task'
Logged from file connectionpool.py, line 362
If I remove this:
requests.get('http://localhost:9100/notify', params=payload)
then the exception is gone.
What exactly am I doing wrong here and how may I fix this?
I am using requests v2.4.3.

The problem is your custom logging format, where you expect a %(task).
Requests (or rather the bundled urllib3) does not include the task parameter when logging, as it has no way of knowing, that you expect this.

As indicated in t-8ch's answer, the logger is being used internally by the requests library and this library doesn't know anything about your parameters. A possible solution is to implant a custom filter in the library's logger (in this case, one of its modules):
class TaskAddingFilter(logging.Filter):
def __init__(self):
logging.Filter.__init__(self)
def filter(self, record):
record.args = record.args + ('task', '')
# ...
requestsLogger = logging.getLogger('requests.packages.urllib3.connectionpool')
requestsLogger.addFilter(TaskAddingFilter())
Potentially, you need to add such filtering to all loggers from requests, which are:
requests.packages.urllib3.util
requests.packages.urllib3.connectionpool
requests.packages
requests.packages.urllib3
requests.packages.urllib3.util.retry
requests
requests.packages.urllib3.poolmanager
in my version, you can find them using the logging.Logger.manager.loggerDict attribute. So, you could do something like this:
for name,logger in logging.Logger.manager.loggerDict.iteritems():
logger = logging.getLogger(name) # because of lazy initialization
if name.startswith('requests.'):
logger.addFilter(TaskAddingFilter())
The TaskAddingFilter can be a bit smarter of course, e.g. adding a particular task entry depending on where you are in your code. I've added only the simplest solution for the code you've provided - the exception doesn't occur anymore - but the wide range of possibilities seems obvious ;)

Related

Keyword 'Capture Page Screenshot' could not be run on failure: Cannot access execution context

How do I access execution context in my program to capture screenshot ?
The following program will fail since contain text does not exist.
from ExtendedSelenium2Library import ExtendedSelenium2Library
import logging
class firsttest():
def googleit(self):
self.use_url = 'https://google.ca'
self.use_browser = 'chrome'
s2l = ExtendedSelenium2Library()
s2l.open_browser(self.use_url, self.use_browser)
s2l.maximize_browser_window()
try:
# Should fail
s2l.page_should_contain('this text does not exist on page')
except:
logger.debug('failed')
runit = firsttest()
runit.googleit()
When I run this program get warning
WARNING - Keyword 'Capture Page Screenshot' could not be run on failure: Cannot access execution context

You have to use robot to execute the test, you can't just instantiate classes and expect them to work. They are designed to work only when run by robot.
If you need to write tests in python, there's no need to use ExtendedSeleniumLilbrary, you can just call the selenium API directly from python.

The problem likely stems from the fact that you did not write your python Library in the correct format for Robot Framework.
Here is the correct format for writing Python code in Robot Framework:
from robot.libraries.BuiltIn import BuiltIn
class ClickAnElement(object):
def __init__(self):
self.selenium_lib = BuiltIn().get_library_instance('ExtendedSelenium2Library')
def click_an_element(self, locator):
BuiltIn().click_element(locator)
How this works (I believe) is in Robot Framework you call this library in your *** Settings *** section with Library ClickAnElement.py. That activates the __init__ function. Then you can call the keywords like you would a keyword from Selenium2Library. So, if I were to re-write your posted code in the correct format, it would look as follows:
from robot.libraries.BuiltIn import BuiltIn
import logging
class FirstTest():
def __init__(self):
self.selenium_lib = BuiltIn().get_library_instance('ExtendedSelenium2Library')
def google_it(self):
self.use_url = 'https://google.ca'
self.use_browser = 'chrome'
s2l = ExtendedSelenium2Library()
s2l.open_browser(self.use_url, self.use_browser)
s2l.maximize_browser_window()
try:
# Should fail
s2l.page_should_contain('this text does not exist on page')
except:
logger.debug('failed')
Then, my .robot file would look as such:
*** Settings ***
Library FirstTest
*** Test Cases ***
Test Google It
Google It
You were writing a Python file to work outside of Robot Framework. If you want it to work inside of Robot Framework, you need to use the correct Library format.
Mind you, I'm only formatting your code, not testing it. I can't, since I don't have your application to test it on.

Python URLLib does not work with PyQt + Multiprocessing

A simple code as such:
import urllib2
import requests
from PyQt4 import QtCore
import multiprocessing
import time
data = (
['a', '2'],
)
def mp_worker((inputs, the_time)):
r = requests.get('http://www.gpsbasecamp.com/national-parks')
request = urllib2.Request("http://www.gpsbasecamp.com/national-parks")
response = urllib2.urlopen(request)
def mp_handler():
p = multiprocessing.Pool(2)
p.map(mp_worker, data)
if __name__ == '__main__':
mp_handler()
Basically, if i import PyQt4, and i have a urllib request (i believe this is used in almost all web extraction libraries such as BeautifulSoup, Requests or Pyquery. it crashes with a cryptic log on my MAC)

This is exactly True. It always fails on Mac, I have wasted rows of days just to fix this. And honestly there is no fix as of now. The best way is to use Thread instead of Process and it will work like a charm.
By the way -
r = requests.get('http://www.gpsbasecamp.com/national-parks')
and
request = urllib2.Request("http://www.gpsbasecamp.com/national-parks")
response = urllib2.urlopen(request)
do one and the same thing. Why are you doing it twice?

This may be due _scproxy.get_proxies() not being fork-safe on Mac.
This is raised here https://bugs.python.org/issue33725#msg329926
_scproxy has been known to be problematic for some time, see for instance Issue31818. That issue also gives a simple workaround: setting urllib's "no_proxy" environment variable to "*" will prevent the calls to the System Configuration framework.
This is something that urllib may be attempting to do causing failure when multiprocessing.
There is a workaround and that is to set the environmental variable no-proxy to *
Eg. export no_proxy=*

Python logger stops logging after uncaught exception

The native python logger used by our flask app seems to stop writing to the log after an exception happens. The last entry logged before each stoppage is a message describing the exception. Typically the next message is one written by code in after_request but for cases where the logger stops, the after_request message is never written out.
Any idea what could be causing this?
Note: I originally posted this question on Serverfault (https://serverfault.com/questions/655683/python-logger-stops-logging) thinking it was an infrastructure issue. But now that we have narrowed the issue down to it occurring after an exception, this issue may be better suited for Stackoverflow.
Update [12/22/2015]:
Logger instantiation:
logging.addLevelName(Config.LOG_AUDIT_LEVEL_NUM, Config.LOG_AUDIT_LEVEL_NAME)
logger = logging.getLogger(Config.LOGGER_NAME)
logger.setLevel(Config.LOG_LEVEL)
handler = SysLogHandler(address='/dev/log', facility=SysLogHandler.LOG_LOCAL3)
handler.setLevel(Config.LOG_LEVEL)
formatter = log_formatter()
handler.setFormatter(formatter)
logger.addHandler(handler)
log_formatter:
class log_formatter(logging.Formatter):
def __init__(self,
fmt=None,
datefmt=None,
json_cls=None,
json_default=_default_json_default):
"""
:param fmt: Config as a JSON string, allowed fields;
extra: provide extra fields always present in logs
source_host: override source host name
:param datefmt: Date format to use (required by logging.Formatter
interface but not used)
:param json_cls: JSON encoder to forward to json.dumps
:param json_default: Default JSON representation for unknown types,
by default coerce everything to a string
"""
if fmt is not None:
self._fmt = json.loads(fmt)
else:
self._fmt = {}
self.json_default = json_default
self.json_cls = json_cls
if 'extra' not in self._fmt:
self.defaults = {}
else:
self.defaults = self._fmt['extra']
try:
self.source_host = socket.gethostname()
except:
self.source_host = ""
def format(self, record):
"""
Format a log record to JSON, if the message is a dict
assume an empty message and use the dict as additional
fields.
"""
fields = record.__dict__.copy()
aux_fields = [
'relativeCreated', 'process', 'args', 'module', 'funcName', 'name',
'thread', 'created', 'threadName', 'msecs', 'filename', 'levelno',
'processName', 'pathname', 'lineno', 'levelname'
]
for k in aux_fields:
del fields[k]
if isinstance(record.msg, dict):
fields.update(record.msg)
fields.pop('msg')
msg = ""
else:
msg = record.getMessage()
if 'msg' in fields:
fields.pop('msg')
if 'exc_info' in fields:
if fields['exc_info']:
formatted = tb.format_exception(*fields['exc_info'])
fields['exception'] = formatted
fields.pop('exc_info')
if 'exc_text' in fields and not fields['exc_text']:
fields.pop('exc_text')
logr = self.defaults.copy()
logr = {
'timestamp': datetime.datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%S.%fZ'),
'host': self.source_host,
}
logr.update(self._build_fields(logr, fields))
if msg:
logr['message'] = msg
something = json.dumps(logr, default=self.json_default, cls=self.json_cls)
return something
def _build_fields(self, defaults, fields):
return dict(defaults.get('fields', {}).items() + fields.items())
Update [01/03/2015]:
Answering questions posted:
Is the application still working after the exception?
Yes, the application continues to run.
Which type of exception is raised and what's the cause of it?
Internal/custom exception. Logger has stopped due to different types of exceptions.
Are you using threads in you application?
Yes, the app is threaded by gunicorn.
How is the logging library used?
We are using default FileHandler, SysLogHandler and a custom formatter (outputs JSON)
Does it log on a file? Does it use log rotation?
Yes, it logs to a file, but no rotation.

In regards to after_request, from the docs:
As of Flask 0.7 this function might not be executed at the end of the
request in case an unhandled exception occurred.
And as for your logging issue, it may be that your debug flag is set to true, which would cause the debugger to kick in and possibly stop the logging.
References:
(http://flask.pocoo.org/docs/0.10/api/#flask.Flask.after_request)
(http://flask.pocoo.org/docs/0.10/errorhandling/#working-with-debuggers)

You didn't provide enough information.
Is the application still working after the exception?
Which type of exception is raised and what's the cause of it?
Are you using threads in you application?
How is the logging library used? Does it log on a file? Does it use log rotation?
Supposing you're using threads in you application, the explanation is that the exception causes the Thread to shut down, therefore you won't see any activity from that specific thread. You should notice issues with the application as well.
If the application is still working but becomes silent, my guess is that the logging library is not configured properly. As you reported on Serverfault, the issue seemed to appear after adding fluentd which might not play well with the way your application uses the logging library.

Setting Pass/Fail using Python for an automated test in Sauce Labs?

So, I'm a complete noob when it comes to this kind of thing, and I need some help. I work in software QA for an ecommerce company, and we started using Saucelabs for our automated testing. I'm in the process of learning python but really know next to nothing at this point. I can build a decent test in Selenium IDE, export in Python/Selenium Webdriver, and run the test. Not an issue. However, how do I set the pass/fail flag on the interface? One of our devs wrote a parallel script so I can run a large number of tests at one time, but in order to do so I need to be able to see at a glance which tests have passed and which ones have failed. Can you help me? Thanks!
Also, any tutorials you are aware of on Selenium Webdriver would be helpful too! Really want to learn this stuff!

I did it this way, first you need to import some things
# These next imports for reporting Test status to Sauce Labs
import sys
import httplib
import base64
try:
import json
except ImportError:
import simplejson as json
Then you need this config
#Config to connect to SauceLabs REST API
config = {"username": "yourusernamehere",
"access-key": "youraccesskeyhere"}
Then you put your tests. At the end, before your TearDown you need to include
# Curl call to SauceLabs API to report Job Result
def set_test_status(self, jobid, passed):
base64string = base64.encodestring('%s:%s' % (config['username'], config['access-key']))[:-1]
body_content = json.dumps({"passed": passed})
connection = httplib.HTTPConnection("saucelabs.com")
connection.request('PUT', '/rest/v1/%s/jobs/%s' % (config['username'], jobid),
body_content,
headers={"Authorization": "Basic %s" % base64string})
result = connection.getresponse()
return result.status == 200
Then in your tearDown you need to include some kind of if logic. I did it this way (and it works)
def tearDown(self):
# sys.exc_info should be (None, None, None) if everything is OK, it fills with some values if something went wrong
# This if reports to Sauce Labs the outcome of the Test where True = Pass and False = Failed
if sys.exc_info() == (None, None, None):
self.set_test_status(self.driver.session_id, True)
else:
self.set_test_status(self.driver.session_id, False)
self.driver.quit()
self.assertEqual([], self.verificationErrors)
That did the trick for me

You can use Sauce labs REST API to mark your test pass/failed. You can find some example code given here

Python Cherrypy Access Log Rotation

If I want the access log for Cherrypy to only get to a fixed size, how would I go about using rotating log files?
I've already tried http://www.cherrypy.org/wiki/Logging, which seems out of date, or has information missing.

Look at http://docs.python.org/library/logging.html.
You probably want to configure a RotatingFileHandler
http://docs.python.org/library/logging.html#rotatingfilehandler

I've already tried http://www.cherrypy.org/wiki/Logging, which seems
out of date, or has information missing.
Try adding:
import logging
import logging.handlers
import cherrypy # you might have imported this already
and instead of
log = app.log
maybe try
log = cherrypy.log

The CherryPy documentation of the custom log handlers shows this very example.
Here is the slightly modified version that I use on my app:
import logging
from logging import handlers
def setup_logging():
log = cherrypy.log
# Remove the default FileHandlers if present.
log.error_file = ""
log.access_file = ""
maxBytes = getattr(log, "rot_maxBytes", 10000000)
backupCount = getattr(log, "rot_backupCount", 1000)
# Make a new RotatingFileHandler for the error log.
fname = getattr(log, "rot_error_file", "log\\error.log")
h = handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
h.setLevel(logging.DEBUG)
h.setFormatter(cherrypy._cplogging.logfmt)
log.error_log.addHandler(h)
# Make a new RotatingFileHandler for the access log.
fname = getattr(log, "rot_access_file", "log\\access.log")
h = handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
h.setLevel(logging.DEBUG)
h.setFormatter(cherrypy._cplogging.logfmt)
log.access_log.addHandler(h)
setup_logging()

Cherrypy does its logging using the standard Python logging module. You will need to change it to use a RotatingFileHandler. This handler will take care of everything for you including rotating the log when it reaches a set maximum size.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Requests library throws exceptions in logging - python

The problem is your custom logging format, where you expect a %(task). Requests (or rather the bundled urllib3) does not include the task parameter when logging, as it has no way of knowing, that you expect this.

Related

Keyword 'Capture Page Screenshot' could not be run on failure: Cannot access execution context

Python URLLib does not work with PyQt + Multiprocessing

Python logger stops logging after uncaught exception

Setting Pass/Fail using Python for an automated test in Sauce Labs?

Python Cherrypy Access Log Rotation

Categories

Resources