Scrapy: logging to a file without ScrapyFileLogObserver() - python

Apparently, I shouldn't be using ScrapyFileLogObserver anymore (http://doc.scrapy.org/en/1.0/topics/logging.html). But I still want to be able to save my log messages to a file, and I still want all the standard Scrapy console information to be saved to the file too.
From reading up on how to use the logging module, this is the code that I have tried to use:
class BlahSpider(CrawlSpider):
name = 'blah'
allowed_domains = ['blah.com']
start_urls = ['https://www.blah.com/blahblahblah']
rules = (
Rule(SgmlLinkExtractor(allow=r'whatever'), callback='parse_item', follow=True),
)
def __init__(self):
CrawlSpider.__init__(self)
self.logger = logging.getLogger()
self.logger.setLevel(logging.DEBUG)
logging.basicConfig(filename='debug_log.txt', filemode='w', format='%(asctime)s %(levelname)s: %(message)s',
level=logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.DEBUG)
simple_format = logging.Formatter('%(levelname)s: %(message)s')
console.setFormatter(simple_format)
self.logger.addHandler(console)
self.logger.info("Something")
def parse_item(self):
i = BlahItem()
return i
It runs fine, and it saves the "Something" to the file. However, all of the stuff that I see in the command prompt window, all of the stuff that used to be saved to the file when I used ScrapyFileLogObserver, is not saved now.
I thought that my "console" handler with "logging.StreamHandler()" was supposed to deal with that, but this is just what I had read and I don't really understand how it works.
Can anyone point out what I am missing or where I have gone wrong?
Thank you.

I think the problem is that you've used both basicConfig and addHandler.
Configure two handlers separately:
self.logger = logging.getLogger()
self.logger.setLevel(logging.DEBUG)
logFormatter = logging.Formatter('%(asctime)s %(levelname)s: %(message)s')
# file handler
fileHandler = logging.FileHandler("debug_log.txt")
fileHandler.setLevel(logging.DEBUG)
fileHandler.setFormatter(logFormatter)
self.logger.addHandler(fileHandler)
# console handler
consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.DEBUG)
consoleHandler.setFormatter(logFormatter)
self.logger.addHandler(consoleHandler)
See also:
logger configuration to log to file and print to stdout

you can log all scrapy logs to file by first disabling root handle in scrapy.utils.log.configure_logging and then adding your own log handler.
In settings.py file of scrapy project add the following code:
import logging
from logging.handlers import RotatingFileHandler
from scrapy.utils.log import configure_logging
LOG_ENABLED = False
# Disable default Scrapy log settings.
configure_logging(install_root_handler=False)
# Define your logging settings.
log_file = '/tmp/logs/CRAWLER_logs.log'
root_logger = logging.getLogger()
root_logger.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
rotating_file_log = RotatingFileHandler(log_file, maxBytes=10485760, backupCount=1)
rotating_file_log.setLevel(logging.DEBUG)
rotating_file_log.setFormatter(formatter)
root_logger.addHandler(rotating_file_log)
Also we customize log level (DEBUG to INFO) and formatter as required.
Hope this helps!

Related

How to disable python logging stdout and have it just inside the log file

I want to disable my script to print the INFO, DEBUG, WARNING or ERROR from logging commands and just have this inside the .log or .err. Here is the configuration I have:
def log(routine_name):
""" Logging configuration """
logger = logging.getLogger(_name_)
logger.setLevel(logging.INFO)
logging.StreamHandler(stream=None)
formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
file_handler = logging.FileHandler(routine_name)
file_handler.setFormatter(formatter)
error_handler = logging.FileHandler(routine_name.replace("log", "err"))
error_handler.setFormatter(formatter)
error_handler.setLevel(logging.ERROR)
stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)
logger.addHandler(file_handler)
logger.addHandler(stream_handler)
logger.addHandler(error_handler)
return logger
I have tried to set logger.propagate = False from this post but didn't work, it still printing everything.
Thanks
For me this did work with module.submodule issuing the logging commands
logger = logging.getLogger("module.submodule")
fh = logging.FileHandler("logfile")
fh.setLevel(logging.DEBUG)
logger.addHandler(fh)
logger.disabled = True

Location of Python log file should be changed

I am using logger in my python 2.7 project on a legacy code. I want to create logs on specific location but python logging module creates the log files at the default place i.e. from where it is executed.
Is there is any way to change this default location?
Below is the initialization of the logger.
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
formatter = logging.Formatter('%(message)s')
file_handler = logging.FileHandler('file.log')
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
Below is an example of how you can set the location and other properties of Python logger:
You can define a get_logger function as follows:
import logging
import os
LOG_DIR = 'log_dir'
LOG_FORMATTER = logging.Formatter('[%(asctime)s] %(levelname)s %(name)s: %(message)s')
def get_logger(log_name, log_dir = LOG_DIR):
if not os.path.isdir(log_dir):
os.mkdir(log_dir)
logger = logging.getLogger(log_name)
logging.basicConfig(level = logging.INFO)
log_handler = logging.FileHandler(os.path.join(log_dir, log_name))
logger.addHandler(log_handler)
log_handler.setFormatter(LOG_FORMATTER)
log_handler.setLevel('INFO')
return logger
Then in the file that you want to make logs, you can do as follows:
logger = get_logger('filename')
If you want to make a logging message, you can then do as follows:
logger.info('logging information!')

Python logger - multiple logger instances with multiple levels - best practice

I have the following requirements:
To have one global logger which you can configure (setup level, additional handlers,..)
To have per module logger which you can configure (setup level, additional handlers,..)
In other words we need more logs with different configuration
Therefore I did the following
create method to setup logger:
def setup_logger(module_name=None, level=logging.INFO, add_stdout_logger=True):
print("Clear all loggers")
for _handler in logging.root.handlers:
logging.root.removeHandler(_handler)
if add_stdout_logger:
print("Add stdout logger")
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setLevel(level)
stdout_handler.setFormatter(logging.Formatter(fmt='%(asctime)-11s [%(levelname)s] [%(name)s] %(message)s'))
logging.root.addHandler(stdout_handler)
print("Set root level log")
logging.root.setLevel(level)
if module_name:
return logging.getLogger(module_name)
else:
return logging.getLogger('global')
Then I create logger as following:
logger_global = setup_logger(level=logging.DEBUG)
logger_module_1 = setup_logger(module_name='module1', level=logging.INFO)
logger_module_2 = setup_logger(module_name='module2', level=logging.DEBUG)
logger_global.debug("This is global log and will be visible because it is setup to DEBUG log")
logger_module_1.debug("This is logger_module_1 log and will NOT be visible because it is setup to INFO log")
logger_module_2.debug("This is logger_module_2 log and will be visible because it is setup to DEBUG log")
Before I will try what works and what not and test it more deeply I want to ask you if this is good practice to do it or do you have any other recommendation how to achieve our requrements?
Thanks for help
Finally I found how to do it:
def setup_logger(module_name=None, level=logging.INFO, add_stdout_logger=True):
custom_logger = logging.getLogger('global')
if module_name:
custom_logger = logging.getLogger(module_name)
print("Clear all handlers in logger") # prevent multiple handler creation
module_logger.handlers.clear()
if add_stdout_logger:
print("Add stdout logger")
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setLevel(level)
stdout_handler.setFormatter(logging.Formatter(fmt='%(asctime)-11s [%(levelname)s] [%(name)s] %(message)s'))
module_logger.addHandler(stdout_handler)
# here you can add another handlers ,...
# because we use custom handlers which have the different type of log level,
# then our logger has to have the lowest level of logging
custom_logger.setLevel(logging.DEBUG)
return custom_logger
Then simply call the following
logger_module_1 = setup_logger(module_name='module1', level=logging.INFO)
logger_module_2 = setup_logger(module_name='module2', level=logging.DEBUG)
logger_module_1.debug("This is logger_module_1 log and will NOT be visible because it is setup to INFO log")
logger_module_2.debug("This is logger_module_2 log and will be visible because it is setup to DEBUG log")

Logging to two files with different settings

I am already using a basic logging config where all messages across all modules are stored in a single file. However, I need a more complex solution now:
Two files: the first remains the same.
The second file should have some custom format.
I have been reading the docs for the module, bu they are very complex for me at the moment. Loggers, handlers...
So, in short:
How to log to two files in Python 3, ie:
import logging
# ...
logging.file1.info('Write this to file 1')
logging.file2.info('Write this to file 2')
You can do something like this:
import logging
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
def setup_logger(name, log_file, level=logging.INFO):
"""To setup as many loggers as you want"""
handler = logging.FileHandler(log_file)
handler.setFormatter(formatter)
logger = logging.getLogger(name)
logger.setLevel(level)
logger.addHandler(handler)
return logger
# first file logger
logger = setup_logger('first_logger', 'first_logfile.log')
logger.info('This is just info message')
# second file logger
super_logger = setup_logger('second_logger', 'second_logfile.log')
super_logger.error('This is an error message')
def another_method():
# using logger defined above also works here
logger.info('Inside method')
def setup_logger(logger_name, log_file, level=logging.INFO):
l = logging.getLogger(logger_name)
formatter = logging.Formatter('%(message)s')
fileHandler = logging.FileHandler(log_file, mode='w')
fileHandler.setFormatter(formatter)
streamHandler = logging.StreamHandler()
streamHandler.setFormatter(formatter)
l.setLevel(level)
l.addHandler(fileHandler)
l.addHandler(streamHandler)
setup_logger('log1', txtName+"txt")
setup_logger('log2', txtName+"small.txt")
logger_1 = logging.getLogger('log1')
logger_2 = logging.getLogger('log2')
logger_1.info('111messasage 1')
logger_2.info('222ersaror foo')

Python logging.get_Logger(name) with FileHandler does not write to file

If I get the logger with a name and add a FileHandler it does not write to the file.
This works and writes correctly to the file:
log = logging.getLogger()
fh = logging.FileHandler(logfile)
log.addHandler(fh)
fh_fmt = logging.Formatter("%(asctime)s (%(levelname)s)\t: %(message)s")
fh.setFormatter(fh_fmt)
log.setLevel(logging.INFO)
This does not write to the file:
log = logging.getLogger(name)
fh = logging.FileHandler(logfile)
log.addHandler(fh)
fh_fmt = logging.Formatter("%(asctime)s (%(levelname)s)\t: %(message)s")
fh.setFormatter(fh_fmt)
log.setLevel(logging.INFO)
The only difference is that I get a 'named' logger.
This is a rather old question, but I believe I found the underlying problem and solution, at least with a newer version of Python.
The second code example starts with log = logging.getLogger(name), where name is presumed to be a string representing the name of the logger. Since a name is provided, this log will not be the root logger. According to Logger.setLevel(level) docs for Python 3.6+,
When a logger is created, the level is set to NOTSET (which causes all messages to be processed when the logger is the root logger, or delegation to the parent when the logger is a non-root logger).
This tells us that we have to set the level of our logger so that it will actually process the messages instead of passing it to the root logger.
This is a code example I wrote (in Python 3.7) that does not work:
from pathlib import Path
import logging
formatter = logging.Formatter('%(name)s [%(levelname)s] %(message)s')
log_file_dir = Path('./log/')
config_file = 'config_file.txt'
config_file_path = log_file_dir / config_file
logger = logging.getLogger('example_logger')
fh = logging.FileHandler(config_file_path, mode='w')
fh.setLevel(logging.INFO)
fh.setFormatter(formatter)
logger.addHandler(fh)
logger.info('Start Configuration Log')
And this one works by adding one line:
from pathlib import Path
import logging
formatter = logging.Formatter('%(name)s [%(levelname)s] %(message)s')
log_file_dir = Path('./log/')
config_file = 'config_file_2.txt'
config_file_path = log_file_dir / config_file
logger = logging.getLogger('example_logger')
logger.setLevel(logging.INFO) # <------ Or the applicable level for your use-case
fh = logging.FileHandler(config_file_path, mode='w')
fh.setLevel(logging.INFO)
fh.setFormatter(formatter)
logger.addHandler(fh)
logger.info('Start Configuration Log')
Note: The first code example does create the chosen log file, but does not write 'Start Configuration Log' to the file.

Categories