Note: This is an old posts and will be updated for Python 3 soon.
Introduction
One of the reasons I got a little curious about logging module was thanks to pylint
. While I was running pylint
on some code, I saw at quite a few places the following warning message -
logging-not-lazy (W1201): *Specify string format arguments as logging function parameters*
While my code had a few instances of equivalent of the following -
log.info("Log Something val %d" % (val))
While the message was saying I should probably use something like
log.info("Log Something val %d" , val)
Now, the natural question is why should that matter? The answer is "While the former string substitution will be evaluated regardless of whether a particular error is enabled or not, ie to say this particular log record will be emitted on not, while the latter is a function call that happens only when the log record is 'emitted'.
Suggested different option of using string.format
is also not a choice here as well for the same reason and you would see a similar warning as above.
:logging-format-interpolation (W1202): *Use % formatting in logging functions and pass the % parameters as arguments*
So this prompted in taking a more closer look at the documentation and figure out some of the subtleties. We discuss all the findings in little more details.
A Bit More About These Warnings
As we have seen in the previous section, it is important to pass parameters as arguments to logging function rather than using '%' substitution or using string.format
. Let's take a look at this in little more detail by profiling the code. Let's run this through our context profiler. Here's the relevant code and corresponding details -
import logging
import time
import profiler
iterations = 1000000
l = logging.getLogger(__name__)
l.addHandler(logging.NullHandler())
l.setLevel(logging.INFO)
class Foo(object):
def __str__(self):
return "%s" % self.__class__
with profiler.Profiler(enabled=True, contextstr="non lazy logging") as p:
for i in range(iterations):
l.debug("Hello There. %s" % Foo())
p.print_profile_data()
with profiler.Profiler(enabled=True, contextstr="lazy logging") as p:
for i in range(iterations):
l.debug("Hello There. %s", Foo())
p.print_profile_data()
And here is the output of running the above code
profile: non lazy logging: enter
4000003 function calls in 1.058 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1000000 0.231 0.000 0.655 0.000 /usr/lib/python2.7/logging/__init__.py:1145(debug)
1000000 0.272 0.000 0.424 0.000 /usr/lib/python2.7/logging/__init__.py:1360(isEnabledFor)
1000000 0.385 0.000 0.385 0.000 lazy_logging.py:13(__str__) <-------------- Check this line
1000000 0.152 0.000 0.152 0.000 /usr/lib/python2.7/logging/__init__.py:1346(getEffectiveLevel)
1 0.018 0.018 0.018 0.018 {range}
1 0.000 0.000 0.000 0.000 /home/gabhijit/backup/personal-code/gabhijit-github-io-blog/sources/misc/python/logging/profiler.py:29(__exit__)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
profile: non lazy logging: exit
profile: lazy logging: enter
3000003 function calls in 0.591 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1000000 0.195 0.000 0.580 0.000 /usr/lib/python2.7/logging/__init__.py:1145(debug)
1000000 0.245 0.000 0.385 0.000 /usr/lib/python2.7/logging/__init__.py:1360(isEnabledFor)
1000000 0.140 0.000 0.140 0.000 /usr/lib/python2.7/logging/__init__.py:1346(getEffectiveLevel)
1 0.010 0.010 0.010 0.010 {range}
1 0.000 0.000 0.000 0.000 /home/gabhijit/backup/personal-code/gabhijit-github-io-blog/sources/misc/python/logging/profiler.py:29(__exit__)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
profile: lazy logging: exit
So our __str__
for the class Foo was called even when the logging.debug
is not enabled. This might look like a convoluted example, but this proves an important point, especially when we are logging in critical path, extra cycles are spent in evaluating the string even when the relevant level is not enabled. When the parameters to be evaluated are themselves more expensive this could be an un-necessary penalty, so this should be avoided.
Surely, it's important to understand logging module in more details as this is going to be widely used in any program or library that we develop.
Let's look at the logging architecture in little more details through use of some examples and what exactly is handling under the hood.
Logging Architecture
The complete logger flow is described below, taken from the excellent logging howto
We quickly go through the main classes Logger
and Handler
class, to provide a quick overview. Reading in more details about the HOWTO pointed above is highly recommended. We are not going to look at some of the other standard classes here a) as they are well covered in the documentation above and b) Typically you'd just use what the standard provides. The intent here is not to re-produce documentation, but look at gotchas that are worth keeping an eye about.
Logger
Class
From the HOWTO above -
Logger objects have a threefold job. First, they expose several methods to application code so that applications can log messages at runtime. Second, logger objects determine which log messages to act upon based upon severity (the default filtering facility) or filter objects. Third, logger objects pass along relevant log messages to all interested log handlers.
So basically, Logger classes are main entry point into the logging system. A more simple description of Logger
class can be, they create LogRecord
objects depending upon current configuration and handle them. The main function that does this looks like following (taken from /usr/lib/python-2.7/logging/__init__.py
)
def _log(self, level, msg, args, exc_info=None, extra=None):
"""
Low-level logging routine which creates a LogRecord and then calls
all the handlers of this logger to handle the record.
"""
if _srcfile:
#IronPython doesn't track Python frames, so findCaller raises an
#exception on some versions of IronPython. We trap it here so that
#IronPython can use logging.
try:
fn, lno, func = self.findCaller()
except ValueError:
fn, lno, func = "(unknown file)", 0, "(unknown function)"
else:
fn, lno, func = "(unknown file)", 0, "(unknown function)"
if exc_info:
if not isinstance(exc_info, tuple):
exc_info = sys.exc_info()
record = self.makeRecord(self.name, level, fn, lno, msg, args, exc_info, func, extra)
self.handle(record)
def handle(self, record):
"""
Call the handlers for the specified record.
This method is used for unpickled records received from a socket, as
well as those created locally. Logger-level filtering is applied.
"""
if (not self.disabled) and self.filter(record):
self.callHandlers(record)
A couple of points to note here - _log
function will be called looking at the current logging LEVEL
of the logger and the handlers are called only if this is logger is enabled
and any of the 'filters' associated with this logger allow the record to be passed. Question - why do all the work - when the logger is disabled?. Actually, the above was acknowledged as an issue and is fixed for Python 3.8.
Handler
Class
This class actually 'handles' the LogRecords emitted. Typically by calling the 'Formatter' for the LogRecord and then emit
ing the record. A word about handle
here is important - The Handler's handle
method calls emit
holding the Handler's lock (see below), so it's important to pay attention to emit
and ensure as far as possible that it is not a blocking one. Some of the implementation of handlers actually have an emit
function that is blocking. Python 3.2 onwards there is a QueueHandler that implements a non-blocking emit
. So it might be a good idea to consider using that especially if you are logging in a critical path. Here is the Handler's handle
function (taken from /usr/lib/python2.7/logging/__init__.py
)
def handle(self, record):
"""
Conditionally emit the specified logging record.
Emission depends on filters which may have been added to the handler.
Wrap the actual emission of the record with acquisition/release of
the I/O thread lock. Returns whether the filter passed the record for
emission.
"""
rv = self.filter(record)
if rv:
self.acquire()
try:
self.emit(record)
finally:
self.release()
return rv
Creating Loggers
What happens upon import logging
When you do import logging
, there are a number of things that happen under the hood. For example a RootLogger
instance is created when the logging
module is imported. Actually all the loggers created in a Python process form a hierarchy with RootLogger
being at the 'root' of the hierarchy. We will look at more about this a bit later, in a somewhat convoluted example, but helps us understand what is really happening.
Logging instead of 'print'
If you often print
for debugging and providing some information to the user, it's often a good idea to simply create a logger and do basicConfig
, so moving to a proper logging later on becomes easier and you don't have to manually move every print
call to a logging call. So something like -
import logging
logger = logging.getLogger() # This will return the `RootLogger` above
logging.basicConfig() # output to stderr
logger.setLevel(logging.INFO)
# And then in the code -
logger.info("Doing something...")
Typically print
should be used when you are developing a command line utility and want to output to stdout
. Note above the logging.basicConfig
default configuration is to configure output (it's actually an internally a StreamHandler
) to ~stdout
~ stderr
.
Creating Logger for Package and Subpackage
A Logger can be created by -
logger = logging.getLogger("something")
Usually it's a good idea to pass the __name__
of the current module to the logger. This has an important benefits that loggers of a package are arranged in a hierarchy. Let's say you have a package structure that looks like foo.bar.baz
then inside each of the __init__.py
if the logger is created through logging.getLogger(__name__)
then the logger corresponding to package foo
will be parent of the logger corresponding to package foo.bar
which in turn will be parent of foo.bar.baz
logging. Some of the advantages of that are - when actually the log records are emitted one precisely knows which module/package the log came from and one doesn't have to worry about what name to assign to the logger. There's another detail here, each Logger
has got a property called propagate
(default is True
). If this property is True
then for a given logger whenever a LogRecord
is to be handled handlers attached to that particular logger as well as those attached to the 'parent' are called as well, (Note: here handlers) regardless of the level
of the parent logger(s). Thus what one can do is at a package level, create a handler and all the sub-packages and modules within that package would just use this handler and no separate handler is required to be created and assigned to logger. This serves as a convenience and helps when defining loggers for a library (see below for details about it).
Loggers for libraries
All the discussion above for the package and sub-packages applies here as well, another thing to keep in mind when working on a library which will be potentially used by someone else, it's a good idea to leave the 'actual logging configuration' to the application that will be using this library, so what we should really do is - simply create loggers and don't attach any handlers, whatever handlers are required to be attached, will be done by the application code that will utilize the function of the library. To avoid an Exception related to 'no handlers attached', it's a good idea to attach a NullHandler
to the logger created by the main package.
In Summary - following is a good example of creating a logger for a library you are developing
Let's say you are developing a library that will be available as a package foo
with sub packages bar
and baz
inside the bar. Following simple logging setup should be good.
# Inside foo/__init__.py
import logging
logger = logging.getLogger()
logger.addHandler(logging.NullHandler())
# Inside foo/bar/__init__.py
import logging
logger = logging.getLongger()
# Inside foo/bar/baz/__init__.py
import logging
logger = logging.getLogger()
And then inside your Application, you can simply -
# Get access to RootLogger
_logger = logging.getLogger()
_logger.addHandler(logging.StreamHandler()) # Handler is attached to root logger.
our_logger = logging.getLogger("app")
It's possible to have a much more detailed configuration of the logging. But something above should often serve good enough a lot of times.
What this will do is - it will create a structure like the following (and default propagate=True
will make sure the RootLogger
's Handler gets called.
RootLogger (StreamHandler)
\
\_ AppLogger
\
\_ Foo.logger
\_ Bar.logger
\_ Baz.logger
Summary
We explored some of the details of Python's logging
module and looked at some good practices worth following while implementing logging for your application. This should serve handy while reading and understanding the documentation and perhaps dealing with sometimes unexpected behaviors.