Translate

вівторок, 15 лютого 2011 р.

10 Tips for Proper Application Logging


 Our latest JCP partnerTomasz Nurkiewicz, has submitted a number of posts describing
 the basic principles of proper application logging. I found them quite interesting, thus
 I decided to aggregate them in a more compact format and present them to you. 
So, here are his suggestions for clean and helpful logs:

(NOTE: The original posts have been slightly edited to improve readability)

1) Use the appropriate tools for the job

Many programmers seem to forget how important is logging an application's behavior 
and its current activity. When somebody puts:

1
log.info("Happy and carefree logging");

happily somewhere in the code, he probably doesn't realize the importance of 
application logs during maintenance, tuning and failure identification. Underestimating
 the value of good logs is a terrible mistake.

In my opinion, SLF4J is the best logging API available, mostly because of a great pattern 
substitution support:

1
log.debug("Found {} records matching filter: '{}'", records, filter);

In Log4j you would have to use:

1
log.debug("Found " + records + " records matching filter: '" + filter + "'");

This is not only longer and less readable, but also inefficient because of extensive use
 of string concatenation. SLF4J adds a nice {} substitution feature. Also, because 
string concatenation is avoided and toString() is not called if the logging statement is
 filtered, there is no need for isDebugEnabled() anymore. BTW, have you noticed
 single quotes around filter string parameter?

SLF4J is just a façade. As an implementation I would recommend the Logback
 framework, already advertised, instead of the well established Log4J. It has many
 interesting features and, in contrary to Log4J, is actively developed.

The last tool to recommend is Perf4J. To quote their motto:

Perf4J is to System.currentTimeMillis() as log4j is to System.out.println()

I've added Perf4J to one existing application under heavy load and seen it in
 action in few other. Both administrators and business users were impressed by
 the nice graphs produced by this simple utility. Also we were able to discover performance
 flaws in no time. Perf4J itself deserves its own article, but for now just check
 their Developer Guide.

Additionally, note that Ceki Gülcü (founder of the Log4JSLF4J and Logback projects)
 suggested a simple approach to get rid of commons-logging dependency (see his comment).



2) Don't forget, logging levels are there for you

Every time you make a logging statement, you think hard which logging level is
 appropriate for this type of event, don't you? Somehow 90% of programmers never 
pay attention to logging levels, simply logging everything on the same level, typically
 INFO or DEBUG. Why? Logging frameworks have two major benefits over
 System.out., i.e. categories and levels. Both allow you to selectively filter logging
 statements permanently or only for diagnostics time. If you really can’t see the
 difference, print this table and look at it every time you start typing “log.” in your IDE:

ERROR – something terribly wrong had happened, that must be investigated
 immediately. No system can tolerate items logged on this level. Example: NPE, 
database unavailable, mission critical use case cannot be continued.

WARN – the process might be continued, but take extra caution. Actually I 
always wanted to have two levels here: one for obvious problems where 
work-around exists (for example: "Current data unavailable, using cached 
values") and second (name it: ATTENTION) for potential problems and
 suggestions. Example: "Application running in development mode" 
or "Administration console is not secured with a password". 
The application can tolerate warning messages, but they should always
 be justified and examined.

INFO – Important business process has finished. In ideal world, administrator
 or advanced user should be able to understand INFO messages and quickly
 find out what the application is doing. For example if an application is all about
 booking airplane tickets, there should be only one INFO statement per each ticket 
saying "[Who] booked ticket from [Where] to [Where]". Other definition of INFO 
message: each action that changes the state of the application significantly (database
 update, external system request).

DEBUG – Developers stuff. I will discuss later what sort of information deserves to 
be logged.

TRACE – Very detailed information, intended only for development. You might keep trace
 messages for a short period of time after deployment on production environment, 
but treat these log statements as temporary, that should or might be turned-off eventually.
 The distinction between DEBUG and TRACE is the most difficult, but if you put logging 
statement and remove it after the feature has been developed and tested, it should 
probably be on TRACE level.

The list above is just a suggestion, you can create your own set of instructions to follow,
 but it is important to have some. My experience is that always everything is logged 
without filtering (at least from the application code), but having the ability to quickly 
filter logs and extract the information with proper detail level, might be a life-saver.

The last thing worth mentioning is the infamous is*Enabled() condition. Some put
 it before every logging statement:

1
if(log.isDebugEnabled())
2
    log.debug("Place for your commercial");

Personally, I find this idiom being just clutter that should be avoided. The performance
 improvement (especially when using SLF4J pattern substitution discussed previously)
 seems irrelevant and smells like a premature optimization. Also, can you spot the 
duplication? There are very rare cases when having explicit condition is justified – 
when we can prove that constructing logging message is expensive. In other situations,
 just do your job of logging and let logging framework do its job (filtering).

3) Do you know what you are logging?

Every time you issue a logging statement, take a moment and have a look at what 
exactly will land in your log file. Read your logs afterwards and spot malformed 
sentences. First of all, avoid NPEs like this:

1
log.debug("Processing request with id: {}", request.getId());

Are you absolutely sure that request is not null here?

Another pitfall is logging collections. If you fetched collection of domain objects from
 the database using Hibernate and carelessly log them like here:

1
log.debug("Returning users: {}", users);

SLF4J will call toString() only when the statement is actually printed, which is quite nice.
 But if it does... Out of memory error, N+1 select problem, thread starvation (logging is
 synchronous!), lazy initialization exception, logs storage filled completely – each of these 
might occur.

It is a much better idea to log, for example, only ids of domain objects (or even only size of 
the collection). But making a collection of ids when having a collection of objects having 
getId() method is unbelievably difficult and cumbersome in Java. Groovy has a 
great spread operator (users*.id), in Java we can emulate it using theCommons Beanutils
 library:

1
log.debug("Returning user ids: {}", collect(users, "id"));

Where collect() method can be implemented as follows:

1
public static Collection collect(Collection collection, String propertyName) {
2
    return CollectionUtils.collect(collection, newBeanToPropertyValueTransformer(propertyName));
3
}

The last thing to mention is the improper implementation or usage of toString(). First, 
create toString() for each class that appears anywhere in logging statements, preferably 
using ToStringBuilder (but not its reflectivecounterpart). Secondly, watch out for 
arrays and non-typical collections. Arrays and some strange collections might not have
 toString() implemented calling toString() of each item. Use Arrays #deepToString 
JDK utility method. And read your logs often to spot incorrectly formatted messages.

4) Avoid side effects

Logging statements should have no or little impact on the application's behavior.
 Recently a friend of mine gave an example of a system that threw Hibernates' 
LazyInitializationException only when running on some particular environment. 
As you’ve probably guessed from the context, some logging statement caused 
lazy initialized collection to be loaded when session was attached. On this environment 
the logging levels were increased and collection was no longer initialized. 
Think how long would it take you to find a bug without knowing this context?

Another side effect is slowing the application down. Quick answer: if you log
 too much or improperly use toString() and/or string concatenation, logging has a 
performance side effect. How big? Well, I have seen server restarting every 15
 minutes because of a thread starvation caused by excessive logging. Now this
 is a side effect! From my experience, few hundreds of MiB is probably the upper limit 
of how much you can log onto disk per hour.

Of course if logging statement itself fails and causes business process to terminate 
due to exception, this is also a huge side effect. I have seen such a construct to avoid this:

1
try {
2
    log.trace("Id=" + request.getUser().getId() + " accesses " + manager.getPage().getUrl().toString())
3
catch(NullPointerException e) {}

This is a real code, but please make the world a bit better place and don’t do it, ever.

5) Be concise and descriptive

Each logging statement should contain both data and description. Consider the following examples:

1
log.debug("Message processed");
2
log.debug(message.getJMSMessageID());
3
4
log.debug("Message with id '{}' processed", message.getJMSMessageID());

Which log would you like to see while diagnosing failure in an unknown application?
 Believe me, all the examples above are almost equally common. Another anti-pattern:

1
if(message instanceof TextMessage)
2
    //...
3
else
4
    log.warn("Unknown message type");

Was it so hard to include thee actual message type, message id, etc. in the warning 
string? I know something went wrong, but what? What was the context?

A third anti-pattern is the "magic-log". Real life example: most programmers in 
the team knew that 3 ampersands followed by exclamation mark, followed by hash, 
followed by pseudorandom alphanumeric string log means "Message with XYZ id
 received". Nobody bothered to change the log, simply someone hit the keyboard 
and chose some unique "&&&!#" string, so that it can be easily found by himself.

As a consequence, the whole logs file looks like a random sequence of characters. 
Somebody might even consider that file to be a valid Perl program. Instead, a log file 
should be readable, clean and descriptive. Don't use magic numbers, log values,
 numbers, ids and include their context. Show the data being processed and show
 its meaning. Show what the program is actually doing. Good logs can serve as a 
great documentation of the application code itself.

Did I mention not to log passwords and any personal information? Don't!


6) Tune your pattern

Logging pattern is a wonderful tool, that transparently adds a meaningful context to 
every logging statement you make. But you must consider very carefully which 
information to include in your pattern. For example, logging date when your logs roll 
every hour is pointless as the date is already included in the log file name. On the 
contrary, without logging the thread name you would be unable to track any
 process using logs when two threads work concurrently – the logs will overlap. 
This might be fine in single-threaded applications – that are almost dead nowadays.

From my experience, the ideal logging pattern should include (of course except the
 logged message itself): current time (without date, milliseconds precision), logging
 level, name of the thread, simple logger name (not fully qualified) and the message.
 In Logback it is something like:

1
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
2
    <encoder>
3
        <pattern>%d{HH:mm:ss.SSS} %-5level [%thread][%logger{0}] %m%npattern>
4
    encoder>
5
appender>

You should never include file name, class name and line number, although it’s very
 tempting. I have even seen empty log statements issued from the code:

1
log.info("");

because the programmer assumed that the line number will be a part of the logging pattern
 and he knew that "If empty logging message appears in 67th line of the file (in authenticate() 
method), it means that the user is authenticated". Besides, logging class name, method 
name and/or line number has a serious performance impact.

A somewhat more advanced feature of a logging frameworks is the concept of 
Mapped Diagnostic Context.MDC is simply a map managed on a thread-local basis.
 You can put any key-value pair in this map and since then every logging statement
 issued from this thread is going to have this value attached as part of the pattern.

7) Log method arguments and return values

When you find a bug during development, you typically run a debugger trying to track
 down the potential cause. Now imagine for a while that you can’t use a debugger. For
 example, because the bug manifested itself on a customer environment few days ago 
and everything you have is logs. Would you be able to find anything in them?

If you follow the simple rule of logging each method input and output (arguments and return
 values), you don’t even need a debugger any more. Of course, you must be reasonable but 
every method that: accesses external system (including database), blocks, waits, etc. 
should be considered. Simply follow this pattern:

1
public String printDocument(Document doc, Mode mode) {
2
    log.debug("Entering printDocument(doc={}, mode={})", doc, mode);
3
    String id = //Lengthy printing operation
4
    log.debug("Leaving printDocument(): {}", id);
5
    return id;
6
}

Because you are logging both the beginning and the end of method invocation, you 
can manually discover inefficient code and even detect possible causes of deadlocks 
and starvation – simply by looking after "entering" without corresponding "leaving". 
If your methods have meaningful names, reading logs would be a pleasure. Also, analyzing 
what went wrong is much simpler, since on each step you know exactly what has been
 processed. You can even use a simple AOP aspect to log a wide range of methods in 
your code. This reduces code duplication, but be careful, since it may lead to enormous 
amount of huge logs.

You should consider DEBUG or TRACE levels as best suited for these types of logs. And 
if you discover some method are called too often and logging might harm performance, 
simply decrease logging level for that class or remove the log completely (maybe leaving just 
one for the whole method invocation?) But it is always better to have too much rather than
 too few logging statements. Treat logging statements with the same respect as unit tests – 
your code should be covered with logging routines as it is with unit tests. No part of the system 
should stay with no logs at all. Remember, sometimes observing logs rolling by is the only way 
to tell whether your application is working properly or hangs forever.

8) Watch out for external systems

This is the special case of the previous tip: if you communicate with an external system, consider
 logging every piece of data that comes out from your application and gets in. Period. Integration
 is a tough job and diagnosing problems between two applications (think two different vendors, 
environments, technology stacks and teams) is particularly hard. Recently, for example, we've 
discovered that logging full messages contents, includingSOAP and HTTP headers in 
Apache CXF web services is extremely useful during integration and system testing.

This is a big overhead and if performance is an issue, you can always disable logging. But
 what is the point of having a fast, but broken application, that no one can fix? Be extra careful
 when integrating with external systems and prepare to pay that cost. If you are lucky and all
 your integration is handled by an ESB, then the bus is probably the best place to log every
 incoming request and response. See for example Mules' log-component.

Sometimes the amount of information exchanged with external systems makes it 
unacceptable to log everything. On the other hand during testing and for a short period of 
time on production (for example when something wrong is happening), we would like to 
have everything saved in logs and are ready to pay performance cost. This can be achieved 
by carefully using logging levels. Just take a look at the following idiom:

1
Collection requestIds = //...
2
if(log.isDebugEnabled())
3
    log.debug("Processing ids: {}", requestIds);
4
else
5
    log.info("Processing ids size: {}", requestIds.size());

If this particular logger is configured to log DEBUG messages, it will print the whole
 requestIds collection contents. But if it is configured to print INFO messages, only the size
 of collection will be outputted. If you are wondering why I forgot about isInfoEnabled()
 condition, go back to tip #2. One thing worth mentioning is that requestIds collection should 
not be null in this case. Although it will be logged correctly as null if DEBUG is 
enabled, but big fat NullPointerException will be thrown if logger is configured to INFO. 
Remember my lesson about side effects in tip #4?

9) Log exceptions properly

First of all, avoid logging exceptions, let your framework or container (whatever it is) do it for
 you. There is one, ekhem, exception to this rule: if you throw exceptions from some remote
 service (RMI, EJB remote session bean, etc.), that is capable of serializing exceptions, 
make sure all of them are available to the client (are part of the API). Otherwise the 
client will receive NoClassDefFoundError: SomeFancyException instead of the "true" error.

Logging exceptions is one of the most important roles of logging at all, but many programmers
 tend to treat logging as a way to handle the exception. They sometimes return default value
 (typically null, 0 or empty string) and pretend that nothing has happened. Other times 
they first log the exception and then wrap it and throw it back:

1
log.error("IO exception", e);
2
throw new MyCustomException(e);

This construct will almost always print the same stack trace two times, because something
 will eventually catch MyCustomException and log its cause. Log, or wrap and throw back
(which is preferable), never both, otherwise your logs will be confusing.

But if we really do WANT to log the exception? For some reason (because we don’t
 read APIs and documentation?), about half of the logging statements I see are wrong. 
Quick quiz, which of the following log statements will log the NPE properly?






01 try {
02 Integer x = null;
03 ++x;
04 } catch (Exception e) {
05 log.error(e); //A
06 log.error(e, e); //B
07 log.error("" + e); //C
08 log.error(e.toString()); //D
09 log.error(e.getMessage()); //E
10 log.error(null, e); //F
11 log.error("", e); //G
12 log.error("{}", e); //H
13 log.error("{}", e.getMessage()); //I
14 log.error("Error reading configuration file: " + e); //J
15 log.error("Error reading configuration file: " + e.getMessage()); //K
16 log.error("Error reading configuration file", e); //L
17 }

Surprisingly, only G and preferably L are correct! A and B don’t even compile in SLF4J, 
others discard stack traces and/or print improper messages. For example, E will not print 
anything as NPE typically doesn't provide any exception message and the stack trace won’t
 be printed as well. Remember, the first argument is always the text message, write 
something about the nature of the problem. Don’t include exception message, as it will be 
printed automatically after the log statement, preceding stack trace. But in order to do so, 
you must pass the exception itself as the second argument.

10) Logs easy to read, easy to parse

There are two groups of receivers particularly interested in your application logs: human
 beings (you might disagree, but programmers belong to this group as well) and computers
 (typically shell scripts written by system administrators). Logs should be suitable for both 
of these groups. If someone looking from behind your back at your application logs sees
 (source Wikipedia):


then you probably have not followed my tips. Logs should be readable and easy to understand
 just like the code should.

On the other hand, if your application produces half GB of logs each hour, no man and no
 graphical text editor will ever manage to read them entirely. This is where old-school grep
sed and awk come in handy. If it is possible, try to write logging messages in such a way, 
that they could be understood both by humans and computers, e.g. avoid formatting of numbers,
 use patterns that can be easily recognized by regular expressions, etc. If it is not possible, 
print the data in two formats:

1
log.debug("Request TTL set to: {} ({})"new Date(ttl), ttl);
2
// Request TTL set to: Wed Apr 28 20:14:12 CEST 2010 (1272478452437)
3
4
final String duration = DurationFormatUtils.formatDurationWords(durationMillis, 
truetrue);
5
log.info("Importing took: {}ms ({})", durationMillis, duration);
6
//Importing took: 123456789ms (1 day 10 hours 17 minutes 36 seconds)

Computers will appreciate "ms after 1970 epoch" time format, while people would be delighted
 seeing "1 day 10 hours 17 minutes 36 seconds" text. BTW take a look at DurationFormatUtils,
 nice tool.

That's all guys, a “logging tips extravaganza” from our JCP partnerTomasz Nurkiewicz. Don't 
forget to share!
Взято звідси
P.S.
Особисто раджу звернути увагу на Logback framework. Log4J  вчорашній день :)


3 коментарі: