<< Using EclipseLink in GlassFish v2.1.1 | Home | GlassFish on my mobile phone >>

XML Logging

I don't think it's an understatement to say that XML is one of the most misused and overengineered technologies of the past decade. XML is good for markup of text documents, but it's clearly a bad format for most other purposes. Nevertheless, many people (me included!) misuse XML for things like configuration files and data export, simply because the XML tools are so ubiquitous today. If XML is used by some part of the application, there is a temptation to use it for all sorts of other things as well, whenever the need for a representation syntax comes along. So rather than bundling e.g the more appropriate JSON library for data transport, a more clumsy XML representation is often used simply because the tools are already in place.

Structured log files are however one of those cases where I think the XML format makes sense. It's actually quite reasonable to think of log files as heavily structured text documents, for which XML markup is a good fit. The classical Unix syslog format with long lines composed of space-separated strings becomes difficult to handle for both machines and humans whenever the content's structural complexity is non-trivial.

But there is one big problem with using XML for logging. Log files are typically written incrementally, where each new entry is appended separately to the end of the file. This doesn't fit in at all with the XML document structure which mandates that all content must be enclosed within a single top-level "root element". This is one of the many deficiencies of XML that disqualifies it as a "general-purpose" syntax. But it turns out that there are workarounds for this problem. It wasn't obvious to me at first how to do this, so I have made a note of it here.

First: Write the actual log file by appending XML fragments, without any root element, and where each fragment is an XML element that represents a log entry. Such log files can be rotated and concatenated without any need to worry about XML document boundaries. The JAXB library supports writing of XML fragments instead of documents: just set the Marshaller property JAXB_FRAGMENT to "true".

Second: Enclose the log files content within a root element before reading, so an XML document is presented to the parser. This was the tricky part, because it wasn't obvious to me how to do this without copying the whole log file each time some program wants to read the log document. But a small wrapper document with a special DOCTYPE declaration makes it possible:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE log [<!ENTITY data SYSTEM "logfile.txt">]>
<log>
   &data;
</log>
The ENTITY definition makes the JAXB parser fill in the contents of "logfile.txt" where it says "&data;". Such a wrapper document can be used as it is, for example as input to an XSLT processor. Or the wrapper document can be created on-the-fly by a Java program. Here is an example:
    String filename = "logfile.txt";

String x = "<?xml version='1.0' encoding='UTF-8' standalone='yes'?>\n" +
   "<!DOCTYPE log [<!ENTITY data SYSTEM '" +filename+ "'>]>\n" +
   "<log>&data;</log>\n";

InputStream s = new ByteArrayInputStream(x.getBytes("UTF-8"));
JAXBContext jc = JAXBContext.newInstance("mypkg.jaxb.log");
Unmarshaller u = jc.createUnmarshaller();
JAXBElement<Log> root = (JAXBElement<Log>) u.unmarshal(s);
Log log = root.getValue();

for (LogEntry e : log.getLogEntry()) {
   ...
}
The technique can be extended to wrapping multiple log files at once. Just add more ENTITY definitions and enumerate the corresponding entities in the wrapper document.



Add a comment Send a TrackBack