Andre's Blog
Perfection is when there is nothing left to take away
SSW, subway edition

It has been over a year since I released the last version of Stone Steps Webalizer. The main reason being total lack of time - the other projects I'm involved in keep me busy day-to-day. However, I was not just about to give up on SSW, so a about a month ago I started looking for ways to continue the development. After some thinking, it dawned on me that every day I'm wasting about 40 minutes on the subway and I began looking for a netbook.

Variable argument lists on x64

People have been reporting x64 builds of Stone Steps Webalizer crashing on Linux for about a year and even though I could see from the stack trace that the problem related to the variable argument list passed into vsnprintf, I couldn't figure out what exactly was going on because I don't have 64-bit hardware to reproduce this problem in a debugger.

The call stack always ended up in strlen called for a bad string with an invalid address, usually 0x3:

A look back and plans for 2009

I made nine releases of Stone Steps Webalizer in 2008. The most notable feature added in 2008 was XML/XSL reporting, which gives website administrators full control over generated HTML. About six thousand people downloaded various number of copies in 2008.

One of the challenges of 2008 was lack of funding - not a single donation was contributed to help the project in 2008. Hardware, some commercial software and co-location are not cheap and I hope to see more support in 2009.

Time to think about new features. Here is what I have in mind, ordered by priority. If you think something is missing, leave a comment or start a discussion thread in the forums.

Viewing all items in XML reports

Those who tried XML reports have noticed that there are no links at the bottom of the reports if the number of the items, like hosts or referrers is greater than the configured top number of items. The reason for this is that, unlike with HTML reports, it does not make much sense to generate the same XML data twice (i.e. once in the top items report and another time in the report listing all items). I have been experimenting with various approaches to this problem and finally have found a solution I like.

Moving Linux installation to new hardware

I finally decided to abandon the old 700 MHz box I was using as a CVS repository and to do Fedora builds of Stone Steps Webalizer. The replacement machine was not new, but I just cannot complain about a 2.8GHz CPU and extra storage! Before this weekend, I never restored a Linux backup onto new hardware and I learned a thing or two about Linux in the past couple of days.

Flash charting - not too flashy

Original Webalizer PNG graphs became quite small when viewed on a high-resolution screen, which is pretty much any screen nowadays and are not very easy to work on due to a lack of a layout engine. Poor antialiasing in the underlying GD library does not help quality either. Being able to produce better graphs was one of the reasons I added XML reports to Stone Steps Webalizer. Last couple of months I was mostly working on making sure that it's easy to use XSL templates that will be included into the Stone Steps Webalizer package with various Flash charting packages.

I asked for a count, not a life story!

One of the main reasons I switched the state file to use Berkeley DB was to allow Stone Steps Webalizer generate reports without loading the entire monthly data set into memory, which may be hundreds of megabytes in size for high traffic web sites and proxies. Once this was implemented, report generators just had to open the database and traverse a few records at the top of each table, such as hosts or URLs, in order to generate a relevant report, which took almost no time. Soon, however, I noticed that generating top-x reports using a 600+ MB database takes minutes and so much memory, as if the entire database was being read.

A blast from the past

A couple of weeks ago I was looking at the Stone Steps Webalizer website stats and noticed a sharp drop in visits. Usually that would mean that there was something wrong with the infrastructure, but taking a closer look at the server and the network I couldn't find anything that would point to the problem. The next day was similar, which made me wonder what kind of a world event has happened that drew traffic away off my site.

Mangling user agents is a good thing!

User agent strings come in all shapes and sizes and showing full user agent strings in reports results in too much fragmentation, as every little detail, such as a service pack or a minor version change results in a new user agent string in the report.

MangleAgents is a configuration parameter that has been around for a while and is designed, despite its name, to tidy up user agent strings and leave only those parts of the user agent string that are interesting from the analysis point of view.

XML Reports in Stone Steps Webalizer

Generating reports in XML has been on my list of things to do for a while and I finally got around to work on it. One might ask, what is so significant about XML and why would an average webmaster be interested in them? Good question.

XML and related technologies provide a neat and powerful way to separate what reports contain, such as hit and visit counts or a list of hosts and URLs, from how reports are presented. As simple as this sounds (and, may be, cryptic to some readers), this separation is the basis for better-looking and much more customizable reports.