Andre's Blog
Perfection is when there is nothing left to take away
Here be dragons

Working with source code for years, I have seen much of good code and code that is, well, not quite thought through. The ratio between the two has always been in favor of good code, in my opinion. However, something is changing in the software development landscape and the overall code quality seems to be consistently going down. Now more than ever I find bad code where it should not be.

I see more than one reason for this trend - partially it is because the hardware has become faster and cheaper and people just prefer to buy more of it rather than to produce better code, partially it is because many of the modern languages favor language simplicity over versatility and partially it is because education seems to be trying to cover more technologies rather than to explore any one of them in depth.

Once is more than enough

One of the main indicators of bad code is that it does something unnecessary and wastes resources. Consider the following example, which obtains a pointer to some address and then checks whether the returned address string is empty or not before processing it.

const char *get_address(void);
if(strlen(get_address()) != 0)
   process_address(get_address());

This code wastes CPU cycles because strlen has to scan the string in order to find the terminating null character. The resulting length is simply discarded and the string would have to be traversed again when processed later by process_address.

It is also important to know what exactly get_address has to do in order to return an address string. If it's known that the returned address is readily available and is guaranteed to be returned by get_address in fixed time, then calling get_address one more time will not slow down this code. However, if retrieving an address requires any additional work, such as querying database or a network component, then calling get_address twice is wasteful.

Finally, if get_address can fail and would return a NULL pointer, this code would simply crash.

The first two issues make this code wasteful, which means it's bad code, even though many people will think there's nothing wrong with the code, since it it does produce expected results. The third issue, not checking the returned pointer for NULL, is just a bug.

Let's rewrite this example to make it good, bug-free code. Call get_address only once and hold on to the returned pointer, which can be checked for NULL and whether the string it points to is empty or not without wasting CPU or resources used by get_address.

const char *get_address(void);
const char *address = get_address();
if(address && *address)
   process_address(address);

Yes, it is mutable!

Another common example of wasteful code is excessive use of the STL substr method. Some developers cannot comprehend the concept of a string being made of characters and instead view a string as an unmutable entity.

The basic algorithm to replace a portion of a string is quite simple - locate the part you want to replace, shift the remainder of the string to the left or right and copy the replacement into the space that was just created. Simple, right?

Well, that works only if you see the characters inside a string. Otherwise you have to take the part of the string the precedes the text being replaced, then the part that follows it and add all three strings, the first part, the replacement and the last part together:

std::string& replace_example(std::string& str, 
               const std::string& pat, const std::string& repl)
{
	size_t rpos;
	
	if((rpos = str.find(pat)) == std::string::npos) 
		return str;
		
	std::string p1 = str.substr(0, rpos);
	std::string p2 = str.substr(rpos+repl.length());
	
	return str = p1 + repl + p2;
}

Performance of such code is obviously substandard because of all the copying and possible memory allocations. The alternative is very simple and works several times faster:

std::string& replace_example(std::string& str, 
               const std::string& pat, const std::string& repl)
{
	size_t rpos;
	
	if((rpos = str.find(pat)) == std::string::npos) 
		return str;
		
	return str.replace(rpos, repl.length(), repl);
}

Rewritten code is also easier to read, which brings me to the next point.

What the heck is that?!

Code readability is a very important factor in writing maintainable code. Granted, sometimes we decide to bypass readability in favor of speed or memory gains (in which case code comments must be provided), but in most cases hard-to-read code means it is bad code.

Unreadable code doesn't always look naive and amateurish. Sometimes it looks quite scientific and thought through. Once I came across of the code similar to the fragment shown below. The comment above simply said: "Here be dragons".

template <typename T, typename U>
class derived : public T, public U {};

class color : public std::string {};
class bgcolor : public std::string {};
class font : public std::string {};
class nothing {};

class style : public derived<color, 
	derived<bgcolor, 
	derived<font, nothing> > > {};

It took me a couple of moments to realize what exactly the author of the code was trying to do and what benefit it would provide. Apparently, the developer was trying to use the names of base classes as if they were data members, so that the code would be used this way:

style md;
((color&) md).assign("red");
((bgcolor&) md).append("white");

After I wrapped my brain around the approach, I couldn't help but smile - it was obvious that the developer was trying to write something exotic in C++, which is, by itself, is a great exercise, but only when done in throw-away test code.

I'm going just to tweak it a bit...

The const qualifier is a great feature in C++, which allows interface designers to enforce read-only parameters, return values or variables. Unfortunately, some developers see the const qualifier as a mere obstacle on their way to a compiled executable and try to resolve compiler errors by casting away const qualifiers instead of writing good code.

Consider the following example. The function below returns a reference to a const STL string object containing a path to the application home directory. Such string can be safely used in a multi-threaded environment because it cannot be changed by any of the callers.

const std::string& get_base_path(void);

If one needs to create a fully-qualified path to a file located underneath the base path, a copy of the base path should be made and then the copy can be updated to contain the remainder of the final path:

std::string path(get_base_path());
path.append("/file.ext");

Inexperienced developers choose to simply cast the return value to a non-const version to avoid the compiler error that would be triggered by this code otherwise:

std::string& path = (std::string&) get_base_path();
path.append("/file.ext");

This trick almost always works in debug environments because there is no concurrency and the shared base path is modified, creating the impression that the application is working as expected. However, executing string::append concurrently will almost certainly result in corrupted memory.

If only there was a variable that is destroyed when out of scope...

Some developers become auto_ptr happy and use it left and right, whether it is needed or not. For example, the following code allocates an instance of an STL string and calls a function that may throw an exception. The auto_ptr ensures that if an exception is indeed thrown, the instance of the string will be deallocated during stack unwinding.

void etf(const std::string*) throw(std::exception);
void f(void)
{
   std::auto_ptr<std::string> sptr(new std::string("abc"));
   etf(sptr.get());
}

This code works as expected functionally, but makes an unnecessary memory allocation. As long as the pointer to the string is not stored anywhere while inside the etf call, the same effect can be achieved using a good old local variable:

void etf(const std::string*) throw(std::exception);
void f(void)
{
   std::string sptr("abc");
   etf(&sptr);
}

This code will run considerably faster and is easier to maintain because it is less ambiguous about the ownership of the string object.

However, watch out for morbidly-obese objects, as they can cause stack overflows. A typical C++ object will be well under 1KB in size. Objects larger than 1 KB are still safe when defined as local variables, but you have to be aware of how many of such objects are allocated throughout the relevant call sequence. For example, if a 10KB object is defined as a local variable every time a binary tree node is traversed using recursion, up to 20 objects can be defined for a reasonably-balanced tree containing 1M nodes, allocating 200KB in total. If such tree is used within an application with a relatively small stack, such as IIS6, whose default stack is 256KB on x86, allocating 20 objects may cause stack to overflow.

goto is C++. Live with it!

The goto statement and, sometimes, the continue statement have always been frowned upon by many developers who consider themselves senior. Most of the corporate coding style documents chant the same mantra: "Thou shalt not use goto!".

It is true, goto may be misused and make code less maintainable, but so is the rest of the language. Here is one example of self-evident bad code using goto.

   if(cond1) {
      if(cond2)
         goto l1;
   }
   else {
l1:   printf("cond1 is false\n");
   }

However, in many cases using goto will improve code readability and will run faster. For example, a function with multiple validation conditions may become hard to maintain when there is more than just a couple of conditions. It is not unusual for such functions to exceed the width of the screen, which obviously doesn't help readability of the code.

void handler(const char *input)
{
   char *buf1 = new char[BUFSIZE];
   char *buf2 = new char[BUFSIZE];

   if(!check1(input, buf1, buf2)) goto funcexit;
   if(!check2(input, buf1, buf2)) goto funcexit;

   proc(input, buf1, buf2);

funcexit:
   delete [] buf1;
   delete [] buf2;
}

Using goto in this case makes code cleaner and, in some cases, faster (e.g. if exception handling would be used instead of goto's).

Conclusion

It is true that good code may contain bugs. It's still good code, though. Buggy, but good. When bugs are fixed, good code becomes working good code. Bad code, on the other hand is bad until it is rewritten or removed altogether. One wise man once said that there is one thing worse than an absurd idea - that would be an absurd idea that becomes a reality. Many of them do. Too many.

Comments:
Name:

Comment: