Code Style

Entity naming style

There are many different styles of naming the entities in your code, but the most important part of it is that you define what your style is and stick to it consistently, naming all the entities of the same type in the same way.

Naming style example

Here’s one example of the code style, the one that I personally use in my current projects.

Before I give it, let me explain what the camel-casing is, which I refer to in almost every line 🙂

Camel-casing is one of the methods of creating identifiers that combine several words. Following the camel-casing style:

  • the words are written in small letters;
  • each word (but the first in many situations) starts with a capital letter;
  • no underscores are used.

So, now to the example:

Variable names:

  • start with a small letter, camel-casing in case of multiple words (e.g. var, myVariable, count, numberOfPoints);
  • one-letter i,j,k,l for the simple integer loop indices (I tend to use i in cases of signed int and j in cases of unsigned when there is one loop without nesting);
  • name starts with p for a pointer, with pp for a pointer to pointer, etc. (e.g. pMyPointer, pSize, ppData, ppMyArray).

Structure and class names:

start with a capital letter, camel-casing (e.g. MyClass, HeatColorizer, PointInfo).

Interface names:

start with a capital letter I, camel-casing (e.g. IMyInterface, IColorizer).

Member method names:

start with a small letter, represent an ACTION, camel-casing (e.g. MyClass::doAction, IColorizer::colorizeVal, Car::startEngine).

Free function names:

Start with a capital letter, represent an ACTION, camel-casing (e.g. PerformProcedure, CreateColorizer, SendMessage).

Named constants and preprocessor defines

Named constants and preprocessor #defines (including header protection, see section II of my last post) are the only exceptions from the camel-casing, they are written in all-caps with using underscores between the words (e.g. MY_DEFINE, MAX_LENGTH, ICOLORIZER_H_).

Once again,

it is not that important which style you use for your solo projects, just pick one and stick to it. However, it is very important that the bigger projects will have consistent style through all the source files and packages, thus, the naming style is one of the required decisions to be made at the group/team level.

Advertisements
Standard
General Tips

Public interfaces in C++

I mentioned class interfaces a couple of posts ago, when discussing the importance of having verbs in method names. Today I would like to expand about it, and show how you can separate the public interface of a class from the class definiton and/or implementation to minimize dependencies. (This also is a post one in the series of posts on minimizing dependencies)

A public interface of a class is simply a set of its public methods (In C++, it also influences the specification of the actual binary structure of the class header, but that one gets quite complicated and I would refrain myself from getting into details there). You would like to separate the class interface from the actual class deinition/implementation whenever you have several different classes that PROVIDE the same set of methods, because they serve some common function – even while they may not be in IS-A relationship that is explained/implemented via classic inheritance.

Let’s demonstrate that by studying a simple example in detail.

Case study: transfer function

The example is an implementation of a predefined transfer function, that converts a data value into a color for the visualization purposes. Essentially, all that a visualization widget needs to know is the color which is assigned to each of the data values, and it does not really care if the transfer function object represents the black body radiation (“heat”) colormap, the rainbow colormap or any other. All that it really wants is to have an object and be able to call a method Color getColorForDataValue(float dataValue). The interface that I actually have may look a bit more complicated, but please stay with me 🙂 Take a look at the header file:

/*! @file icolorizer.h
 * @author Anatoliy Antonov
 * @brief IColorizer interface
 */

#ifndef ICOLORIZER_H_
#define ICOLORIZER_H_

enum ColorizerType {
	BLUE_WHITE_RED,
	HEAT,
	RAINBOW
};

/*! @brief Interface for colorizers
 */
struct IColorizer {

	/*! @brief Get type of the colorizer.
	 *  @return Value of enum ColorizerType 
		corresponding to the current colorizer.
	 */
	virtual ColorizerType getType() = 0;

	/*! @brief Colorize a single value.
	 * @pre 0 <= val <= 1
	 * @param[in] val Floating-point value in range [0,1]
	 * @param[out] pr Pointer to float variable 
		receiving red channel value (in [0,1]).
	 * @param[out] pg Pointer to float variable 
		receiving green channel value (in [0,1]).
	 * @param[out] pb Pointer to float variable 
		receiving blue channel value (in [0,1]).
	 */
	virtual void colorize(float val, 
				float* pr, 
				float *pg, 
				float *pb) = 0;

	/*! @brief Virtual destructor for 
	 * proper memory deallocation. */
	virtual ~IColorizer() {}

	/// Factory method to create colorizers
	static IColorizer* createColorizer(ColorizerType type);
};

#endif /* ICOLORIZER_H_ */

Let’s examine all the parts.

I. Doxygen file comment

First, we have a Doxygen file comment that provides in-code documentation for this header file:

/*! @file icolorizer.h
 * @author Anatoliy Antonov
 * @brief IColorizer interface
 */

It is self-explanatory: the name of the file, the file’s author, and a brief description. It is a multiline comment /* ... */, where the exclamation mark indicates that this is a Doxygen comment block (in Qt-style, also possible to put a second star symbol instead to have it JavaDoc-style). I have chosen the @tag notation for similarity with JavaDoc (there is also \tag notation similar to TeX). Each of the tags marks the beginning of a paragraph with the tag-related information, which finished either with the next tag or with an empty line (i.e. line breaks are OK). It is permitted to use HTML within the paragraphs.

II. Header protection

After the file comment, we have a protection against multiple inclusion of the header file. Microsof Visual C++ and some other compilers would understand simple #pragma once for that purpose, but traditionally it is specified in classical C-preprocessor style:

#ifndef ICOLORIZER_H_
#define ICOLORIZER_H_

	/* Header contents */

#endif /* ICOLORIZER_H_ */

Which, basically, checks if a preprocessor variable ICOLORIZER_H_ has been defined. If it is NOT defined at this point, the preprocessor defines it and includes the header contents in place of the caller’s #include "icolorizer.h" line. If it is already defined, then the complete block between #ifndef and #endif is omitted.

The reason behind it is that the compliler does not allow multiple definitions of exactly the same class to be included in a compilation unit. Omitting multiple includes is not the default behavior for the preprocessor, because in some cases this is exactly what you need. E.g. multiple includes of the same header file with different preprocessor #defines were extensively used to create generic C structures before the templates were adopted in C++.

III. Colorizer type enumeration

First up in the header contents, we have the enumeration that specifies the type of the colorizer.

enum ColorizerType {
	BLUE_WHITE_RED,
	HEAT,
	RAINBOW
};

I must admit that it was the first obvious solution that came to my mind to provide a means for the visualization widget to understand which colorizer is currently in use. However, this seems to be not the best solution: it is not easily extensible. Much better solution would be to omit enumeration here and return text strings from the colorizer classes (the enumeration will still be required for creating a colorizer object through a factory method, see below, but this method could be moved to a separate header, included only by those compilation units that actually create colorizers). But this is what we have at the moment.

IV. Interface

Finally, next we have the interface.

/// Interface for colorizers
struct IColorizer {

	/*! @brief Get type of the colorizer.
	 *  @return Value of enum ColorizerType 
		corresponding to the current colorizer.
	 */
	virtual ColorizerType getType() = 0;

	/*! @brief Colorize a single value.
	 * @pre 0 <= val <= 1
	 * @param[in] val Floating-point value in range [0,1]
	 * @param[out] pr Pointer to float variable 
		receiving red channel value (in [0,1]).
	 * @param[out] pg Pointer to float variable 
		receiving green channel value (in [0,1]).
	 * @param[out] pb Pointer to float variable 
		receiving blue channel value (in [0,1]).
	 */
	virtual void colorize(float val, 
				float* pr, 
				float *pg, 
				float *pb) = 0;

	/*! @brief Virtual destructor for 
	 * proper memory deallocation. */
	virtual ~IColorizer() {}

	/// Factory method to create colorizers
	static IColorizer* createColorizer(ColorizerType type);
};

#endif /* ICOLORIZER_H_ */

It starts with a Doxygen single-line comment (triple slash), giving a brief description for the class.

Then, keyword struct indicates that all the members of the entity are public by default. (Remember: following the core object-oriented programming principle of encapsulation, NO data members can be public; an interface specifies exactly the public part, which in OOP can contain only methods).

IV.1 Colorizer type

First method is the mentioned before means to provide the type of the current colorizer to the visualization widget for the user reference.

/*! @brief Get type of the colorizer.
 *  @return Value of enum ColorizerType 
	corresponding to the current colorizer.
 */
virtual ColorizerType getType() = 0;

It has a Doxygen comment block giving the brief description of the method, as well as the description of the return value. This method is specified as virtual, meaning that when the caller has the pointer to the interface, calling a method on it will execute the method’s implementation in the concrete subclass. This concept is often called override, and is required for following the core object-oriented programming principle of polymorphism.

Additionally, it is an abstract or purely virtual method, which is indicated by = 0 after the method's name. This means that the interface IColorizer provides NO implementation for this method, and this method MUST be implemented in subclasses. This is a requirement for an interface, as interface specifies only a set of methods. If you need to have some basic/base implementation for some of the methods, then instead of an interface you will need a similar concept that is called abstract base class, and when you provide default implementation for all of the methods, then it is simply a base class. But in both those cases, it will already be under the focus of another core object-oriented programming principle of inheritance.

IV.2 Transfer function

Second method is the essence of the transfer function: converting data values to colors.

/*! @brief Colorize a single value.
 * @pre 0 <= val <= 1
 * @param[in] val Floating-point value in range [0,1]
 * @param[out] pr Pointer to float variable 
	receiving red channel value (in [0,1]).
 * @param[out] pg Pointer to float variable 
	receiving green channel value (in [0,1]).
 * @param[out] pb Pointer to float variable 
	receiving blue channel value (in [0,1]).
 */
virtual void colorize(float val, 
			float* pr, 
			float *pg, 
			float *pb) = 0;

Its Doxygen comment looks a bit more complicated than all the others, but it is still self-explanatory.

It starts with a brief description of the method.

Then, the tag @pre specifies the precondition, i.e. condition on the input(s) of the method for it to function properly. In this case, we state that the method would expect the data value to be normalized to the [0,1] interval. The results of calling method with different value are undefined and are purely implementor’s decision (alternatives: trim to [0,1] and give an output, try to convert the given value, throw an exception, or simply crash).

Next up, we have a description of an input parameter: tag @param with a direction specifier [in], name of the parameter and its description.

Finally, we have three lines describing output parameters – pointers to float variables that will receive the color in red, green and blue channels respectively. Their description explicitly states that the color channel values are returned in [0,1] interval. (The last point may be also specified with the tag @post: post-condition, the condition on the outputs of the method in case the inputs had satisfied the precondition).

You can notice that this function is also purely virtual (see above).

IV.3 Destructor

Next method is a virtual destructor, which is a general requirement for polymorphically used objects.

/*! @brief Virtual destructor for 
* proper memory deallocation. */
virtual ~IColorizer() {}

The keyword virtual ensures that the destructor of the concrete implementation class will be called when the user calls delete on the pointer to the interface. The destructor cannot be purely virtual (= 0) even for the specification of a public interface. (There is some alternative “tricky” method to indicate to the compiler that this method has null implementation, but I stick with just empty body)

IV.4 Factory method

Finally, we have a static method (assigned to the interface itself, not to any instances of its implementations), which is a factory method (reference: design patterns, e.g. on Wikipedia): it will create a concrete object of the required implementation class based on the given type from the enumeration, and return it via the pointer to the common interface, thus (IMPORTANT!!!) completely freeing the client of the dependencise on the implementations of the particular transfer functions.

This method will be implemented in the CPP file icolorizer.cpp as follows:

//include all the particular implementation headers
#include "bluewhiteredcolorizer.h"
#include "heatcolorizer.h"
#include "rainbowcolorizer.h"

IColorizer* IColorizer::createColorizer(enum ColorizerType type) {
	IColorizer* pColorizer = 0; // ALWAYS initialize 
				// your local variables!!!
	
	switch (type) {
	case BLUE_WHITE_RED:
		pColorizer = new BlueWhiteRedColorizer();
		break;
	case HEAT:
		pColorizer = new HeatColorizer();
		break;
	case RAINBOW:
		pColorizer = new RainbowColorizer();
		break;
	default:	
	// (required by compiler at higher warnings level)
		break;	// do nothing
 
			// ^^^^^^^^^^ (this comment 
			// indicates that the empty 
			// default section of switch 
			// operator is intentional 
			// and not a mistake)
	}
	
	return pColorizer;
}

V. Using the interface in the class definition (and implementation)

A header file for the definition of a particular implementation of the interface will look like the following (comments and header protection mechanism are omitted to keep things short):

#include "icolorizer.h"

class HeatColorizer: public IColorizer {
public:
	virtual ColorizerType getType() { return HEAT; }
	virtual void colorize(float val, 
				float* pr, 
				float *pg, 
				float *pb);
	virtual ~HeatColorizer() {}

	static void colorizeVal(float val, 
				float* pr, 
				float *pg, 
				float *pb);
};

This header includes the header file with the definition of the interface IColorizer, enumerates all the newly implemented methods (notice that the virtual methods are no longer pure virtual – they have implementation either directly in the class definition or in heatcolorizer.cpp)

The static method allows the clients who only use HeatColorizer (and directly depend on heatcolorizer.h) to avoid creating an object for the transfer function, and simply execute the transfer function using HeatColorizer::colorizeVal(val, &r, &g, &b);

VI. Using the inteface in code

In the code, using interface is pretty simple:

#include "icolorizer.h"

void Visualization::DrawData() {
	IColorizer* pColorizer = 0;
	pColorizer = IColorizer::createColorizer(HEAT);

	if (pColorizer != 0) {
		for (unsigned i=0; 
		i<points.size(); 
		i++) {

			float r=0, g=0, b=0;

			pColorizer->colorize(
				points[i].value(), 
				&r, &g, &b);

			/* draw the point 
			 * with given colors */
		}
	}
	
	delete pColorizer;
	pColorizer = 0;
}

Conclusions

Summing up: in this post we have examined all the required parts of separating the public interfaces of classes from the particular implementations, and how this allows the users of your classes to avoid dependencies on these particular implementation.

As I noted, this example is still not perfect: it is possible to extract the enumeration of colorizer types and the factory method to a separate header, so only the code actually requiring the creation of new colorizer objects will depend on the changes and only that code will need to be recompiled in case a new transfer function is implemented (which is quite expectable).

At the same time, it depends on the size and complexity of the calling code. If your application really has just one window which does both visualization and changes in the transfer function, then it might be unreasonable to do that separation. However, in case of big projects with multiple visualization widgets that only use colorizers and ONE widget for preferences and selecting the transfer functions, it would be beneficial to keep creation and usage completely separate.

So, refer to your own judgment!

Enjoy 😉
and now go write some good code!!! 😀

Standard
General Tips

Assertions vs. Error handling

Many programmers are not sure or not aware when to use assertions and when to use conditions-checking and error handling code. Here’s a tip on this.

But first, what is an assertion? An assertion is a line in your code which looks like this:

assert(x>0);

When execution reaches this point, if the condition is true, then the program continues with the next operator. However, if the condition is false, then the execution flow is terminated, and an error is reported to the debugger or to the command shell.

Important: the assertions are intended to use for debugging, and usually have an easy method of exculding them from the release version (e.g. #define NDEBUG for the assertion from the C standard library).

By their definition and implementation, assertions are targeted at catching the possible mistakes of the DEVELOPER (of the application or of the library code), at the situations “this variable should NEVER be negative (at this place of the code)”. Use the assertions to check conditions in the code when the user performs everything correctly, but your application does not give the intended result (and that is essentially the definition of debugging).

However, when the condition you are trying to avoid CAN arise during the normal execution because of the USER mistakes (e.g. you ask for an integer number but user inputs a text or a floating-point number), you need to implement the proper error handling (set of if/else operators, plus reporting wrond data with the help of error codes or exceptions, plus reporting back to the user with the help of console outputs or message boxes).

Sometimes, for the production code you need to combine both. Assertions check the conditions during the development and testing (and force to notice and correct the errors immediately). For the release version assertions are removed, and the error hanling code makes sure that the program will not not crash if things do go wrong during the execution.

Standard
General Tips

What ToDO-lists have in common with class interfaces

NOTE: I have completely forgotten that I had started this blog! A couple of days ago, I shared with a new colleague the documents on software quality essentials that I made two years ago for our laboratory, and suddenly a thought of this blog popped up.

During my work, I do quite some refactoring and documenting of the legacy code. The code pieces represent implementations of scientific algorithms by former PhD students, mainly with the background in mathematics. That means: only vague understanding of the object-oriented methodology, prefering the speed of development and reaching “it’s working” stage to the code quality, and less than minimal level or absolutely no comments or documentation.

I find this very unfortunate that there is only a little attention to the software quality aspects for mathematics and physics students in general. Very unfortunate, because for the next person who comes to work on and extend the code it takes a huge amount of effort to understand what is done and how it is implemented (assuming the why is clear from the tool/library purpose itself).

In this case, fortunately this student is me, and I am investing the effort in making the code better (which is absolutely undervalued by my supervisor and current colleagues, and makes it even harder). But so often a next person spends a huge amount of time to understand the code, then reuses it as it is. Then the next person after that one will again need to invest a lot of time and efforts, and the next, and the next. Just writing the good code from the very beginnning, or taking the time to clean it up and document it at the end of a projects that used it will pay off multiplied in future. I believe in leaving everything in a better state than we found it, so I do it anyway, despite the lack of support in the present.

This time, I have been struggling with understanding a class interface, and I decided to share my insights here. I have found a wonderful analogy in a TEDx talk! ( Surprise-surprise! 🙂 ).

In his wonderful talk at TEDx Claremont Colleges, David Allen demonstrates a fundamental mistake which many people make when they write ToDo-lists. They write NOUNS. “Mother”, “dog”, “kids”. Then, an hour later you look at the list and you have no idea what you actually wanted to do. Certainly, that was something about the kids. Pick them up from school? Take them to a movie? Talk to them about the plans for holidays? You spend a lot of extra mental effort each time when you look at such a ToDo-list. It does not serves its purpose of being a reminder of the ACTION you want to take.

Well, a class interface is essentially the same as a ToDo-list! It specifies the list of OPERATIONS (actions) the object of this class can perform. Thus, writing it in nouns would generate the very same struggle as in ToDo-list. Let’s show it on example (the return types and arguments are omitted for clarity):

class Car {
public:
	Doors();	
	Belts();
	Engine();
	WindshieldWiper();
	Mirrors();	
	drive();
}

Obviously, the class Car has something to do with all of these things. But take Engine(). Is it actually startEngine()? Or maybe it is serviceEngine()? Maybe you take a look to the implementation (and this is already a sign of bad code quality) and realize that this method is actually checkEngineAndMakeSureOilLevelIsOK()? 🙂 And the same with every other method in the class: you need verbs that clearly explain the performed operation. And our Car will become:

class Car {
public:
	closeDoors();
	plugBelts();
	adjustMirrors();
	turnOnWinshieldWiper();
	startEngine();
	drive();
}

I would say that the only exception from these rule are properties and getter/setter methods. For properties in C# or Objective-C, you don’t even think about them, the compiler generates them for you automatically. And for a variable named x, the getter may be getX() or simply x(). The setter, however, should be setX(). Again, in some languages this is done automatically, and all you need to write is

	obj.x = 5;
	drawTickMark(obj.x);

In all other cases, use the rule above. Have a verb in the method name that tells you the action, and try to make the performed operation as clear as possible from the method name. Bonus point: add a comment that says it in a plain text. But we’ll talk about the comments and documentation in another post (in the meanwhile, you can check the document I have written for the lab about the documentation essentials).

Enjoy,
and write the good code!

P.S. And now you can watch the full talk “The Art of Stress-Free Productivity” by David Allen:

Standard
Code Style

One-line if/for/while/do

I recommend to always put curly brackets (begin/end) in the structural operators like conditions and loops. I.e. instead of

if ( a < 0 ) cout << "a is negative" << endl;
for ( int i=0; i<N; i++ ) cout << i << ' ';

write

if ( a < 0 ) {
    cout << "a is negative" << endl;
}
for ( int i=0; i<N; i++ ) {
    cout << i << ' ';
}

Why?

First, it makes your code clearer. In this way, you explicitly define the scope, the block of code to be executed when the respective condition is true and control flow enters the operator.

Second, often there will come a moment when you would like to add a debug print, to perform an additional operation or to split the single complex line you have into several ones to make it easy to understand. At this time, there is a very high risk that you’ll forget to enclose them in a block by adding the curly brackets. In this case, if you don’t test your program thoroughly, you’ll miss that change on the spot. Later – even a half an hour later – it may become incredibly difficult to trace back, to find the part of code with the mistake and to see such mistake. Because everything seems to be fine, especially if there is a lot of operators without curly brackets spread through the code.

Standard