Octopull/C++: Ending with the grin

Ending with the grin

In C++ "private" is sometimes not private enough. Alan Griffiths examines an old technique and adds some recent ideas.

I started using C++ in the late '80s - these were the early days of MS-Windows on the corporate desktop and MFC had not been offered to the developer community. However, there was a small Irish company Glockenspiel who not only offered an early C++ compiler for DOS/Windows (based on CFront), but also a GUI class library called "CommonView". (This was also available for a number of platforms - including OS/2 and a variety of UNIX variants.)

As well as a being a productive tool for writing portable GUI applications this class library was the first place I encountered the venerable "Cheshire Cat" technique for hiding the implementation of a C++ class. I know John Carollan was associated with this library, and suspect he first coined the phrase.

Why hide the implementation?

In C++ the whole of the class body is seen by client code that #includes the corresponding header file. Even those parts of the class marked "private" are compiled along with the client code. This has three undesirable effects: these parts may require headers to be included that wouldn't otherwise be compiled with the client code, if the private parts (or the additional headers) are changed then the client code must be recompiled, and, in the case of library code, it becomes impossible to substitute a new version of the library without recompiling all of the client code when the implementation but not the interface of a class changes. In a small project these effects are small and are often ignored, but because of the need for additional includes they increase exponentially with the number of files in the system.

There are a number of well known techniques for addressing this issue but here I am going to concentrate on the "Cheshire Cat" idiom. (This also known as "pimpl" [sic] and a special case of the "Bridge" pattern described in "Design Patterns" by Gamma et. al.) There is a brief overview of the alternatives on my website. The canonical implementation of a "Cheshire Cat" class looks like listing 1. Note that, in addition to any constructors and methods required for the interface to the class it is also necessary to implement the copy constructor, assignment operator and destructor in order to manage the "hidden" body referred to by the "rep" pointer.

Listing 1: Example of a "Cheshire Cat" class

	// ------------ The header file ------------

	/** Telephone list. Example of implementing a minimal telephone
	*   list using "Cheshire Cat" to hide the implementation.
	*/
	class phone_list
	{
	public:
		explicit phone_list(std::string name);

		phone_list(const phone_list& rhs);

		~phone_list();

		std::string name() const;

		std::pair<bool, std::string> number(std::string person) const;

		phone_list& add(std::string name, std::string number);

		phone_list& operator=(const phone_list& rhs);

	private:
		class implementation;
		implementation* rep;
	};


	// ------------ The implementation file ------------

	using std::string;
	using std::pair;

	class phone_list::implementation
	{
	public:
		implementation(std::string initialName) :
			list_name(initialName) 	{}

		string name() const	{ return list_name; }

		pair<bool, string> number(string person) const
			{
				pair<bool, string> rval(false, string());

				dictionary::const_iterator i = dict.find(person);

				if (i != dict.end())
				{
					rval = std::make_pair(true, (*i).second);
				}

				return rval;
			}

		void add(string name, string number)
			{
			    dict[name] = number;
			}
		
	private:
		typedef std::map<string, string> dictionary;

		string      list_name;
		dictionary  dict;
	};


	phone_list::phone_list(string name) 
	: rep(new implementation(name))
	{
	}

	phone_list::phone_list(const phone_list& rhs)
	: rep(new implementation(*rhs.rep))
	{
	}

	phone_list& phone_list::operator=(const phone_list& rhs)
	{
		// Allocate...
		implementation* tmp(new implementation(*rhs.rep));

		// ...before release...
		delete rep;

		// ...and update
		rep = tmp;
		return *this;
	}

	phone_list::~phone_list()   
	{
		delete rep;
	}

	string phone_list::name() const
	{
		return rep->name();
	}

	pair<bool, string> phone_list::number(string person) const
	{
		return rep->number(person);
	}

	phone_list& phone_list::add(string name, string number)
	{
		rep->add(name, number);
		return *this;
	}

Considering the phone_list class as seen by the client code the private parts consist of a pointer to an incomplete type: this requires no additional headers (a naive implementation with list_name and dict as private members would require <map>). If a new implementation were provided (e.g. to use a name comparison method that ignored case and understood "Mc..." == "Mac..."), then the header wouldn't need to change. A new version of the class with methods added to the interface will remain link-compatible with existing code - even if these methods require changes to the implementation. These advantages are gained at the cost of writing an extra class containing a few forwarding functions.

In need of modernisation?

The code in listing 1 isn't quite as I'd have written it in the '80s - there was no a standard string class (or pair<> template) then. But it definitely has the flavour of the past - each time this idiom is used one needs to explicitly write out the copy constructor, assignment operator and destructor in the same way. Only the class names change.

At the beginning of the '90s a tool was introduced into the language to address the sort of generic programming: templates. In the years variable compiler support for templates put off a lot of programmers. However, things have improved a lot since then - largely because the standard library makes heavy use of them.

So what are we looking for? A template that looks after an object allocated on the heap, and ensures it is copied and deleted when appropriate.

As you may have guessed this is coming soon, but before proceeding to my solution we need a short diversion to discuss the standard library's auto_ptr<> template. Although auto_ptr<> doesn't meet our current needs it is instructive to know why since the reasons are not immediately obvious - a notable expert (Herb Sutter) stepped into this trap in print recently ("Using auto_ptr Effectively" C/C++ Users Journal October 1999).

On `std::auto_ptr<>`

By historical accident the standard library provides a single smart pointer template known as auto_ptr<>. auto_ptr<> has what I will politely describe as "interesting" copy semantics. Specifically, if one auto_ptr<> is assigned (or copy constructed) from another then they are both changed - the auto_ptr<> that originally owned the object loses ownership and becomes 0. This is a trap for the unwary! There are situations that call for this behaviour, but on most occasions that require a smart pointer the copy semantics cause a problem.

If we tried to replace implementation* with auto_ptr<implementation> in the phone_list class we'd find that we still need to write the copy constructor and assignment operator carefully to avoid an implementation being passed from one phone_list to another (with the consequence that one of the phone_lists loses its implementation). Worse than this, and the point I believe the Herb Sutter missed: we would also need to write the destructor - if we don't, the one the compiler generates for us will not correctly destroy the implementation. This is because the client code causes the generation of the phone_list destructor and consequently instantiates the auto_ptr<> destructor without having seen the class definition for implementation. As a result the destructor of implementation is never called. (A good compiler may give a warning about deleting an incomplete type, but the language standard requires that the code compiles - although the results are unlikely to be what the programmer intended.)

OK, I could rewrite the code using auto_ptr<> but I'm not going to since it saves us nothing.

`arg::grin_ptr<>`

Although the standard library doesn't support out needs it is possible to write a smart pointer which does. To prove it I've put one into the "arglib" library on my website. This allows the phone_list class to be rewritten as shown in listing 2.

Listing 2: Using arg::grin_ptr<>

	class phone_list
	{
	public:
		explicit phone_list(std::string name);

		std::string name() const;

		std::pair<bool, std::string> number(std::string person) const;

		phone_list& add(std::string name, std::string number);

	private:
		class implementation;
		arg::grin_ptr<implementation> rep;
	};

Note that there is no longer a need to supply the copy constructor, assignment operator, or destructor as the necessary logic is supplied by the grin_ptr<> template. (I've not shown the implementation file again since it only differs from listing 1 by the removal of these methods.)

In most cases it doesn't matter, but in addition to the convenience of not having to rewrite these methods, grin_ptr<> is what Kevlin Henney refers to as a QUALIFIEDSMARTPOINTER ("Coping with Copying in C++" - Overload 33 ISSN 1354 3172). The particular convenience this has is that methods on phone_list and implementation may be overloaded on const since dereferencing a const grin_ptr<implementation> provides a const implementation. Naturally it is rare for const and non-const methods to have the same name, but it does happen. (For example containers overload begin() and end() on const so as to provide const_iterators when the container is const.)

Implementing `arg::grin_ptr<>`

If you have what you are looking for I'll understand if you leave me now, but if you want to understand how grin_ptr<> works then there are some interesting issues to be addressed.

Let us start with the last point mentioned in the discussion of auto_ptr<> - how to cope with deleting an incomplete type. The destructor can't be a simple "delete p;" because at the point of instantiation the pointer is to an "incomplete type" and the compiler won't call the destructor.

To avoid this I make use of the fact that the constructor for grin_ptr<> is instantiated in the implementation file, where the class definition for implementation resides. At this point I force the compiler to generate a deletion function using a trick I first saw in Richard Hickey's article "Callbacks in C++ Using Template Functors" (C ++ Gems ISBN 1 884842 37 2): the constructor stores a pointer to this function in the grin_ptr<>. This provides a safe method for the destructor to delete the object it owns. The point of passing around function pointers instead of the apparently more natural use of virtual member functions is that everything can be done "by value" and no dynamic allocation is required.

A similar function is used for copying the object, so because it contains two function pointers grin_ptr<> is a bit bigger than a raw pointer but this isn't likely to be an issue in most uses. The complete code for grin_ptr<> is shown in listing 3.

Listing 3: The code for arg::grin_ptr<>

	template<typename p_type>
	class grin_ptr
	{
	    // Pointers to utility functions
		typedef void (*delete_ftn)(p_type* p);
	    typedef p_type* (*copy_ftn)(const p_type* p);

	public:
		explicit grin_ptr(p_type* pointee) 
			: do_copy(&my_copy_ftn), p(pointee), do_delete(my_delete_ftn) {}
		
		grin_ptr(const grin_ptr& rhs);
		
		~grin_ptr() throw()              { do_delete(p); }

		const p_type* get() const        { return p; }

		p_type* get()                    { return p; }

		const p_type* operator->() const { return p; }

		p_type* operator->()             { return p; }

		const p_type& operator*() const  { return *p; }

		p_type& operator*()              { return *p; }

		void swap(grin_ptr& with) throw()
			{ p_type* pp = p; p = with.p; with.p = pp; }
		
		grin_ptr& operator=(const grin_ptr& rhs);

	private:
		copy_ftn	do_copy;
		p_type*		p;
		delete_ftn	do_delete;

	    static void my_delete_ftn(p_type* p)
	    {
	    	delete p;
	    }
		
	    static p_type* my_copy_ftn(const p_type* p)
	    {
	    	return deep_copy(p);
	    }
	};

Copying is in fact a bit more flexible than the current discussion would indicate, since the copying method has been factored out into the "arg::deep_copy()" algorithm but if you do nothing special then copying will take place using "new implementation(*p);" as you probably expect.

The arg::deep_copy<>() Algorithm

As mentioned in the main text this defaults to copying using "new implementation(*p);" but if a clone() method (or any other name) needs to be called this is easy to achieve. The code for deep_copy() is shown in listing 4.

Listing 4: The code for `arg::deep_copy()`

	struct cloneable {};
	struct Cloneable {};

	template<class p_type>
	inline p_type* deep_copy(const p_type* p, const void*) 
	{
		return p ? new p_type(*p) : 0;
	}

	template<class p_type>
	inline p_type* deep_copy(const p_type *p, const cloneable *)
	{
		return p ? p->clone() : 0;
	}

	template<class p_type>
	inline p_type* deep_copy(const p_type *p, const Cloneable *)
	{
		return p ? p->makeClone() : 0;
	}

	template<class p_type>
	inline p_type* deep_copy(const p_type* p) 
	{
		return deep_copy(p, p);
	}

This means that any class that inherits from arg::clonable() will be copied by "->clone();", any class that inherits from arg::Cloneable will be copied using "->makeClone()". (These two versions exist to support both my currently favoured naming style and that I am required to use at work.) In addition any class that has a "deep_copy()" algorithm defined in a suitable namespace will be copied using it.

In conclusion

arg::grin_ptr<> removes some of the repetitive work from the development of "Cheshire Cat" classes whilst remaining flexible enough to support applications that are considerably more advanced (making use of polymorphic implementations and/or overloading on const) than the phone_list example considered here.