Monday, May 28, 2012

What's wrong with C++ tutorials?

Quite often on ##c++ or ##programming on Freenode there'll be some poor newbie who comes in with a piece of awful C++ code.  The kind of code that shows a severe misunderstanding of the language: usage of new where a vector would be appropriate, nasty things being done to temporaries, attempts to access a variable from a completely different scope, no understanding of const-correctness and why it is necessary, and many more things that show a misunderstanding of some fundamental language feature.  The question of how people could get the language so wrong arises, and the answer is usually the same: the newbie is learning from a tutorial.

It may be somewhat surprising that just by reading texts that are (mostly) technically correct one could get such a perverse view of the language.  The technical mistakes the texts make probably do contribute somewhat to this confusion but they can't be responsible for it all -- after all, just looking at the most common ones shows that the cases where it would be a problem aren't all that common:
  1. By far the most widespread error is that of streams being read incorrectly.  The state of the stream is checked before the read, and not after, meaning that often one read is made past the end of the stream.
  2. Another common one is the idea that structs cannot have inheritance and member functions.
  3. People tend to make operator overloads of operator+ and friends members instead of free functions.
  4. The preprocessor is abused for const values and inline functions a lot.
Tutorials tend to get at least one of these wrong.  However, by themselves, they're not really a big deal; while it may lead to code that isn't quite as good, these are all fairly isolated problems that , while many of the misconceptions mentioned earlier need a thorough explanation of the language basics to fix.

The problem, as far as I can tell, isn't in what the tutorials explain: it's all the context they leave out while explaining it.  Tutorials tend to focus on explaining the syntax, breaking it down into separate chunks and coming up with contrived examples that demonstrate just that one language feature in isolation.  Worse yet, they tell too much about the useless technical details in the process.  How often do beginners use nothrow new?  What about the unary scope resolution operator?

For example, if we look at the cplusplus.com tutorial on dynamic memory, we see an example of how to ask the user for how much memory they want to allocate and then allocate that much.  However, nowhere does it mention that actually doing this in your code would be a bad idea, and that std::vector should be used for this purpose.  Did the author want to spare the readers the introduction to more complicated matters?  Had that been the case, the discussion of exceptions would probably not have found its way to that chapter.  As things are, the chapter spends a lot of time on checking whether the allocation succeeded (which is going to be obvious when using the throwing version in a program that doesn't handle exceptions anyway) yet failing to show the actual benefits and drawbacks of dynamic allocation, or why "dynamic memory" is called what it is!

Speaking of exceptions, the two tutorials on classes have one sentence on RAII in total.  It's not mentioned by name, the connection to exceptions is not made (neither in the classes tutorial nor in the later exceptions tutorial), and the rule of three (now five) is not mentioned.  Instead, it explains operator overloads -- a worthy subject, definitely, but nowhere near as fundamental and important to sane development as RAII and proper exception handling is.  No mention of const member functions.  No realistic examples of when or why one would use a class, or how code from earlier in the tutorial could be made to use one.

Next, the inheritance and friendship tutorial is just a joke.  The purpose of friendship is not explained, at all.  This is exactly the "explain the syntax, fail to explain the purpose" pattern that I'm referring to -- the example given about a square and a rectangle just shouldn't be written that way; the rectangle should have had a constructor that takes a width and a height, because as things are it is utterly useless except to compute the areas of squares.  This being necessary for the purpose of the example is no excuse -- there are plenty of cases where both friend functions (operator overloads) and friend classes (iterators) are appropriate.  Using a case that doesn't come anywhere close to practice as an example is silly.

The inheritance part of the tutorial fares little better.  The clich√© explanation using shapes is completed in the next part about polymorphism which makes it an acceptable introduction, but the mother-daughter example drags it right down.  A mother-daughter relationship should never, ever be described through inheritance!  While this goes more into the realm of bad examples of object oriented programming than of C++, there's no reason people should be exposed to such blatantly misleading code.  

The last few tutorials go on to cover some more syntax.  What don't they cover?  Well, for example, there's still no mention of the usage of the standard library containers and algorithms.  There's still no mention of smart pointers or how they relate to exception safety, nor any realistic code pieces being taken apart step by step and explained.  The poor reader is left to figure out how to put all of this together into something that won't crash and burn all by himself, and unsurprisingly, it takes a lot more effort than it would have if these things were explained in the first place.

I could of course go on.  I've mostly focused on cplusplus.com here as it is the most commonly used source (I started with it myself), and is in text which makes these things easier to spot.  Sadly, most other tutorials are no better, although there are exceptions; most notably www.learncpp.com, which is much more thorough.  However, it teaches things in a very unusual order, which I don't think is a good idea.  Whether things will be corrected is questionable; I haven't seen any work done in quite a while, and would not rely on it.

Update 12 September, 2012:  There is a tutorial that may eventually become good enough for me to recommend it, but it is not yet at that stage.  Hopefully, the author will update it fairly soon, and it'll be possible to recommend it at least as a secondary source.

Furthermore, I am considering writing some annotated C++ code that illustrates the development of a simple project.  However, I do not currently have time to do this well, so it will probably have to wait to around new year.

No comments:

Post a Comment