NULL considered harmful?

Tuesday, August 2nd, 2016
Oh great, let's dig this corpse up again for an undignified session
of prodding and jabbing. Well, the reason I bring this up again
is that I've found myself being confused by the issue.
NULL, I feel, surely cannot be all that bad? Right?

First off, let's make a clarification: In various discussions,
NULL is often confusingly referring to both NULL-references and
NULL-values. A NULL-reference is a pointer to a region of memory
that is considered "nothing". A NULL-value is a special value
that is considered "nothing". So far there's not that much of a
difference between the concepts.

Problems start creeping up when a language promotes heavy use of
pointers and does little to help out when a NULL-reference is
being dereferenced. The most notable example is C, which allows
its NULL-reference to be cast arbitrarily and drops dead when
it's dereferenced. (The language spec actually lists this as
"undefined behavior", prompting most compiler implementations
to ignore the issue outright and treat the end-user with a crash
to the screams of "Segmentation fault".)

Another issue is when a language supports pointer-arithmetic like
C does, addresses are tangible values accessible to the programmer,
which leads to assumptions about the location of NULL. For many
architectures NULL is equal to 0, which manifests itself in
code like

    ...
    if (!my_ptr) {
        ...
    }
    ...

...which is elegant, concise and makes no goddamn sense if compiled
to an architecture where NULL != 0.

The issues with NULL-references can be mitigated with proper
handling in the compiler and runtime exceptions when doing faulty
dereferences. Still, this is more of a band-aid to fix the glaring
design fault that is NULL-references. A better solution is to use
NULL-values.

However, NULL-values is far from a silver bullet. They are usually
an orthogonal value, meaning that all values can be either the
expected value or a NULL. This means that you cannot truly know
what will be passed beforehand. In object oriented languages, this
can be solved with NULL-objects that can, for example, embed
proper error handling. In languages with static typing and
sum-types, such as Haskell or ML, they can be be represented by the
Maybe-type, which makes it explicit that a value can be either
'Just a' or 'Nothing'.

So, in summation, what's the verdict? Is NULL considered harmful?
Weerrll, yes and no. NULL-references make little sense and are
generally a hassle that can blow your leg off. I'd say that they
are harmful. NULL-values and friends, on the other hand, are not
necessarily a bad idea, considering that there are plenty of times
where you need to express something that is or isn't. I'd say
that the chief problem with NULL-values is that they convey a
lot of different meanings in different contexts. Sometimes a
NULL means a failed operation, sometimes it's an empty set, etc.,
etc.

As usual when it comes to the art of programming, harm is in the
hands of the fool and the careless.