DevTalk.net

A blog on .Net, C#, F#, architecture and the web

Chained null checks and the Maybe monad

with 14 comments

Nested IF StatementsA great many programmers have met a situation where, while accessing a nested object property (e.g., person.Address.PostCode), they have to do several null checks. This requirement frequently pops up in XML parsing where missing elements and attributes can return null when you attempt to access them (and subsequently trying to access Value throws a NullReferenceException). In this article, I’ll show how a take on the Maybe monad in C#, coupled with use of extension methods, can be used to improve readability.

Problem Description

So, to start with, let’s look at the way to get a person’s post code (just imagine you’re working with XML or something). The code shown below does several null checks and assigns the value only if it is available.

string postCode = null;
if (person != null && person.Address != null && person.Address.PostCode != null)
{
  postCode = person.Address.PostCode.ToString();
}

What you’ve got up there is some fairly unreadable (and un-maintainable) code. Actually, we’re lucky to have all of our code fall under a single if – something that might not be possible in a more complex scenario. Let’s imagine a more complicated situation – say we need to perform some operation between the if evaluations. What do we get? That’s right – a chain of ifs.

string postCode;
if (person != null)
{
  if (HasMedicalRecord(person) && person.Address != null)
  {
    CheckAddress(person.Address);
    if (person.Address.PostCode != null)
      postCode = person.Address.PostCode.ToString();
    else
      postCode = "UNKNOWN";
  }
}

The code presented above contains a lot of excess data – for example, person.Address.PostCode is mentioned twice. There’s nothing incorrect about the code per se, it just has a bit too many symbols. To sum up, we want our code to communicate better that

  • If the value is null, no further evaluations should be done; if the value is not null, then this is the value we’re going to work with
  • If we perform some action, it only happens on a valid object

So what am I suggesting? I propose that we create a fluent interface that will satisfy the above conditions without any nesting. To do that, we are going to employ the Maybe monad.

For those of you who know F#, the Maybe monad will be familiar as the Option type. For C# developers, let’s just assume that you can have variable that either have Some value or no value (None). Of course, C# doesn’t directly support this none-some duality except by using null. Which is precisely why I’m proposing the chained extension solution presented below.

With

Our primary concern is to do the null checks to ‘shorten’ them so they don’t pollute our code. For that, we’ll define a With() extension method:

public static TResult With<TInput, TResult>(this TInput o, Func<TInput, TResult> evaluator)
  where TResult : class where TInput : class
{
  if (o == null) return null;
  return evaluator(o);
}

The above method can be attached to any type (because TInput is effectively object). As a parameter, this method takes a function which defines the next value in the chain. If we passed null, we get null back. Let’s rewrite our first example using this method:

string postCode = this.With(x => person)
                      .With(x => x.Address)
                      .With(x => x.PostCode);

I suppose, in the above example, we could replace Func<> with Expression<> and try to pull properties, but I’ve seen this done and the resulting code is too slow and it’s also somewhat limiting – it assumes that you’re working with just one object, whereas my Maybe chains can (and do) drag in many objects.

Return

Here comes another piece of syntactic sugar – the Return() method. This method will return the ‘current’ value just like Where() does, but in case null was passed, it will return a different value that we supply. Consider this a kind of «Where() with fallback» method.

public static TResult Return<TInput,TResult>(this TInput o,
  Func<TInput, TResult> evaluator, TResult failureValue) where TInput: class
{
  if (o == null) return failureValue;
  return evaluator(o);
}

So let’s assume now that, with the absense of a postcode, we want to return, say, string.Empty. Here’s how:

string postCode = this.With(x => person).With(x => x.Address)
                      .Return(x => x.PostCode, string.Empty);

By the way, you could rewrite the extension method so that failureValue would also be computed via a Func<> – I am yet to meet a scenario where this is required, though. It is typically the case that we never know at which stage the chain failed (and yielded null), so the terminal Return() is typically an indicator (either true/false or null/not null).

If & Unless

Going through the call chain, you sometimes need to do checks not related to null. Theoretically, you could suspend the chain and use an if, or you could use an if in one of the delegates, but… you can simply define an If() extension method (and an Unless() if you feel like it) and plug it into the chain:

public static TInput If<TInput>(this TInput o, Func<TInput, bool> evaluator)
  where TInput : class
{
  if (o == null) return null;
  return evaluator(o) ? o : null;
}

public static TInput Unless<TInput>(this TInput o, Func<TInput, bool> evaluator)
  where TInput : class
{
  if (o == null) return null;
  return evaluator(o) ? null : o;
}

Do

Seeing how we’re having a party here, let’s add yet another method that simply calls a delegate – and that’s it. Of course, this method is best used for one-line calls and not for evaluating 20-line algorithms with convoluted logic. Nevertheless, the call is quite useful in practice.

public static TInput Do<TInput>(this TInput o, Action<TInput> action)
  where TInput: class
{
  if (o == null) return null;
  action(o);
  return o;
}

So, we’re done: we’ve got the infrastructure we need to get our post code extraction to be a bit more readable. Here is the end result:

string postCode = this.With(x => person)
    .If(x => HasMedicalRecord(x))]
    .With(x => x.Address)
    .Do(x => CheckAddress(x))
    .With(x => x.PostCode)
    .Return(x => x.ToString(), "UNKNOWN");

As you can see, the depth of nesting has fallen to zero – no more curly braces!

Discussion

I use these Maybe-monadic-chain-null-extension-methods (call them how you will) in my R2P software product. Here’s an example of real-life use of these constructs:

public override void VisitInvocationExpression(IInvocationExpression expression)
{
  base.VisitInvocationExpression(expression);
  string typeName = this.With(x => expression)
    .With(x => x.InvokedExpression)
    .With(x => x as IReferenceExpression)
    .With(x => x.Reference)
    .With(x => x.Resolve())
    .With(x => x.DeclaredElement)
    .With(x => x.GetContainingType())
    .Return(x => x.CLRName, null);
  this.If(x => Array.IndexOf(types, typeName) != -1)
    .With(x => ExpressionStatementNavigator.GetByExpression(expression))
    .Do(x =>
          {
            var suggestion = new SideEffectSuggestion(typeName);
            var highlightInfo = new HighlightingInfo(
              expression.GetDocumentRange(),
              suggestion);
            context.HighlightingInfos.Add(highlightInfo);
          });
}

I have to point out here that, at any point, you can stop the chain and start a new one. Why would you want that? Well, for example, you cannot define shared variables within the chain (unless you refactor it all to have a Dictionary<string,object>-like parameter).

By the way, quite frequently I find myself making additional, domain-specific methods to plug into this chain. For example:

public static IElement IsWithin<TContainingType>(this IElement self)
  where TContainingType: class, IElement
{
  if (self == null) return self;
  var owner = self.GetContainingElement<TContainingType>(false);
  return owner == null ? self : null;
}

One more thing: this type of notation is actually light obfuscation because, as I’m sure you’ve guessed, each extension method call will be shown as a static method call in Reflector:

public override void VisitInvocationExpression(IInvocationExpression expression)
{
    base.VisitInvocationExpression(expression);
    string typeName = this.With<SideEffectAnalyser, IInvocationExpression>(
    delegate (SideEffectAnalyser x) {
        return expression;
    }).With<IInvocationExpression, ICSharpExpression>(delegate (IInvocationExpression x) {
        return x.InvokedExpression;
    }).With<ICSharpExpression, IReferenceExpression>(delegate (ICSharpExpression x) {
        return (x as IReferenceExpression);
    }).With<IReferenceExpression, IReference>(delegate (IReferenceExpression x) {
        return x.Reference;
    })
    ⋮
    // and so on
}

This approach is easily extensible – for example, a colleague of mine does try-catch checks in his chains, too. Hey, this is kind of like AOP, but without post-build or dynamic proxies. Oh, and the performance hit for these chains is negligible compared to if statements.

That’s it! Comments are, as always, welcome! Oh, and if you like this article, please vote for it on CodeProject.  ■

Written by Dmitri Nesteruk

September 12th, 2010 at 9:05 am

Posted in CSharp

  • bsnote

    One thing is bad here: it doesn’t matter at what level the check fails, all of them are executed. Do you have an idea, how this can be avoided?

  • http://devtalk.net Dmitri Nesteruk

    I think the point is to go through with them. You could certainly break out of the execution by throwing an exception, but then you would have to catch it somewhere, which would lead to uglier top-level code. Not to mention the fact that throwing exceptions in this instance is just plain wrong.

    I imagine this could be done at the pure IL level (ehh… somehow). But personally I’m happy to execute the whole chain knowing it’s resistant to failure.

  • Андрей

    Ваши статьи вдохновляют меня учиться, учиться и еще раз учиться! :)

  • Rafi

    Maybe I missed something here, but it looks as ugly

  • http://blog.threenine.co.uk Gary Woodfine

    Excellent post. I agree with you in that it is porbably better to execute the entire chain. however off the top of my head you could probably use the yield statement if you are looking to return errors.
    Sorry I can’t supply more information, it’s late and I really need to get to bed :-)

  • http://devtalk.net Dmitri Nesteruk

    Well that would imply that the method would have to return IEnumerable, which isn’t exactly what we want.

  • Shrike

    Nice.
    Я еще иногда использую паттерн NullObject.
    Тогда person.Address никогда не возвращает null, а возвращает NullAddress.
    Имеет смысл, если типы генерируются. Писать руками утомительно конечно.
    Не так гибко, зато с data binding’ом WPF дружит.

  • Tim Collinson

    Dmitri,

    Great post. I was just playing around with this and noticed that while it works fine for things like a string, it falls apart with ints which are not nullable. Any ideas? Or is that the point to begin with since we wouldn’t be checking for nulls in ints?

  • http://devtalk.net Dmitri Nesteruk

    This is happening because of the :class constraint which is, IIRC, necessary for all of this to work.Naturally, you cannot compare int to null and so propagation of numeric data via these chains becomes problematic. The simplest solution I can think of is creating yet another chain method that foregoes the null check, e.g.:

    public static TResult WithValue<TInput, TResult>(this TInput o, Func<TInput, TResult> evaluator)
      where TInput:struct
    {
      return evaluator(o);
    }
    
  • karlssberg

    I wouldn’t worry too much about subsequent null test being executed when the first fails. If you could see how a compiler reorders the flow of your code to improve performance while still maintaining the program logic then you’d simply give up trying these little speed hacks. Compilers generally do a better job of tweaking our code to make it faster.

  • http://activeengine.wordpress.com David Robbins

    Wow – you just blew me away. Really nice work.

  • http://devtalk.net Dmitri Nesteruk

    Thanks! Glad you like it.

  • Guest

    just … why?!

  • http://nesteruk.wordpress.com/2011/12/31/year-2011-results/ Итоги 2011 года « Дмитрий Нестерук

    [...] одно и то же, снова и снова. Также активно использую монаду Maybe для [...]