Types First

Backend Oct 29, 2021

Have you ever come across a piece of code wrapped in seemingly random validation conditions? Consider this example:

public void SaveNewCustomer(string firstName, string lastName, string email)
{
	if (string.IsNullOrWhiteSpace(lastName))
	{
		throw new ArgumentException("Last name is required");
	}
	if (string.IsNullOrWhiteSpace(email))
	{
		throw new ArgumentException("Invalid email");
	}

	var customer = new Customer(firstName, lastName, email);
	database.Save(customer);
}

This method validates some arguments, uses them to instantiate a new Customer entity and saves the result to a database somehow. That is fine, but can we be sure that the validations are correct? What if you stumbled across something similar a while later but without the first condition? Does it mean that lastName is not always mandatory? Or worse, are those two implementations supposed to be identical? And which one is correct?

Spoiler alert: There is a better way.

The problem with validations

Our recent blogpost about functional principles calls out Statements as one of the common coding pitfalls. And I’d argue they especially suck in combination with Validations. Don’t get me wrong; it’s absolutely correct to check inputs, but there is a caveat.

Take for example the code above. I’ve mentioned some concerns already, and there really is quite a lot that can go wrong with it. So, how can we improve it?

Well, we can start off with the email because the validation doesn’t practically cover anything. So, let’s add some things, shall we?

if (string.IsNullOrWhiteSpace(email) || !email.Contains("@") || !email.Contains("."))

Now the string must at least contain some specific symbols, but it’s still pretty useless. For example, this would be a valid email according to us: “@.”

So, let’s employ some regular expressions. They are meant for this sort of thing anyway. Fortunately, Microsoft so handily provided us with an example of their own:

if (email == null || !Regex.IsMatch(
		email,
		@"^[^@\s]+@[^@\s]+\.[^@\s]+$",
		RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(250)
))

If we split the code into multiple methods for the sake of brevity, we end up with something like this:

public Customer SaveNewCustomer(string firstName, string lastName, string email)
{
	CheckIsEmpty(lastName, nameof(lastName));
	CheckIsEmail(email);

	var customer = new Customer(firstName, lastName, email);
	database.Save(customer);
}

private void CheckIsEmpty(string input, string parameterName)
{
	if (string.IsNullOrWhiteSpace(input))
	{
		throw new ArgumentException($"{parameterName} is required");
	}
}

private void CheckIsEmail(string email)
{
	if (email == null || !Regex.IsMatch(
			email,
			@"^[^@\s]+@[^@\s]+\.[^@\s]+$",
			RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(250)
	))
	{
		throw new ArgumentException("Invalid email");
	}
}

Nice. This is better. But have we solved the concerns we had in the beginning? Not at all, because the validations took our focus off what’s important, and that is creating valid customers.

I’ve taken this detour to highlight that validations have this nasty habit of distracting you while you try to understand the code, but that’s not all…

Parsing over validation

Let’s say the Customer constructor looks like this:

public Customer(string firstName, string lastName, string email)
{
	FirstName = firstName;
	LastName = lastName;
	Email = email;
}

There is nothing out of the ordinary here, but this is THE source of all our problems as there is virtually nothing preventing us from simply skipping the validations altogether and calling this constructor directly.

We could move all the validations down here into the constructor, but what if we wanted to use them elsewhere? Do we just copy them? Or do we introduce some standalone classes to handle all this? What if the properties in the class must be mutable for some reason; wouldn’t that allow us to skip validations in the constructor? So, we move them to setters, right?

At this point it’s becoming clearer that the issue is with the property types themselves. Everything else up to this point was trying to limit the data that can end up here. The basic types simply allow data states we do not wish to support.

Now this is the enlightening realization that opens our eyes to a simple, yet very effective solution – we just need better types! Let’s redesign our Customer class while keeping all those validations in mind. We need a non-empty string type and an email type.

Microsoft has us covered again, and they provide a MailAddress class from the System.Net.Mail namespace. For the last name, we can create our own custom type that would handle nulls as well as white spaces. It can look like this:

public sealed class NonEmptyString : IEquatable<string>
{
	private NonEmptyString(string value)
	{
		Value = value.Trim();
	}

	public string Value { get; }

	public static NonEmptyString CreateUnsafe(string value)
	{
		if (string.IsNullOrWhiteSpace(value))
		{
			throw new ArgumentException("You cannot create a NonEmptyString from whitespaces, empty strings or nulls.");
		}

		return new NonEmptyString(value);
	}

	public static NonEmptyString? Create(string value)
	{
		return !string.IsNullOrWhiteSpace(value) ? new NonEmptyString(value) : null;
	}

	public override int GetHashCode()
	{
		return Value.GetHashCode();
	}

	public override bool Equals(object obj)
	{
		return Value.Equals(obj);
	}

	public bool Equals(string other)
	{
		return Value.Equals(other);
	}

	public override string ToString()
	{
		return Value;
	}
}

Note: It is always a good practice to handle null cases (see CreateUnsafe vs Create methods).

Now we can use these types in our Customer class, and there it is. We are now forced by compiler to parse inputs as early as possible, which also conveniently complies with the Fail Fast principle:

public Customer(string firstName, NonEmptyString lastName, MailAddress email)
{
	FirstName = firstName;
	LastName = lastName;
	Email = email;
}

Benefits

Before we wrap this up, let’s mention some other aspects of strong typing I find very useful:

  • Less code duplication – Types are nicely reusable. Validation conditions less so.
  • Self-documenting capacity – The important part of development is reading, and with proper strong types, method signatures are often enough information. There is no need for lengthy (and often outdated) comments explaining stuff.
  • Compile-time security – Because the data model doesn't allow invalid states, you can focus on what is important instead of keeping track of all the different edge-cases.

Summary

When it comes to code, we believe it should be Simple, Correct and Fast. Type Driven Development is one design philosophy that brings these three together in a beautiful fashion. Gone is the endless head scratching whether this or that should be allowed here or there. The compiler will do lots of the heavy lifting for you. You can simply let yourself be guided by your types.

If this has caught your fancy, here are some more resources that dive even deeper into this topic:

Parse, don’t validate by Alexis King
Type Driven Development by Mark Seemann


For more engineering insights shared by Mews tech team:

Tags

David Müller

Backend Engineer at Mews

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.