Composable Error Handling¶
Sentinel, being a framework that interacts with a database, faces common problems. It needs to validate user-supplied data, write and read from the database, and so on. These are not new problems and various other frameworks and/or languages have solved it in their own way.
Since Sentinel is based on Scala, we’ll take a look at how we can actually achieve these things nicely in Scala. It may look daunting at first, but the result is worth the effort: composable clean code.
We’ll cover three topics in particular:
- Dealing expected failures such as user-input errors, in this guide.
- Launching task asynchronously using
Future
, in the next guide: Asynchronous Processing.- How Sentinel composes error handling asynchronously, also in the next guide: Asynchronous Processing.
Note
This guide is geared towards use cases in Sentinel and is by no means comprehensive. Readers are expected to be familiar with Scala, particularly using for comprehensions and pattern matching. It is also a good idea to be familiar with the scalaz library as Sentinel also makes considerable use of it.
Dealing with expected failures¶
Upon receiving a file upload, Sentinel parses the contents into a JSON object and extracts all the metrics contained within. This means that Sentinel needs to ensure that:
- The file is valid JSON.
- The JSON file contains the expected metrics values.
If any of these requirements are not met, Sentinel should notify the uploader of the error since this is something that we can expect the uploader to correct.
Let’s assume that the uploaded data is stored as a Array[Byte]
object and the data is parsed into a JValue
object, as defined by the Json4s library. We can use Json4s’s parse
function to extract our
JSON. It has the following (simplified) signature:
// `JsonInput` can be many different types,
// `java.io.ByteArrayInputStream` among them.
// `Formats` is a Json4s-exported object that
// determines how the parsing should be done.
def parse(in: JsonInput)(implicit formats: Formats): JValue
We can then write a function to extract JSON out of our byte array using parse
.
import java.io.ByteArrayInputStream
import org.json4s.JValue
import org.json4s.jackson.JsonMethods.parse
// For now, we can use the default `Formats`
// as the implicit value
implicit val formats = org.json4s.DefaultFormats
// first attempt
def extractJson(contents: Array[Byte]): JValue =
parse(new ByteArrayInputStream(contents))
In an ideal world, our function would always return a JValue
given a byte array. In reality, our function will be
faced with user inputs which should be treated with extreme caution. That includes expecting corner cases, for example
if the byte array is not valid JSON or if the byte array is empty. Those cases will cause parse
to throw an
exception that must be dealt with, otherwise our normal program flow is interrupted.
The Option
type¶
How should we handle the exception? Having a strong functional flavour, Scala offers a nice alternative using a type
called the Option[T]
type. The short gist is that it allows us to encode return types that may or may not exist.
It has two concrete values: Some(_)
or None
. For example, if a function has return type Option[Int]
it can
either be of value Some(3)
(which means it has the value 3) or None
(which means no number was returned).
While exceptions may be more suitable for some cases, Option
offers interesting benefits of its own. With
Option
, we explicitly state the possibility that a function may not return its intended value in its very signature.
Consequently, this informs any caller of the function to deal with this possibility, making it less prone to errors.
It turns out also that the Option
pattern occurs frequently in code. Functions that perform integer division, for
example, needs to acknowledge the fact that division by zero may occur. Another example is functions that return the
first item of an array. What should the function do when the array is empty? Option
fits well into these and many
other cases. Over time, this has resulted in a set of common operations that can be applied to Option
objects
that we can use to make our code more concise without sacrificing functionality. You can check out the
official documentation for a glance of what these
operations are.
Tip
For a more in depth treatment of Option
, we find the guide
here
informative.
Some operations worth highlighting are flatMap
and withFilter
. They are used by Scala’s for-comprehension,
which means you can chain code that returns Option
like this:
def alpha(): Option[Int] = { ... }
def beta(n: Int): Option[String] = { ... }
def gamma(s: String): Option[Double] = { ... }
val finalResult: Option[Double] = for {
result1 <- alpha()
result2 <- beta(result1)
result3 <- gamma(result2)
squared = result3 * result3
} yield squared
In the code snippet above, alpha
is called first, then its result is used for calling beta
, whose result is
used for calling gamma
. The beauty of the chain is that if any of the functions return None
, subsequent
functions will not be called and we get None
as the value of finalResult
. There is no need to do manual checks
using if
blocks. Furthermore, the for comprehension automatically unwraps result1
and result2
out of
Option
when used for calling beta
and gamma
. Finally, we can slip an extra value declaration (squared
)
which will only work if our chain produces an expected result3
value.
Tip
flatMap
and withFilter
are not the only methods that for comprehensions desugars into. Check out the
official FAQ on other possible methods.
Going back to our JSON extractor function, we need to update it so that it returns Option[JValue]
. Luckily, Json4s
also already has us covered here. In addition to parse
, it also provides a function called parseOpt
which only
returns a JValue
if the given input can be parsed into JSON. It has the following (simplified) signature:
def parseOpt(in: JsonInput)(implicit formats: Formats): Option[JValue]
Our function then becomes:
import java.io.ByteArrayInputStream
import org.json4s.JValue
import org.json4s.jackson.JsonMethods.parseOpt
implicit val formats = org.json4s.DefaultFormats
// second attempt
def extractJson(contents: Array[Byte]): Option[JValue] =
parseOpt(new ByteArrayInputStream(contents))
The Either
type¶
Our function now has a better return type for any of its caller. Notice however, that Option
is black and white.
Either our function returns the expected JValue
or not. In contrast, there are more than one way that the parsing
can fail. The JSON file could be malformed, maybe containing an extra comma or missing a bracket somewhere. There could
also be network errors that cause no bytes to be uploaded, resulting in an empty byte array. These are information that
is potentially useful for uploaders, so it would be desirable for Sentinel to be able to report what kind of error
causes the parsing to fail. In short, we would like to encode the possibility that our function may fail in multiple
ways.
Enter the Either
type. Either
allows us to encode two return types into one, unlike Option
which only
allows one. Its two concrete values are either Right
, conventionally used to encode the type returned for successful
function calls, and Left
for encoding errors.
This should be clearer with an example. We will use a function that returns the sum of the first and last item of a
List[Int]
to illustrate this. Here, the given list must contain at least two items. If that’s not the case, we would
like to notify the caller. One way to write this with Either
is like so:
def sumFirstLast(list: List[Int]): Either[String, Int] =
if (list.isEmpty) Left("List has no items.")
else if (list.size == 1) Left("List only has one item.")
else Right(list.head + list.last)
The type that encodes the error (the left type) can be anything. We use String
here for convenience, but other
types such as List[String]
or even a custom type can be used.
We can now further improve our extractJson
function using Either
. Since Json4s does not provide a parsing
function that returns Either
, we need to modify our own function a bit:
import java.io.ByteArrayInputStream
import org.json4s.JValue
import org.json4s.jackson.JsonMethods.parseOpt
implicit val formats = org.json4s.DefaultFormats
// third attempt
def extractJson(contents: Array[Byte]): Either[String, JValue] =
if (contents.isEmpty) Left("Nothing to parse.")
else parseOpt(new ByteArrayInputStream(contents)) match {
case None => Left("Invalid syntax.")
case Some(jv) => Right(jv)
}
The disjunction type: \/
¶
Our iterations are looking better, but we are not there yet. Either
, as provided by the Scala standard library,
unfortunately does not play very well with for comprehensions like Option
does. Scala does not enforce that
Either
‘s Left
encodes the error return type (and consequently, that Right
encodes the succes type). What
this means is that in for comprehensions, we have to tell whether we expect the Right
or Left
type for each
call. This is done by calling the Either.right
or Either.left
method.
def uno(): Either[String, Int] = { ... }
def dos(n: Int): Either[String, String] = { ... }
def tres(s: String): Either[String, Double] = { ... }
val finalResult: Either[String, Double] = for {
result1 <- uno().right
result2 <- dos(result1).right
result3 <- tres(result2).right
} yield result3
It seems like a minor inconvenience to add .right
, but there is something going on under the hood with .right
and .left
. They do not actually create Right
and Left
, but RightProjection
and LeftProjection
, which
is a different type with different properties. The practical consequence is that the code below will not compile
anymore (unlike its Option
counterpart):
val finalResult: Either[String, Double] = for {
result1 <- uno().right
result2 <- dos(result1).right
result3 <- tres(result2).right
squared = result3 * result3
} yield squared
To get it working, we need to manually wrap the squared
declaration inside an Either
, invoke .right
, and
replace the value assignment operator:
val finalResult: Either[String, Double] = for {
result1 <- uno().right
result2 <- dos(result1).right
result3 <- tres(result2).right
squared <- Right(result3 * result3).right
} yield squared
This is getting unecessarily verbose. We have to invoke .right
every time and we lose the ability to declare values
inside for comprehensions. To remedy this, we need to use the scalaz library.
Scalaz is a third party Scala library that provides many useful functional programming abstractions. One that we will
use now is called \/
(often called the disjunction type, since it is inspired by the mathematical disjunction
operator ∨). It is very similar to Either
, except for the fact that it is right-biased. This means, it expects the
error type to be encoded as the left type and the expected type to be encoded as the right type.
Here is a quick comparison between Either
and \/
:
import scalaz._, Scalaz._
// Type declaration.
// We can use the `\/` type as an infix
// operator as well, as shown in `value3`
// declaration below
def value1: Either[String, Int] // Scala
def value2: \/[String, Int] // scalaz
def value3: String \/ Int // scalaz
// Right instance creation.
// The scalaz constructor is the type name,
// plus the side we use: `\/` appended with `-`
val value4: Either[String, Int] = Right(3) // Scala
val value5: String \/ Int = \/-(3) // scalaz
// Left instance creation.
// The scalaz constructor is analogous to its
// right type counterpart: `\/` prepended with `-`
val value6: Either[String, Int] = Left("err") // Scala
val value7: String \/ Int = -\/("err") // scalaz
Our earlier example can now be made more concise using the disjunction type:
def uno(): String \/ Int = { ... }
def dos(n: Int): String \/ String = { ... }
def tres(s: String): String \/ Double = { ... }
val finalResult: String \/ Double = for {
result1 <- uno()
result2 <- dos(result1)
result3 <- tres(result2)
squared = result3 * result3
} yield squared
One more thing: notice that we always encode the error type / left type as String
and we need to redeclare it
every time. We can make this even shorter by creating a type alias to disjunction whose left type is always String
.
Let’s call this alias Perhaps
:
type Perhaps[+T] = String \/ T
def uno(): Perhaps[Int] = { ... }
def dos(n: Int): Perhaps[String] = { ... }
def tres(s: String): Perhaps[Double] = { ... }
val finalResult: Perhaps[Double] = for {
result1 <- uno()
result2 <- dos(result1)
result3 <- tres(result2)
squared = result3 * result3
} yield squared
And finally, going back to our JSON extractor example, we need to update it like so:
import java.io.ByteArrayInputStream
import org.json4s.JValue
import org.json4s.jackson.JsonMethods.parseOpt
import scalaz._, Scalaz._
implicit val formats = org.json4s.DefaultFormats
type Perhaps[+T] = String \/ T
// fourth attempt
def extractJson(contents: Array[Byte]): Perhaps[JValue] =
if (contents.isEmpty) -\/("Nothing to parse.")
else parseOpt(new ByteArrayInputStream(contents)) match {
case None => -\/("Invalid syntax.")
case Some(jv) => \/-(jv)
}
Going even further, we can replace the pattern match with a call to scalaz’s .toRightDisjunction
. This can be done
on the Option[JValue]
value that parseOpt
returns. The argument is the error value; the value that we would
like to return in case parseOpt
evaluates to None
.
...
// fourth attempt
def extractJson(contents: Array[Byte]): Perhaps[JValue] =
if (contents.isEmpty) -\/("Nothing to parse.")
else parseOpt(new ByteArrayInputStream(contents))
.toRightDisjunction("Invalid syntax.")
We can further shorter this using the \/>
function, which is basically an alias to .toRighDisjunction
:
...
// fourth attempt
def extractJson(contents: Array[Byte]): Perhaps[JValue] =
if (contents.isEmpty) -\/("Nothing to parse.")
else parseOpt(contents) \/> "Invalid syntax."
This is functionally the same, and some would prefer the clarity of .toRightDisjunction
instead of \/>
‘s
brevity. We will stick to .toRighDisjunction
for now.
Comprehensive value extraction¶
We did not use any for comprehensions in extractJson
, though, so why did we bother to use \/
at all? Remember
that creating the JSON object is only the first part of our task. The next part is to extract the necessary metrics
from the created JSON object. At this point it is still possible to have a valid JSON object that does not contain
our expected metrics.
Let’s assume that our expected JSON is simple:
{
"nSnps": 100,
"nReads": 10000
}
There are only two values we expect, nSnps
and nReads
. Using Json4s, extracting this value would be something
like this:
// `json` is our parsed JSON
val nSnps: Int = (json \ "nSnps").extract[Int]
val nReads: Int = (json \ "nReads").extract[Int]
We can also use .extractOpt
to extract the values into an Option
type:
// By doing `.extractOpt[Int]`, not only do we expect
// `nSnps` to be present, but we also check that it is
// parseable into an `Int`.
val nSnps: Option[Int] = (json \ "nSnps").extractOpt[Int]
val nReads: Option[Int] = (json \ "nReads").extractOpt[Int]
Now let’s put them together in a single function. We’ll also create a case class to contain the results in a single
object as well. Since we are doing two extractions, it’s a good idea then to use the disjunction type instead of
Option
so that we can see if any error occurs.
...
case class Metrics(nSnps: Int, nReads: Int)
def extractMetrics(json: JValue): Perhaps[Metrics] = for {
nSnps <- (json \ "nSnps")
.extractOpt[Int]
.toRightDisjunction("nSnps not found.")
nReads <- (json \ "nReads")
.extractOpt[Int]
.toRightDisjunction("nReads not found.")
metrics = Metrics(nSnps, nReads)
} yield metrics
Both extraction steps now combine nicely in one for comprehension. The code is concise and we can still immediately see
that both nSnps
and nReads
must be present in the parsed JSON object. If any of them is not present, an error
message will be returned appropriately.
What’s even nicer, is that extractMetrics
compose well with our previous extractJson
. We can now write one
function that does both:
...
def processUpload(contents: Array[Byte]): Perhaps[Metrics] = for {
json <- extractJson(contents)
metrics <- extractMetrics(json)
} yield metrics
That’s it. Our processUpload
function extracts JSON from a byte array and then extracts the expected metrics from
the JSON object. If any error occurs within any of these steps, we will get the error message appropriately. If we ever
want to add additional steps afterwards (maybe checking if the uploaded metrics is already in a database or so), we
can simply add another line in the for comprehension so long as our function call returns a Perhaps
type.
Sentinel’s Error Type¶
While String
is a useful error type in some cases, in our cases it is not exactly the most suitable type for errors.
Consider a case where our uploaded JSON does not contain both nSnps
and nReads
. In that case, the user would
first get an error message saying ‘nSnps not found.’. Assuming he/she fixes the JSON by only adding nSnps
, he/she
would then get another error on the second attempt, saying ‘nReads not found.’. This should have been displayed on the
first upload, since the error was already present then.
This approach of failing on the first error we see (often called failing fast) is then not exactly suitable for our
extractMetrics
function. Another approach where we accumulate the errors first (failing slow) before displaying
them seems more appropriate. To do so, we need to tweak our error type to be a List[String]
instead of the
current String
. We can then add error messages to the list and return it to the user eventually.
It’s only for extractMetrics
, though. We would still like to fail fast in extractJson
as both errors we expect
to encounter there can not occur simultaneously. If the JSON file is empty, it must not contain any syntax errors and
vice versa.
Sentinel reconciles this by having a custom type for its error type, called the ApiPayload
. It is a case class
that contains both String
and List[String]
. The ApiPayload
type is also associated with specific HTTP status codes.
This is because the error messages that Sentinel displays must be sent via HTTP and thus must be associated with a
specific code.
Its simplified signature is:
// `httpFunc` defaults to a function
// that returns HTTP 500
sealed case class ApiPayload(
message: String,
hints: List[String],
httpFunc: ApiPayload => ActionResult)
The idea here is that we always have a single error message that we want to display to users (the message
attribute).
Accumulated errors can be grouped in hints
, if there are any. We also associate a specific error message with
a specific HTTP error code in one place.
Note
Being based on the Scalatra framework, Sentinel uses Scalatra’s
ActionResult to denote HTTP actions. Scalatra already
associates the canonical HTTP status message with the error code (for example InternalServerError
has the 500
code). Check out the Scalatra documentation if you need more background on ActionResult
.
Additionally, ApiPayload
objects are transformed into plain JSON that are then sent back to the user. The
JSON representation displays only message
and hints
, since httpFunc
is only for internal Sentinel use.
An example of an ApiPayload
would look something like this:
// `BadRequest` is Scalatra's function
// that evaluates to HTTP 400.
val JsonInputError = ApiPayload(
message = "JSON input can not be parsed.",
hints = List("Input is empty."),
httpFunc = (ap) => BadRequest(ap))
It can get a bit tedious, as you can see. Some HTTP error messages occur more frequently than others, fortunately, so
Sentinel already creates some predefined ApiPayload
objects that you can use. They are all defined in
nl.lumc.sasc.sentinel.models.Payloads
.
In our case, we can use JsonValidationError
. It is always associated with HTTP 400 and its message
attribute
is hard coded to “JSON is invalid.”. We only need to supply the hints inside a List[String]
. Moreover, our
disjunction type ApiPayload \/ T
is also already defined by sentinel in nl.lumc.sasc.sentinel.models.Perhaps
,
so we can use that.
Let’s now update our functions to use ApiPayload
(along with some style updates). We will also outline how far we
have written our functions:
// We import a mutable list for collecting our errors
import collection.mutable.ListBuffer
import java.io.ByteArrayInputStream
import org.json4s.JValue
import org.json4s.jackson.JsonMethods.parseOpt
import scalaz._, Scalaz._
import nl.lumc.sasc.sentinel.models.{ Payloads, Perhaps }, Payloads._
implicit val formats = org.json4s.DefaultFormats
case class Metrics(nSnps: Int, nReads: Int)
// Our change here is mostly to replace
// `String` with `ApiPayload`.
def extractJson(contents: Array[Byte]): Perhaps[JValue] =
if (contents.isEmpty) {
val hints = JsonValidationError("Nothing to parse.")
-\/(hints)
} else {
val stream = new ByteArrayInputStream(contents)
val hints = JsonValidationError("Invalid syntax.")
parseOpt(input).toRightDisjunction(hints)
}
// This is where most our changes happen
def extractMetrics(json: JValue): Perhaps[Metrics] = {
val maybe1 = (json \ "nSnps").extractOpt[Int]
val maybe2 = (json \ "nReads").extractOpt[Int]
(maybe1, maybe2) match {
// When both values are defined, we can create
// our desired return type. Remember we need
// to wrap it inside `\/` still.
case (Some(nSnps), Some(nReads)) =>
\/-(Metrics(nSnps, nReads))
// Otherwise we further check on what's missing
case otherwise =>
val errors: ListBuffer[String] = ListBuffer()
if (!maybe1.isDefined) errors :+ "nSnps not found."
if (!maybe2.isDefined) errors :+ "nReads not found."
-\/(JsonValidationError(errors.toList))
}
}
// This function remains the same.
def processUpload(contents: Array[Byte]): Perhaps[Metrics] = for {
json <- extractJson(contents)
metrics <- extractMetrics(json)
} yield metrics
And there we have it. Notice that even though we fiddled with the internals of extractJson
and extractMetrics
,
our processUpload
function stays the same. This is one of the biggest wins of keeping our API stable. Our functions
all follow the pattern of accepting concrete values and returning them wrapped in Perhaps
. This is all intentional,
so that we can keep processUpload
clean and extendable.
Fitting the JSON Schema in¶
Our extractMetrics
function looks good now, but notice that it is already quite verbose even for a small JSON.
This is why we recommend that you define JSON schemas for your expected summary files. Sentinel can then validate
based on that schema, accumulating all the errors it sees.
The Sentinel validation function is called validateJson
, which has the following signature:
def validateJson(json: JValue): Perhaps[JValue]
You can see that it expects as its input a parsed JSON object. This means that we need to create a JSON object first
before we validate it. To this end, Sentinel also provides an extractJson
function. Its signature is the same as
the extractJson
function you have been writing. We can then combine extraction and validation together in one
function like so:
def extractAndValidateJson(contents: Array[Byte]): Perhaps[JValue] =
for {
json <- extractJson(contents)
validatedJson <- validateJson(json)
} yield validatedJson
Sentinel provides extractAndValidateJson
as well. In fact, that is also how Sentinel composes JSON extraction and
JSON validation internally: using a single for comprehension.
Next Steps¶
We hope we have convinced you that encoding errors as the return type instead of throwing exceptions can make our code
cleaner and more composable. In the next section, Asynchronous Processing, we will be combining our
Perhaps
type with Scala’s Future
so that we can process data asynchronously.