Unconditional Code for Mapping of Protobuf in Golang

Unconditional code is code that just works, without errors or exceptions, without funny edge cases represented with tedious IF statements. In this article I will share how this can be applied to situations in Golang where you need to map incoming protobuf messages to internal or external (i.e. outgoing) structures.

A Word about Golang

Golang is particularly sensitive to “ify” code due to the existence of nil and the decision of having errors as values returned by functions. If you have a function like:

func Foo() (*Result, error) {
  ...
}

func UseFoo() error {
  res, err := Foo()
  if err != nil {
    // handle, wrap or at least return `err`
  }
  if res == nil {
    // handle nil
  }
  // Do something with *res
}

You are forced by the author of Foo to handle both an error and a nil case.

If you are dealing with data intensive applications, you might be in a situation where you want to map some upstream/incoming data to some internal representations for further processing.

Example

I will use eCommerce as an example. Imagine we have a system composed of several services communicating over gRPC. We have a Checkout service that calls Document service, passing an order.Order object. The order contains all necessary information to issue a purchase document, and the Document service needs to map the upstream data to its internal model.Document struct to render it and send it to the customer.

Here is a segment that deals with the user part of the order.Order:

func FromUser(u *user.User) (*model.Customer, error) {
  if u == nil {
    return nil, nil
  }
  var customer model.Customer
  customer.Email = u.Email
  if u.FirstName == "" && u.LastName == "" {
    return nil, fmt.Errorf("customer must have first or last name")
  }
  customer.Name = strings.TrimSpace(strings.Join([]string{u.FirstName, u.LastName}, " "))
  if u.BillingAddress == nil {
    return nil, fmt.Errorf("customer must have billing address")
  }
  customer.Country = u.BillingAddress.Country
  customer.City = u.BillingAddress.City
  return &customer, nil
}

There are several problems with this code, the most obvious being it is difficult to read. It might be convenient to put the blame on the language. However, in my experience programming languages have rarely been the cause of trouble.

I see the the following issues with this code:

  • It does mapping and validation at the same time.
  • Calling function will have to deal with both the error and the nil case.
  • It is really hard to read.
  • It is hard to test – we always need to provide a complete user.User object and assert the entire mapped structure, vs having a table test case which asserts the mapping of name.

The last problem can easily explode throughout the codebase. Consider a FromOrder function that maps the deep structure of order.Order to an internal Document one:

func FromOrder(order *order.Order) (*model.Document, error) {
  if order == nil {
    return nil, fmt.Errorf("order is nil")
  }
  if order.User == nil {
    return nil, fmt.Errorf("user is missing")
  }
  customer, err = FromUser(order.User)
  if err != nil {
    // ...
  }
  // ...
}

func TestFromOrder(t *testing.T) {
  testCases := []struct {
    desc string
    in   *order.Order
    want model.Document
  }{
    {
      desc: "should create document",
      in:   &order.Order{
        // ...
      },
      want: &model.Document{
        // ...
      },
    },
  }
  for _, tC := range testCases {
    t.Run(tC.desc, func(t *testing.T) {
      // ...
    })
  }
}

We have three options to make the test works:

  • Initialize the complete order.Order and user.User objects to get passed the mapper validations. This will make our tests long and hard to maintain – any changes to the smaller mappers (e.g. FromUser) will break the TestFromOrder.
  • Use indirection – hide the mapping of different parts behind interfaces that can be mocked. This will create clatter in our mapper package and we will take a performance hit.
  • Not test the FromOrder function.

Luckily, there is a forth option – unconditional code.

Unconditional Go

We will take advantage of two very nice features of Golang and protobuf:

  • nil receivers. I fell in love with the idea of sending an empty message to an object from Objective-C and I was thrilled to see this in Golang. Since the methods in Golang are functions with receivers (which can be considered just another parameter), nothing prevents us from having the receiver parameter nil when it is a pointer. Probobuf code generation takes full advantage of this fact. It is quite simple actually (as Golang is always aiming for simplicity):
func (s *Struct) GetData() string {
  if s == nil {
    return ""
  }
  return s.Data
}

var ptr *Struct // nil
fmt.Println(ptr.GetData()) // this works fine
  • Range over nil. Iterating over an empty or nil slice would render the same result – the body of the loop will not get executed at all. In Golang, you don’t need to explicitly check for nil in those situations. Moreover, I can append to a nil slice.
func get() []string {
  return nil
}

func main() {
  for _, s := range get() {
    fmt.Println("never", s)
  }
  var slice []string // nil
  slice = append(slice, "first element")
}

Armed with those tools, we can now apply unconditional code to our mapping, i.e. make the code always work, without explicit checks, errors, nil returned values or panics:

func FromOrder(order *order.Order) model.Document {
  return model.Document{
    Customer:  FromUser(order.GetUser()),
    LineItems: FromItems(order.GetItems()),
  }
}
func FromItems(items []*order.Item) []model.LineItem {
  var lineItems []model.LineItem
  for _, item := range items {
    lineItems = append(lineItems, FromItem(item))
  }
  return lineItems
}
func FromItem(item *order.Item) model.LineItem {
  price := decimal.New(item.GetProduct().GetPriceE2(), -2)
  qty := decimal.New(item.GetQuantity(), 0)
  return model.LineItem{
    Description: item.GetProduct().GetDescription(),
    Total:       price.Mul(qty),
  }
}

func FromUser(u *user.User) model.Customer {
  return model.Customer{
    Name:    CombineNames(u.GetFirstName(), u.GetLastName()),
    Email:   u.GetEmail(),
    Country: u.GetBillingAddress().GetCountry(),
    City:    u.GetBillingAddress().GetCity(),
  }
}
func CombineNames(names ...string) string {
  return strings.TrimSpace(
    strings.Join(names, " "))
}

Now the code is much cleaner (and shorter), as it is doing one thing – mapping. It is also much easier to read, test and faster to write. Notice how the lack of error return value liberates us from all the clatter of Golang’s idiomatic error handling.

Invariants

So far so good, but what can we do about the validation rules that used to be embedded in the mapper? We can’t just delete them, right?

An invariant is a property of an object that is always true. In our example, we might say that an order.Item always has an order.Product inside, or that a model.LineItem always has a description.

The mapper is a bridge between two models and usually both models would have some invariants. Putting the invariants logic in that bridge might seem like a good idea, but is it really?

Let’s look at a segment of the original FromUser example:

if u.BillingAddress == nil {
  return nil, fmt.Errorf("customer must have billing address")
}
customer.Country = u.BillingAddress.Country
customer.City = u.BillingAddress.City

What is the invariant here? Is it that all order.User always have billing address or that a model.Customer will always have country and city? If it is the second one, it is very weak invariant, because both Country and City can be empty strings, so by checking for nil we are enforcing a technical (i.e. internal), not domain invariant.

The only advantage of having the validation logic in the mapper is that we can see why we cannot satisfy the output model’s invariants, e.g. “Line item should always have description, but it needs to come from the product description, which is missing”. Wow, this is complex in English, how can it be simple in Golang?

I think that a much better option is to use either pre checks or post checks.

Pre Checks

I prefer keeping the mapping code clean and move all assumptions about the input data to one place.

In our example we have the Document service accepting an order.Order over gRPC, we have two options, the first one being to add a validator to the handler accepting the gRPC request (aka pre check):

func (h *OrderHandler) handle(ctx context.Context, order *order.Order) error {
  err := h.validator.Validate(order)
  if err != nil {
    return fmt.Errorf("cannot handle invalid order: %w", err)
  }

  err = h.controller.CreateDocument(order)
  if err != nil {
    return fmt.Errorf("cannot create document from order: %w", err)
  }

  return nil
}

This is a good approach if the Document service is having specific, non-universal requirements about “orders that can have a purchase document”.

For example, the business might dictate that people receiving the purchased goods is more important than receiving a purchase document and the Delivery service in our system might have more relaxed view on the order.Order data.

However, in many other circumstances the order.Order invariants must be owned by the Checkout service. In those situations the Document or Delivery services shouldn’t make further assumptions, but rely on the invariants guaranteed by the upstream.

Which brings us to the next topic – post checks.

Post Checks

Now that each service is responsible to contribute the data along with the consistency (i.e. invariants) associated with it, we can have a validation at the end of the pipeline.

This means that the Document service should guarantee that all line items have description:

func (c controller) CreateDocument(o *order.Order) error {
  doc := mapper.FromOrder(o)
  err := c.validator.Validate(doc)
  if err != nil {
    return fmt.Errorf("cannot issue invalid document: %w", err)
  }
  // ...
  return nil
}

You can imagine a similar piece of code in the Checkout service in regards to the order.Order structs.

The post check guarantees that we will not issue a “bad” document because of “bad” upstream data and provokes the right question. Instead of asking “why do we see products without description” we will be asking “how should we issue a document for products without description”. Maybe we can have a default description?

Outcome

One might argue that having the Document service relying on the invariants of the Checkout service is coupling.

I would argue the opposite. Having pre-checks in the document service or having it embedded in the mapping is creating coupling of the worst kind. The Checkout service might accidentally break the Document service. Yes, the Document service will break with a mapping error, but it would still be broken, wouldn’t it?.

Now imagine the Checkout service is publishing orders to a message broker (e.g. Kafka) and there are several other products relying on that stream of events. We have a situation where the Checkout team can break any of those downstream services at any time and the only way to understand if a change is breaking is to go through the code of each one. The team owning the Checkout service has a choice – be careful and slow or be faster and break things.

I believe sometimes it is better to forget about microservices and tap into the decades of experience in the industry. If I have a monolith with components for orders, documents and delivery, would I repeat the invariants of the orders in each dependency or try to minimize duplication?

Regardless whether you agree with the statement above or not, the Document team must make a decision about checking incoming orders. Having this decision spread through the mapping code and potentially other parts of the service would make revising it later very, very expensive.

Conclusion

We make decisions about our systems every day and there is no formula on how to always make the right one. Therefore, we as engineers must make sure that our decisions are as cheap as possible and we can easily revise them in the future.

Unconditional code is one way to get there.

References