Bursty producer, Laggy consumer

Bursty producer, Laggy consumer

A concurrency exercise in Go

This is a problem I've encountered in programming more than just a few times. Recently I was faced with it again and I decided to look at it with fresh eyes and tackle it with renewed vigor.

The Problem

The problem is the Producer-Consumer problem. It's well understood and you will find implementations of it in most software systems. But the problem I was trying to address is not that of ensuring mutually exclusive access to a shared buffer. Go), which is my weapon of choice in this particular instance, provides some very simple and powerful concurrency tricks out of the box, more specifically its CSP approach to concurrency through the use of channels. You can also make use of mutexes or semaphores but this is a perfect example of where channels and synchronised goroutines shine.

In short

The essence of the Producer-Consumer problem is that you have two processes, one that produces messages, and another that consumes messages. The messages are delivered from the producer to the consumer in a FIFO manner over a shared buffer of sorts. The challenge is to:

  1. add messages to the buffer only if the buffer is not full,
  2. remove messages from the buffer only if it is non-empty, and
  3. maintain mutual exclusivity to the channel so that it is never modified by two processes simultaneously

What is new for me?

Like I mentioned before, Go takes care of the third point quite nicely. Even the first two are pretty much taken care of for free given the way Go writes to and reads from buffered channels.

The problem I was trying to solve is how to choose a meaningful size of the buffer. You don't want the buffer too big as that might result in the application chomping up more resources than you can afford. You don't want to make it too small either as that might result in hold ups on the producer's side more frequently than your SLA allows.

In my case, as is probably true in 90% of Producer-Consumer cases, the producer is less predictable than the consumer. The producer might dump a large number of messages on the buffer and then be quiet again for a period of time. The consumer on the other hand processes messages at a fairly constant rate, albeit slower than the producer can generate messages when it's running at full tilt. What you're relying on is that, over longer time frames, the consumer will outperform the producer, even though the producer might outperform the consumer over short time frames.

The approach I followed was for the producer to always attempt a non-blocking write to the channel first, but if the channel is full to increment a variable and then perform a blocking write to the channel. Of course, in Go, a simple bufferChan <- message will be non-blocking if the channel is not full, and blocking if it is, but my approach can be achieved with the following code:

select {
case bufferChan <- message:
default:
    reachedCapacity++
    bufferChan <- message
}

The idea behind all of this is that you can now have visibility of how many times or how often the buffer reached capacity. The variable that is incremented every time the channel write is blocking can be exposed as an expvar or fed to your metrics monitoring tool of choice ( statsd, prometheus, datadog etc.)

That way you can make an informed decision on whether it is necessary to increase your buffer size.

A repository with a sample implementation of this model that illustrates a producer outperforming the consumer and the consumer the gradually catching up can be found on my Github.