Rowanto Luo

Just another blog. or log.

Event Driven Architecture

Why do we need it?

In the early stages of the development, event driven architecture is not needed since everything is still small and could still be contained in one place. Unfortunately, as the system grows bigger and bigger, putting everything in one place is not a very good way to scale (a.k.a monolith). The logic of the whole system will also keep increasing in complexity, and eventually there will arise the need to separate the giant system into domains.

Let's take an example of an online shop. In the early stages, all logic could reside alone in the online shop. For simplicity, we only have one extra logic after a user buys things. First, we check what he just bought, and what he has bought before this, if some certain condition is met, then send him a voucher. When the system gets big enough, and we want to separate them, it's very tempting to do it like this.

1. direct http call

There are some problems with this. First, it doesn't scale well. Second, it doesn't decouple the logic into separate domains. Imagine when the system gets more complex.

2. direct http call on scale

Now, when a purchase is made in the online shop. It has to make three http calls:

  1. Tell the campaign service about it so that a voucher can be sent when condition is met
  2. Tell the user service that this user bought x product at y time
  3. Tell the billing system send a bill to the user

On performance level, it obviously doesn't scale very well, and on the domain separation level, we are mixing the logic everywhere. Does the online shop, which show the product and let the user click on the buy button have to know every single other service which is dependent on the action so that it can tell them about it? The answer is an obvious no. Each domain is responsible in its own area. Billing system should not know when to send a voucher. Online shop should not know that the next button click will trigger a voucher to the user's inbox.

We need to decouple the system and logic in a better way.

3. event stream

In this manner, online shop just has to do its job, which is displaying product and letting the user click on buy product, then send an event to a stream. Any clients who are interested should just listen to the stream. This allows us to decouple the system, and letting the system scale.

Do it right, do it well

The event driven architecture is actually well known to a lot of people. When writing this post, I was thinking whether I just wasted my time describing what it actually is. Unfortunately, there were some people who tried to use event driven architecture in a very weird way, which led to some very obvious problem.

Just having a client, producer, and event stream doesn't automatically fix all your problems. We also have to do it properly. The event driven architecture is also not a silver bullet, there are cases when we should not use it too. If we are going to do event driven architecture, there are some properties we need to fulfill or pay attention to. The so called best-practices.

1. Fire and Forget

One very important aspect of event driven architecture is actually what we send in the event. The event should never be used as an orchestration tool, meaning, as a way to send commands to other services. If we are doing this, then there is a very high chance that we are mixing logic between two different domains.

In the case that they are from the same domain, then the two should not have communicated via events. Maybe we should consider doing a direct REST call or RPC. The event producer should never have to care about how or what is consuming the events it produced. Thus for each and single event the producer sent, it just has to make sure that it arrives in the event bus.

2. Self Contained Event

This should also be obvious, the message in the event should also be self containing. It should contain the relevant information about what happened.

Think about the online shop from before. Imagine that the shop only sends an event which contains the product id, user id, and the text "BUY". If this happens, most of the consumers now will have to make a call to the shop to know what kind of products was bought. If 100 consumers exist, it means one event will generate 100 http GET calls to the service for that same one product information. This will easily create a mini DDOS attack to the producer.

3. Idempotent Consumer

For a very reliable system, usually there will be at-least-once delivery guaranteed. It basically means, the same message could be processed multiple times. Because of this, it is very important that all consumers are idempotent. If the same exact events are processed twice, then the second time it is processed, nothing should happen.

This approach is much better than having all consumers not idempotent, but guarantee at-most-once delivery. At-most-once delivery also means that the message could not arrive at all. For tasks like this, we should not use an event stream, it should be something like a queue.

Making the consumers idempotent also enables the producer to replay all events in case a major outage. This allows easy recovery of the entire dependent system.

4. Schema Evolution

This is just a cooler term to guarantee backwards-compatibility. Basically, since one event from the producer could be consumed by n-number of consumers, an event format which is always backwards compatible should be used. When some field name changes, the client should ideally not need to do anything.

There is already a lot of library for this. Protobuf, Thrift, Avro, etc.

Queue vs Event Stream

This is just a clarification. Many people seem to confuse queue with stream or the other way around. A queue is not a message stream, and a message stream is not a queue.

A queue is something like SQS or JMS (RabbitMQ, ActiveMQ ,etc). One message is only meant to be consumed by one consumer. If we want to send the same message to multiple consumers, it has to be somehow replicated (by the producer or the queue itself). A queue is more suitable for some offline processing where all messages have to be processed one way or another by the same type of consumer. Usually, the producer will know the consumer of the queue, and could also even pre-filter the message which goes to the queue.

An event stream is something like Kafka, or Kinesis. One message could be consumed by an unlimited amount of consumers without replication. The producer usually just outputs every event into the stream, and the consumer will decide which one it has to process.

The event driven architecture this post covered was about the event stream.