An overview of message brokers

6 min readMar 9, 2022

Introduction

The message brokers have earn a lot of popularity during the past years, some companies have been increasing the efforts to provide a message queuing solution and get a part of the cake. But what are message brokers / message queues? In this article I’ll show you what they are meant for and some solutions that you can find outside.

How does the application communication works?

First of all, before getting into details and understand why they exists, we need to know how the different applications can communicate and exchange information.

Let’s assume that we want to connect an App #1 to send data to the App #2, here we must use a connection strategy that can be either: TCP connection, a regular RPC call, etc.

But this lead to a different limitations, for example, what happens if the network is slowing down? What happens with that data if the App #2 goes down and maybe we have different products that does not speak the same language and causing a problem to integrate those solutions… A lot of questions are shown when we have this old way of communication.

Figure #1. Regular application communication

Message Broker

The message broker will help us to avoid the issues associated to the questions shown before. The message broker acts here like a middle man in which it receive events from a source and forward those to the destinations and this message broker has its own storage to save events if the destinations are not available. Some of the benefits is to provide High availability, avoid the “spoken language” limitation as well as working properly with Firewalls.

Here are some basic examples:

If the App #2 goes down, the message broker will write down those events or streams on the storage until the App #2 is available again. This is really important since it allow us to avoid event loss by providing High Availability of our systems (HA)

If the connection to the App #2 is slow, the message broker will not sent all those streams or events as fast as it could be, but it will “delay” those messages to avoid to saturate the network. Same applies if the App#2 is really busy that cannot handle the load as fast as they need.

Message Queues

Now that we understood how the message broker works, let’s take a deep dive in the Message queues, which are required to use the Brokers. The message queues are First In First Out concept (FIFO) in which the first message that arrives to the queue will be the first in being sent. Just like being on a red traffic light of a single lane road, all the cars will be doing a queue and the first one in that queue will be the first one to get out of there.

In the message broker we do not have only one queue, we have multiple number of queues with different names, allowing us to receive and sent from / to specific applications.

We can also have a queue that receives from several applications and forwards to different applications, when this occurs the events are sent in the way they arrived and causing what is called “Competing consumers” in which we have several destinations to sent the event load, so the message broker will be able to manage this as well in which the consumers will take the next available massage in the queue…

Topics & Subscriptions

Maybe these are the most important concepts when building distributed systems. They follow the same logic as the queues we have just discussed but there are a few differences! and they can also be different between different message brokers.

These topics and subscriptions are needed when we want to achieve more specific goals, for example: how we can replicate the events between two consumers (receiving applications), how can we scale the queues and implement the Competing consumers approach? That is why the topics and subscriptions are here!

You may be wondering what are actually those topics and subscriptions? Well… A topic can be described as a way to organize messages. Each topic has a name that is unique across the entire Message Broker. In the diagram bellow we have a topic called customers, here the message or events are sent to that topic by the producers (or sending applications) and then we can have a subscription queue which is basically a queue that copy what was received by to that specific topic and that is how we can achieve the replication in the consumers (or receiving application). It is important to know that the Topics are based on the business functions and will be created based on the business needs.

Publish & Subscribe

Now that we have understood the basics of the Message Brokers, the next topic to understand are the publish and subscribe, these are the basics for the Publish / subscribe Pattern, which basically allow us to decouple senders and receivers of messages.

The decouple follow this logic:

The producers (sending application) only needs to know the topic name to sent
The consumers (receiving application) only needs to know the topic to subscribe to.

This is extremely powerful since we can scale our system faster by adding more consumers and producers and just caring about the topic name and the topic subscription.

Distributed Log topics

When working on distributed topics we do not use the subscriptions, here the things work a little bit different. First of all when we are working on distributed topics we are talking about clusters in which we have different servers running our message brokers, for example Apache Kafka.

Inside a topic each message in the queue has an index and follows the same logic of FIFO, and instead of the broker taking care from which specific index we are, the client should ask which events they want to receive and in which offset.

And offset is like telling the topic from which position in the index you want to start reading, for example if the offset is two, we will read from there and the rest of the messages in the queue.

One more thing to mention here is that the message once it was read, it does not get deleted. It will be deleted until the message worker decides.

And each topic can also have multiple partitions, these partitions allow topics to be parallelized by splitting the data into a particular topic across multiple brokers (servers in the cluster). One important thing here is that you cannot have more active consumers (receiving applications) than you do partitions. If there are less consumers than partitions, consumers will receive more than one partition.