Aiven Kafka comes with a number of predefined plans, specifying the number of brokers and the capacity of individual brokers. The predefined plans consist of 3, 6, 9 or 15 brokers, but we can also create larger custom plans based on requirements.
To increase capacity of an existing Kafka cluster there are two options available:
- Vertically scaling the performance of individual brokers
- Horizontally scaling the cluster by adding more brokers
Both scaling options are available for all Aiven Kafka users and can be performed as an online operation, keeping your cluster up and running while it's being upgraded. The options offer different benefits which we will describe below.
Vertical scaling of a Kafka cluster means replacing existing broker nodes with higher capacity nodes while keeping the number of brokers the same. If you're unable to increase the partition or topic count of your Kafka cluster due to application constraints, this is usually the only available option.
In practice vertical scaling means going from, for example, Aiven Kafka Business-4 plan to the Business-8 plan. When such a service plan change is performed in Aiven, we automatically start adding new brokers with the new specifications to the existing cluster. Once the new brokers are online and data has been replicated to them from the older nodes the old brokers are retired one by one.
Horizontal scaling means adding more brokers to an existing Kafka cluster. This allows the load in the cluster to be shared by a larger number of individual nodes allowing the cluster to serve more requests as a whole. Horizontal scaling also makes the cluster more resilient to failures of a single node: in case one broker in a 3 node cluster fails, the remaining two nodes get a 50% increase in their load which may cause availability issues to the cluster. In case one broker in a 9 node cluster fails the remaining 8 nodes will only see a ~13% increase in their load.
When an Aiven Kafka cluster is upgraded from, for example, the Business-8 plan to the Premium-6x-8 plan, we will immediately launch two new brokers that are added to the existing cluster. The existing cluster nodes stay online and once the two new brokers are online and included in the cluster configuration Kafka will start placing partition replicas on the new nodes. It may take some time until the cluster is fully balanced.
Our recommendation is to utilize both the vertical and horizontal scaling capabilities of Aiven Kafka to achieve the best possible performance and fault tolerance. For production clusters we recommend a minimum of 6 cluster nodes to make sure the failure of a single cluster node won't cause a sharp increase in load of the remaining cluster nodes.