Technical Blog
Backend Development
April 22, 2024
Scaling Node.js to Infinity and Beyond: Cluster and Load Balancing for Galactic Performance
Ashutosh Kumar

Node.js, the superhero of web development, has swooped in to save the day for building high-performance and scalable web applications. With its lightning-fast event-driven, non-blocking powers, it tackles even the most demanding concurrent connections. But just like our favourite caped crusader, Node.js realises that even superheroes need a sidekick when the going gets tough. That's where the dynamic duo of cluster and load balancing steps in, working hand in hand to give Node.js the ability to scale to infinity and beyond, achieving performance that's out of this world. Let's dive into the exciting world of scaling Node.js to infinity and beyond!

1. Node.js Cluster: Harnessing the Power of Multiprocessing

When it comes to scaling and parallel processing, Node.js has a trusty sidekick built right in: the Cluster module. This powerful companion enables developers create a cluster of worker processes that share the same server port. With its superpowers of leveraging multiprocessor systems, the Cluster module enables your application to tap into the full potential of every CPU core, distributing the workload among the team. Let's have a look on how we can utilise this Robin.

First of all, we import the necessary modules for our server application.

const cluster = require('cluster');
const express = require('express');
const numCPUs = require('os').cpus().length;

We import the cluster, express, and os modules.

After importing the required modules, we'll use the cluster module to form a worker process cluster.

if (cluster.isMaster) {
 // Create a worker process for each CPU core
 for (let i = 0; i < numCPUs; i++) {

 // Event listener for when a worker process exits
 cluster.on('exit', (worker, code, signal) => {
   console.log(`Worker ${} died`);
   // Fork a new worker to replace the exited one

We start by verifying if the current process is the master process through cluster.isMaster. The master process handles the management and creation of worker processes.

If the current process is the master process, we can initiate the creation of the cluster. In this case, we aim to have one process per CPU core, and we utilise the 'exit' event emitted by the cluster module to automatically restart any process that exits unexpectedly. This way, the desired number of worker processes is consistently maintained.

If the current process is not the master process, it should execute the main code, which could involve running the Express server or any other relevant tasks for the application.

else {
 // Create an Express app
 const app = express();

 // Define routes
 app.get('/', (req, res) => {
   res.send('Hello, world!');

 const server = app.listen(8000, () => {
   console.log(`Worker ${} started`);

When the SIGTERM signal is received, we implement graceful server shutdown. This involves closing the server, allowing ongoing requests to complete, and terminating the worker process.

 // Gracefully handle server shutdown
 process.on('SIGTERM', () => {
   server.close(() => {
     console.log(`Worker ${} terminated`);

2. Intergalactic Load Balancing: Handling Massive Traffic Flows

In the realm of Node.js applications, load balancing emerges as a pivotal technique for achieving scalability. By intelligently distributing incoming requests across multiple servers or worker processes, this approach optimises resource utilisation, safeguarding against bottlenecks that could impede performance.

Load balancing is a technique that distributes traffic across multiple servers. This can help to improve the performance and availability of your application.

The Importance of Load Balancing in Node.js

Node.js is known for its event-driven, non-blocking I/O model, which makes it very efficient at handling concurrent connections. Well, even though it's great at that, there's a point where it can get overwhelmed if there are too many people trying to use the application all at once. That's where load balancing comes into play.

Implementing Load Balancer in Express + Node.js

First, make sure you have Express and http-proxy-middleware installed. If not, install them by running the following command;

npm install express http-proxy-middleware

# or

yarn add express http-proxy-middleware

# or

pnpm install express http-proxy-middleware

In your main.js file, where we have the logic to spin-off our server, import the required modules

const express = require('express');
const { createProxyMiddleware } = require('http-proxy-middleware');

Now import express and create an app. Also define the target servers. These target servers are instances of your Node.js application that are running on different ports

const app = express();

const targetServers = [

This array contains the URLs of the target servers (instances of your Node.js server) that will handle the incoming requests. In this example, we assume there are three target servers running on ports 3000, 3001, and 3002.

Set up the load balancer middleware using the createProxyMiddlewarefunction.

const loadBalancer = createProxyMiddleware({
 target: targetServers,
 changeOrigin: true,
 onError: (err, req, res) => {
   console.error('Proxy error:', err);
   res.writeHead(500, { 'Content-Type': 'text/plain' });
   res.end('Something went wrong. Please try again later.');

The loadBalancer middleware thus created can be used to apply Load Balancing on our express server. Now just listen to all the ports using load balancer.

// Use the load balancer middleware for all incoming requests

// Start the load balancer server on port 8080
const PORT = 8080;
app.listen(PORT, () => {
 console.log(`Load balancer with random strategy listening on port ${PORT}`);

The one we have implemented above is called Random Load Balancing. The world of load balancing offers a variety of other techniques, each with its own strengths and advantages.

In conclusion, Load balancing and clustering are a powerful duo in Node.js, improving performance and scalability. Load balancing shares incoming requests among servers, while clustering utilizes CPU cores effectively. Together, they create an efficient and resilient system capable of handling high traffic. If you need more help, feel free to ask!

Focus on insights.
Not data preparation!
Get Started Today