Sharing State Across Node.js Cluster Workers with IPC Sockets
When you scale a Node.js application using cluster mode (or PM2), you spin up multiple worker processes to utilize all CPU cores. But there's a catch: each worker has its own isolated memory space. Your in-memory cache becomes useless.
In this post, I'll show you how to solve this using Unix domain sockets for inter-process communication (IPC), creating a shared cache that all workers can access.
The Problem with Cluster Mode Caching
You might think this would work:
// lib/cache.ts
const cache = new Map<string, any>();
export function get(key: string) {
return cache.get(key);
}
export function set(key: string, value: any) {
cache.set(key, value);
}It won't at least not across workers. When running in cluster mode:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Worker 1 │ │ Worker 2 │ │ Worker 3 │
│ cache: {} │ │ cache: {} │ │ cache: {} │
└─────────────┘ └─────────────┘ └─────────────┘
↑ sets ↑ gets ↑ gets
user:123 user:123 user:123
= data = undefined! = undefined!
Worker 1 caches some data, but Workers 2 and 3 have completely separate Map instances. They'll never see each other's cached values. Every worker ends up fetching the same data independently.
The Solution: External Cache Process
The fix is to move the cache outside the worker processes entirely. We'll create a small daemon that:
- Runs as a separate process (independent of cluster workers)
- Stores cache data in memory
- Communicates via Unix domain sockets (fast, no network overhead)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Worker 1 │ │ Worker 2 │ │ Worker 3 │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────────────┼───────────────────┘
│ Unix Socket
┌──────▼──────┐
│ IPC Cache │
│ Server │
└─────────────┘
Why Unix Domain Sockets?
- Speed: No TCP/IP stack overhead, just direct kernel-level IPC
- Simplicity: No ports to manage, just a file path
- Security: Socket file permissions control access
- Reliability: The OS handles connection management
Implementation
The Cache Server
First, let's create a simple cache server:
// cache-server.ts
import { createServer, Socket } from 'net';
import { existsSync, unlinkSync } from 'fs';
const SOCKET_PATH = '/tmp/app-cache.sock';
const cache = new Map<string, { value: any; expires: number }>();
// Clean up stale socket file
if (existsSync(SOCKET_PATH)) {
unlinkSync(SOCKET_PATH);
}
const server = createServer((socket: Socket) => {
socket.on('data', (data) => {
try {
const { action, key, value, ttl } = JSON.parse(data.toString());
let response: any;
switch (action) {
case 'get': {
const entry = cache.get(key);
if (entry && entry.expires > Date.now()) {
response = { ok: true, value: entry.value };
} else {
if (entry) cache.delete(key); // Clean expired
response = { ok: true, value: null };
}
break;
}
case 'set': {
const expires = Date.now() + (ttl || 60000); // Default 60s TTL
cache.set(key, { value, expires });
response = { ok: true };
break;
}
case 'delete': {
cache.delete(key);
response = { ok: true };
break;
}
case 'clear': {
cache.clear();
response = { ok: true };
break;
}
case 'stats': {
response = { ok: true, size: cache.size };
break;
}
default:
response = { ok: false, error: 'Unknown action' };
}
socket.write(JSON.stringify(response));
} catch (err) {
socket.write(JSON.stringify({ ok: false, error: 'Parse error' }));
}
});
});
server.listen(SOCKET_PATH, () => {
console.log(`Cache server listening on ${SOCKET_PATH}`);
});
// Graceful shutdown
process.on('SIGTERM', () => {
server.close();
if (existsSync(SOCKET_PATH)) unlinkSync(SOCKET_PATH);
process.exit(0);
});The Cache Client
Now the client that your Next.js app will use:
// lib/ipc-cache.ts
import { createConnection, Socket } from 'net';
const SOCKET_PATH = '/tmp/app-cache.sock';
function sendCommand(command: object): Promise<any> {
return new Promise((resolve, reject) => {
const socket: Socket = createConnection(SOCKET_PATH);
let data = '';
socket.on('connect', () => {
socket.write(JSON.stringify(command));
});
socket.on('data', (chunk) => {
data += chunk.toString();
});
socket.on('end', () => {
try {
const response = JSON.parse(data);
if (response.ok) {
resolve(response.value ?? response);
} else {
reject(new Error(response.error));
}
} catch {
reject(new Error('Invalid response'));
}
});
socket.on('error', (err) => {
reject(err);
});
// Timeout after 1 second
setTimeout(() => {
socket.destroy();
reject(new Error('Cache timeout'));
}, 1000);
});
}
export const ipcCache = {
async get<T>(key: string): Promise<T | null> {
try {
return await sendCommand({ action: 'get', key });
} catch {
return null; // Graceful fallback
}
},
async set(key: string, value: any, ttl?: number): Promise<void> {
try {
await sendCommand({ action: 'set', key, value, ttl });
} catch {
// Silent fail - cache is optional
}
},
async delete(key: string): Promise<void> {
try {
await sendCommand({ action: 'delete', key });
} catch {
// Silent fail
}
},
async clear(): Promise<void> {
try {
await sendCommand({ action: 'clear' });
} catch {
// Silent fail
}
},
};Usage in Your App
Now you can use it anywhere every worker shares the same cache:
// routes/users.ts
import { ipcCache } from './lib/ipc-cache';
export async function getUser(id: string) {
const cacheKey = `user:${id}`;
// Try cache first - works across ALL workers
const cached = await ipcCache.get<User>(cacheKey);
if (cached) {
return cached;
}
// Fetch fresh data
const user = await db.users.findById(id);
// Cache for 5 minutes - available to all workers instantly
await ipcCache.set(cacheKey, user, 5 * 60 * 1000);
return user;
}Running with Cluster Mode
Here's a complete example with Node.js cluster:
// server.ts
import cluster from 'cluster';
import { cpus } from 'os';
import { spawn } from 'child_process';
if (cluster.isPrimary) {
// Start the cache server first
const cacheServer = spawn('node', ['--import', 'tsx', 'cache-server.ts'], {
stdio: 'inherit',
});
// Fork workers
const numWorkers = cpus().length;
for (let i = 0; i < numWorkers; i++) {
cluster.fork();
}
cluster.on('exit', (worker) => {
console.log(`Worker ${worker.process.pid} died, restarting...`);
cluster.fork();
});
process.on('SIGTERM', () => {
cacheServer.kill();
process.exit(0);
});
} else {
// Worker process - start your app
import('./app');
}Or with PM2, add to your ecosystem.config.js:
module.exports = {
apps: [
{
name: 'cache-server',
script: 'cache-server.ts',
interpreter: 'tsx',
instances: 1, // Single instance!
},
{
name: 'api',
script: 'app.ts',
interpreter: 'tsx',
instances: 'max', // One per CPU
exec_mode: 'cluster',
},
],
};Performance Considerations
Connection Pooling
For high-throughput scenarios, you might want to maintain persistent connections:
// lib/ipc-cache-pooled.ts
import { createConnection, Socket } from 'net';
const SOCKET_PATH = '/tmp/app-cache.sock';
const pool: Socket[] = [];
const POOL_SIZE = 5;
function getConnection(): Promise<Socket> {
const socket = pool.pop();
if (socket && !socket.destroyed) {
return Promise.resolve(socket);
}
return new Promise((resolve, reject) => {
const newSocket = createConnection(SOCKET_PATH);
newSocket.on('connect', () => resolve(newSocket));
newSocket.on('error', reject);
});
}
function releaseConnection(socket: Socket) {
if (!socket.destroyed && pool.length < POOL_SIZE) {
pool.push(socket);
} else {
socket.destroy();
}
}Binary Protocol
For even better performance, consider using a binary protocol instead of JSON:
// Simple length-prefixed binary protocol
function encodeMessage(obj: object): Buffer {
const json = JSON.stringify(obj);
const length = Buffer.byteLength(json);
const buffer = Buffer.alloc(4 + length);
buffer.writeUInt32BE(length, 0);
buffer.write(json, 4);
return buffer;
}Alternatives Considered
Redis
Redis is the obvious choice for shared caching, but it:
- Requires running another service (and managing it)
- Has network overhead (even on localhost, TCP adds latency)
- Is overkill when you just need simple cross-worker state
For large-scale deployments or when you need persistence, Redis is still the right choice. But for single-server cluster deployments, IPC sockets are simpler.
Node.js Built-in IPC
Node.js cluster has built-in IPC via worker.send(), but:
- Messages go through the primary process (bottleneck)
- No built-in request/response pattern
- Primary process becomes a single point of failure
Shared Memory
Node.js doesn't have great native shared memory support. Libraries exist, but they're:
- Platform-specific
- Complex to set up correctly
- Often require native modules
Unix domain sockets hit the sweet spot: fast, simple, and universally supported.
Conclusion
When running Node.js in cluster mode and you need shared state between workers, a separate cache process with Unix domain sockets is an elegant solution. It's fast (kernel-level IPC), simple (just a socket file), and avoids the complexity of Redis for single-server deployments.
This pattern also works for:
- Rate limiting across workers (shared counters)
- Session coordination (sticky sessions without load balancer support)
- Distributed locks (ensure only one worker processes a job)
- Real-time counters (live user counts, etc.)
The implementation above is intentionally minimal. For production use, you might want to add:
- Health checks and automatic cache server restart
- Automatic reconnection in the client
- Cache eviction policies (LRU, etc.)
- Metrics and logging
But for many use cases, this simple approach is all you need.