Skip to main content

Node.js memory leak in production — detection and resolution

TL;DR

How to detect, diagnose, and fix memory leaks in production Node.js applications.

Key facts

Topic
Production error triage
Stack
Node.js / Linux

TL;DR

A Node.js memory leak manifests as steadily increasing RSS (Resident Set Size) over hours or days until the process is killed by the OOM killer or PM2's max_memory_restart threshold. Unlike a single OOM spike, leaks are gradual and often go unnoticed until production traffic amplifies them.

Common leak patterns

  • Event listeners accumulating — attaching listeners in a request handler without removing them
  • Global caches without eviction — objects stored in module-level Maps or Sets that grow indefinitely
  • Closures retaining large scopes — callbacks holding references to request/response objects
  • Unreferenced timerssetInterval or setTimeout capturing outer variables

Detection

Monitor heap usage over time. A healthy process has a sawtooth pattern (grows, GC clears, grows again). A leaking process trends upward:

pm2 monit

For precise diagnosis, take two heap snapshots 10–15 minutes apart:

node --inspect app.js
# In Chrome DevTools: Memory > Take heap snapshot (twice)
# Compare retained size differences between snapshots

Alternatively, use Clinic.js for a high-level overview:

npx clinic doctor -- node app.js

Common fixes

Remove listeners properly:

function handler(req, res) {
  const onData = (chunk) => { /* process chunk */ };
  stream.on('data', onData);
  res.on('close', () => stream.removeListener('data', onData));
}

Add eviction to in-process caches:

const { LRUCache } = require('lru-cache');
const cache = new LRUCache({ max: 500, ttl: 1000 * 60 * 15 });

Clear intervals when they are no longer needed and avoid storing request-scoped data at module level.

Production safety net

Configure PM2 to restart before the leak causes a crash:

{ max_memory_restart: '1500M' }

This buys time while you find and fix the root cause.

Where Reflex helps

Reflex tracks Node.js memory trends and detects leak patterns — steadily rising RSS without corresponding GC reclamation. When it identifies a leak, it can trigger a rolling restart across cluster workers, preserving availability while your team investigates the root cause. See How it works.