Next.js server crash on Linux — recovery guide
TL;DR
How to diagnose and recover a self-hosted Next.js server that crashes repeatedly on Linux.
Key facts
- Topic
- Production error triage
- Stack
- Next.js / Linux
TL;DR
A self-hosted Next.js server crashing on Linux leaves your entire application unreachable. Unlike Vercel's managed infrastructure, a crash on your own server means no automatic failover unless you have explicitly configured process management and health monitoring.
Common crash sources
- Uncaught exceptions in
getServerSideProps— an unhandled database timeout or API failure in a server-side data fetch brings down the entire Node.js process rather than returning a 500 to that single request - API route errors — throwing outside a try/catch in a Next.js API route handler crashes the server process
- ISR revalidation failures — when Incremental Static Regeneration fails on a callback and the error propagates uncaught, the revalidation worker can take the process with it
- Middleware infinite loops — a redirect chain in
middleware.tsthat matches its own destination creates an infinite loop, exhausting the stack
Diagnosing the crash
Check PM2 for crash history and recent logs:
pm2 status
pm2 logs nextapp --lines 200 --err
If the process exited without PM2 catching it, check systemd journal and kernel logs:
journalctl -u pm2-deploy --since "1 hour ago"
dmesg | grep -i "oom\|killed process"
Generate a Node.js crash dump for deeper analysis by setting the environment variable before starting:
NODE_OPTIONS='--abort-on-uncaught-exception' pm2 start ecosystem.config.js
Look for .core or crash report files in the working directory.
Restart strategies
Configure PM2 with exponential backoff to prevent rapid restart loops:
module.exports = {
apps: [{
name: 'nextapp',
script: '.next/standalone/server.js',
instances: 2,
exec_mode: 'cluster',
max_restarts: 10,
min_uptime: '10s',
exp_backoff_restart_delay: 100,
max_memory_restart: '1G',
}],
};
Graceful shutdown handling
Add a custom server wrapper that handles process signals correctly:
const { createServer } = require('http');
const next = require('next');
const app = next({ dev: false });
const handle = app.getRequestHandler();
app.prepare().then(() => {
const server = createServer(handle);
server.listen(3000, () => {
if (process.send) process.send('ready');
});
process.on('SIGINT', () => {
server.close(() => process.exit(0));
});
});
Prevention
Wrap all getServerSideProps and API route handlers in try/catch blocks. Use Next.js error boundaries for rendering failures. For ISR, set fallback: 'blocking' and handle revalidation errors explicitly in your data-fetching layer rather than letting them propagate.
Where Reflex helps
Reflex monitors your Next.js process health and detects crash loops before they exhaust PM2's restart limit. When a crash occurs, Reflex captures the error context, executes a recovery playbook — restarting the process, verifying the health endpoint, and rolling back to the previous build if the crash persists — all with a documented audit trail. See How it works.