PHP-FPM tuning guide — optimal worker settings for production
The Reflex Team9 min14 May 2026
Every PHP application running on nginx talks to the world through PHP-FPM. And yet most production servers run the default pm.max_children = 5 from the package maintainer, or worse, a number copied from a Stack Overflow answer for a completely different workload.
This guide gives you the arithmetic, the mental model, and the monitoring checks to tune PHP-FPM properly — once, based on evidence, instead of guessing until the OOM killer makes the decision for you.
The fundamental constraint
PHP-FPM spawns child processes. Each child handles one request at a time. Each child consumes memory — real, resident, non-shared memory that the kernel accounts for individually.
The constraint is simple:
pm.max_children × average_worker_rss ≤ memory_available_for_fpm
If the left side exceeds the right side, the OOM killer arrives. Every tuning decision flows from this equation.
Measuring worker memory
Do not guess. Measure:
ps -C php-fpm8.3 -o pid,rss --no-headers | awk '{sum+=$2; count++} END {print "Average RSS:", sum/count/1024, "MB"; print "Workers:", count}'
Run this under realistic load — not during a quiet Tuesday morning. Peak traffic is what matters because that is when all workers are active simultaneously.
For a typical Laravel application with Eloquent, queue listeners, and a few packages: 40-80 MB per worker is common. Heavy applications with image processing, PDF generation, or large dataset operations can reach 150-250 MB per worker.
The memory budget
Total server RAM is not your budget. Subtract everything else first:
Total RAM: 4096 MB
- Operating system + kernel: - 400 MB
- MySQL/PostgreSQL: - 800 MB
- Redis: - 256 MB
- nginx: - 50 MB
- reflexd + other agents: - 100 MB
- Safety buffer (15%): - 615 MB
= Available for PHP-FPM: 1875 MB
Now apply the formula. If your average worker RSS is 60 MB:
max_children = 1875 / 60 ≈ 31
Round down, not up. Set pm.max_children = 30 and leave headroom for the worst-case request that allocates more than average.
Dynamic vs static vs on-demand
PHP-FPM offers three process management modes:
pm = static — spawns max_children workers at startup, keeps them alive permanently.
Best for: dedicated servers with predictable traffic. No spawn latency on traffic spikes. Memory usage is constant and predictable — the arithmetic above applies exactly.
pm = dynamic — starts with pm.start_servers, scales between pm.min_spare_servers and pm.max_spare_servers, up to pm.max_children.
Best for: servers with variable traffic where idle workers waste memory. The trade-off is spawn latency during sudden spikes — new workers take milliseconds to fork, which can cause brief upstream timeouts under burst load.
Sensible defaults for a 4 GB server:
pm = dynamic
pm.max_children = 30
pm.start_servers = 8
pm.min_spare_servers = 4
pm.max_spare_servers = 12
pm = ondemand — no workers at idle, spawns on demand, kills after pm.process_idle_timeout.
Best for: development servers, shared hosting, or boxes running many low-traffic pools. Not recommended for production with any real traffic — the spawn latency on every request after idle is noticeable.
pm.max_requests: the memory leak safety valve
pm.max_requests = 500
After serving 500 requests, a worker gracefully exits and the master spawns a replacement. This prevents slow memory leaks from accumulating indefinitely. Without it, a worker that leaks 0.5 MB per request will consume an extra 250 MB after 500 requests — and it will never give that memory back.
The default of 0 (unlimited) is dangerous for any application with third-party packages. Set it between 200 and 1000 depending on how much you trust your dependency tree.
Caution: setting this too low increases worker churn. Every respawn has a cost — opcache needs to warm, in-memory state is lost, and there is a brief moment where that worker slot is unavailable. Balance leak prevention against churn overhead.
pm.process_idle_timeout (dynamic/ondemand only)
pm.process_idle_timeout = 10s
How long an idle worker survives before the master kills it to reclaim memory. For dynamic mode, this controls how quickly the pool shrinks after a traffic spike. Ten seconds is a reasonable default — long enough to absorb brief pauses between requests, short enough to reclaim memory within a minute of traffic dropping.
Slow log: your free profiler
request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/slow.log
Any request that takes longer than 5 seconds gets a full stack trace dumped to the slow log. This is the cheapest performance diagnostic tool available — it tells you exactly which function was executing when the timeout hit, without any application instrumentation.
Review the slow log weekly. Patterns emerge: the same Eloquent query, the same HTTP client call, the same PDF generation path. Fix the top offender and your p99 latency drops.
Monitoring the pool in production
PHP-FPM exposes a status page that should be your primary monitoring data source:
location /fpm-status {
access_log off;
allow 127.0.0.1;
deny all;
fastcgi_pass unix:/run/php/php8.3-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
}
Enable it in the pool config:
pm.status_path = /fpm-status
The status page exposes: active processes, idle processes, total processes, max active processes (high-water mark), listen queue length, and max listen queue length.
The critical metric is listen queue. If this number is consistently above zero, requests are waiting for a free worker. Either your workers are too slow (fix the app) or you do not have enough workers (increase max_children if memory allows, or add a server).
max active processes tells you the peak concurrent worker usage. If this equals max_children, you have hit the ceiling and requests queued. If it is consistently below 50% of max_children on a static pool, you are wasting memory on idle workers — switch to dynamic.
The tuning cycle
- Measure worker RSS under production load
- Calculate max_children from available memory
- Set pm.max_requests as a leak safety net
- Monitor the FPM status page for listen queue growth
- Review the slow log for application-level bottlenecks
- Repeat after significant code changes or traffic pattern shifts
This is not a one-time configuration. Every major release, every new package, every traffic milestone should trigger a review of the arithmetic. The math does not change — only the inputs do.
How Reflex approaches FPM tuning
Reflex's reflexd agent collects FPM pool metrics continuously — active workers, idle workers, listen queue depth, memory per worker, and slow request counts. The Brain uses these signals to detect pool pressure before it becomes an outage: rising listen queue depth, workers approaching memory limits, or slow log frequency increasing after a deploy. When the Brain determines FPM is in trouble, it can restart the pool gracefully — the same reload signal you would send manually, but triggered by data instead of a 3am page.