Tutorial

PHP-FPM tuning guide — optimal worker settings for production

The Reflex Team9 min14 May 2026

Every PHP application running on nginx talks to the world through PHP-FPM. And yet most production servers run the default pm.max_children = 5 from the package maintainer, or worse, a number copied from a Stack Overflow answer for a completely different workload.

This guide gives you the arithmetic, the mental model, and the monitoring checks to tune PHP-FPM properly — once, based on evidence, instead of guessing until the OOM killer makes the decision for you.

The fundamental constraint

PHP-FPM spawns child processes. Each child handles one request at a time. Each child consumes memory — real, resident, non-shared memory that the kernel accounts for individually.

The constraint is simple:

pm.max_children × average_worker_rss ≤ memory_available_for_fpm

If the left side exceeds the right side, the OOM killer arrives. Every tuning decision flows from this equation.

Measuring worker memory

Do not guess. Measure:

ps -C php-fpm8.3 -o pid,rss --no-headers | awk '{sum+=$2; count++} END {print "Average RSS:", sum/count/1024, "MB"; print "Workers:", count}'

Run this under realistic load — not during a quiet Tuesday morning. Peak traffic is what matters because that is when all workers are active simultaneously.

For a typical Laravel application with Eloquent, queue listeners, and a few packages: 40-80 MB per worker is common. Heavy applications with image processing, PDF generation, or large dataset operations can reach 150-250 MB per worker.

The memory budget

Total server RAM is not your budget. Subtract everything else first:

Total RAM:                    4096 MB
- Operating system + kernel:  - 400 MB
- MySQL/PostgreSQL:           - 800 MB
- Redis:                      - 256 MB
- nginx:                      - 50 MB
- reflexd + other agents:     - 100 MB
- Safety buffer (15%):        - 615 MB
= Available for PHP-FPM:      1875 MB

Now apply the formula. If your average worker RSS is 60 MB:

max_children = 1875 / 60 ≈ 31

Round down, not up. Set pm.max_children = 30 and leave headroom for the worst-case request that allocates more than average.

Dynamic vs static vs on-demand

PHP-FPM offers three process management modes:

pm = static — spawns max_children workers at startup, keeps them alive permanently.

Best for: dedicated servers with predictable traffic. No spawn latency on traffic spikes. Memory usage is constant and predictable — the arithmetic above applies exactly.

pm = dynamic — starts with pm.start_servers, scales between pm.min_spare_servers and pm.max_spare_servers, up to pm.max_children.

Best for: servers with variable traffic where idle workers waste memory. The trade-off is spawn latency during sudden spikes — new workers take milliseconds to fork, which can cause brief upstream timeouts under burst load.

Sensible defaults for a 4 GB server:

pm = dynamic
pm.max_children = 30
pm.start_servers = 8
pm.min_spare_servers = 4
pm.max_spare_servers = 12

pm = ondemand — no workers at idle, spawns on demand, kills after pm.process_idle_timeout.

Best for: development servers, shared hosting, or boxes running many low-traffic pools. Not recommended for production with any real traffic — the spawn latency on every request after idle is noticeable.

pm.max_requests: the memory leak safety valve

pm.max_requests = 500

After serving 500 requests, a worker gracefully exits and the master spawns a replacement. This prevents slow memory leaks from accumulating indefinitely. Without it, a worker that leaks 0.5 MB per request will consume an extra 250 MB after 500 requests — and it will never give that memory back.

The default of 0 (unlimited) is dangerous for any application with third-party packages. Set it between 200 and 1000 depending on how much you trust your dependency tree.

Caution: setting this too low increases worker churn. Every respawn has a cost — opcache needs to warm, in-memory state is lost, and there is a brief moment where that worker slot is unavailable. Balance leak prevention against churn overhead.

pm.process_idle_timeout (dynamic/ondemand only)

pm.process_idle_timeout = 10s

How long an idle worker survives before the master kills it to reclaim memory. For dynamic mode, this controls how quickly the pool shrinks after a traffic spike. Ten seconds is a reasonable default — long enough to absorb brief pauses between requests, short enough to reclaim memory within a minute of traffic dropping.

Slow log: your free profiler

request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/slow.log

Any request that takes longer than 5 seconds gets a full stack trace dumped to the slow log. This is the cheapest performance diagnostic tool available — it tells you exactly which function was executing when the timeout hit, without any application instrumentation.

Review the slow log weekly. Patterns emerge: the same Eloquent query, the same HTTP client call, the same PDF generation path. Fix the top offender and your p99 latency drops.

Monitoring the pool in production

PHP-FPM exposes a status page that should be your primary monitoring data source:

location /fpm-status {
    access_log off;
    allow 127.0.0.1;
    deny all;
    fastcgi_pass unix:/run/php/php8.3-fpm.sock;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
}

Enable it in the pool config:

pm.status_path = /fpm-status

The status page exposes: active processes, idle processes, total processes, max active processes (high-water mark), listen queue length, and max listen queue length.

The critical metric is listen queue. If this number is consistently above zero, requests are waiting for a free worker. Either your workers are too slow (fix the app) or you do not have enough workers (increase max_children if memory allows, or add a server).

max active processes tells you the peak concurrent worker usage. If this equals max_children, you have hit the ceiling and requests queued. If it is consistently below 50% of max_children on a static pool, you are wasting memory on idle workers — switch to dynamic.

The tuning cycle

Measure worker RSS under production load
Calculate max_children from available memory
Set pm.max_requests as a leak safety net
Monitor the FPM status page for listen queue growth
Review the slow log for application-level bottlenecks
Repeat after significant code changes or traffic pattern shifts

This is not a one-time configuration. Every major release, every new package, every traffic milestone should trigger a review of the arithmetic. The math does not change — only the inputs do.

How Reflex approaches FPM tuning

Reflex's reflexd agent collects FPM pool metrics continuously — active workers, idle workers, listen queue depth, memory per worker, and slow request counts. The Brain uses these signals to detect pool pressure before it becomes an outage: rising listen queue depth, workers approaching memory limits, or slow log frequency increasing after a deploy. When the Brain determines FPM is in trouble, it can restart the pool gracefully — the same reload signal you would send manually, but triggered by data instead of a 3am page.

Ready to stop firefighting your servers?

Try Reflex free for 14 days.

Start free — 1 server View Pricing