TL;DR

How to diagnose and fix Celery workers that stop processing tasks from the queue.

Topic: Production error triage
Stack: Python / Celery / Redis / Linux

TL;DR

When Celery workers stop processing tasks, jobs pile up in the queue while your application appears to accept work but never completes it. This is often more insidious than a crash because there is no obvious error — tasks simply sit unprocessed.

Common causes

The message broker (Redis or RabbitMQ) is down or unreachable
Workers are stuck on a long-running or deadlocked task
Task routing is misconfigured — tasks are sent to a queue no worker is consuming
Workers hit their memory limit and were killed without replacement
Prefetch count is too high, causing one worker to hoard tasks it cannot process in time

Diagnosis workflow

Check worker status:

celery -A myapp inspect active
celery -A myapp inspect reserved
celery -A myapp inspect ping

Check broker connectivity and queue depths:

redis-cli ping
redis-cli llen celery

For RabbitMQ:

rabbitmqctl list_queues name messages consumers

Review worker logs:

journalctl -u celery-worker --since "30 minutes ago"

Quick fixes

Restart the workers:

sudo systemctl restart celery-worker

Purge stuck tasks from a specific queue (use with caution):

celery -A myapp purge -Q queue_name

Prevention

Configure sensible worker settings:

app.conf.update(
    worker_max_tasks_per_child=200,
    worker_max_memory_per_child=256_000,
    task_time_limit=300,
    task_soft_time_limit=240,
    worker_prefetch_multiplier=4,
    task_acks_late=True,
)

Setting task_acks_late=True ensures tasks return to the queue if a worker crashes mid-execution, preventing task loss.

Where Reflex helps

Reflex monitors Celery queue depths and worker heartbeats. When queue depth grows while worker throughput drops, Reflex detects the mismatch, restarts stuck workers, verifies tasks begin processing again, and sends a detailed incident report with the timeline. See How it works.

Celery worker not processing tasks — fix guide

TL;DR

Common causes

Diagnosis workflow

Quick fixes

Prevention

Where Reflex helps