Writing maintainable code and delivering robust web solutions

Don't Let Failed Tasks Stop Your Celery Chord Callback

  ·   4 min read

The problem #

Let’s start with a simplified real-world scenario: you need to process a large number of contracts in parallel, and once they’re done, generate a report. The reporting task analyzes the outcome of each contract, generates some fancy charts for management and sends them via email.

So, the workflow looks like this:

  1. Schedule one task per contract and wait for them all to complete.
  2. Run a report task that aggregates the results and performs the analysis.

If you’re familiar with Celery, this may sound like a perfect use case for a chord: run multiple tasks in parallel (the header), then execute a callback once they all finish.

And indeed, a basic version of that might look like this:

 1@app.task
 2def process_single_contract(contract: Contract):
 3    try:
 4        result = contract.process()
 5        return {"success": True, "result": result}
 6    except ContractProcessingException as e:
 7        return {"success": False, "error": str(e)}
 8
 9@app.task
10def analyze_results_callback(results):
11    # render_fancy_graphics()
12    # send_email()
13
14def schedule_contracts_processing(contracts: list[Contract]):
15    tasks = [
16        process_single_contract.s(contract)
17        for contract in contracts
18    ]
19    chord(tasks)(analyze_results_callback)

However, Celery chords have one significant limitation:

If any task in the chord’s header fails, the entire chord fails — and the callback is never executed.

There are many reasons a task might fail:

  • uncaught exceptions,
  • timeouts (soft or hard),
  • task revocation,
  • worker crashes or OOM,
  • manual termination (e.g., via Flower).

This behavior can be problematic — even if only one contract fails, you lose all your reporting logic. Fortunately, there is a workaround to handle this more gracefully.

What is an errback? #

Celery provides an error callback mechanism called an errback — a function that’s called if a task fails.

Here’s a simplified example from the Celery documentation:

 1@app.task
 2def on_chord_error(request, exc, traceback):
 3    print(f"Task {request.id!r} raised: {exc!r}")
 4
 5c = chord([
 6    add.s(4, 4),
 7    raising_task.s(),
 8    add.s(8, 8),
 9])(
10    successful_callback.s().on_error(on_chord_error.s())
11)

Here’s what’s happening:

  • If all header tasks succeed, successful_callback is executed.
  • If any header task fails, on_chord_error is triggered instead.

Celery guarantees that either the callback or the errback will be called exactly once after the chord completes.

Can the errback trigger the callback? #

Yes — and this gives us a clean workaround.

If we invoke the callback from the errback (after collecting results from successful tasks), we essentially guarantee that the callback logic runs, even if some header tasks failed.

Updated workflow #

Let’s visualize this with a flowchart:

---
title: "Updated Workflow: Errback Calls Callback"
---
flowchart LR
    header_done["Header tasks finished"]
    callback["Callback executed"]
    errback["Errback triggered — retrieve successful results"]

    header_done -->|All succeeded| callback
    header_done -->|Some failed| errback
    errback --> callback

Notice that in the failure case, the errback is responsible for retrieving successful results and then triggering the callback. That’s necessary because Celery does not pass results to the errback — only the exception context.

To retrieve the results, the errback needs access to the task IDs, which we can freeze before the chord is dispatched.

Final implementation #

Here’s how this looks in code:

 1from celery import chord, group, Task
 2from celery.result import AsyncResult
 3
 4@app.task
 5def process_single_contract(contract: Contract):
 6    try:
 7        result = contract.process()
 8        return {"success": True, "result": result}
 9    except ContractProcessingException as e:
10        return {"success": False, "error": str(e)}
11
12@app.task
13def analyze_results_callback(results):
14    # render_fancy_graphics()
15    # send_email()
16
17@app.task
18def on_chord_error(tasks: list[AsyncResult]):
19    successful_results = [
20        task.result for task in tasks if task.successful()
21    ]
22    analyze_results_callback.delay(results=successful_results)
23
24def schedule_contracts_processing(contracts: list[Contract]):
25    header_group = group(
26        process_single_contract.s(contract)
27        for contract in contracts
28    )
29
30    # Freeze task IDs so we can track their results in the errback
31    frozen_tasks = [task.freeze() for task in header_group.tasks]
32
33    callback = analyze_results_callback.s().on_error(
34        on_chord_error.s(tasks=frozen_tasks)
35    )
36
37    chord(header_group)(callback)

Key changes in this version:

  • The on_chord_error function receives a list of frozen AsyncResult objects, allowing it to inspect which tasks succeeded.
  • The callback is now guaranteed to run regardless of whether all tasks succeed or not.

Conclusion #

By chaining the callback from within the errback, and freezing task references ahead of time, you can work around one of Celery’s limitations: losing your chord callback when any task fails.

This pattern ensures that partial results still get reported and that your system remains resilient — even when individual tasks fail.