FastAPI Webhook Receiver Template with Retry

Webhooks are the backbone of event-driven architectures, allowing different services to communicate asynchronously. Whether you're integrating with payment gateways like Stripe, version control systems like GitHub, or SaaS platforms like Salesforce, you'll inevitably encounter webhooks. They're powerful, but building a robust webhook receiver isn't always straightforward. You need to handle transient network issues, service unavailability, and ensure data integrity, all while responding quickly to the sender.

In this article, we'll walk through building a FastAPI webhook receiver that incorporates robust retry mechanisms. We'll cover why synchronous retries are a bad idea, how to implement asynchronous processing with task queues, and the importance of idempotency and security.

The Basic FastAPI Webhook Receiver

Let's start with a minimal FastAPI application that can receive a webhook.

# main.py
from fastapi import FastAPI, Request, HTTPException
import logging

app = FastAPI()
logger = logging.getLogger(__name__)

@app.post("/webhook")
async def receive_webhook(request: Request):
    try:
        payload = await request.json()
        logger.info(f"Received webhook payload: {payload}")
        # In a real application, you'd process this payload
        # For now, let's simulate some processing.
        # This is where things can go wrong!

        # Example: Simulate a successful processing
        if "data" in payload and payload["data"] == "error":
            raise ValueError("Simulated processing error")

        return {"status": "success", "message": "Webhook received and processed."}
    except Exception as e:
        logger.error(f"Error processing webhook: {e}", exc_info=True)
        # It's generally good practice to respond with a 2xx status
        # even if internal processing fails, to prevent the sender from retrying
        # immediately and aggressively. The actual retry logic should be internal.
        raise HTTPException(status_code=200, detail=f"Webhook received, but internal error: {e}")

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

To run this, save it as main.py and install FastAPI and Uvicorn: pip install fastapi uvicorn Then execute: uvicorn main:app --reload

You can test it with curl: curl -X POST -H "Content-Type: application/json" -d '{"event": "user_created", "data": {"id": 123, "name": "Alice"}}' http://localhost:8000/webhook

This basic setup works, but it has a critical flaw: if the processing logic inside receive_webhook fails (e.g., a database connection drops, an external API is down, or there's a transient network issue), the webhook sender might consider it a failure and retry. More importantly, if the processing takes a long time, the sender might time out and retry anyway, leading to duplicate events.

Introducing Retries: Why and How

Why Retry?

Modern distributed systems are inherently unreliable. Transient errors are common: * Network glitches: Brief disconnections or high latency. * Service unavailability: A dependent microservice or third-party API is temporarily down or overloaded. * Database contention: Deadlocks or temporary connection pool exhaustion.

Retrying these operations after a short delay is often enough to succeed.

Synchronous vs. Asynchronous Retries

When it comes to webhooks, how you handle retries is crucial:

  • Synchronous Retries (Bad Idea): If your FastAPI endpoint tries to process the webhook and retry failed operations within the same request-response cycle, you're tying up the webhook sender. This is problematic because:

    • Timeouts: Most webhook senders have strict timeout limits (e.g., 5-10 seconds). If your retries exceed this, the sender will time out and likely retry the entire webhook, regardless of your internal retries.
    • Resource Blocking: You're holding open an HTTP connection for potentially a long time, consuming resources on both your server and the sender's.
    • Lack of Durability: If your FastAPI application crashes during a retry loop, the webhook processing is lost.
  • Asynchronous Retries (The Right Way): The best practice is to acknowledge the webhook immediately (with a 2xx HTTP status code), then hand off the actual processing and any retry logic to a background task. This decouples the webhook reception from its processing.

How to Implement Asynchronous Retries

For asynchronous processing and retries, you typically use a task queue.

Option 1: Task Queues (Recommended for Production)

Task queues like Celery, Redis Queue (RQ), or Apache Kafka (for more complex streaming scenarios) are designed for this. They provide: * Durability: Tasks are persisted, so if your FastAPI app or worker crashes, the task isn't lost. * Scalability: You can run multiple workers to process tasks in parallel. * Retry Mechanisms: Built-in support for retries, exponential backoff, and dead-letter queues.

Let's integrate with Celery as a concrete example.

First, install Celery and a broker (e.g., Redis): pip install "celery[redis]"

Next, create a celery_app.py:

```python

celery_app.py

from celery import Celery import logging

logger = logging.getLogger(name)

Configure Celery with a Redis broker

celery_app = Celery( "webhook_tasks", broker="redis://localhost:6379/0", backend="redis://localhost:6379/0" )

Optional: Configure Celery to retry by default

celery_app.conf.task_acks_late = True # Acknowledge task only after completion

celery_app.conf.task_reject_on_worker_lost = True # Re-queue if worker dies during processing

@celery_app.task(bind=True, max_retries=5, default_retry_delay=60) def process_webhook_payload(self, payload: dict, event_id: str = None): """ Celery task to process the webhook payload with retries. """ try: logger.info(f"Processing task for event_id={event_id}, payload={payload}")

    # --- Idempotency Check (Crucial for retries!) ---
    # Before doing any critical work