Behind CVFactory's Backend: Celery, FastAPI, and Playwright at Scale

Introduction

In this follow-up post I’ll lift the hood on CVFactory’s backend ‑ the directory you can find at @/Backend in the repo.

The service may look small, but it orchestrates a surprising amount of moving parts:

  • FastAPI for a thin HTTP interface
  • Celery for asynchronous, fault-tolerant job queues
  • Playwright for scraping job descriptions that hide behind login walls
  • LangChain for prompt templating and LLM orchestration
  • Docker + Supervisord to bundle everything into a single, reproducible container

My goal is to share the design decisions, code snippets, and gotchas so that you can reuse or extend the pattern in your own projects.

1. Why split the backend from Django?

Django excels at session-based web apps, but long-running AI calls and headless browser automation can block the asyncio loop and exhaust Gunicorn workers. Off-loading these tasks to a dedicated FastAPI + Celery stack keeps the main web app snappy and horizontally scalable.

graph TD
  Browser[User Browser]
  Django[Django Frontend]
  API[FastAPI Service]
  Queue[Celery \n Redis Broker]
  Worker[Celery Worker]
  LLM[OpenAI GPT-4o]
  Playwright[Playwright Crawler]

  Browser -->|POST /generate| Django
  Django -->|HTTP ↔ JSON| API
  API -->|send_task| Queue
  Queue --> Worker
  Worker --> LLM
  Worker --> Playwright
  Worker -->|result| API
  API --> Django
  Django --> Browser

2. Task pipeline in depth

  1. extract_html.py — Given a URL, spin up a Playwright context, authenticate if needed, and extract the raw HTML.
  2. text_extraction.py — Clean the HTML with BeautifulSoup and remove boilerplate like nav bars.
  3. content_filtering.py — Apply a profanity filter and redact PII (Personally Identifiable Information).
  4. cover_letter_generation.py — Build a prompt, call the LLM, and stream tokens back to the client.

Each step is idempotent and logged to /Backend/logs so that reruns don’t re-crawl the same page unnecessarily.

# tasks/cover_letter_generation.py (snippet)
from celery import shared_task
from utils.logging_utils import task_logger
from utils.celery_utils import get_openai_client

@shared_task(bind=True, acks_late=True, autoretry_for=(Exception,), retry_backoff=True)
def generate_cover_letter(self, clean_text: str, profile: dict) -> str:
    """Generates a cover-letter draft from sanitized JD + user profile."""
    task_logger.info("Starting generation task", extra={"task_id": self.request.id})

    client = get_openai_client()
    prompt = _build_prompt(clean_text, profile)
    response = client.chat.completions.create(model="gpt-4o", messages=prompt)

    task_logger.info("LLM finished", extra={"usage": response.usage})
    return response.choices[0].message.content

3. Robust logging & error handling

Every function is wrapped with structured logging and granular exception catching so that a failure in Playwright doesn’t bring down the entire worker. Logs are shipped to CloudWatch in production and to files locally.

  • Use retry_backoff=True to add exponential retries for transient errors.
  • Capture full tracebacks but redact sensitive env vars before shipping logs.

4. Local development in one command

docker compose -f docker-compose.yml up --build --remove-orphans

5. Lessons learned

  • Keep tasks small & serializable — Pass only JSON-serializable payloads to Celery.
  • Don’t scrape inside the web worker — Off-load any I/O-heavy scraping to dedicated workers to avoid timeouts.
  • Leverage typed settingspydantic.BaseSettings in core/config.py catches mis-configured env vars at startup.

Closing thoughts

The Backend may sit quietly behind the scenes, but it enables the AI magic users see on the frontend. By modularizing each concern—HTTP I/O, task queuing, scraping, and LLM calls—you gain a pipeline that’s easier to observe, scale, and extend.

Questions or feedback? Reach out to me directly—I’d love to hear your thoughts.

Written on June 29, 2025