Show HN: LogCost – map logging cost to specific log lines

1 points by random_round 3 hours ago
We kept running into the same pattern with logging costs:
  - CloudWatch / GCP Logging / Datadog tell you which log group/index is expensive
  - But not which specific log statements in your code are responsible

  So the response is always:

  - tweak retention / tiers
  - add filters and sampling

  …and only much later do you discover it was a couple of DEBUG lines in a hot path, verbose HTTP tracing, or payload dumps in loops.

  At some point we wanted a simple answer to:

  > “For the code that’s deployed right now, which log call sites are burning most of the budget?”

  ———

  ### What LogCost does

  LogCost is a small Python library + CLI that:

  - wraps the standard logging module (and optionally print)
  - tracks per‑call‑site metrics:
    {file, line, level, message_template, count, total_bytes}
  - applies provider pricing (e.g. GCP/AWS) to estimate cost
  - periodically exports aggregated stats only (no raw log payloads)
  - can send Slack notifications with the top N most expensive log lines

  It’s intended as a snapshot for the current deploy: run it under normal load, see which lines dominate cost, change them, redeploy, repeat.

  ———

  ### How it works (high level)

  - It wraps logging.Logger._log and records a key per call site using file, line, and level.
  - Message size is estimated from the formatted string length plus a configurable per‑event overhead, and accumulated per call site.
  - A background thread periodically flushes aggregates to JSON on disk.
  - The CLI reads that JSON and prints:
      - a cost summary (based on current provider pricing), and
      - a “top cost drivers” table per call site.

  By design it never stores raw log payloads, only aggregates like:

  {
    "file": "src/memory_utils.py",
    "line": 338,
    "level": "DEBUG",
    "message_template": "Processing step: %s",
    "count": 1200000,
    "bytes": 630000000,
    "estimated_cost": 315.0
  }

  That’s partly for privacy, and partly because this is meant to complement your log platform, not replace it.

  ———

  ### Example output

  A report might say:

  - Provider: GCP, Currency: USD
  - Total bytes: 900,000,000,000
  - Estimated cost: 450.00

  Top cost drivers:

  - src/memory_utils.py:338 [DEBUG] Processing step: %s → $157.50
  - src/api.py:92 [INFO] Request: %s → $73.20
  - …

  Slack notifications are just a formatted version of the same data, on a configurable interval (with an optional early “test” ping so you can verify wiring).

  ———

  ### Scope and status

  - Python‑only for now (Flask/FastAPI/Django / K8s sidecar examples)
  - MIT‑licensed, no backend service required
  - Export format is simple JSON, so it could feed a central aggregator later if needed

  Repo:
  https://github.com/ubermorgenland/LogCost

  I’d be interested in feedback from people who’ve debugged “mysterious” log bills:

  - Do you already solve this mapping (bill → specific log sites) in a cleaner way?
  - Is per‑line aggregation actually useful in your setups, or is this overkill compared to just better log group conventions?