tokenpayback — your AI activity ledger

Where the waste hides — NVA share

Every session is also tagged as Value-Added (you'd pay for it), Non-Value-Added (retry / failed / harmful — pure waste), or Business-Value-Added (necessary overhead like research). Borrowed from Cooper & Kaplan's ABM (1991) — most of the easy money to save lives in NVA.

Drift detector — week over week

Latest week vs trailing 3-week baseline. From ACCA's variance analysis — the same machinery factories use to catch silent quality regressions.

Cost split — proportional vs fixed

Per Resource Consumption Accounting, pricing decisions belong to proportional cost (per-call API spend). Cancel/keep decisions belong to fixed cost (subscriptions). Most dashboards mash these together. We don't.

Proportional (this period)—

Fixed (weekly avg)—

Causality score—

Subscription utilization (last 30 days)

Per project — value stream cost

Each project = one value stream: end-to-end cost of one workflow you actually care about. NVA% column shows how much was waste.

Project	Sessions	Agents	Cost	Est. value	ROI	NVA %

Tool overlap — $— potentially recoverable

Projects where two or more agents touched the same work. The smaller-spend agent is usually the cancel candidate. Each row shows the primary agent (the one you'd keep) and the secondary spend you might be able to drop.

Project	Primary agent	Primary $	Also used	Recoverable $

What your AI did this period

Every session is categorized. Each row shows total cost, total human-equivalent value, and ROI — using mid estimates of human time and market rate, multiplied by the quality factor. Hover the columns for low/high ranges.

Across your AI agents

tokenpayback auto-detects Claude Code, Codex CLI, Hermes, OpenClaw, OpenHuman, and Cursor on your machine.

Agent	Sessions	$ spent	est. value	ROI

Last 4 weeks (engineering output)

Week	Cost	Value	ROI	PRs	Commits	+Lines	Reverts

🚨 Loss-makers — sessions where AI didn't pay off

Sessions with ROI < 1× (AI cost more than the human time it saved) or with failed / harmful quality. Look at these before declaring victory. A negative-ROI session burned tokens AND added cleanup work for you — those are the cases that should make you change your prompt, your tool choice, or your task framing.

Date	Agent	Project	Category	What happened	Quality	$ AI cost	$ Value	ROI

Recent sessions — with human-equivalent valuation

Every row shows the closest professional role, a time range (low — mid — high) for that human, the market rate range for that role, and the quality factor reflecting whether the AI output was directly usable. Final value = mid_hours × mid_rate × quality_factor; the range comes from combining the low/high estimates. Nothing leaves your machine.

Date	Agent	Via	Category	Project	Equivalent role	Human time low → mid → high	Hourly rate low → mid → high	Quality ×factor	$ AI cost	$ Human-equiv low — mid — high	ROI low — mid — high

How we estimate value

Per session, the LLM picks the closest professional role that would do this work and estimates hours + market rate (anchored to your benchmark: us-west / senior). Value = hours × rate × quality_factor. Quality factor: full-replacement 1.0 · with-edits 0.7 · draft-only 0.4 · failed 0.0.

Estimates are uncertain by design — that's why every number is a range, not a single value. Tune the benchmark in ~/.tokenpayback/config.yaml (region + seniority).