While MMLU and HumanEval refresh the "best model" headline every week, production teams vote on a different scoreboard: OpenRouter weekly token throughput. Dollars spent and tokens processed do not spin narratives—they measure what actually ships.
This piece is for developers and tech leads running OpenClaw, Cursor, or Claude Code on Mac. Using the public snapshot for May 18–24, 2026 (confirm the cutoff on OpenRouter's site), we unpack the 28.9 trillion weekly token landscape, the Top 10 model board, DeepSeek's multi-model dominance, and Anthropic's token-versus-dollar paradox, then deliver a six-step weekly routing checklist. You should leave knowing whether to trust benchmarks or invoices, how China–US model share is shifting, and how to iterate default agent models on a seven-day cadence.
01 Why weekly token billing beats benchmark leaderboards: three selection traps
OpenRouter is one of the largest neutral AI API aggregators: 300+ models across 60+ providers, with a single endpoint for OpenAI, Anthropic, Google, DeepSeek, and others. Its Rankings page sorts by weekly token throughput (input + output), updated every seven days and free to inspect.
- Benchmarks drift from production: Leaderboards often stress single-shot reasoning peaks, but agent workflows care about stability, tool-call success rate, API latency, and unit price. The a16z × OpenRouter 2025 AI Usage Report (built on roughly 100 trillion tokens of anonymized metadata) found benchmark scores and real market share move in near-inverse directions—developers optimize for inference cost. Coding workloads rose from about 11% of traffic in early 2025 to over 50%, now the largest single use case.
- Keynote hype vs wallet votes: Vendor stages sell "state of the art," but OpenRouter's weekly board reflects sustained paid calls worldwide. Platform volume was near 2.4 trillion tokens per week a year ago; by late May 2026 it reached 28.9 trillion—roughly 12× growth. Invoice velocity says more about adoption than any one-off eval run.
- Single-model myopia: Ranking one SKU hides vendor strategy. DeepSeek placed V4-Flash, V4-Pro, and V3.2 in the same week's Top 10, with combined throughput near 5.74 trillion tokens—closer to ecosystem control than a one-hit wonder.
Core thesis: weekly token volume is the thermometer of real AI adoption. In an agent- and batch-heavy era, the invoice beats MMLU for picking your default route.
Official rankings and methodology live here—re-open the link after publish to confirm the latest numbers:
02 May 18–24, 2026: 28.9 trillion weekly tokens and the Top 10 board
Reporting window: May 18–24, 2026 (OpenRouter's rolling seven-day window). Global model API traffic totaled 28.9 trillion tokens, up +7.4% week over week—the fifth consecutive weekly gain.
| Metric | Value | WoW | Read |
|---|---|---|---|
| Global weekly token volume | 28.9T | +7.4% | Fifth straight weekly rise; demand still accelerating |
| China-origin model volume | 9.223T | +19.89% | Fourth consecutive week ahead of US models |
| US-origin model volume | 4.93T | +16.27% | Strong growth, but share lost to China lineup |
| China traffic share | ~45%+ | — | Under 2% in early 2025; landscape reshaped in two years |
Model-level Top 10 for the week (sorted by token volume; cross-check OpenRouter's public board and press coverage—some rows are estimates; treat the live site as source of truth):
| Rank | Model | Vendor | Weekly tokens | Notes |
|---|---|---|---|---|
| 1 | DeepSeek-V4-Flash | DeepSeek (China) | 3.43T (+66%) | Default for agent flows; rock-bottom unit price |
| 2 | Tencent Hy3 Preview | Tencent (China) | 3.07T (+16%) | Still climbing after promo window ended |
| 3 | Claude Sonnet 4.6 | Anthropic (US) | 1.35T | 1M context; enterprise coding workhorse |
| 4 | DeepSeek-V3.2 | DeepSeek (China) | 1.31T | Low-cost long tail; roleplay-heavy traffic |
| 5 | Owl Alpha | OpenRouter | 1.15T (+29%) | Free agent-tuned tier; 1M context |
| 6 | Gemini 3 Flash Preview | Google (US) | 1.06T | Multimodal; academic and medical pipelines |
| 7 | DeepSeek-V4-Pro | DeepSeek (China) | 1.00T | Matrix flagship (~5.74T series total) |
| 8 | MiniMax M2.7 | MiniMax (China) | 806B | Long-context value tier |
| 9 | Grok 4.1 Fast | xAI (US) | 721B | 2M context; legal-document workloads |
| 10 | Step 3.5 Flash | StepFun (China) | 673B | Fast, cheap batch processing |
DeepSeek's multi-model matrix: three SKUs in the Top 10, combined weekly volume near 5.74 trillion tokens (about +25.9% WoW), beating Anthropic and Google on vendor-level share for the second straight week. More than half of the Top 10 are China-origin models—a stark shift from under 2% China traffic share in early 2025. Ultra-cheap open-weight routes are rewriting global call patterns.
Note: Kimi K2.6 ranked sixth the prior week but dropped out this cycle; some reports derive V4-Pro volume by subtracting Flash and V3.2 from series totals. If you read this weeks later, trust the live Rankings page over any static table.
03 Token share vs dollar revenue: the Anthropic premium paradox
Weekly tokens answer "who gets called most." Dollar share answers "who earns most." Stack both sheets to see how AI commercialization actually layers.
| Vendor / tier | Token share trend | Dollar revenue profile | Typical workloads |
|---|---|---|---|
| Anthropic Claude | ~12% (was ~25% a year ago) | ~46% of dollar revenue | Enterprise reasoning; buyers pay for quality |
| Google Gemini Flash | Mid-tier traffic | Mid-tier unit economics | Multimodal, academic, and clinical flows |
| DeepSeek / Tencent / MiniMax / StepFun | High volume, fast growth | Ultra-low price points | Agents, coding, batch pipelines |
The Anthropic premium paradox: flagship SKUs such as Claude Opus 4.6 can generate on the order of $25 million per month in platform revenue (public finance and aggregator estimates), yet token throughput is a fraction of the DeepSeek family. Enterprise buyers still pay a quality premium while traffic leadership tilts toward ultra-cheap China models—the second lesson of "billing does not lie." The market simultaneously buys capability markup and scale efficiency; it is not a winner-take-all race on either axis alone.
Three market layers: [high value · low volume] Anthropic Opus for hard reasoning; [balanced · mid volume] Gemini Flash for multimodal; [ultra-cheap · high volume] DeepSeek matrix for agents and batch. Pick one layer per default route—mixing without intent invites runaway spend.
For investors, OpenRouter's weekly board is a window into AI monetization velocity (platform valuation discussions have cited roughly 26× price-to-sales multiples in press). For builders, it is a vendor-neutral thermometer when you refuse to bet on one lab. For researchers, it is among the clearest public time series tracking China–US model share.
04 Six steps: track OpenRouter weekly and retune model routing
- Lock a Monday Rankings review: Open the OpenRouter Rankings page, log global weekly total, Top 10 moves, and WoW arrows. Drop screenshots or CSV exports into team docs—do not route from memory.
- Split token and dollar ledgers: In the OpenRouter dashboard or your own billing system, track weekly tokens and weekly USD spend per model. High token share with low business value is a downgrade candidate.
- Map workloads to the three layers: Default agents and batch jobs to DeepSeek-V4-Flash or peers; reserve Claude Sonnet/Opus for hard enterprise reasoning; route multimodal chains through Gemini Flash. Do not bind the entire stack to whichever model tops the board that week.
- Watch "signal" entrants: Models such as Hy3 Preview and Owl Alpha that spike after promos or agent-specific launches often keep growing—treat them as A/B routes, not instant fleet-wide swaps.
- Make Mac gateway routes hot-swappable: OpenClaw, Cursor, and Claude Code should read model IDs from env vars or config files, not hard-coded Skills. The macOS host running the gateway must stay online 24/7; a sleeping laptop drops both agents and routing policy.
- Monthly benchmark-vs-bill reconciliation: Compare your team's SWE-bench (or equivalent) favorites against OpenRouter weekly share. If high-score models stay underrepresented on invoices, production cares more about cost and latency than press headlines—trust the bill.
OPENROUTER_DEFAULT_MODEL=deepseek/deepseek-v4-flash
OPENROUTER_FALLBACK_MODEL=anthropic/claude-sonnet-4.6
OPENROUTER_WEEKLY_REVIEW_CRON=0 9 * * 1
curl -s https://openrouter.ai/api/v1/models | jq '.data[].id' | head
05 Citable figures, sources, and CALMVPS fit
- Reporting window: Core figures use OpenRouter's rolling seven-day window, snapshot through May 24, 2026; global weekly volume 28.9 trillion tokens, +7.4% WoW.
- DeepSeek-V4-Flash: About 3.43 trillion tokens for the week, roughly +66% WoW, ranked first on the model board (cited in financial press via OpenRouter/Bloomberg channels).
- China vs US: China-origin models 9.223T (+19.89%) versus US 4.93T (+16.27%); China led for the fourth straight week.
- DeepSeek series total: About 5.74 trillion tokens per week, roughly +25.9% WoW, leading Anthropic and Google on vendor-level share.
- a16z × OpenRouter report: Coding exceeds 50% of usage; benchmark scores and market share trend in opposite directions (2025 publication—cite the original PDF for exact wording).
The weekly board states the uncomfortable truth plainly: adoption is driven by who gets called, not who scores highest in a lab. China open models are capturing global traffic on razor-thin unit economics while Anthropic defends a high-margin enterprise pool. Teams that chase benchmarks alone often blow agent budgets inside two weeks.
Running model routes on Mac surfaces familiar gaps: laptop sleep kills gateway uptime; Linux VPS cannot host native macOS agent toolchains; virtualized Mac instances often pay a Metal and Xcode tax. For 24/7 agent control planes and CI nodes that scale day-to-week-month, CALMVPS bare-metal Mac rental delivers dedicated Apple Silicon, roughly 120-second provisioning, and flexible billing—so you can trust weekly invoices for model choice while keeping OpenClaw or Cursor orchestration on macOS that never sleeps. See pricing for hardware tiers and help center for remote access setup.