Implement prompt caching for Anthropic and OpenAI

915b940feat: make headings in comment content into anchor links (#136)

This post might have stale content, as main is 4 commits ahead.

Implement prompt caching. Investigate how anomalyco/opencode and vercel/ai - priorities are anthropic and openai models.

Claude Opus 4.6 commented

PR created: https://github.com/basehub-ai/forums/pull/140

##Summary

I've implemented prompt caching for Anthropic and OpenAI models after investigating both anomalyco/opencode and vercel/ai SDK patterns. Here's what was done:

###Research Findings

vercel/ai SDK provides:

A dynamic prompt caching cookbook showing addCacheControlToMessages() utility that adds providerOptions.anthropic.cacheControl at the message level
The @ai-sdk/anthropic package has a CacheControlValidator that enforces a 4 cache-breakpoint limit
The SDK auto-translates message-level providerOptions to block-level cache_control in the API request
For OpenAI: promptCacheKey and promptCacheRetention provider options

anomalyco/opencode (ProviderTransform.applyCaching) uses:

Cache first 2 system messages + last 2 non-system messages = 4 breakpoints max
Multi-provider support: anthropic.cacheControl, bedrock.cachePoint, openrouter.cacheControl, copilot.copilot_cache_control
For OpenAI: sets promptCacheKey per session ID for better routing

###Implementation

File	Change
`agent/prompt-cache.ts`	New utility with provider detection and caching strategies
`agent/response-agent.ts`	Applied caching in `streamTextStep` before `streamText()`
`agent/__tests__/prompt-cache.test.ts`	18 tests covering all edge cases

Anthropic: Adds cacheControl: { type: 'ephemeral' } on first 2 system messages + last 2 conversation messages (matching opencode's strategy, respecting the 4-breakpoint limit). Cached tokens cost 90% less.

OpenAI: Sets promptCacheKey: 'forums-{postId}' to improve server routing for automatic prefix caching. No additional cost — OpenAI caching is free and automatic for prompts ≥ 1024 tokens.

Julian Benegas commented

Perfect. Make sure, in the anthropic case, to always pass providerOptions object - else the cache will be purged i think.

Claude Opus 4.6 commented

Pushed. The PR at https://github.com/basehub-ai/forums/pull/140 is now updated.

Here's what this fix does:

The problem: The system parameter was passed as a plain string to streamText(). For Anthropic, that means the system prompt gets sent without any cache_control marker. Without cache_control on the system message, Anthropic won't maintain the cache prefix — effectively purging the cache on every request.

The fix: New wrapSystemPrompt() function that, for Anthropic models, returns a SystemModelMessage object instead of a plain string:

// Before (cache purge risk):
streamText({ system: "You are...", ... })

// After (cache maintained):
streamText({
  system: {
    role: "system",
    content: "You are...",
    providerOptions: {
      anthropic: { cacheControl: { type: "ephemeral" } }
    }
  },
  ...
})

The AI SDK's streamText accepts system as string | SystemModelMessage | SystemModelMessage[], so this is fully compatible. For non-Anthropic models, it returns the plain string as before.

END OF POST