v1.28.0 — LLM Service Redesign

Released 2026-05-04. GitHub release.

v1.28.0 contains breaking changes to the LLM service API and to the env package’s loading semantics, alongside a number of resilience and observability improvements. An upgrade-v1-28-0 skill is available under .claude/skills/upgrade/ to mechanize most of the migration.

Highlights

  • LLM service redesign. Caller now picks provider and model per call; ProviderHostname and per-provider Model configs are removed. Chat, Turn, and ChatLoop return token usage. Anthropic prompt caching is enabled on the claudellm provider.
  • env package adopts dotenv semantics. env.yaml / env.local.yaml values are written through to the real OS environment at package init. Real OS env wins over yaml. env.Push / env.Pop save and restore the OS env so test overrides reach third-party libraries.
  • Single-source agent guidance. AGENTS.md is removed; all per-microservice and per-package design rationale lives in CLAUDE.md files. Framework packages now ship per-package CLAUDE.md notes.
  • Distributed cache stampede protection. New LoadOrCompute / GetOrCompute operations coalesce concurrent makers per process via singleflight.
  • OTLP exporter resilience. Lazy connect, retries disabled, per-export timeout via the spec-standard OTEL_EXPORTER_OTLP_TIMEOUT.
  • Subscription method validation. Only the standard HTTP verbs and ANY are accepted at registration time; unknown methods now fail fast.

Breaking Changes

LLM service (coreservices/llm and providers):

  • Chat signature changed. Old: Chat(ctx, messages, tools) (messagesOut, err). New: Chat(ctx, provider, model, messages, toolURLs, options) (messagesOut, usage, err). Both provider and model are required and rejected with 400 Bad Request if empty.
  • Turn signature changed. Old: Turn(ctx, messages, tools) (*TurnCompletion, err). New: Turn(ctx, model, messages, tools, options) (content, toolCalls, usage, err).
  • Executor.ChatLoop signature changed. Old: ChatLoop(ctx, messages, tools) (messagesOut, status, err). New: ChatLoop(ctx, provider, model, messages, tools, options) (messagesOut, usage, status, err).
  • ChatLoop workflow’s declared inputs grew from (messages, tools) to (provider, model, messages, tools, options). Initial-state maps passed via foremanapi.Run/Create must include provider and model.
  • MockChat, MockTurn, and MockChatLoop handler signatures changed to match.
  • Removed configs: llm.coreProviderHostname; claude.llm.core, chatgpt.llm.core, gemini.llm.coreModel.
  • Removed methods: SetProviderHostname, SetModel (use the per-call arguments instead).
  • Removed type: llmapi.TurnCompletion (replaced by flat (content, toolCalls, usage) returns).

env package:

  • env.yaml and env.local.yaml values are written to the real OS environment at package init. Code that reads os.Getenv directly (third-party SDKs included) now sees yaml values.
  • OS env wins over yaml. Pre-1.28 yaml entries overrode OS values via the in-memory shadow store; in 1.28 yaml only fills keys that are not already present in the OS env. Operators who set values via shell, systemd, k8s, docker, or CI now win.
  • env.Push and env.Pop mutate the real OS env. Tests using env.Push must not run with t.Parallel().

Subscription method validation:

  • HTTP method strings on subscriptions are validated at registration time. Only GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE, PATCH, and ANY are accepted (case-insensitive). Unknown methods fail at startup rather than producing silently unreachable endpoints.

HTTP ingress proxy:

  • In PROD deployments, inbound requests to ports :1:1023 are blocked, except :80 and :443. Port :888 remains blocked in all environments. Other deployments are unaffected.

Time-budget header format:

  • Microbus-Time-Budget is now serialized as a Go duration string (5ms, 1h30m). The legacy bare-integer format is still accepted on read, so consumers using the framework are unaffected. Hand-emitted headers should switch to the new format.

Agent guidance files:

  • AGENTS.md files are removed across the project. All per-microservice design rationale now lives in CLAUDE.md. Existing redirect-only CLAUDE.md files are replaced by the merged content. References to AGENTS.md in documentation, prompts, and skills have been replaced with CLAUDE.md.

New Features

LLM service:

  • Token usage tracking via llmapi.Usage (InputTokens, OutputTokens, CacheReadTokens, CacheWriteTokens, Model, Turns). Aggregated across turns by Chat and ChatLoop.
  • New ChatOptions (caller-facing) and TurnOptions (provider-facing) structs for MaxToolRounds, MaxTokens, Temperature.
  • microbus_llm_tokens_total counter metric, labeled by provider, model, direction (input, output, cacheRead, cacheWrite).
  • New LLM Grafana dashboard charting tokens by direction/provider/model and cache hit ratio.
  • Anthropic prompt caching: the claudellm provider sets two cache_control breakpoints to enable prompt cache reuse across turns.
  • Typed model constants exported from each provider’s *api package (claudellmapi.ModelHaiku45, chatgptllmapi.ModelGPT4o, geminillmapi.ModelGemini20Flash, etc.).
  • chatbox.example extended with a provider dropdown supporting the simulated chatbox plus real Claude, ChatGPT, and Gemini providers.

Distributed cache:

  • LoadOrCompute and GetOrCompute coalesce concurrent makers per process via singleflight. Bounds the load on the underlying data source to roughly one regeneration per key per replica under bursty traffic. Maker errors are not cached.

Foreman:

  • New created_at index on microbus_flows and microbus_steps for efficient time-window queries (recent flows, retention purges, dashboards).
  • Fixed bug in fan-out + subgraph execution.

Observability:

  • OTLP exporter uses lazy connect (no eager dial at startup), retries are disabled, and the per-export timeout is read from the spec-standard OTEL_EXPORTER_OTLP_TIMEOUT env var. A misconfigured or unreachable collector no longer stalls startup or blocks exports indefinitely.
  • Internal control-plane traffic no longer emits noisy OTel spans.
  • Revamped Grafana dashboards.

Coding agents:

  • New review-microservice skill for end-to-end design audits of a single microservice.
  • housekeeping skill enforcement after every microservice change.
  • PROMPTS.md rewritten as prose rather than a chronological log.
  • Per-package CLAUDE.md design-rationale notes for framework packages (env, frame, connector, sub, openapi, application, workflow, etc.).
  • Tool fetcher in llm.core now handles unnamed and greedy path arguments correctly.

Migration

An automated migration skill is provided. From inside a Microbus project, ask Claude Code to “upgrade Microbus” — it will run upgrade-microbus, which calls upgrade-v1-28-0 automatically when going through this version.

The skill performs:

  1. Merges AGENTS.md into CLAUDE.md per directory.
  2. Strips redirect boilerplate and seeds empty files with the microservice hostname.
  3. Replaces AGENTS.md references with CLAUDE.md across the project (excluding skill templates).
  4. Bumps frameworkVersion: 1.28.0 and refreshes modifiedAt on every manifest.yaml.
  5. Migrates Chat / Turn / ChatLoop call sites and their mocks. Inserts // TODO: comments at every site that needs a provider / model value, and emits a list of touched call sites at the end.
  6. Removes ProviderHostname and Model config entries from config.yaml / config.local.yaml.
  7. Reports env.yaml keys that may collide with operator-set OS env, and flags *_test.go files that combine env.Push with t.Parallel().
  8. Validates subscription method: fields in manifests and sub.At( / sub.Method( calls.
  9. Heads-up audits for hand-emitted Microbus-Time-Budget headers and PortMappings targeting blocked PROD ports.

After running the skill, run go vet ./... and go test ./... -count=1, then grep for any remaining // TODO: comments inserted at LLM call sites and fill in provider / model values.