Unable to Generate Title - Insufficient Context
what's this?
##getsentry/junior — A Slack Bot Runtime for Next.js
Junior is a Slack bot runtime built for Next.js apps, created by Sentry. It lets you investigate issues, summarize context, and take action directly from Slack using connected tools.
###Packages
The repo is a monorepo with three main packages:
| Package | Purpose |
|---|---|
@sentry/junior | Core Slack bot runtime for Next.js |
@sentry/junior-github | GitHub plugin for issue workflows |
@sentry/junior-sentry | Sentry plugin for issue workflows |
###Structure
packages/junior— The core runtimepackages/junior-github— GitHub integration pluginpackages/junior-sentry— Sentry integration pluginpackages/docs— Documentation site (hosted at junior.sentry.dev)apps/example— Example appspecs/— Design specs covering agents, plugins, OAuth, security, and morescripts/— Release and dev helper scripts
###Key Links
- 📖 Docs: junior.sentry.dev
- 🚀 Quickstart: Getting started
- 🔌 Plugins: Plugin setup
Want me to dive deeper into any part of the codebase?
what does it use for the agent loop?
Now I have a clear picture. Here's the summary:
##Agent Loop: @mariozechner/pi-agent-core (from the Pi toolkit)
Junior uses Pi (@mariozechner/pi-agent-core) for its agent loop — specifically the Agent class. Pi is a TypeScript AI agent toolkit created by Mario Zechner (@badlogic), the same stack that powers OpenClaw.
###How it works in Junior
The core agent loop lives in packages/junior/src/chat/respond.ts (~line 969):
import { Agent, type AgentTool } from "@mariozechner/pi-agent-core";
const agent = new Agent({
getApiKey: () => getGatewayApiKey(),
initialState: {
systemPrompt: baseInstructions,
model: resolveGatewayModel(botConfig.modelId), // via @mariozechner/pi-ai
tools: createAgentTools(...) // Junior's tools adapted to AgentTool format
}
});
Then it calls agent.prompt(...) with the user's message and awaits the result. The Pi agent loop internally handles the tool-calling cycle — it calls the LLM, the LLM requests tool calls, Pi executes them, feeds results back, and repeats until the model is done.
###Key pieces of the stack
| Layer | Package | Role |
|---|---|---|
| LLM communication | @mariozechner/pi-ai | Unified multi-provider LLM API (resolves models via Vercel AI Gateway) |
| Agent loop | @mariozechner/pi-agent-core | The Agent class — manages state, tool calling loop, event subscriptions, message queuing |
| Model routing | @ai-sdk/gateway + Vercel AI Gateway | Routes to the actual model (default: anthropic/claude-sonnet-4.6) |
###Notable features of the loop
- Streaming: Junior subscribes to agent events (
agent.subscribe()) for real-time text deltas piped back to Slack - Timeout handling: A configurable turn timeout (default 12 min) races against the prompt — if it fires, the agent is aborted via
agent.abort() - Session resumability: Agent messages can be checkpointed and resumed across turns (
replaceMessages+continue()) for long-running conversations - Tool adaptation: Junior's own
ToolDefinitionobjects (Slack tools, web search, bash sandbox, skill loading, etc.) are adapted into Pi'sAgentToolformat viacreateAgentTools()
The Pi agent loop is intentionally minimal — as its creator describes it: "The loop just loops until the agent says it's done." No max-steps knobs or complex orchestration — just a straightforward prompt → tool call → result → repeat cycle.
Timeout handling
tell me about this. why does it do this? what happens on timeout? does it re-call itself? it is because it's deployed to vercel, right?
Now I have the full picture. Here's the explanation:
##Timeout Handling: Why, What, and How
###Why — Yes, it's because of Vercel
Junior is a Next.js app deployed on Vercel, and Vercel serverless functions have hard execution time limits. The agent loop (LLM calls + tool execution) can easily take many minutes for complex tasks, so it needs a way to survive being killed by the platform.
The timeout budget is derived directly from the Vercel Queue callback's max duration:
// config.ts
const DEFAULT_QUEUE_CALLBACK_MAX_DURATION_SECONDS = 800; // ~13.3 min
const TURN_TIMEOUT_BUFFER_SECONDS = 20; // safety margin
// turnTimeout = min(800 - 20, 720) seconds = 720s = 12 minutes
const DEFAULT_AGENT_TURN_TIMEOUT_MS = 12 * 60 * 1000;
The 20-second buffer ensures the agent has time to checkpoint its state before Vercel kills the function.
###What happens on timeout
Here's the sequence:
-
Promise.race— The agent prompt races against a timeout timer. When the timer fires:agent.abort(); // stops the Pi agent loop mid-execution reject(new AgentTurnTimeoutError(botConfig.turnTimeoutMs)); -
Snapshot messages — After aborting, it captures the agent's current message history:
timeoutResumeMessages = [...(agent.state.messages as unknown[])]; -
Checkpoint — The outer
catchblock writes a durable checkpoint with state"awaiting_resume":await upsertAgentTurnSessionCheckpoint({ conversationId, sessionId, sliceId: nextSliceId, state: "awaiting_resume", piMessages, // the full message history so far resumedFromSliceId: currentSliceId, }); -
Throw
RetryableTurnError— This propagates all the way up throughreplyToThread→app-runtime→ the queue callback handler.
###Yes, it re-calls itself
The RetryableTurnError is thrown as an unhandled error from the Vercel Queue callback handler (@vercel/queue's handleCallback). The queue's retry policy catches this and re-delivers the message with exponential backoff:
// queue/client.ts
retry: (_error, metadata) => {
if (metadata.deliveryCount >= MAX_DELIVERY_ATTEMPTS) { // max 10 retries
return { acknowledge: true }; // give up
}
const backoffSeconds = Math.min(300, Math.max(5, metadata.deliveryCount * 5));
return { afterSeconds: backoffSeconds }; // retry after 5s, 10s, 15s...
}
When the next delivery runs, generateAssistantReply detects the existing checkpoint:
const existingTurnCheckpoint = await getAgentTurnSessionCheckpoint(conversationId, sessionId);
const resumedFromCheckpoint = Boolean(
existingTurnCheckpoint?.state === "awaiting_resume" &&
existingTurnCheckpoint.piMessages.length > 0
);
And resumes the agent from where it left off instead of starting over:
// Restore the Pi agent's message history
await maybeReplaceAgentMessages(agent, existingTurnCheckpoint.piMessages);
// Then continue the loop instead of re-prompting
const promptPromise = resumedFromCheckpoint
? runAgentContinuation(agent) // agent.continue() — picks up where it stopped
: agent.prompt({ ... }); // normal first-time prompt
###The full architecture
The design is called "multi-slice execution" in their spec:
User message → Vercel Queue → Slice 1 (up to ~12 min)
↓ timeout
checkpoint state="awaiting_resume"
throw RetryableTurnError
↓
Queue re-delivers → Slice 2 (up to ~12 min)
↓ timeout
checkpoint again...
↓
Queue re-delivers → Slice N
↓ finishes
checkpoint state="completed"
Each "slice" gets a fresh 12-minute Vercel function invocation. The Pi agent's message history (including all tool calls and results so far) is checkpointed to durable storage and replayed into a new Agent instance on the next slice. This allows unbounded execution time across up to 10 slices (so theoretically ~2 hours of agent work per turn).
a new user message essentially creates a queue item?
Now I have the full picture. Here's the answer:
##Yes — every incoming Slack message becomes a queue item
The flow is:
###1. Slack webhook hits Vercel
Slack sends an HTTP POST to /api/webhooks/slack. This is a normal Vercel serverless function with a short execution budget — Slack requires a response within 3 seconds or it retries.
###2. Monkey-patch intercepts the message
Junior installs a prototype patch on the Chat class (chat-background-patch.ts) that replaces processMessage. Instead of processing the message inline, it:
- Uses
waitUntil()(from Next.js'safter()) to do work after the 200 response is sent to Slack - Inside that background work, calls
routeIncomingMessageToQueue()
###3. Route → enqueue
routeIncomingMessageToQueue() does several checks before enqueuing:
- Ignores self-messages (bot's own messages)
- Ignores messages without an ID
- Checks if the thread is subscribed or if the message is a mention — if neither, ignores it
- For subscribed threads (non-mention), runs a routing check (
shouldReplyInSubscribedThread) to decide if it should even respond - Deduplicates via a 24-hour TTL key to prevent Slack retries from creating duplicate work
- Adds an 👀 reaction to the message (visual "I'm working on it" feedback)
- Enqueues the serialized message + thread to
@vercel/queue:
queueMessageId = await deps.enqueueThreadMessage(payload, dedupKey);
This calls send() from @vercel/queue to the topic "junior-thread-message".
###4. Queue delivers to the callback handler
The queue delivers the item to /api/queue/callback (configured in vercel.json), which is a separate Vercel function invocation with a much longer execution budget (up to ~800 seconds). That's where the actual agent loop runs.
###Why this two-step design?
Slack POST ──→ Webhook handler (respond 200 in <3s)
└─ waitUntil: enqueue to Vercel Queue
└─ 👀 reaction added
Queue delivery ──→ /api/queue/callback (up to ~13 min budget)
└─ remove 👀 reaction
└─ run agent loop (Pi Agent)
└─ post reply to Slack
└─ if timeout → checkpoint → RetryableTurnError
└─ queue re-delivers → resume from checkpoint
It solves two Vercel constraints:
- Slack's 3-second webhook timeout — you can't do LLM work inline in the webhook handler
- Serverless function time limits — the queue callback gets a long budget, and if even that isn't enough, the checkpoint/resume mechanism splits work across multiple invocations
Uses waitUntil() (from Next.js's after())
how's this code
Here's how the two pieces connect:
###The bridge
In webhooks.ts, the Slack adapter's webhook handler is called with a waitUntil option:
const response = await handler(request, {
waitUntil: (task) =>
after(() => {
const runTask = () => {
const taskOrFactory = task as Promise<unknown> | (() => Promise<unknown>);
return typeof taskOrFactory === "function" ? taskOrFactory() : taskOrFactory;
};
if (activeSpan) {
return Sentry.withActiveSpan(activeSpan, runTask);
}
return runTask();
})
});
The Slack adapter (from the @chat-adapter/slack package) parses the webhook, returns a 200 response to Slack immediately, and calls waitUntil(task) with the actual processing work. That waitUntil is wired to Next.js's after() — which schedules a callback to run after the HTTP response has been sent to the client.
Then in chat-background-patch.ts, the patched processMessage passes that run function (which does routeIncomingMessageToQueue) into scheduleBackgroundWork, which just calls options.waitUntil(run).
###So the chain is:
Slack adapter calls waitUntil(task)
→ which calls after(() => task()) // Next.js after()
→ response is sent to Slack (200 OK)
→ then task() runs in the background
→ routeIncomingMessageToQueue()
→ enqueueThreadMessage() // @vercel/queue send()
The after() import is from next/server (line 1 of webhooks.ts). It also wraps the background work with Sentry.withActiveSpan so the Sentry trace context carries over — otherwise the enqueue work would be orphaned from the original request span.
It's a fairly standard pattern for serverless Slack bots — the only slightly unusual part is the monkey-patching of Chat.prototype.processMessage to intercept what the chat library would normally do (process the message inline) and redirect it through the queue instead.
does it use chat npm library?
Got it. It's not a random npm package — it's Vercel's own Chat SDK.
The chat package at version 4.17.0 is vercel/chat — Vercel's unified TypeScript SDK for building chat bots across Slack, Teams, Google Chat, Discord, and more. It's published on npm under the bare name chat.
Junior uses it as its platform abstraction layer. Here's how the pieces fit:
| Dependency | What it is |
|---|---|
chat (v4.17.0) | Core SDK — the Chat class, Thread, Message types, state management |
@chat-adapter/slack (v4.17.0) | Slack adapter — handles webhook parsing, signing, posting messages |
@chat-adapter/state-redis (v4.17.0) | Redis state adapter for thread subscriptions and persistence |
@chat-adapter/state-memory (v4.17.0) | In-memory state for dev/testing |
You can see it wired up in bot.ts:
import { Chat } from "chat";
import { createSlackAdapter } from "@chat-adapter/slack";
const createdBot = new Chat<{ slack: SlackAdapter }>({
userName: botConfig.userName,
adapters: {
slack: createSlackAdapter({ signingSecret, botToken, ... })
},
state: getStateAdapter()
});
The Chat SDK provides the event system (bot.onNewMention, bot.onSubscribedMessage, etc.), thread/message abstractions, and the webhook handling. Junior then monkey-patches the Chat.prototype.processMessage method (in chat-background-patch.ts) to redirect incoming messages through the Vercel Queue instead of processing them inline — which is the whole webhook → queue flow we looked at earlier.
END OF POST