getsentry/junior

Unable to Generate Title - Insufficient Context

HEAD

994c10dchore: Remove example skills

This post might have stale content, as HEAD is 248 commits ahead.

Julian Benegas posted

what's this?

Claude Opus 4.6 commented

##`getsentry/junior` — A Slack Bot Runtime for Next.js

Junior is a Slack bot runtime built for Next.js apps, created by Sentry. It lets you investigate issues, summarize context, and take action directly from Slack using connected tools.

###Packages

The repo is a monorepo with three main packages:

Package	Purpose
`@sentry/junior`	Core Slack bot runtime for Next.js
`@sentry/junior-github`	GitHub plugin for issue workflows
`@sentry/junior-sentry`	Sentry plugin for issue workflows

###Structure

packages/junior — The core runtime
packages/junior-github — GitHub integration plugin
packages/junior-sentry — Sentry integration plugin
packages/docs — Documentation site (hosted at junior.sentry.dev)
apps/example — Example app
specs/ — Design specs covering agents, plugins, OAuth, security, and more
scripts/ — Release and dev helper scripts

###Key Links

📖 Docs: junior.sentry.dev
🚀 Quickstart: Getting started
🔌 Plugins: Plugin setup

Want me to dive deeper into any part of the codebase?

Julian Benegas commented

what does it use for the agent loop?

Claude Opus 4.6 commented

Now I have a clear picture. Here's the summary:

##Agent Loop: `@mariozechner/pi-agent-core` (from the Pi toolkit)

Junior uses Pi (@mariozechner/pi-agent-core) for its agent loop — specifically the Agent class. Pi is a TypeScript AI agent toolkit created by Mario Zechner (@badlogic), the same stack that powers OpenClaw.

###How it works in Junior

The core agent loop lives in packages/junior/src/chat/respond.ts (~line 969):

import { Agent, type AgentTool } from "@mariozechner/pi-agent-core";

const agent = new Agent({
  getApiKey: () => getGatewayApiKey(),
  initialState: {
    systemPrompt: baseInstructions,
    model: resolveGatewayModel(botConfig.modelId),   // via @mariozechner/pi-ai
    tools: createAgentTools(...)                       // Junior's tools adapted to AgentTool format
  }
});

Then it calls agent.prompt(...) with the user's message and awaits the result. The Pi agent loop internally handles the tool-calling cycle — it calls the LLM, the LLM requests tool calls, Pi executes them, feeds results back, and repeats until the model is done.

###Key pieces of the stack

Layer	Package	Role
LLM communication	`@mariozechner/pi-ai`	Unified multi-provider LLM API (resolves models via Vercel AI Gateway)
Agent loop	`@mariozechner/pi-agent-core`	The `Agent` class — manages state, tool calling loop, event subscriptions, message queuing
Model routing	`@ai-sdk/gateway` + Vercel AI Gateway	Routes to the actual model (default: `anthropic/claude-sonnet-4.6`)

###Notable features of the loop

Streaming: Junior subscribes to agent events (agent.subscribe()) for real-time text deltas piped back to Slack
Timeout handling: A configurable turn timeout (default 12 min) races against the prompt — if it fires, the agent is aborted via agent.abort()
Session resumability: Agent messages can be checkpointed and resumed across turns (replaceMessages + continue()) for long-running conversations
Tool adaptation: Junior's own ToolDefinition objects (Slack tools, web search, bash sandbox, skill loading, etc.) are adapted into Pi's AgentTool format via createAgentTools()

The Pi agent loop is intentionally minimal — as its creator describes it: "The loop just loops until the agent says it's done." No max-steps knobs or complex orchestration — just a straightforward prompt → tool call → result → repeat cycle.

Julian Benegas commented

Timeout handling

tell me about this. why does it do this? what happens on timeout? does it re-call itself? it is because it's deployed to vercel, right?

Claude Opus 4.6 commented

Now I have the full picture. Here's the explanation:

##Timeout Handling: Why, What, and How

###Why — Yes, it's because of Vercel

Junior is a Next.js app deployed on Vercel, and Vercel serverless functions have hard execution time limits. The agent loop (LLM calls + tool execution) can easily take many minutes for complex tasks, so it needs a way to survive being killed by the platform.

The timeout budget is derived directly from the Vercel Queue callback's max duration:

// config.ts
const DEFAULT_QUEUE_CALLBACK_MAX_DURATION_SECONDS = 800;  // ~13.3 min
const TURN_TIMEOUT_BUFFER_SECONDS = 20;                    // safety margin

// turnTimeout = min(800 - 20, 720) seconds = 720s = 12 minutes
const DEFAULT_AGENT_TURN_TIMEOUT_MS = 12 * 60 * 1000;

The 20-second buffer ensures the agent has time to checkpoint its state before Vercel kills the function.

###What happens on timeout

Here's the sequence:

Promise.race — The agent prompt races against a timeout timer. When the timer fires:

agent.abort();  // stops the Pi agent loop mid-execution
reject(new AgentTurnTimeoutError(botConfig.turnTimeoutMs));

Snapshot messages — After aborting, it captures the agent's current message history:
```
timeoutResumeMessages = [...(agent.state.messages as unknown[])];
```

Checkpoint — The outer catch block writes a durable checkpoint with state "awaiting_resume":

await upsertAgentTurnSessionCheckpoint({
  conversationId, sessionId,
  sliceId: nextSliceId,
  state: "awaiting_resume",
  piMessages,                    // the full message history so far
  resumedFromSliceId: currentSliceId,
});

Throw RetryableTurnError — This propagates all the way up through replyToThread → app-runtime → the queue callback handler.

###Yes, it re-calls itself

The RetryableTurnError is thrown as an unhandled error from the Vercel Queue callback handler (@vercel/queue's handleCallback). The queue's retry policy catches this and re-delivers the message with exponential backoff:

// queue/client.ts
retry: (_error, metadata) => {
  if (metadata.deliveryCount >= MAX_DELIVERY_ATTEMPTS) {  // max 10 retries
    return { acknowledge: true };  // give up
  }
  const backoffSeconds = Math.min(300, Math.max(5, metadata.deliveryCount * 5));
  return { afterSeconds: backoffSeconds };  // retry after 5s, 10s, 15s...
}

When the next delivery runs, generateAssistantReply detects the existing checkpoint:

const existingTurnCheckpoint = await getAgentTurnSessionCheckpoint(conversationId, sessionId);
const resumedFromCheckpoint = Boolean(
  existingTurnCheckpoint?.state === "awaiting_resume" &&
  existingTurnCheckpoint.piMessages.length > 0
);

And resumes the agent from where it left off instead of starting over:

// Restore the Pi agent's message history
await maybeReplaceAgentMessages(agent, existingTurnCheckpoint.piMessages);

// Then continue the loop instead of re-prompting
const promptPromise = resumedFromCheckpoint
  ? runAgentContinuation(agent)   // agent.continue() — picks up where it stopped
  : agent.prompt({ ... });        // normal first-time prompt

###The full architecture

The design is called "multi-slice execution" in their spec:

User message → Vercel Queue → Slice 1 (up to ~12 min)
                                  ↓ timeout
                              checkpoint state="awaiting_resume"
                              throw RetryableTurnError
                                  ↓
                              Queue re-delivers → Slice 2 (up to ~12 min)
                                  ↓ timeout
                              checkpoint again...
                                  ↓
                              Queue re-delivers → Slice N
                                  ↓ finishes
                              checkpoint state="completed"

Each "slice" gets a fresh 12-minute Vercel function invocation. The Pi agent's message history (including all tool calls and results so far) is checkpointed to durable storage and replayed into a new Agent instance on the next slice. This allows unbounded execution time across up to 10 slices (so theoretically ~2 hours of agent work per turn).

Julian Benegas commented

a new user message essentially creates a queue item?

Claude Opus 4.6 commented

Now I have the full picture. Here's the answer:

##Yes — every incoming Slack message becomes a queue item

The flow is:

###1. Slack webhook hits Vercel

Slack sends an HTTP POST to /api/webhooks/slack. This is a normal Vercel serverless function with a short execution budget — Slack requires a response within 3 seconds or it retries.

###2. Monkey-patch intercepts the message

Junior installs a prototype patch on the Chat class (chat-background-patch.ts) that replaces processMessage. Instead of processing the message inline, it:

Uses waitUntil() (from Next.js's after()) to do work after the 200 response is sent to Slack
Inside that background work, calls routeIncomingMessageToQueue()

###3. Route → enqueue

routeIncomingMessageToQueue() does several checks before enqueuing:

Ignores self-messages (bot's own messages)
Ignores messages without an ID
Checks if the thread is subscribed or if the message is a mention — if neither, ignores it
For subscribed threads (non-mention), runs a routing check (shouldReplyInSubscribedThread) to decide if it should even respond
Deduplicates via a 24-hour TTL key to prevent Slack retries from creating duplicate work
Adds an 👀 reaction to the message (visual "I'm working on it" feedback)
Enqueues the serialized message + thread to @vercel/queue:

queueMessageId = await deps.enqueueThreadMessage(payload, dedupKey);

This calls send() from @vercel/queue to the topic "junior-thread-message".

###4. Queue delivers to the callback handler

The queue delivers the item to /api/queue/callback (configured in vercel.json), which is a separate Vercel function invocation with a much longer execution budget (up to ~800 seconds). That's where the actual agent loop runs.

###Why this two-step design?

Slack POST ──→ Webhook handler (respond 200 in <3s)
                  └─ waitUntil: enqueue to Vercel Queue
                        └─ 👀 reaction added

Queue delivery ──→ /api/queue/callback (up to ~13 min budget)
                     └─ remove 👀 reaction
                     └─ run agent loop (Pi Agent)
                     └─ post reply to Slack
                        └─ if timeout → checkpoint → RetryableTurnError
                           └─ queue re-delivers → resume from checkpoint

It solves two Vercel constraints:

Slack's 3-second webhook timeout — you can't do LLM work inline in the webhook handler
Serverless function time limits — the queue callback gets a long budget, and if even that isn't enough, the checkpoint/resume mechanism splits work across multiple invocations

Julian Benegas commented

Uses waitUntil() (from Next.js's after())

how's this code

Claude Opus 4.6 commented

Here's how the two pieces connect:

###The bridge

In webhooks.ts, the Slack adapter's webhook handler is called with a waitUntil option:

const response = await handler(request, {
  waitUntil: (task) =>
    after(() => {
      const runTask = () => {
        const taskOrFactory = task as Promise<unknown> | (() => Promise<unknown>);
        return typeof taskOrFactory === "function" ? taskOrFactory() : taskOrFactory;
      };
      if (activeSpan) {
        return Sentry.withActiveSpan(activeSpan, runTask);
      }
      return runTask();
    })
});

The Slack adapter (from the @chat-adapter/slack package) parses the webhook, returns a 200 response to Slack immediately, and calls waitUntil(task) with the actual processing work. That waitUntil is wired to Next.js's after() — which schedules a callback to run after the HTTP response has been sent to the client.

Then in chat-background-patch.ts, the patched processMessage passes that run function (which does routeIncomingMessageToQueue) into scheduleBackgroundWork, which just calls options.waitUntil(run).

###So the chain is:

Slack adapter calls waitUntil(task)
  → which calls after(() => task())        // Next.js after()
    → response is sent to Slack (200 OK)
    → then task() runs in the background
      → routeIncomingMessageToQueue()
        → enqueueThreadMessage()           // @vercel/queue send()

The after() import is from next/server (line 1 of webhooks.ts). It also wraps the background work with Sentry.withActiveSpan so the Sentry trace context carries over — otherwise the enqueue work would be orphaned from the original request span.

It's a fairly standard pattern for serverless Slack bots — the only slightly unusual part is the monkey-patching of Chat.prototype.processMessage to intercept what the chat library would normally do (process the message inline) and redirect it through the queue instead.

Julian Benegas commented

does it use chat npm library?

Claude Opus 4.6 commented

Got it. It's not a random npm package — it's Vercel's own Chat SDK.

The chat package at version 4.17.0 is vercel/chat — Vercel's unified TypeScript SDK for building chat bots across Slack, Teams, Google Chat, Discord, and more. It's published on npm under the bare name chat.

Junior uses it as its platform abstraction layer. Here's how the pieces fit:

Dependency	What it is
`chat` (v4.17.0)	Core SDK — the `Chat` class, `Thread`, `Message` types, state management
`@chat-adapter/slack` (v4.17.0)	Slack adapter — handles webhook parsing, signing, posting messages
`@chat-adapter/state-redis` (v4.17.0)	Redis state adapter for thread subscriptions and persistence
`@chat-adapter/state-memory` (v4.17.0)	In-memory state for dev/testing

You can see it wired up in bot.ts:

import { Chat } from "chat";
import { createSlackAdapter } from "@chat-adapter/slack";

const createdBot = new Chat<{ slack: SlackAdapter }>({
  userName: botConfig.userName,
  adapters: {
    slack: createSlackAdapter({ signingSecret, botToken, ... })
  },
  state: getStateAdapter()
});

The Chat SDK provides the event system (bot.onNewMention, bot.onSubscribedMessage, etc.), thread/message abstractions, and the webhook handling. Junior then monkey-patches the Chat.prototype.processMessage method (in chat-background-patch.ts) to redirect incoming messages through the Vercel Queue instead of processing them inline — which is the whole webhook → queue flow we looked at earlier.

END OF POST