How is the agent (ghost) implemented in the stack?

HEAD
78d0764feat: pin repos (#62)
This post might have stale content, as HEAD is 229 commits ahead.

how's the agent (ghost) implemented? in terms of the stack. and specifically, how does it handle the "navigate" behavior

Avatar of anthropic/claude-opus-4.6
Claude Opus 4.6
commented

Here's a full breakdown of how the Ghost agent is implemented and how navigation works:


##Ghost Architecture Overview

Ghost is an AI agent embedded as a side panel in this GitHub client app (a Next.js app). The stack:

LayerTech
LLM GatewayOpenRouter via @openrouter/ai-sdk-provider — defaults to moonshotai/kimi-k2.5 (or google/gemini-2.5-pro-preview for merge conflicts). Users can bring their own API key.
AI FrameworkVercel AI SDK v6 (ai@^6.0.97) — streamText() on the server, useChat() on the client.
Server RouteA single Next.js API route at /api/ai/ghost/route.ts (~3,400 lines), which is the brain.
Client UIai-chat.tsx (the chat component), global-chat-provider.tsx (React context for global state), global-chat-panel.tsx (the sliding panel), floating-ghost-button.tsx (the trigger button).
PersistenceChat history persisted to a DB via chat-store.ts with resumable streams. Tab state persisted via /api/ai/ghost-tabs.
MemoryLong-term memory via Supermemory (saveMemory/recallMemory tools). Past memories are auto-recalled into the system prompt.
Semantic SearchEmbeddings via Mixedbread for searching previously viewed PRs/issues.
SandboxE2B cloud sandboxes for running code, tests, builds, and previews.
GitHub APIVia Octokit, authenticated through the user's session.

##How the Agent Works

The route handler (POST in route.ts) does:

  1. Authenticates the user and gets an Octokit client.
  2. Detects context mode — PR, Issue, or General — based on explicit context from the client or by auto-parsing pageContext.pathname.
  3. Assembles tools by merging several tool groups: generalTools, codeEditTools, sandboxTools, searchTools, memoryTools, plus context-specific tools (prTools, issueTools, mergeConflictTools).
  4. Builds a system prompt that varies by mode (PR, Issue, General), including the current page context, inline code selections, and recalled memories.
  5. Wraps all tools with withSafeTools() — a try/catch wrapper around every tool's execute so a single tool failure doesn't crash the stream.
  6. Calls streamText() with the resolved model, system prompt, messages, tools, up to 50 agentic steps (stopWhen: stepCountIs(50)), and streams the response back.

##How "Navigate" Works — The _clientAction Pattern

Navigation is implemented as a server-returns-instruction, client-executes pattern. It's a two-phase process:

###Phase 1: Server-side (tool definitions)

There are 9 navigation tools defined in getGeneralTools(), all following the same pattern — they do no server-side work, just return a marker object:

// apps/web/src/app/api/ai/ghost/route.ts, line 731
navigateTo: tool({
    description: "Navigate the user to a top-level page within the app...",
    inputSchema: z.object({
        page: z.enum(["dashboard", "repos", "prs", "issues", 
                       "notifications", "settings", "search", "trending", "orgs"]),
        description: z.string(),
    }),
    execute: async (input) => ({
        _clientAction: "navigate" as const,   // ← the marker
        ...input,
    }),
}),

The other navigation tools (openRepo, openRepoTab, openWorkflowRun, openCommit, openIssue, openPullRequest, openUser, openUrl) work identically — each returns { _clientAction: "<actionName>", ...params }.

The key insight: these tools have no side effects on the server. Their execute functions simply echo back the input with a _clientAction tag. The LLM decides which tool to call and with what parameters; the actual routing happens on the client.

###Phase 2: Client-side (effect in ai-chat.tsx)

In ai-chat.tsx (line ~974), a useEffect watches the messages array. When a new assistant message contains a tool output with _clientAction, it dispatches the navigation:

// apps/web/src/components/shared/ai-chat.tsx, line 1005
if (output._clientAction) {
    executedActionsRef.current.add(actionKey);  // prevent re-execution
    const action = output._clientAction as string;

    setTimeout(() => {  // 600ms delay for UX smoothness
        if (action === "refreshPage") {
            router.refresh();
        } else if (action === "navigate") {
            const pageMap = {
                dashboard: "/dashboard", repos: "/repos", prs: "/prs",
                issues: "/issues", notifications: "/notifications",
                settings: "/settings", search: "/search",
                trending: "/trending", orgs: "/orgs",
            };
            router.push(pageMap[output.page] ?? "/dashboard");
        } else if (action === "openRepo") {
            router.push(`/${output.owner}/${output.repo}`);
        } else if (action === "openRepoTab") {
            router.push(`/${output.owner}/${output.repo}/${output.tab}`);
        } else if (action === "openIssue") {
            router.push(`/${output.owner}/${output.repo}/issues/${output.issueNumber}`);
        } else if (action === "openPullRequest") {
            router.push(`/${output.owner}/${output.repo}/pulls/${output.pullNumber}`);
        } else if (action === "openUser") {
            router.push(`/users/${output.username}`);
        } else if (action === "openUrl") {
            window.open(output.url, "_blank");
        }
        // ... etc for openCommit, openWorkflowRun
    }, 600);
}

It uses Next.js's useRouter().push() for in-app navigation and window.open() for external URLs. An executedActionsRef (a Set) keyed by ${messageId}-${toolCallId} prevents the same navigation from firing twice as the messages array re-renders.

###Why this design?

The system prompt explicitly instructs: "ALWAYS navigate within the app — use navigation tools, not github.com links." This ensures the LLM uses in-app routing instead of generating raw URLs. The _clientAction convention is a clean way to bridge the server-side tool execution (where the LLM decides what to do) with client-side effects (where router.push actually works), without needing a separate RPC layer.

Avatar of julianbenegas
Julian Benegas
commented

clever

Avatar of anthropic/claude-opus-4.6
Claude Opus 4.6
commented

Yeah, it's a nice pattern. The server-side tools are essentially just structured intents — the LLM picks the right one and fills in the params, but the execute is a pure pass-through. All the real side effects live on the client where router.push actually exists. Keeps things clean and avoids any awkward server→client callback plumbing.


END OF POST

How is the agent (ghost) implemented in the stack? — better-auth/better-hub