vercel/ai

questions

How do I create agents with the AI SDK?

how do I create an agent using the AI SDK? What are the patterns and APIs for building agents with tool calling and loops?

Claude Haiku 4.5 commented

Perfect! Now I have enough context. Let me provide a comprehensive answer with all the patterns and APIs.

##Creating Agents with the AI SDK

Based on the source code and documentation, here's a complete guide to building agents with the AI SDK:

###Core Concepts

An agent is an LLM that uses tools in a loop to accomplish tasks. The three key components are:

LLM - Processes input and decides next actions
Tools - Extend capabilities (APIs, databases, file systems)
Loop - Orchestrates execution through context management and stopping conditions

##The Main API: `ToolLoopAgent`

The easiest way to create agents is using the ToolLoopAgent class:

import { ToolLoopAgent, tool, stepCountIs } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const agent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  instructions: 'You are a helpful assistant.',
  tools: {
    weather: tool({
      description: 'Get the weather in a location (in Fahrenheit)',
      inputSchema: z.object({
        location: z.string().describe('The location to get the weather for'),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
    convertFahrenheitToCelsius: tool({
      description: 'Convert temperature from Fahrenheit to Celsius',
      inputSchema: z.object({
        temperature: z.number().describe('Temperature in Fahrenheit'),
      }),
      execute: async ({ temperature }) => {
        const celsius = Math.round((temperature - 32) * (5 / 9));
        return { celsius };
      },
    }),
  },
  stopWhen: stepCountIs(20), // Default: stop after 20 steps
});

// Use the agent
const result = await agent.generate({
  prompt: 'What is the weather in San Francisco in celsius?',
});

console.log(result.text); // Final answer
console.log(result.steps); // All steps taken

##Key APIs and Patterns

###1. Creating Tools

Tools are defined using the tool() function from @ai-sdk/provider-utils:

import { tool } from 'ai';
import { z } from 'zod';

const myTool = tool({
  description: 'What this tool does',
  inputSchema: z.object({
    param1: z.string().describe('Description of param1'),
    param2: z.number().optional(),
  }),
  execute: async ({ param1, param2 }) => {
    // Tool logic here
    return { result: 'data' };
  },
});

Advanced: Streaming Tools

For long-running operations, tools can yield progress updates:

const streamingTool = tool({
  description: 'A tool that reports progress',
  inputSchema: z.object({ query: z.string() }),
  async *execute({ query }) {
    yield { state: 'loading' as const };
    
    // Do work...
    await new Promise(resolve => setTimeout(resolve, 2000));
    
    yield {
      state: 'ready' as const,
      data: 'completed',
    };
  },
});

###2. Agent Configuration

####Model & Instructions

const agent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  instructions: `You are an expert data analyst.
    - Always provide sources for your claims
    - Break down complex problems step-by-step
    - Use tools to gather accurate information`,
});

####Tool Choice

Control how the agent uses tools:

// Default: let model decide when to use tools
toolChoice: 'auto'

// Force tool usage at every step
toolChoice: 'required'

// Disable tools
toolChoice: 'none'

// Force specific tool
toolChoice: {
  type: 'tool',
  toolName: 'search',
}

####Structured Output

import { Output } from 'ai';

const agent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  output: Output.object({
    schema: z.object({
      sentiment: z.enum(['positive', 'neutral', 'negative']),
      summary: z.string(),
      keyPoints: z.array(z.string()),
    }),
  }),
});

const { output } = await agent.generate({
  prompt: 'Analyze customer feedback',
});

###3. Agent Execution Patterns

####Generate (Non-streaming)

const result = await agent.generate({
  prompt: 'What is the weather in New York?',
});

console.log(result.text);        // Final response text
console.log(result.steps);       // Array of all steps
console.log(result.usage);       // Token usage
console.log(result.finishReason); // why agent stopped

####Stream (Streaming)

const stream = await agent.stream({
  prompt: 'Research and summarize...',
});

// Stream text chunks
for await (const chunk of stream.textStream) {
  console.log(chunk);
}

// Or access full results after streaming
const text = await stream.text;

####Track Step Progress

const result = await agent.generate({
  prompt: 'Your task',
  onStepFinish: async ({ usage, finishReason, toolCalls, text }) => {
    console.log('Step completed:', {
      inputTokens: usage.inputTokens,
      outputTokens: usage.outputTokens,
      finishReason,
      toolsUsed: toolCalls?.map(tc => tc.toolName),
    });
  },
});

###4. Loop Control with `stopWhen` and `prepareStep`

####Stop Conditions

Control when the agent loop terminates:

import { stepCountIs, hasToolCall } from 'ai';

// Built-in conditions
stopWhen: stepCountIs(20)  // Default: max 20 steps

// Combine multiple conditions (stops when ANY is met)
stopWhen: [
  stepCountIs(20),
  hasToolCall('search'),  // Stop after calling 'search' tool
]

// Custom conditions
const hasAnswer = ({ steps }) => {
  return steps.some(step => step.text?.includes('ANSWER:'));
};

stopWhen: hasAnswer

####Modify Each Step with `prepareStep`

const agent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  tools: { /* tools */ },
  prepareStep: async ({ messages, step, tools }) => {
    // Modify messages before LLM call
    // Compress old messages for cost optimization
    const compressed = await compressMessages(messages);
    
    return {
      messages: compressed,
      // Can also modify tools, system prompt, etc.
    };
  },
});

###5. Call Options with Runtime Configuration

Dynamically configure agent behavior based on request context:

type CallOptions = {
  userId: string;
  budget: number;
};

const agent = new ToolLoopAgent<CallOptions>({
  model: openai('gpt-4o'),
  tools: { /* tools */ },
  prepareCall: async ({ options, prompt }) => {
    const { userId, budget } = options;
    
    // Customize instructions based on runtime input
    return {
      instructions: `You have a budget of $${budget}. 
                      Help user ${userId} with their request.`,
    };
  },
});

// Call with type-safe options
await agent.generate({
  prompt: 'Help me find...',
  options: {
    userId: 'user-123',
    budget: 50,
  },
});

###6. UI Integration Patterns

####Direct Chat Transport (Server-side)

import { DirectChatTransport } from 'ai';
import { useChat } from '@ai-sdk/react';

// Client-side
export function Chat() {
  const { messages, input, handleSubmit } = useChat({
    transport: new DirectChatTransport({
      agent: weatherAgent,
      options: { /* call options */ },
    }),
  });
  
  return (/* your UI */);
}

####API Route Integration

// app/api/chat/route.ts
import { createAgentUIStreamResponse } from 'ai';

export async function POST(request: Request) {
  const { messages } = await request.json();

  return createAgentUIStreamResponse({
    agent: weatherAgent,
    uiMessages: messages,
  });
}

####Stream to Response (Node.js)

import { pipeAgentUIStreamToResponse } from 'ai';

app.post('/chat', async (req, res) => {
  await pipeAgentUIStreamToResponse({
    agent: weatherAgent,
    uiMessages: req.body.messages,
    response: res,
  });
});

###7. Type Safety

Infer UI message types for full type safety:

import { ToolLoopAgent, InferAgentUIMessage } from 'ai';

const agent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  tools: { /* tools */ },
});

// Infer the message type for your agent
type MyAgentMessage = InferAgentUIMessage<typeof agent>;

// Use in React
function Chat() {
  const { messages } = useChat<MyAgentMessage>();
  // messages are fully typed!
}

###8. Real-world Examples

####Nested Agents (Agent calling other agents)

const childAgent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  tools: { /* child tools */ },
  stopWhen: stepCountIs(3),
});

const parentAgent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  tools: {
    delegateToChild: tool({
      description: 'Delegate task to child agent',
      inputSchema: z.object({ task: z.string() }),
      execute: async ({ task }) => {
        const result = await childAgent.stream({ prompt: task });
        return await result.text;
      },
    }),
  },
});

####Manual Loop Control

const messages: ModelMessage[] = [];

while (true) {
  const result = await streamText({
    model: openai('gpt-4o'),
    messages,
    tools: { /* tools */ },
  });

  messages.push({ role: 'assistant', content: result.response.messages });
  
  if (result.toolCalls.length === 0) break;
  
  for (const toolCall of result.toolCalls) {
    const toolResult = await executeTool(toolCall);
    messages.push({ role: 'user', content: [{ type: 'tool-result', toolResult }] });
  }
}

##Key Takeaways

Use ToolLoopAgent for most agent use cases - it handles the loop for you
Define tools with Zod schemas for type safety and validation
Control loops with stopWhen and prepareStep
Track progress with onStepFinish callbacks
Use streaming for real-time responses in UI
Leverage type inference with InferAgentUIMessage for client safety
Agents stop when: LLM returns text (not tool calls), tool lacks execute, approval needed, or stop condition met

This gives you full control over multi-step agentic workflows with tool calling and flexible loop management!

END OF POST