vercel/workflow

questions

How is the security model for auto-generated routes?

HEAD

c1d7c8dAdd support for `"use step"` functions in class instance methods (#777)

This post might have stale content, as HEAD is 470 commits ahead.

Julian Benegas posted

how is the security model for auto generated routes? are they available for anyone to hit? or guarded somehow?

Claude Opus 4.5 commented

Now I have a complete picture. Let me summarize the security model for you:

##Security Model for Auto-Generated Routes

The security model for auto-generated routes (.well-known/workflow/v1/flow, .well-known/workflow/v1/step, and .well-known/workflow/v1/webhook/:token) differs based on which World implementation you're using:

###1. Vercel World (`@workflow/world-vercel`) - Production

For the Vercel World, the security model is multi-layered:

####a) Queue-based invocation (for `flow` and `step` routes)

The routes are not intended to be called directly by users. They're invoked by Vercel Queue Service (VQS)

From packages/world-vercel/src/queue.ts:

const handleCallbackClient = new Client({
  ...baseClientOptions,
});
// ...
return handleCallbackClient.handleCallback({...})

The @vercel/queue client's handleCallback method processes incoming messages only from VQS
Handlers receive a message ID that must be valid in Vercel's queue backend - you can't forge arbitrary payloads

OIDC authentication: From packages/world-vercel/src/utils.ts:

const token = config?.token ?? (await getVercelOidcToken());
if (token) {
  headers.set('Authorization', `Bearer ${token}`);
}

The documentation in docs/content/docs/how-it-works/framework-integrations.mdx states:

Vercel (@workflow/world-vercel):

Vercel Queue will support private invoke, making routes inaccessible from the public internet

Handlers receive only a message ID that must be retrieved from Vercel's backend

Impossible to craft custom payloads without valid queue-issued message IDs

####b) Webhook routes (`/.well-known/workflow/v1/webhook/:token`)

Token-based security: Webhooks use randomly-generated tokens (nanoid) unless you specify a custom one

From packages/core/src/workflow/hook.ts:

// Generate hook ID and token
const token = options.token ?? ctx.generateNanoid();

The token acts as a capability URL - if you know the token, you can resume the workflow

Invalid tokens return 404 (as shown in e2e test):

test('webhook route with invalid token', { timeout: 60_000 }, async () => {
  const res = await fetch(invalidWebhookUrl, ...);
  expect(res.status).toBe(404);
});

###2. Local World (`@workflow/world-local`) - Development Only

The local world has no authentication - it's designed only for local development:

From packages/world-local/src/queue.ts:

const response = await fetch(
  `${baseUrl}/.well-known/workflow/v1/${pathname}`,
  {
    method: 'POST',
    headers: {
      'content-type': 'application/json',
      'x-vqs-queue-name': queueName,
      'x-vqs-message-id': messageId,
      'x-vqs-message-attempt': String(attempt + 1),
    },
    body,
  }
);

The local world documentation confirms:

No authentication - Suitable only for local development

###3. Custom Worlds

For custom world implementations, the docs recommend:

From docs/content/docs/how-it-works/framework-integrations.mdx:

Custom implementations:

Implement authentication via framework middleware

Use API keys, JWT validation, or other auth schemes

Network-level security (VPCs, private networks, firewall rules)

Rate limiting and request validation

###Health Checks

Health checks are intentionally unauthenticated for monitoring purposes. From packages/core/src/runtime.ts:

// NOTE: Health check messages are intentionally unauthenticated for monitoring purposes.
// They only write a simple status response to a stream and do not expose sensitive data.
// The stream name includes a unique correlationId that must be known by the caller.

Summary:

Flow/Step routes: Protected by Vercel Queue Service in production - messages must come from VQS with valid message IDs. Anyone hitting these endpoints directly would get errors since they can't provide valid queue-issued messages.
Webhook routes: Use capability URLs (secret tokens). Knowing the token = authorization to resume the workflow.
Local world: No auth, development only.

Julian Benegas commented

gotchu. in the vercel world, how do they avoid deployment security issues?

Claude Opus 4.5 commented

Now I have the full picture. Let me summarize the deployment security model:

##Deployment Security in Vercel World

The Vercel World addresses several deployment-level security concerns through deployment pinning and infrastructure-level isolation:

###1. Deployment Pinning - Workflows Stay on Their Original Deployment

When a workflow starts, it's pinned to the deployment that started it. This is critical for both security and reliability.

From packages/core/src/runtime/start.ts:

const world = opts?.world ?? getWorld();
const deploymentId = opts.deploymentId ?? (await world.getDeploymentId());
// ...
await world.queue(
  `__wkf_workflow_${workflowName}`,
  { runId, traceCarrier } satisfies WorkflowInvokePayload,
  { deploymentId }  // Messages are routed to this specific deployment
);

The deploymentId is stored with the workflow run and all subsequent messages (steps, sleeps, hook resumptions) are routed back to that same deployment.

From packages/world-vercel/src/queue.ts:

const sendMessageClient = new Client({
  ...baseClientOptions,
  deploymentId,  // VQS Client routes to this specific deployment
});

###2. Message Preservation Across Re-queuing

When a workflow needs to sleep or re-queue (e.g., for long waits), the deployment ID is preserved in the message itself:

From packages/world-vercel/src/queue.ts:

const MessageWrapper = z.object({
  payload: QueuePayloadSchema,
  queueName: ValidQueueName,
  /**
   * The deployment ID to use when re-enqueueing the message.
   * This ensures the message is processed by the same deployment.
   */
  deploymentId: z.string().optional(),
});

And when re-queueing after a sleep (packages/world-vercel/src/queue.ts):

const { payload, queueName, deploymentId } = MessageWrapper.parse(body);
// ...
if (typeof result?.timeoutSeconds === 'number') {
  await queue(queueName, payload, {
    deploymentId,  // Preserved from original message
    delaySeconds,
  });
}

###3. Hook/Webhook Resumption Uses Original Deployment

When a webhook or hook resumes a workflow, it looks up the original workflow run's deployment ID and routes the message there:

From packages/core/src/runtime/resume-hook.ts:

const workflowRun = await world.runs.get(hook.runId);
// ...
// Re-trigger the workflow against the deployment ID associated
// with the workflow run that the hook belongs to
await world.queue(
  `__wkf_workflow_${workflowRun.workflowName}`,
  { runId: hook.runId, traceCarrier: ... },
  { deploymentId: workflowRun.deploymentId }  // Uses original deployment!
);

###4. Vercel Queue Service (VQS) Handles the Routing

The actual routing magic happens in Vercel's Queue Service infrastructure. When you pass deploymentId to the @vercel/queue Client, VQS ensures the message is delivered to that specific deployment's .well-known/workflow/v1/* endpoints.

This means:

New deployments can't hijack existing workflow runs - messages are routed by deployment ID
Old code continues running old workflows - no breaking changes mid-execution
Security isolation - a malicious new deployment can't intercept messages meant for another deployment

###5. The Documentation Says It Best

From docs/content/docs/deploying/world/vercel-world.mdx:

##Versioning

On Vercel, workflow runs are pegged to the deployment that started them. This means:

Existing workflow runs continue executing on their original deployment, even as new code is deployed

New workflow runs start on the latest deployment

Code changes won't break in-flight workflows

This ensures long-running workflows complete reliably without being affected by subsequent deployments.

###Summary

Security Concern	How It's Handled
Cross-deployment message hijacking	Messages include `deploymentId` - VQS routes to correct deployment
Code changes breaking in-flight workflows	Workflows continue on original deployment
Webhook tokens being reused across deployments	Hook lookup returns `workflowRun.deploymentId`, ensuring resumption goes to original deployment
Unauthorized queue access	VQS validates messages came from Vercel infrastructure, not arbitrary HTTP requests

The key insight is that Vercel Queue Service is the gatekeeper - your routes don't need to implement authentication themselves because VQS only delivers valid messages to them, and those messages are cryptographically tied to specific deployments.

END OF POST