From REST to Agentic: A Pragmatic Migration Guide
Moving from traditional REST APIs to agentic AI systems doesn't have to be a risky rewrite. Here's how we've successfully migrated production systems while keeping the lights on.
The Problem with Big-Bang Migrations
Most teams think they need to:
- Shut down the old system
- Build the agent from scratch
- Hope everything works
- Deal with the fallout
This is backwards. Here's what actually works.
The Strangler Fig Pattern for AI
Named after a tree that gradually replaces its host, this pattern lets you:
- Run old and new systems side-by-side
- Route traffic incrementally
- Rollback at any point
- Measure before/after metrics
Step 1: Shadow Mode (Week 1-2)
Start by having your agent run in parallel without affecting production.
// Original REST call (still in production)
const result = await fetch("/api/bookings", {
method: "POST",
body: JSON.stringify(booking),
});
// Shadow agent call (logs only, no side effects)
await agent.processBooking(booking).catch((err) => {
logger.warn("Agent shadow mode failed", { err, booking });
// Don't throw - this is informational only
});
What you're testing:
- Agent response time vs REST
- Accuracy of agent decisions
- Error patterns and edge cases
- Cost per request
Step 2: Validation Mode (Week 2-3)
Now compare outputs. The REST API is still authoritative, but you're building confidence.
const [restResult, agentResult] = await Promise.all([
makeRESTCall(booking),
agent.processBooking(booking),
]);
// Compare and log differences
if (JSON.stringify(restResult) !== JSON.stringify(agentResult)) {
await metrics.recordDrift({
rest: restResult,
agent: agentResult,
booking,
});
}
// Still return REST result
return restResult;
Red flags to watch:
- Drift > 5% means agent needs tuning
- Timeouts > 2x REST means performance issues
- Pattern failures (works for simple cases, fails for complex)
Step 3: Percentage Rollout (Week 3-4)
Start routing real traffic, but control the blast radius.
const useAgent =
isFeatureFlagEnabled("agentic-booking") &&
(await shouldUseAgent(user, booking));
if (useAgent) {
try {
return await agent.processBooking(booking);
} catch (err) {
logger.error("Agent failed, falling back to REST", { err });
metrics.recordFallback();
return await makeRESTCall(booking); // Automatic fallback
}
}
return await makeRESTCall(booking);
Rollout strategy:
- 1% for 48 hours
- 10% if no issues
- 50% with monitoring
- 100% only after a week of stability
Common Pitfalls
Pitfall 1: No Escape Hatch
Always have a fallback. We use a feature flag that can be toggled instantly:
// config/features.ts
export const FEATURES = {
agenticBooking: {
enabled: env("AGENTIC_BOOKING_ENABLED", "false") === "true",
fallbackOnError: true,
timeout: 5000,
},
};
Pitfall 2: Ignoring Cost
Agents are more expensive than REST calls. Track this from day one:
metrics.recordAgentCost({
inputTokens: response.usage.input,
outputTokens: response.usage.output,
model: "gpt-4o",
costUSD: calculateCost(response.usage),
});
Rule of thumb: If agent costs > 10x REST and provides < 2x value, reconsider.
Pitfall 3: Treating Agents Like APIs
Agents aren't deterministic. Don't write tests expecting exact matches:
// ❌ Bad: Brittle test
expect(agent.response).toEqual("Booking confirmed for 2pm");
// ✅ Good: Test for intent
expect(agent.response).toMatch(/confirm|booked|scheduled/);
expect(agent.actions).toContainEqual({
type: "create_booking",
time: "14:00",
});
Real Example: docsquad.my Migration
We migrated their booking system from a simple form POST to an agentic assistant. Here's the timeline:
Week 1: Shadow mode revealed agents were 3x slower. We added caching and parallel processing.
Week 2: Validation mode showed 8% drift in address parsing. Fine-tuned prompts with real examples.
Week 3: 5% rollout caught an edge case with recurring bookings. Added explicit handling.
Week 4: 100% agent traffic. Response time improved to 1.2x REST (acceptable for better UX).
Result: 3x increase in booking completion rate because the agent handled ambiguous inputs gracefully.
Rollback Strategy
Always have a plan to go backwards:
// Emergency rollback process
if (metrics.errorRate > 5% || metrics.avgLatency > 10000) {
await featureFlags.disable('agentic-booking')
await alerts.pagerDuty('Agent degraded, reverted to REST')
}
Keep your REST endpoints running for at least 3 months post-migration. We've needed them.
Measuring Success
Don't just track errors. Track outcomes:
- Completion rate: Did users finish the flow?
- Support tickets: Fewer questions?
- Time to complete: Faster?
- Edge case handling: Better?
If agents don't improve these, they're just expensive middleware.
Next Steps
- Pick one low-risk flow (not checkout, not payments)
- Implement shadow mode this week
- Measure for 7 days
- Decide based on data, not hype
Need help? Book a discovery call and we'll audit your migration plan.