We needed to migrate off Railway. A cloud storage platform with AI-powered search—Next.js frontend, Express backend, Inngest worker, PostgreSQL with pgvector, Redis—all running on a single provider. Time to split things up and optimize the architecture.
The first instinct was Vercel. We already use it for client projects, the DX is excellent, and deployment is seamless. But as we dug into the architecture, reality set in: this wasn't a simple "deploy to Vercel" situation.
Why Vercel wasn't enough
The backend uses Server-Sent Events for streaming chat responses. Every API request hits Redis four times for rate limiting. The entire codebase maintains persistent ioredis connections. None of this plays well with serverless cold starts and execution limits.
┌─────────────────────────────────────────┐
│ Railway (before) │
├─────────────────────────────────────────┤
│ • Next.js web app │
│ • Express backend (SSE streaming) │
│ • Express worker (Inngest functions) │
│ • Self-hosted Inngest server │
│ • PostgreSQL + pgvector │
│ • Redis (persistent connections) │
└─────────────────────────────────────────┘
We needed a split architecture. Vercel for what it does best (Next.js at the edge), dedicated infrastructure for everything else.
The new architecture
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Vercel │────▶│ Fly.io │────▶│ Inngest Cloud│
│ (Next.js) │ │ (Backend + │ │ (Events) │
│ │ │ Worker) │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Neon │ │ Fly Redis │ │ S3 │
│ (PostgreSQL) │ │ (Upstash) │ │ (Files) │
└──────────────┘ └──────────────┘ └──────────────┘
Vercel handles the web app. Fast builds with pnpm --filter web... build to pull in workspace dependencies.
Fly.io runs the backend API and worker as separate apps. Dockerfiles already existed from Railway, but we needed fly.toml configs for each service. The monorepo structure complicated this—Fly expects the Dockerfile at the project root, but ours lived in apps/backend/Dockerfile and apps/worker/Dockerfile.
The fix: deploy from root with build context flags:
flyctl deploy --config apps/backend/fly.toml --build-context .Inngest Cloud replaced our self-hosted Inngest server. The SDK auto-discovers via INNGEST_EVENT_KEY, so we removed all the Railway-specific baseUrl logic. The old code normalized .up.railway.app vs .railway.internal URLs—now unnecessary.
Neon for PostgreSQL with pgvector. Fly.io's internal Redis replaced Upstash, same region as the backend for minimal latency.
The Inngest migration
The worker's Inngest client still had Railway URL normalization:
// Before: Railway-specific URL handling
export const inngest = new Inngest({
id: "disposal-space",
baseUrl: normalizeServiceUrl(process.env.INNGEST_URL, 8288),
});With Inngest Cloud, the SDK handles everything if you omit baseUrl:
// After: SDK auto-discovery
export const inngest = new Inngest({
id: "disposal-space",
});Same change in the web app's client. Update INNGEST_EVENT_KEY and INNGEST_SIGNING_KEY on Vercel, redeploy, done.
The audit
Migration complete, services running. Then we audited user-facing content. Terms of Service still said "AI-powered search uses self-hosted Inngest on Railway." Privacy Policy listed Railway as a service provider for "background job processing and database hosting."
Two marketing pages had "Powered by Railway" cards with descriptions about "edge hosting and auto-scaling"—Railway's marketing language, not Fly.io's. We updated them to reflect Fly.io's actual value: micro VMs, global regions, auto-suspend.
- Railway
- Lightning-fast edge hosting with automatic scaling
+ Fly.io
+ High-performance micro VMs close to users with instant auto-suspendCode comments throughout referenced Railway's proxy behavior, .railway.internal domains, port 8288. We commented out the old functions with deprecation notices and added migration context.
Performance tuning
After launch, search was slow. The backend in Stockholm, Redis in EU West, Postgres in Frankfurt. Cross-region latency killed response times.
We moved Redis to Fly.io's internal network in the same Stockholm region:
flyctl redis create --name disposal-redis --region arnLatency dropped. What was 800ms became 150ms. Redis now communicates over Fly's private IPv6 network at fdaa:....
The embedding pipeline gap
Uploads worked. But PDF embeddings weren't generating. We traced the flow: upload → triggerEmbedding() → POST /api/embeddings/trigger → inngest.send() → Inngest Cloud → worker.
The backend needed OPENAI_API_KEY to generate query embeddings for semantic search. We'd set it on the worker but forgot the backend. One flyctl secrets set later, AI search worked.
Then we noticed: Google Drive imports bypassed embeddings entirely. The import route created database records but never called triggerEmbedding(). Regular uploads did. Inconsistent data flow.
Lessons
Serverless has limits. SSE streaming and persistent connections don't fit the execution model. Know your constraints before committing.
Colocation matters. Cross-region latency compounds with every service call. Keep hot paths in the same region, use internal networks where possible.
Monorepo deployments need build context. Fly's --build-context flag saved us from restructuring the entire repo.
Audit everything after migration. Terms of service, privacy policies, marketing copy, code comments—all referenced the old infrastructure. Users don't care about your hosting provider until it's wrong in legal docs.
Different code paths need different testing. Regular uploads triggered embeddings. Google Drive imports didn't. Two paths to the same outcome, different implementations, silent failure.
The platform runs faster now than on Railway. Costs are similar—Fly.io's auto-suspend means we pay for compute we use, Neon scales to zero, Inngest Cloud charges per function run. The architecture is more explicit, services have clear boundaries, and we control the blast radius of each component.
Migration complete.