Users started hitting a wall. Clicking "Upgrade to Premium" would fail silently. The Stripe checkout modal would show a generic error. Support tickets mentioned "something broke" but couldn't reproduce it.
The error in Railway logs told the story:
Error: Failed to find Server Action "x". This request might be from an older or newer deployment.
This is deployment skew — when client-side JavaScript references server-side functions that no longer exist.
The Root Cause
Next.js 15 Server Actions compile to unique action IDs. These IDs are deterministic hashes based on the function's content and location. When the server redeploys, new IDs are generated. Clients with cached JavaScript bundles still reference the old IDs.
Vercel handles this with built-in skew protection — old deployments stay warm and requests are routed based on client deployment headers. Railway doesn't have this. Old deployments are gone. Clients with stale bundles hit a wall.
Step 1: Add Telemetry
First step was understanding when this happens. Added logging to all server actions:
// lib/stripe/actions.ts
async function logServerAction(actionName: string) {
const headersList = await headers();
console.log(`[Server Action] ${actionName} invoked`, {
timestamp: new Date().toISOString(),
userAgent: headersList.get("user-agent")?.slice(0, 100),
referer: headersList.get("referer"),
});
}
export async function createCheckoutSession(
currentPath?: string,
billingPeriod: "monthly" | "yearly" = "monthly",
) {
await logServerAction("createCheckoutSession");
const clerkUser = await currentUser();
if (!clerkUser) {
throw new Error("User not authenticated");
}
// ... rest of implementation
}Applied the same pattern to createBillingPortalSession and cancelSubscription. Railway logs now show:
[Server Action] createCheckoutSession invoked {
timestamp: '2026-01-11T10:23:45.123Z',
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...',
referer: 'https://example.com/dashboard'
}
When the deployment skew error occurs, we'd see the action not being logged at all — the failure happens at the routing layer before our code executes. This confirmed the issue wasn't in our logic.
Step 2: Detect Stale Client Errors
The error message is predictable. We can catch it:
// lib/contexts/stale-client-context.tsx
export function isStaleClientError(error: unknown): boolean {
if (error instanceof Error) {
const message = error.message.toLowerCase();
return (
message.includes("failed to find server action") ||
message.includes("server action") ||
message.includes("older or newer deployment")
);
}
return false;
}Multiple patterns because Next.js error messages vary between versions. The function checks all known variants.
Step 3: Create a Context for Global State
When any component detects a stale client, the whole app should know. React Context handles this:
// lib/contexts/stale-client-context.tsx
"use client";
import { createContext, useContext, useState, useCallback, ReactNode } from "react";
interface StaleClientContextType {
isStale: boolean;
markAsStale: () => void;
refresh: () => void;
}
const StaleClientContext = createContext<StaleClientContextType | undefined>(undefined);
export function StaleClientProvider({ children }: { children: ReactNode }) {
const [isStale, setIsStale] = useState(false);
const markAsStale = useCallback(() => {
setIsStale(true);
}, []);
const refresh = useCallback(() => {
window.location.reload(); // Full reload, not router.refresh()
}, []);
return (
<StaleClientContext.Provider value={{ isStale, markAsStale, refresh }}>
{children}
</StaleClientContext.Provider>
);
}
export function useStaleClient() {
const context = useContext(StaleClientContext);
if (context === undefined) {
throw new Error("useStaleClient must be used within a StaleClientProvider");
}
return context;
}The key detail: window.location.reload() instead of Next.js router.refresh(). We need a full page reload to fetch new JavaScript bundles, not just refresh the RSC tree.
Step 4: Build the User-Facing Banner
The banner provides a non-intrusive notification:
// blocks/shared/StaleClientBanner.tsx
"use client";
import { useStaleClient } from "@/lib/contexts/stale-client-context";
import { RefreshCw } from "lucide-react";
export function StaleClientBanner() {
const { isStale, refresh } = useStaleClient();
if (!isStale) return null;
return (
<div className="fixed inset-x-0 bottom-0 z-50 p-4 sm:bottom-4 sm:left-auto sm:right-4 sm:p-0">
<div className="mx-auto max-w-md rounded-xl border border-(--warning)/30 bg-(--surface) p-4 shadow-lg sm:mx-0">
<div className="flex items-start gap-3">
<div className="flex h-10 w-10 shrink-0 items-center justify-center rounded-full bg-(--warning)/10">
<RefreshCw className="h-5 w-5 text-(--warning)" />
</div>
<div className="flex-1">
<h3 className="text-sm font-semibold text-(--heading)">
Update Available
</h3>
<p className="mt-1 text-sm text-(--body)">
A new version is available. Please refresh to continue.
</p>
<button
onClick={refresh}
className="mt-3 inline-flex items-center gap-2 rounded-full bg-(--primary) px-4 py-2 text-sm font-medium text-white transition-colors hover:bg-(--primary-hover)"
>
<RefreshCw className="h-4 w-4" />
Refresh Now
</button>
</div>
</div>
</div>
</div>
);
}UX decisions:
- Warning color (amber), not error red — it's not the user's fault
- "Update Available" — positive framing instead of "Something broke"
- No dismiss option — the user genuinely needs to refresh
- Bottom-right on desktop, full-width on mobile — consistent with toast notifications
Step 5: Wire It Into Components
Every component calling server actions needs the error handling pattern. The Membership Modal has three server actions:
// blocks/modals/MembershipModal.tsx
import {
isStaleClientError,
useStaleClient,
} from "@/lib/contexts/stale-client-context";
export function MembershipModal({ /* props */ }) {
const { markAsStale } = useStaleClient();
const handleUpgrade = async () => {
setIsLoading(true);
try {
await createCheckoutSession(currentPath, isYearly ? "yearly" : "monthly");
} catch (error) {
console.error("Failed to create checkout session:", error);
if (isStaleClientError(error)) {
markAsStale();
onClose();
} else {
customToast.error("Checkout Failed", "Failed to start checkout process");
}
setIsLoading(false);
}
};
const handleBillingPortal = async () => {
setIsLoading(true);
try {
await createBillingPortalSession();
} catch (error) {
console.error("Failed to open billing portal:", error);
if (isStaleClientError(error)) {
markAsStale();
onClose();
} else {
customToast.error("Error", "Failed to open billing portal");
}
setIsLoading(false);
}
};
const handleDowngrade = async () => {
// ... confirmation logic ...
try {
await cancelSubscription();
customToast.success("Subscription Canceled", "...");
onClose();
} catch (error) {
console.error("Failed to cancel subscription:", error);
if (isStaleClientError(error)) {
markAsStale();
onClose();
} else {
customToast.error("Cancellation Failed", "Failed to cancel subscription");
}
} finally {
setIsLoading(false);
}
};
// ... rest of component
}Same pattern applied to EmbedModal.tsx — anywhere we call createCheckoutSession.
Step 6: Dashboard Layout Integration
The provider wraps the entire dashboard. The banner lives at the layout level:
// app/dashboard/layout.tsx
import { StaleClientProvider } from "@/lib/contexts/stale-client-context";
import { StaleClientBanner } from "@/blocks/shared";
export default function DashboardLayout({
children,
}: {
children: React.ReactNode;
}) {
return (
<Suspense fallback={<Loading />}>
<StaleClientProvider>
<SelectionProvider>
<UploadProgressProvider>
<ValidationModalProvider>
<FileProcessingProvider>
<DashboardLayoutContent>
{children}
</DashboardLayoutContent>
</FileProcessingProvider>
</ValidationModalProvider>
</UploadProgressProvider>
</SelectionProvider>
</StaleClientProvider>
</Suspense>
);
}
function DashboardLayoutContent({ children }: { children: React.ReactNode }) {
return (
<div className="flex h-screen bg-(--background)">
{/* ... sidebar, header, content ... */}
<StaleClientBanner />
</div>
);
}The new additions are StaleClientProvider wrapping the existing providers, and StaleClientBanner at the end of the layout content.
The Tools Apps: A Different Problem
The separate tools apps (compress, convert, resize, share, feedback) were also hitting the error. But they don't have custom server actions — they're mostly client-side tools.
The issue: Next.js 15 uses internal server actions for router transitions. Prefetching, navigation — all internal actions with IDs that change between deployments.
These apps were using output: "standalone" builds:
// next.config.ts (BEFORE)
const nextConfig: NextConfig = {
output: "standalone",
};Standalone builds are optimized for serverless but increase deployment skew exposure because they're self-contained with pinned action IDs.
The fix was removing standalone output entirely:
// next.config.ts (AFTER)
const nextConfig: NextConfig = {
// No standalone output - standard Next.js build
};And updating Dockerfiles from:
# BEFORE
CMD ["node", "server.js"]To:
# AFTER
CMD ["pnpm", "start"]The apps are simple enough that standard Next.js deployment works fine. The complexity reduction eliminates the skew window.
Files Modified
lib/stripe/actions.ts - Added logServerAction() telemetry
lib/usage-tracking.ts - Added logging for updateBilledCredits
lib/contexts/stale-client-context.tsx - NEW: Context + detection utility
lib/contexts/index.ts - Export stale client utilities
blocks/shared/StaleClientBanner.tsx - NEW: User-facing refresh prompt
blocks/shared/index.ts - Export StaleClientBanner
app/dashboard/layout.tsx - Provider wrapper + banner
blocks/modals/MembershipModal.tsx - Stale error handling (3 actions)
blocks/modals/EmbedModal.tsx - Stale error handling (1 action)
tools/compress/next.config.ts - Removed output: "standalone"
tools/convert/next.config.ts - Removed output: "standalone"
tools/resize/next.config.ts - Removed output: "standalone"
tools/share/next.config.ts - Removed output: "standalone"
tools/feedback/next.config.ts - Removed output: "standalone"
tools/compress/Dockerfile - Changed to pnpm start
tools/convert/Dockerfile - Changed to pnpm start
tools/resize/Dockerfile - Changed to pnpm start
tools/share/Dockerfile - Changed to pnpm start
tools/feedback/Dockerfile - Changed to pnpm start
Lessons
-
Server Actions = Deployment Coupling — Action IDs are baked into client bundles. Any deployment creates potential for mismatch. This is the trade-off for the convenience of server actions.
-
Railway ≠ Vercel — No built-in skew protection. We have to handle it ourselves. The error message is helpful though — "older or newer deployment" is clear about the cause.
-
Graceful Degradation > Cryptic Errors — Users don't understand "Server Action x not found". They understand "Update Available".
-
Telemetry First — Adding logs confirmed when errors occurred relative to deployments. Without timestamps and referers, we'd be guessing.
-
Consistent Patterns — Once we established the
isStaleClientError()pattern, applying it to all server action call sites was mechanical. A function that takes an error, returns a boolean. Simple to test, simple to use. -
Standalone Builds Add Complexity — For simple apps, the deployment optimization isn't worth the skew exposure. Standard builds are fine.
The fix was deployed by late afternoon. No more cryptic errors, no more silent failures. Users see a friendly banner, click refresh, and continue with their upgrade.