Back to Journal

Handling Next.js server action deployment skew

Users started hitting a wall. Clicking "Upgrade to Premium" would fail silently. The Stripe checkout modal would show a generic error. Support tickets mentioned "something broke" but couldn't reproduce it.

The error in Railway logs told the story:

Error: Failed to find Server Action "x". This request might be from an older or newer deployment.

This is deployment skew — when client-side JavaScript references server-side functions that no longer exist.

The Root Cause

Next.js 15 Server Actions compile to unique action IDs. These IDs are deterministic hashes based on the function's content and location. When the server redeploys, new IDs are generated. Clients with cached JavaScript bundles still reference the old IDs.

Deployment Skew TimelineT=0User loads pagev1 JS bundleT=1Deploy startsv2 serverT=2Old server diesv1 goneT=3User clicks actionv1 action IDT=4404Mismatch!Client (Browser)Cached JS with action ID:"abc123_createCheckout"Server (v2)Only knows action ID:"xyz789_createCheckout"

Vercel handles this with built-in skew protection — old deployments stay warm and requests are routed based on client deployment headers. Railway doesn't have this. Old deployments are gone. Clients with stale bundles hit a wall.

Step 1: Add Telemetry

First step was understanding when this happens. Added logging to all server actions:

// lib/stripe/actions.ts
 
async function logServerAction(actionName: string) {
  const headersList = await headers();
  console.log(`[Server Action] ${actionName} invoked`, {
    timestamp: new Date().toISOString(),
    userAgent: headersList.get("user-agent")?.slice(0, 100),
    referer: headersList.get("referer"),
  });
}
 
export async function createCheckoutSession(
  currentPath?: string,
  billingPeriod: "monthly" | "yearly" = "monthly",
) {
  await logServerAction("createCheckoutSession"); 
 
  const clerkUser = await currentUser();
  if (!clerkUser) {
    throw new Error("User not authenticated");
  }
  // ... rest of implementation
}

Applied the same pattern to createBillingPortalSession and cancelSubscription. Railway logs now show:

[Server Action] createCheckoutSession invoked {
  timestamp: '2026-01-11T10:23:45.123Z',
  userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...',
  referer: 'https://example.com/dashboard'
}

When the deployment skew error occurs, we'd see the action not being logged at all — the failure happens at the routing layer before our code executes. This confirmed the issue wasn't in our logic.

Step 2: Detect Stale Client Errors

The error message is predictable. We can catch it:

// lib/contexts/stale-client-context.tsx
 
export function isStaleClientError(error: unknown): boolean {
  if (error instanceof Error) {
    const message = error.message.toLowerCase();
    return (
      message.includes("failed to find server action") ||
      message.includes("server action") ||
      message.includes("older or newer deployment")
    );
  }
  return false;
}

Multiple patterns because Next.js error messages vary between versions. The function checks all known variants.

Step 3: Create a Context for Global State

When any component detects a stale client, the whole app should know. React Context handles this:

// lib/contexts/stale-client-context.tsx
 
"use client";
 
import { createContext, useContext, useState, useCallback, ReactNode } from "react";
 
interface StaleClientContextType {
  isStale: boolean;
  markAsStale: () => void;
  refresh: () => void;
}
 
const StaleClientContext = createContext<StaleClientContextType | undefined>(undefined);
 
export function StaleClientProvider({ children }: { children: ReactNode }) {
  const [isStale, setIsStale] = useState(false);
 
  const markAsStale = useCallback(() => {
    setIsStale(true);
  }, []);
 
  const refresh = useCallback(() => {
    window.location.reload(); // Full reload, not router.refresh()
  }, []);
 
  return (
    <StaleClientContext.Provider value={{ isStale, markAsStale, refresh }}>
      {children}
    </StaleClientContext.Provider>
  );
}
 
export function useStaleClient() {
  const context = useContext(StaleClientContext);
  if (context === undefined) {
    throw new Error("useStaleClient must be used within a StaleClientProvider");
  }
  return context;
}

The key detail: window.location.reload() instead of Next.js router.refresh(). We need a full page reload to fetch new JavaScript bundles, not just refresh the RSC tree.

Stale Client Detection ArchitectureLayer 1: DetectionisStaleClientError(error)• Pattern matching• Known error messagesLayer 2: StateStaleClientContext• isStale: boolean• markAsStale(): voidLayer 3: UIStaleClientBanner• Shows when isStale• "Refresh Now" buttonIntegration Flow1. Component calls server action2. Action fails → isStaleClientError()?3. Yes → markAsStale() + close modal4. Banner appears → user clicks refresh

Step 4: Build the User-Facing Banner

The banner provides a non-intrusive notification:

// blocks/shared/StaleClientBanner.tsx
 
"use client";
 
import { useStaleClient } from "@/lib/contexts/stale-client-context";
import { RefreshCw } from "lucide-react";
 
export function StaleClientBanner() {
  const { isStale, refresh } = useStaleClient();
 
  if (!isStale) return null;
 
  return (
    <div className="fixed inset-x-0 bottom-0 z-50 p-4 sm:bottom-4 sm:left-auto sm:right-4 sm:p-0">
      <div className="mx-auto max-w-md rounded-xl border border-(--warning)/30 bg-(--surface) p-4 shadow-lg sm:mx-0">
        <div className="flex items-start gap-3">
          <div className="flex h-10 w-10 shrink-0 items-center justify-center rounded-full bg-(--warning)/10">
            <RefreshCw className="h-5 w-5 text-(--warning)" />
          </div>
          <div className="flex-1">
            <h3 className="text-sm font-semibold text-(--heading)">
              Update Available
            </h3>
            <p className="mt-1 text-sm text-(--body)">
              A new version is available. Please refresh to continue.
            </p>
            <button
              onClick={refresh}
              className="mt-3 inline-flex items-center gap-2 rounded-full bg-(--primary) px-4 py-2 text-sm font-medium text-white transition-colors hover:bg-(--primary-hover)"
            >
              <RefreshCw className="h-4 w-4" />
              Refresh Now
            </button>
          </div>
        </div>
      </div>
    </div>
  );
}

UX decisions:

  • Warning color (amber), not error red — it's not the user's fault
  • "Update Available" — positive framing instead of "Something broke"
  • No dismiss option — the user genuinely needs to refresh
  • Bottom-right on desktop, full-width on mobile — consistent with toast notifications
User Experience FlowUser clicks"Upgrade"Server actionfailsisStaleClientError()?YESNOmarkAsStale()Close modalBanner appears→ User refreshesRegular errorShow toast:"Checkout Failed"

Step 5: Wire It Into Components

Every component calling server actions needs the error handling pattern. The Membership Modal has three server actions:

// blocks/modals/MembershipModal.tsx
 
import {
  isStaleClientError,
  useStaleClient,
} from "@/lib/contexts/stale-client-context";
 
export function MembershipModal({ /* props */ }) {
  const { markAsStale } = useStaleClient();
 
  const handleUpgrade = async () => {
    setIsLoading(true);
    try {
      await createCheckoutSession(currentPath, isYearly ? "yearly" : "monthly");
    } catch (error) {
      console.error("Failed to create checkout session:", error);
      if (isStaleClientError(error)) { 
        markAsStale(); 
        onClose(); 
      } else { 
        customToast.error("Checkout Failed", "Failed to start checkout process");
      } 
      setIsLoading(false);
    }
  };
 
  const handleBillingPortal = async () => {
    setIsLoading(true);
    try {
      await createBillingPortalSession();
    } catch (error) {
      console.error("Failed to open billing portal:", error);
      if (isStaleClientError(error)) { 
        markAsStale(); 
        onClose(); 
      } else { 
        customToast.error("Error", "Failed to open billing portal");
      } 
      setIsLoading(false);
    }
  };
 
  const handleDowngrade = async () => {
    // ... confirmation logic ...
    try {
      await cancelSubscription();
      customToast.success("Subscription Canceled", "...");
      onClose();
    } catch (error) {
      console.error("Failed to cancel subscription:", error);
      if (isStaleClientError(error)) { 
        markAsStale(); 
        onClose(); 
      } else { 
        customToast.error("Cancellation Failed", "Failed to cancel subscription");
      } 
    } finally {
      setIsLoading(false);
    }
  };
 
  // ... rest of component
}

Same pattern applied to EmbedModal.tsx — anywhere we call createCheckoutSession.

Step 6: Dashboard Layout Integration

The provider wraps the entire dashboard. The banner lives at the layout level:

// app/dashboard/layout.tsx
 
import { StaleClientProvider } from "@/lib/contexts/stale-client-context"; 
import { StaleClientBanner } from "@/blocks/shared"; 
 
export default function DashboardLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <Suspense fallback={<Loading />}>
      <StaleClientProvider>
        <SelectionProvider>
          <UploadProgressProvider>
            <ValidationModalProvider>
              <FileProcessingProvider>
                <DashboardLayoutContent>
                  {children}
                </DashboardLayoutContent>
              </FileProcessingProvider>
            </ValidationModalProvider>
          </UploadProgressProvider>
        </SelectionProvider>
      </StaleClientProvider>
    </Suspense>
  );
}
 
function DashboardLayoutContent({ children }: { children: React.ReactNode }) {
  return (
    <div className="flex h-screen bg-(--background)">
      {/* ... sidebar, header, content ... */}
      <StaleClientBanner />
    </div>
  );
}

The new additions are StaleClientProvider wrapping the existing providers, and StaleClientBanner at the end of the layout content.

The Tools Apps: A Different Problem

The separate tools apps (compress, convert, resize, share, feedback) were also hitting the error. But they don't have custom server actions — they're mostly client-side tools.

The issue: Next.js 15 uses internal server actions for router transitions. Prefetching, navigation — all internal actions with IDs that change between deployments.

These apps were using output: "standalone" builds:

// next.config.ts (BEFORE)
const nextConfig: NextConfig = {
  output: "standalone", 
};

Standalone builds are optimized for serverless but increase deployment skew exposure because they're self-contained with pinned action IDs.

The fix was removing standalone output entirely:

// next.config.ts (AFTER)
const nextConfig: NextConfig = {
  // No standalone output - standard Next.js build
};

And updating Dockerfiles from:

# BEFORE
CMD ["node", "server.js"]

To:

# AFTER
CMD ["pnpm", "start"]

The apps are simple enough that standard Next.js deployment works fine. The complexity reduction eliminates the skew window.

Two Different SolutionsMain Web AppHas real server actions• Stripe checkout• Billing portal• Subscription cancelSolution: Detect + Banner + RefreshTools AppsOnly internal router actions• compress• convert• resize, share, feedbackSolution: Remove standalone builds

Files Modified

lib/stripe/actions.ts                    - Added logServerAction() telemetry
lib/usage-tracking.ts                    - Added logging for updateBilledCredits
lib/contexts/stale-client-context.tsx    - NEW: Context + detection utility
lib/contexts/index.ts                    - Export stale client utilities
blocks/shared/StaleClientBanner.tsx      - NEW: User-facing refresh prompt
blocks/shared/index.ts                   - Export StaleClientBanner
app/dashboard/layout.tsx                 - Provider wrapper + banner
blocks/modals/MembershipModal.tsx        - Stale error handling (3 actions)
blocks/modals/EmbedModal.tsx             - Stale error handling (1 action)

tools/compress/next.config.ts            - Removed output: "standalone"
tools/convert/next.config.ts             - Removed output: "standalone"
tools/resize/next.config.ts              - Removed output: "standalone"
tools/share/next.config.ts               - Removed output: "standalone"
tools/feedback/next.config.ts            - Removed output: "standalone"

tools/compress/Dockerfile                - Changed to pnpm start
tools/convert/Dockerfile                 - Changed to pnpm start
tools/resize/Dockerfile                  - Changed to pnpm start
tools/share/Dockerfile                   - Changed to pnpm start
tools/feedback/Dockerfile                - Changed to pnpm start

Lessons

  1. Server Actions = Deployment Coupling — Action IDs are baked into client bundles. Any deployment creates potential for mismatch. This is the trade-off for the convenience of server actions.

  2. Railway ≠ Vercel — No built-in skew protection. We have to handle it ourselves. The error message is helpful though — "older or newer deployment" is clear about the cause.

  3. Graceful Degradation > Cryptic Errors — Users don't understand "Server Action x not found". They understand "Update Available".

  4. Telemetry First — Adding logs confirmed when errors occurred relative to deployments. Without timestamps and referers, we'd be guessing.

  5. Consistent Patterns — Once we established the isStaleClientError() pattern, applying it to all server action call sites was mechanical. A function that takes an error, returns a boolean. Simple to test, simple to use.

  6. Standalone Builds Add Complexity — For simple apps, the deployment optimization isn't worth the skew exposure. Standard builds are fine.

The fix was deployed by late afternoon. No more cryptic errors, no more silent failures. Users see a friendly banner, click refresh, and continue with their upgrade.