Skip to main content
Juliano Alves
Back to blog

Streaming and Suspense in the Next.js App Router

7 min read
By Juliano Alves

The App Router fundamentally changes how you think about time: a route is no longer “one big await” before paint. Instead, React can emit HTML incrementally while slow segments of the tree are still resolving. That behavior is built on Suspense on the server and on the client.

This post walks through the mental model, concrete patterns, caching interactions, and pitfalls I watch for in production apps.

The problem: all-or-nothing rendering#

In a classic SSR model, the server computes the entire page HTML before sending bytes. If one subtree depends on a slow upstream (database, third-party API, heavy CMS query), every user waits—even for static chrome like headers, navigation, and layout.

Streaming flips the contract: send early bytes that include layout and placeholders, then append HTML as data arrives. Browsers can paint progressively; Core Web Vitals such as LCP and FID often improve because the main thread gets meaningful content sooner.

How Suspense participates#

Suspense declares a region of the tree that may suspend while reading async data. When the child suspends, React renders the fallback and can later replace that subtree without discarding outer layout.

import { Suspense } from 'react';
import { Posts } from './posts'; // async Server Component
import { PostsSkeleton } from './posts-skeleton';

export default function BlogPage() {
  return (
    <main className="mx-auto max-w-3xl space-y-8 p-6">
      <header>
        <h1>Blog</h1>
        <p className="text-muted-foreground">Latest writing on front-end and platform work.</p>
      </header>

      <Suspense fallback={<PostsSkeleton />}>
        <Posts />
      </Suspense>
    </main>
  );
}

Posts might be implemented as:

// posts.tsx
export async function Posts() {
  const res = await fetch('https://api.example.com/posts', {
    next: { revalidate: 60 },
  });
  if (!res.ok) throw new Error('Failed to load posts');
  const items = await res.json();

  return (
    <ul className="space-y-4">
      {items.map((post: { id: string; title: string }) => (
        <li key={post.id}>
          <a href={`/blog/${post.id}`}>{post.title}</a>
        </li>
      ))}
    </ul>
  );
}

The key observation: the header renders immediately in the first chunk; the list arrives when fetch completes.

loading.tsx vs inline Suspense#

Next.js supports a file convention loading.tsx beside page.tsx. It automatically wraps the page segment in Suspense and is ideal for route-level loading UI during navigation.

Inline <Suspense> boundaries are better when:

  • Only part of the page is slow and you want granular skeletons.
  • You need different fallbacks for different subtrees.
  • You are composing parallel data-fetching regions (e.g. dashboard widgets).

Use both: loading.tsx for fast route feedback, nested Suspense for fine-grained streaming inside the page.

Caching and streaming interact#

When you call fetch in a Server Component, the cache, next.revalidate, and tags options determine deduplication and ISR behavior. Streaming does not remove the need to think about staleness:

  • Force dynamic: export const dynamic = 'force-dynamic' or fetch(..., { cache: 'no-store' }) when data must be fresh per request.
  • Static with revalidation: fetch(url, { next: { revalidate: 3600 } }) for content that can be slightly stale.
  • Tag-based revalidation: fetch(url, { next: { tags: ['posts'] } }) paired with revalidateTag('posts') from a webhook or server action.

Misconfigured caching can stream “fast” but wrong data. Always validate whether the streamed HTML matches your freshness contract.

Skeleton design matters#

Poor fallbacks hurt CLS (Cumulative Layout Shift). Match the skeleton’s dimensions and hierarchy to the final UI:

  • Reserve image aspect ratios.
  • Use similar heading line counts.
  • Avoid skeletons that collapse from 400px height to 40px when real content arrives.

Error boundaries and partial failure#

Streaming does not remove failure modes. Pair suspenseful regions with error.tsx at appropriate segment levels so one failing widget does not blank the entire route unless that is intentional.

For recoverable UI, you can also isolate fetches in smaller Server Components and let Next surface segment errors while keeping unrelated siblings on screen.

When not to stream#

Streaming adds complexity to debugging and to middleware / headers that must be known before the first byte in edge cases. For tiny pages or purely static marketing routes, a single blocking render may be simpler.

Also, some crawlers and proxies buffer aggressively; most modern environments handle streaming well, but validate SEO-critical pages in your hosting stack.

Checklist before shipping#

  1. Identify the slow leaves in the React tree (DB, remote APIs, filesystem).
  2. Wrap them in tight Suspense boundaries with honest skeletons.
  3. Set explicit fetch caching (no-store, revalidate, or tags).
  4. Measure TTFB, LCP, and CLS in Lighthouse and in the field (RUM).
  5. Add error.tsx at the segment where failures should be contained.

Summary#

Streaming + Suspense is the default tool in the App Router for graceful degradation under latency. It shines when combined with thoughtful caching, skeleton design, and segment-level error handling—treating perceived performance as part of the architecture, not a last-minute polish pass.

© 2026 Juliano Alves. All rights reserved.