A Brief Discussion on Data Hydration and Persistent Data in NextJS RSC/SSR (Part One)

Because I recently rewrote my personal site and encountered many pitfalls while trying out the brand new RSC architecture of NextJS, I plan to document some practices in this article.

In the SSR architecture, if the requested data is on the server and relies on SSR data when transitioning to CSR rendering, it is essential to ensure that the data obtained during CSR matches that of the server. Only in this way can we guarantee consistent rendering on both ends and successful hydration; otherwise, an error will occur: Hydration failed because the initial UI does not match what was rendered on the server.. Although this error does not cause the page to crash and does not significantly lower LCP during use, it can be quite frustrating during development, leading to numerous red pop-ups from NextJS and Sentry bombardments in production (if Sentry is integrated). The following image shows the frustrating experience of kami at present. Since I couldn't make any more changes, I had the idea of rewriting it.

!Sentry reported interface 429 throttled

In the RSC architecture, which is also based on SSR, the routing is now completely taken over by the server, so the original NextJS router has been entirely replaced. The rendering of the route starts from the top-level component and is rendered by the server downwards before returning to the client. Theoretically, if there are no components that encounter use client, the browser does not need to render anything. In most projects, the business logic is not that simple; for example, my data changes with server event pushes.

One point to note is that it is crucial to ensure data consistency during browser hydration. If this cannot be achieved, SSR rendering of that component must be abandoned. This is the most conventional method, but it cannot do much more.

// app/pageImpl.tsx
'use client'

export default (props: { data: PageData }) => {
	// ...
}

// app/page.tsx
import PageImpl from './pageImpl'
export default async () => {
  const data = await fetch('...')
  
  return <PageImpl data={data} />
}

The above is the data passing method I initially tried. Using this method, there are no issues as long as the passed data is JSON serializable.

However, using the above method, the data passed through props is immutable, and the page's components are driven by this data. If this data needs to change based on subsequent events, state management must be introduced.

Returning to the already problematic kami, here's how it works. After the required data for the page is requested, the server renders the page based on the obtained data and returns HTML to the browser. The first frame rendered by the browser is the complete state of the page, but at this point, the page is not yet in an interactive state. Until JS loads, React begins to hydrate. However, since the page's data is not passed through props but extracted from the store, the store has not completed hydration, causing the first frame after hydration to enter a loading state with no data, leading to a React error: Hydration Error, thus switching to Client Render.

The Zustand I previously used did not seem to provide a good solution. This time, I plan to use Jotai to complete this part of the migration. Our page data is still driven by the Store rather than passed through props.

React Query Solution#

I tried using React Query as an intermediary. React Query naturally provides a Hydrate component, which can solve this problem to some extent. However, if React Query is used for data management, it will not be possible to control the granularity of each component. The select capability of React Query is not very flexible, and in some attempts, I found that even using select could not accurately granularize updates to each component.

Is it really simple?#

If using the React Query solution, simple scenarios only require the following operations.

Establish the ReactQueryProvider and Hydrate components, which are two client components.

// app/provider.tsx
'use client'
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
import { PropsWithChildren } from 'react'

export const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      staleTime: 1000 * 60 * 5, // 5 minutes
      refetchInterval: 1000 * 60 * 5, // 5 minutes
      refetchOnWindowFocus: false,
      refetchIntervalInBackground: false,
    },
  },
})
export const ReactQueryProvider = ({ children }: PropsWithChildren) => {
  return (
    <QueryClientProvider client={queryClient}>{children}</QueryClientProvider>
  )
}

// app/hydrate.tsx
'use client'

import { Hydrate as RQHydrate } from '@tanstack/react-query'
import type { HydrateProps } from '@tanstack/react-query'

export function Hydrate(props: HydrateProps) {
  return <RQHydrate {...props} />
}

Then import it in layout.tsx.

import { QueryClient, dehydrate } from '@tanstack/react-query'
import { Hydrate } from './hydrate'

import { ReactQueryProvider } from './provider'
import { QueryKeys } from './query-key'
import { sleep } from '@/utiils'

const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      cacheTime: 1000,
      staleTime: 1000,
    },
  },
})
export default async function RootLayout({
  children,
}: {
  children: React.ReactNode
}) {
  await queryClient.fetchQuery({
    queryKey: [QueryKeys.ROOT],
    queryFn: async () => {
      await sleep(1000)
      const random = Math.random()
      console.log(random)
      return random
    },
  })
  const state = dehydrate(queryClient, {
    shouldDehydrateQuery: (query) => {
      if (query.state.error) return false
      if (!query.meta) return true
      const {
        shouldHydration,
        skipHydration,
      } = query.meta

      if (skipHydration) return false

      return (shouldHydration as Function)?.(query.state.data as any) ?? false
    },
  })

  return (
    <ReactQueryProvider>
      <Hydrate state={state}>
        <html lang="en">
          <body>{children}</body>
        </html>
      </Hydrate>
    </ReactQueryProvider>
  )
}

Note that you must also create a QueryClient on the server side. The QueryClient used in Server Components is not the same as that in Client Components, and during Hydrate, the server-side one is used. Therefore, in layout.tsx, we create another QueryClient for Server Side Only. We defined a Query fetch in RootLayout to simulate fetching random data and waited for this asynchronous request to complete before entering the Dehydrate phase. Please note the cacheTime set above, which will be discussed later. Next, we verify whether Hydrate is effective. If no Hydrate Error occurs, it indicates that there is no problem.

Create page.tsx and convert it into a Client Component.

'use client'
import { useQuery } from '@tanstack/react-query'
import { QueryKeys } from './query-key'

export default function Home() {
  const { data } = useQuery({
    queryKey: [QueryKeys.ROOT],
    queryFn: async () => {
      return 0
    },
    enabled: false,
  })
  return <p>Hydrate Number: {data}</p>
}

Here we disabled the automatic refetch feature of Query to ensure that data does not refresh. In this example, as long as the page does not display 0, it is OK.

We see that the random number matches what the server printed, and there are no Hydrate errors in the browser.

Data Caching#

Earlier, we mentioned that we set the cacheTime for the ServerSide Only QueryClient. This parameter is not what you think of as the data caching time but rather the existence time of the Query instance. In React Query, all Queries are hosted in QueryCache. Once this time has passed, the Query will be destroyed. In the useQuery of React Hook, the Query remains hanging in the component without needing to be aware of this value. However, data fetched manually in QueryClient will also generate Query instances. Therefore, on the Server Side, to ensure that data hits the same Query multiple times, do not set the time too short; the default value is 5 minutes.

For example, if I set the ServerSide QueryClient cacheTime to 10 milliseconds, and during the queryClient fetch data, an asynchronous task is inserted, it may lead to the Query instance being destroyed before reaching the dehydrate stage.


const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      cacheTime: 10, // Set to 10ms, perhaps to avoid letting the Server hit the API cache for too long to ensure data is up-to-date.
    },
  },
})

export default async function RootLayout({
  children,
}: {
  children: React.ReactNode
}) {
  await queryClient.fetchQuery({
    queryKey: [QueryKeys.ROOT],
    queryFn: async () => {
      await sleep(1000)
      const random = Math.random()
      console.log(random)
      return random
    },
  })
  await sleep(10) // Simulate an asynchronous task that exceeds 10ms
  
  const state = dehydrate(queryClient, {
    shouldDehydrateQuery: (query) => {
      if (query.state.error) return false
      if (!query.meta) return true
      const {
        shouldHydration,
        skipHydration,
      } = query.meta

      if (skipHydration) return false

      return (shouldHydration as Function)?.(query.state.data as any) ?? false
    },
  })

  return (
    <ReactQueryProvider>
      <Hydrate state={state}>
        <html lang="en">
          <body>{children}</body>
        </html>
      </Hydrate>
    </ReactQueryProvider>
  )
}

At this point, looking at the browser page, there is no data anymore.

It is clear that using React Query while not wanting the Server to cache the API in itself is somewhat challenging.

Potential Data Leakage#

If you are not running this Next.js in Serverless Mode, since there is only one QueryClient on the server, but many users are accessing your site, the QueryClient will cache different request data.

When User A accesses the site, it may contain hydration data from User B's access.

For example, let's write a Demo. We will comment out the ServerSide cacheTime and revert to the default of 5 minutes.

Create pages A and B.

// app/A/layout.tsx
import { queryClient } from '../queryClient.server'

export default async () => {
  await queryClient.fetchQuery({
    queryKey: ['A'],
    queryFn: async () => {
      return 'This is A'
    },
  })
  return null
}
// app/A/page.tsx
export default () => {
  return null
}

B is similar, changing all instances of A to B.

Access /A and /B. Refresh the page and check the HTML source of /A.

We can see that accessing /A carries the hydration data of /B.

As the access volume increases, this hydration data can become quite large, which is something we do not want to see. Moreover, if you forward the Cookie to the server, it may allow visitors to see things they shouldn't.

To avoid this, my solution is to judge based on meta. You can customize a meta key in the query definition to indicate whether this query needs hydration. Then, according to the current route, only hydrate the data for the current route. For sensitive content (which can be authenticated or partially viewed), force skipping hydration.


  const dehydratedState = dehydrate(queryClient, {
    shouldDehydrateQuery: (query) => {
      if (query.state.error) return false
      if (!query.meta) return true
      const {
        shouldHydration,
        hydrationRoutePath,
        skipHydration,
        forceHydration,
      } = query.meta

      if (forceHydration) return true
      if (hydrationRoutePath) {
        const pathname = headers().get(REQUEST_PATHNAME)

        if (pathname === query.meta?.hydrationRoutePath) {
          if (!shouldHydration) return true
          return (shouldHydration as Function)(query.state.data as any)
        }
      }

      if (skipHydration) return false

      return (shouldHydration as Function)?.(query.state.data as any) ?? false
    },
  })

You only need to modify the dehydrateState. Here, I used shouldHydration, hydrationRoutePath, skipHydration, and forceHydration to control the hydration state.

Reference usage:

   defineQuery({
      queryKey: ['note', nid],
      meta: {
        hydrationRoutePath: routeBuilder(Routes.Note, { id: nid }),
        shouldHydration: (data: NoteWrappedPayload) => {
          const note = data?.data
          const isSecret = note?.secret
            ? dayjs(note?.secret).isAfter(new Date())
            : false
          return !isSecret
        },
      },
      queryFn: async ({ queryKey }) => {
        const [, id] = queryKey

        if (id === LATEST_KEY) {
          return (await apiClient.note.getLatest()).$serialized
        }
        const data = await apiClient.note.getNoteById(+queryKey[1], password!)

        return { ...data }
      },
    })

At this point, you might say, is it really necessary to be so complicated? Can't we just create a new QueryClient instance inside the RootLayout component to ensure that the data for each request won't be polluted? Indeed, the solution mentioned in the React Query documentation is like this, but this is only applicable in traditional SSR architectures. It also has many limitations; for example, if you do not use this method, the QueryClient cannot be called by other Layouts. For instance, in data fetching in child Layouts, a new QueryClient must be established, and wrapping it again with the Hydrate component will incur a lot of additional overhead.

In React 18.3, the cache method (which Next.js has implemented) may solve this issue. Using cache(() => new QueryClient()) ensures that the same QueryClient is always hit during this React DOM rendering. Although this solution addresses cross-request state pollution, it cannot enjoy the request deduplication benefits brought by a single instance in high concurrency, and the overload caused by sending too many requests at once also needs to be considered.

I won't elaborate further on this here.

In summary, React Query still requires consideration of many issues, leading to increased complexity, prompting me to turn to other solutions.

Jotai#

I'm tired of writing. Let's break it down next time.

This article is synchronized and updated to xLog by Mix Space. The original link is https://innei.ren/posts/programming/nextjs-rsc-ssr-data-hydration-persistence