How to Enable 64KB Page Sizes on 4KB Kernel Systems: Two Approaches

Introduction

Running applications with larger base page sizes can yield performance benefits—reducing TLB misses and improving memory access patterns. However, many Linux kernels are built with a 4KB base page size, especially on x86 architectures. At the 2026 Linux Storage, Filesystem, Memory Management, and BPF Summit, two distinct methods were discussed to allow processes to use 64KB pages even when the kernel itself uses 4KB pages. This guide walks through both approaches: per-process page size selection and bringing 64KB pages to x86 systems. Whether you are a kernel developer or a system administrator tuning for high-performance workloads, follow these steps to evaluate and implement the right solution for your environment.

How to Enable 64KB Page Sizes on 4KB Kernel Systems: Two Approaches

What You Need

Step-by-Step Implementation

We cover two independent paths. Choose the one that fits your architecture and requirements.

Path A: Per-Process Page Size Selection

This approach lets individual processes request a different base page size (e.g., 64KB) while the kernel continues to use the default 4KB pages. It requires kernel support and a way for the process to indicate its preference.

  1. Step 1: Obtain the kernel patches for per-process page size. The summit sessions described prototype work to extend the madvise() or prctl() system calls. Check the Linux memory management mailing list or the kernel’s -mm tree for patches labeled something like "per-process base page size". As of early 2026, this is experimental—so be prepared to apply patches manually.
  2. Step 2: Apply patches and rebuild the kernel. Download the kernel source, apply the patch series with git am or patch -p1. Configure the kernel to enable the feature (likely under "Processor type and features" or "Memory management options"). Build and install the kernel: make -j$(nproc) && sudo make modules_install install. Reboot into the new kernel.
  3. Step 3: Create a test program that requests 64KB pages. Write a simple C program (or modify your workload) that calls the new interface—for instance, prctl(PR_SET_PAGE_SIZE, 65536) or a similar suggested API. The process must call this early, before allocating significant memory. Example snippet:
    #include <sys/prctl.h>
    #include <stdio.h>
    int main() {
        if (prctl(PR_SET_PAGE_SIZE, 65536) == -1) {
            perror("prctl");
            return 1;
        }
        // now subsequent mmap, malloc will use 64KB pages
        return 0;
    }
        
  4. Step 4: Compile and run the test. Use gcc -o test test.c and run with ./test. Verify with /proc/self/smaps or a custom tool that shows page sizes used by the process. You should see KernelPageSize entries of 64KB for newly allocated regions.
  5. Step 5: Measure performance. Run your actual workload with and without the page-size hint. Compare metrics like runtime, TLB misses (using perf stat -e dTLB-load-misses), and memory consumption. Note that the kernel still uses 4KB pages internally; the larger pages are only for user-space mappings. This may lead to increased memory usage due to fragmentation—monitor carefully.

Path B: Bringing 64KB Pages to x86 Systems (Through a Compatibility Mode)

The second approach aims to make the entire system behave as if it had 64KB base pages, even on x86 hardware that natively only offers 4KB pages. This is achieved via a kernel mechanism that emulates large pages in software, possibly using contiguity hints or special page table tricks. The summit discussed a scheme where the kernel groups 16 consecutive 4KB pages into a single 64KB “pseudo-page.”

  1. Step 1: Check hardware and kernel support. This feature requires a kernel that can manage large page coalescing. Look for config options like CONFIG_64KB_PAGE_EMULATION or similar in the memory management menu. As of the summit, it was not yet upstream—you may need to retrieve patches from the same mailing list as Path A but labelled for x86 64KB emulation.
  2. Step 2: Apply the patches and rebuild. Similar to Path A, apply the relevant patch series, enable the config, build, install, and reboot. Note that this change alters the whole kernel’s page size from 4KB to an emulated 64KB for user space. The kernel internal structures still use 4KB, but user-visible mappings become 64KB-aligned and sized.
  3. Step 3: Verify system-wide page size. After booting, check getconf PAGE_SIZE. If successful, it should report 65536. Also examine /proc/meminfo for HugePages_* fields—this mechanism may coexist with or replace the traditional huge pages subsystem.
  4. Step 4: Adapt any system tools or libraries. Some software hard-codes 4KB page sizes (e.g., in memory allocators or file systems). Recompile critical libraries (glibc, jemalloc) to handle 64KB pages. Or, use environment variables like LD_PRELOAD with a wrapper that corrects assumptions. This is a major compatibility step—expect more work.
  5. Step 5: Benchmark and tune. Run your complete application stack. Pay attention to memory overhead: each allocation rounds up to 64KB, so small objects waste space. Tune allocation strategies. Use perf stat to compare TLB performance. This approach is more invasive but can benefit all processes without code changes.

Tips for Success

By following either path, you can unlock the performance advantages of 64KB base pages on systems where the kernel traditionally uses 4KB. Choose based on whether you need per-process control or system-wide compatibility, and always validate with real-world workloads.

Tags:

Recommended

Discover More

Google Unveils 'Agent Skills' for Dart and Flutter—Bringing Domain-Specific AI to Mobile DevelopmentSoftware Supply Chain Security: Essential Q&A for Engineering TeamsCyber-Enabled Cargo Theft on the Rise: FBI Warns of $725M LossesExploring the Latest Developments in Open Source: April 30, 2026 LWN EditionLexus Enters Three-Row Electric SUV Market: Everything We Know So Far