Kernel Page Tables

  • Kernel maintains a set of page tables for its own use
    • It is rooted at the “master kernel Page Global Directory”
  • The highest entries in the master kernel Page Global Directory are used as templates of PGDs of all other processes
  • At this point the CPU is still in real mode and paging is note enabled
  • The kernel initalizes its own address space is set up in 2 steps
    1. Create a limited address space to store the kernel and data structures
      • Contains data and code segments, initial page tables, 128 kb dynamic data structures
    2. Takes all existing RAM and sets up all page tables (PAGE_OFFSET and above)

Provisional kernel Page Tables (Step 1: Still Real Mode no PAE)

  • Note: Assume the kernel can fit in 8mB of RAM (2 pages).  And the mapping is to be easily addressed in both real and protected mode
  • The linear address space is set up as follows
    1. The provisional PGD is first initialized statically at compilation
      • Contained in swapper_pg_dir var
      • Stored starting a pg0 right after _end (end of uninitialized kernel data)
    2. The Page Tables are initialized by startup_32()
  • This mapping is achieved by filling all entries in swapper_pg_dir with zeros except for entries 0, 1, 768 and 769 (2 pages with the addresses specified above)
    • The following flags must be set: Present, Read/Write, and User/Supervisor
    • The following flags must be cleared: Accessed, Dirty, PCD, PWD, and Page Size
  • Recall in real mode there is a one to one mapping to physical addresses.
    • Therefore the mappings to be made is
      • 0x000000000x007fffff to 0x000000000x007fffff for user mode addresses
      • 0x000000000x007fffff to 0xc00000000xc07fffff for kernel mode addresses
  • First we create a mapping
  • Second we initialize all other page tables
  • Recall this is done using the setup_32() function
    • This function also enables the paging unit
    • This function essentially loads the physical address of swapper_pg_dir into cr3 and setting PG flag (paging) in cr0
movl $swapper_pg_dir-0xc0000000,%eax

movl %eax,%cr3   /* set the page table pointer.. */

movl %cr0,%eax

orl $0x80000000,%eax

movl %eax,%cr0   /* ..and set paging (PG) bit */</span>

The highest 128 MB of linear addresses are left available for several kinds of mappings (see sections “FixMapped Linear Addresses” later in this chapter and “Linear Addresses of Noncontiguous Memory Areas” in
Chapter 8). The kernel address space left for mapping the RAM is thus 1 GB – 128 MB = 896 MB.

Final Kernel Page Table when RAM is < 896 MB (Step 2)

  • The final mapping provided by the kernel page tables must transform linear
    addresses starting from
    0xc0000000 into physical addresses starting from 0

    • This is done using two macros
      • __pa: Linear to physical starting from PAGE_OFFSET
      • __va: Physical to linear starting from PAGE_OFFSET
  • Next we initialize all other pages (master kernel is still in swapper_pg_dir)
    1. Invokes pagetable_init() to set up the Page Table entries properly
      • pagetable_init() depends on system config aka RAM and CPU
      • This case < 896 MB of RAM
    2. Writes the physical address of swapper_pg_dir in the cr3 control register.
    3. If the CPU supports PAE and if the kernel is compiled with PAE support, sets
      PAE flag in the cr4 control register
    4. Invokes _ _flush_tlb_all() to invalidate all TLB entries
  • For this case since we don’t need PAE the swapper_pg_dir can be initialized as follows
pgd = swapper_pg_dir + pgd_index(PAGE_OFFSET);  /* 768th page frame */
phys_addr = 0x00000000; /*start counting from 0 physical address*/
while (phys_addr < (max_low_pfn * PAGE_SIZE)) { /*while not hit the end of low mem*/
    pmd = one_md_table_init(pgd); /* returns pgd itself */
    set_pmd(pmd, _ _pmd(phys_addr | pgprot_val(_ _pgprot(0x1e3))));
    /* 0x1e3 == Present, Accessed, Dirty, Read/Write, Page Size, Global */
    phys_addr += PTRS_PER_PTE * PAGE_SIZE; /* 0x400000 */
    /*PTRS_PER_X is the number of entries at each level pointers per _*/
  • Identity mapping of the first megabytes of physical memory (startup_32() is required to complete the initialization phase of the kernel.
  • When this mapping is no longer necessary, the kernel clears the corresponding page table entries by invoking the zap_low_mappings() function.
  • Note we have not yet discussed fix-mapped linear addresses

Final Kernel Page Table when RAM is between 896 MB and 4GB

  • RAM cannot be mapped entirely into kernel linear address space
  • The solution is to map 896 megabytes into kernel address space like before
    • When a program needs to address parts other than those 896 mb some other linear address interval must be mapped into the current space
      • This means changing the value of some page table entries
      • (Dynamic remapping Ch8)
  • The 896 mb which are mapped are initialized in the same way as above

Final Kernel Page Table when RAM is > 4GB

  • This means 3 things
    1. CPU supports PAE
    2. More than 4GB of RAM installed
    3. Kernel compiled with PAE
  • With PAE this becomes a 3 level paging problem
    • And instead of relying on dynamic remapping we can directly map
pgd_idx = pgd_index(PAGE_OFFSET); /* 3 */
for (i=0; i<pgd_idx; i++)
    set_pgd(swapper_pg_dir + i, _ _pgd(_ _pa(empty_zero_page) + 0x001));
    /* 0x001 == Present */
pgd = swapper_pg_dir + pgd_idx;
phys_addr = 0x00000000;

for (; i<PTRS_PER_PGD; ++i, ++pgd) {
    pmd = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
    set_pgd(pgd, _ _pgd(_ _pa(pmd) | 0x001)); /* 0x001 == Present */

    if (phys_addr < max_low_pfn * PAGE_SIZE)
    for (j=0; j < PTRS_PER_PMD /* 512 */
            && phys_addr < max_low_pfn*PAGE_SIZE; ++j) {
        set_pmd(pmd, _ _pmd(phys_addr |
                pgprot_val(_ _pgprot(0x1e3))));
        /* 0x1e3 == Present, Accessed, Dirty, Read/Write, Page Size, Global */
        phys_addr += PTRS_PER_PTE * PAGE_SIZE; /* 0x200000 */
swapper_pg_dir[0] = swapper_pg_dir[pgd_idx];

  • First three entries in the Page Global Directory corresponding to the user linear address space with the address of an empty page (empty_zero_page).
  • The fourth entry is initialized with the address of a (pmd) allocated by invoking alloc_bootmem_low_pages().
  • The first 448 entries in the PMD  are filled with the physical address of the first 896 MB of RAM.
    • (there are 512 entries, but the last 64 are reserved for noncontiguous memory allocation; see the section “Noncontiguous Memory Area Management” in Chapter 8)
  • PAE also supports large 2-MB pages and global pages. Whenever possible, Linux uses large pages toreduce the number of Page Tables thus chooses 2MB pages
  • Fourth Page Global Directory entry is then copied into the first entry, so as to
    mirror the mapping of the low physical memory in the first 896 MB of the linear
    address space.

    • This mapping is required in order to complete the initialization of SMP (symmetric multi processing) systems: when it is no longer necessary, the kernel clears the corresponding page table entries by invoking the zap_low_mappings() function, as in the previous cases

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s