Thursday, September 1, 2022

[SOLVED] On x64, how does the Linux kernel access the data segment? Does it use -mcmodel=large during compilation?

Issue

I'm writing a minimal x86-64 kernel from scratch and I am having some design issues.

From the comments and the link provided by stark I decided to rephrase my question. I want to take example on the Linux kernel to design my own kernel and would like some advice.

I know that, when C++ code is compiled it will use by default RIP-relative addressing to access the data segment of the executable (for all global/static variables). RIP-relative addressing is limited to a 32 bits offset which leaves with a maximum of 2GB offset from the code segment.

I also know (from stark's comment) that the Linux kernel starts its code segment at 0xffff_ffff_8000_0000 (https://www.kernel.org/doc/html/latest/x86/x86_64/mm.html):

ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0

If the code segment of the Linux kernel is further than 2GB from most of its data segment, how does it access it otherwise than with RIP-relative addressing?

I think that the -mcmodel=kernel code model can sign extend a 32 bits absolute address to 64 bits which allows the executable to access the upper 2GB of the virtual address space without using -mcmodel=large. That doesn't help since the data segment of the kernel is not found in that region. Meanwhile, the -mcmodel=large makes the executable access the data segment with a 64 bits absolute address which slows down the kernel and makes it much bigger.

How does the Linux kernel access the data segment and does it use a large code model to access the 0xffff_8000_0000_0000 region of the virtual address space?


Solution

I think the confusion is between the gcc memory model and the 64-bit CPU's MMU. Using the kernel memory model generates code that uses signed 32-bit offsets, which means all symbols in the kernel must fit in the top 2GB of the address space. This does not change the fact that virtual address pointers in the kernel are 64-bit, of which 48 or so bits are significant, allowing anything in the kernel or current user space to be indirectly accessed via the page tables and MMU.



Answered By - stark
Answer Checked By - Robin (WPSolving Admin)