Issue
While debugging linux booting on a virtual machine, I found the stack pointer registers are strange. I was in a printk routine early in the start_kernel function. (the serial port is not connected so the data is just wrtten to the log buffer at this time).
When I'm in the function vsnprintf
(linux-5.4.21), when I examine the stack pointers, it is seen like this(this is seen for the arm64 virtual machine running on qemu).
(gdb) info reg sp
sp 0xffffffc0105d3bf0 0xffffffc0105d3bf0
(gdb) info reg SP_EL2
SP_EL2 0x407cd70 67620208
(gdb) info reg SP_EL1
SP_EL1 0x0 0
(gdb) info reg SP_EL0
SP_EL0 0xffffffc0105d9f00 -274603335936
In my case, the kernel runs at EL2 so I thought the sp
value will be equal to SP_EL2 but it's not. Instead the SP_EL2 is set still as it has been set by the u-boot program, a physical address (u-boot program was running at 0x4000000 ~ range, and the linux kernel is running at 0x80000000 ~ range, with only 8MB).
I know PS_EL0 contains the address of init_task when kernel is running at EL2. But why is SP_EL2 not being used and what is this real sp
value which is decremented as I go down the function calls?
ADD : I checked SPSel register is set to 1. So SP_EL2 is being used. So my questiog is : why isn't SP_EL2 being used? and what is this sp
register which is actually being used?
Solution
This is just a minor bug in QEMU's display of the SP_ELx system registers to the gdbstub: you should ignore the SP_ELx value for the exception level the guest is currently at, and look at the normal sp register instead.
The reason for the bug is that architecturally, if you're code running on the guest CPU you cannot access the SP_ELx system register for an exception level that's equal to or higher than the one you're running at. So EL2 code cannot itself read SP_EL2 -- only EL3 can use the SP_EL2 system register. The only way EL2 code can look at its own stack pointer is to use SP. QEMU takes advantage of this to avoid having to do extra work to ensure that the SP_ELx system register value that would be the same as the current SP is in sync with the real SP register. But the gdbstub accessors let you as the gdb user read system registers that the currently running code doesn't have the ability to access. Mostly that works OK, but occasionally you get nonsense results, like this one.
Answered By - Peter Maydell Answer Checked By - Gilberto Lyons (WPSolving Admin)