Issue
Linux 64 bit on x86_64 employs 4 segment descriptors for code and data segments for userspace and kernelspace.
AFAIK a call to system API from a 64-bit process is done executing the syscall
instruction from ring 3 (usermode) code segment. Then sysret
executed in kernel mode returns control to the calling 64-bit process switching back the CS
segment selector to point to the 64-bit usermode code segment.
What about 32-bit processes ? From Intel SDM syscall
is not supported in IA-32e
compatibility mode.
P.s. I'm aware of on Windows 64 bit, 32-bit applications are supported via Wow64 subsystem and a call into system API is done switching the logical CPU/core from 32-bit compatibility mode to 64-bit mode in userspace. The call to system API is done actually from 64-bit mode and upon returning from it the logical CPU is then switched back again in compatibility mode.
Solution
See https://blog.packagecloud.io/the-definitive-guide-to-linux-system-calls/.
The recommended way to make system calls from 32-bit code in Linux is to call
into the VDSO, a "library" of code+data that the kernel maps into the address-space of every executable. The kernel chooses at bootup which instructions to put into it, depending on what the CPU supports.
64-bit kernel, 64-bit user-space (64-bit mode)
- Glibc uses
syscall
directly, only calling a VDSO wrapper for system calls likeclock_gettime
andgetpid
that can run purely in user-space. All x86-64 CPUs supportsyscall
from 64-bit user-space
64-bit kernel, 32-bit user-space (compat mode sub-mode of long mode)
- On Intel CPUs, x86-64 Linux's 32-bit VDSO system-call wrapper uses
sysenter
.
AMD only supportssysenter
in legacy mode, if at all. - On AMD CPUs, x86-64 Linux's 32-bit VDSO system-call wrapper uses
syscall
. The 64-bit kernel side has similar semantics between 64-bit and compat mode user-space. Intel CPUs only supportsyscall
in full 64-bit mode. - I think all x86-64 CPUs support one or the other, so the fallback to slow
int 0x80
is never needed. I don't know which one is supported by CPUs from Via or Zhaoxin or other vendors.
32-bit kernels (legacy mode)
- (Intel and AMD): The VDSO uses
sysenter
if available. Intel CPUs were the first to support this. AMD added support forsysenter
(in legacy mode only, not compat mode) to their CPUs some time after adding legacy-modesyscall
. - Otherwise it uses
int 0x80
. Only ancient CPUs from either vendor are stuck with this.
AMD CPUs support syscall
in legacy mode with different semantics from long mode, but Linux doesn't use that even if available. According to kernel comments in entry_64_compat.S
which defines the entry points from compat mode into a 64-bit kernel, Linux disables the SYSCALL instruction on 32-bit kernels because the SYSCALL instruction in legacy/native 32-bit mode (as opposed to compat mode) is sufficiently poorly designed as to be essentially unusable. This is part of a long comment on the compat-mode entry-point for syscall
for AMD CPUs, which is used.
Intel CPUs support sysenter
from 64-bit user-space, but Linux never uses that, only syscall
. (Intel manual: https://www.felixcloutier.com/x86/sysenter)
The history here is that before AMD64 existed, both vendors added their own fast-system-call instructions as extensions. AMD's was apparently not well designed, which is why it has different kernel-side semantics in 64-bit mode.
For x86-64, each vendor kept more support for their own fast-system-call instruction across modes, although AMD later added support for sysenter
in legacy mode because at least some OSes (like Linux) weren't using syscall
in 32-bit kernels because.
Related
- https://blog.packagecloud.io/the-definitive-guide-to-linux-system-calls/
- OsDev syscall/sysret and sysenter/sysexit instructions enabling all the various choices, including fun facts like that an invalid-instruction trap was the fastest way into the kernel on 386, faster than
int
, and Windows actually used it. - Syscall or sysenter on 32 bits Linux?
- https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24594.pdf#page=510 AMD documentation for
syscall
. And scroll down a couple pages in the PDF forsysenter
, where it still clearly states in this 2023 manual thatsysenter
is only supported in legacy mode on AMD CPUs. - What is better "int 0x80" or "syscall" in 32-bit code on Linux?
- Intel x86 vs x64 system call - quotes the Linux kernel source's
if(vdso32_syscall())
etc. that chooses which system-call instruction sequence to use for__kernel_vsyscall
. - How to invoke a system call via syscall or sysenter in inline assembly?
- Fastest Linux system call shows some of the kernel side of
sysenter
under a 64-bit kernel. - Why int80h instead of sysenter is used to invoke system calls? - an old VMware on an old Intel CPU apparently didn't expose
sysenter
support. But that was legacy mode. - https://reverseengineering.stackexchange.com/questions/16454/struggling-between-syscall-or-sysenter-windows has an answer describing what's supported by Intel vs. AMD in what mode, but doesn't say what Linux actually does.
I didn't think any of those explained clearly enough about compat mode vs. legacy mode differences, so not duplicates.
Answered By - Peter Cordes Answer Checked By - David Marino (WPSolving Volunteer)