Issue
I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.
Here is the assembly:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I am using these commands to create the executable:
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64
And I run it using:
./hello
The program then seems to run without a segmentation fault or error, but it produces no output.
I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?
Solution
Related: WSL2 does allow 32-bit user-space programs, WSL1 doesn't. See Does WSL 2 really support 32 bit program? re: making sure you're actually using WSL2. The rest of this answer was written before WLS2 existed.
The issue is with Ubuntu for Windows (Windows Subsystem for Linux version 1). It only supports the 64-bit syscall
interface and not the 32-bit x86 int 0x80
system call mechanism.
Besides not being able to use int 0x80
(32-bit compatibility) in 64-bit binaries, Ubuntu on Windows (WSL1) doesn't support running 32-bit executables either. (Same as if you'd built a real Linux kernel without CONFIG_IA32_EMULATION
, like some Gentoo users do.)
You need to convert from using int 0x80
to syscall
. It's not difficult. A different set of registers are used for a syscall
and the system call numbers are different from their 32-bit counterparts. Ryan Chapman's blog has information on the syscall
interface, the system calls, and their parameters. Sys_write
and Sys_exit
are defined this way:
%rax System call %rdi %rsi %rdx %r10 %r8 %r9 ---------------------------------------------------------------------------------- 0 sys_read unsigned int fd char *buf size_t count 1 sys_write unsigned int fd const char *buf size_t count 60 sys_exit int error_code
Using syscall
also clobbers RCX and the R11 registers. They are considered volatile. Don't rely on them being the same value after the syscall
.
Your code could be modified to be:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov rsi,msg ;message to write
mov edi,1 ;file descriptor (stdout)
mov eax,edi ;system call number (sys_write)
syscall ;call kernel
xor edi, edi ;Return value = 0
mov eax,60 ;system call number (sys_exit)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
Note: in 64-bit code if the destination register of an instruction is 32-bit (like EAX, EBX, EDI, ESI etc) the processor zero extends the result into the upper 32-bits of the 64-bit register. mov edi,1
has the same effect as mov rdi,1
.
This answer isn't a primer on writing 64-bit code, only about using the syscall
interface. If you are interested in the nuances of writing code that calls the C library, and conforms to the 64-bit System V ABI there are reasonable tutorials to get you started like Ray Toal's NASM tutorial. He discusses stack alignment, the red zone, register usage, and a basic overview of the 64-bit System V calling convention.
Answered By - Michael Petch Answer Checked By - Terry (WPSolving Volunteer)