Issue
I'm trying to embed arbitrary data into an ELF executable and have Linux map it in automatically at load time. Recently asked another question about this which culminated in support for this use case being added to the mold
linker.
I have written a tool that appends arbitrary data at the end of the executable and patches in a PT_LOAD
ELF program header that points to the appended data. This is the patching logic:
appended_data_file_offset = /* ... seek(elf file, SEEK_END) ... */;
appended_data_size = /* ... stat(data file) ... */;
phdr->p_type = PT_LOAD;
phdr->p_filesz = phdr->p_memsz = appended_data_size;
size_t base = phdr->p_vaddr - phdr->p_offset; // calculate program's base load address
phdr->p_vaddr = phdr->p_paddr = base + appended_data_file_offset;
phdr->p_offset = appended_data_file_offset;
phdr->p_align = 1;
phdr->p_flags = PF_R;
Running my patcher results in an ELF file with this data appended to it at offset 0xAD78
:
0000ad70: 00 00 00 00 00 00 00 00 74 65 73 74 20 64 61 74 ........test dat
0000ad80: 61 0a a.
And this PT_LOAD
segment added:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x000000000000ad78 0x000000000020ad78 0x000000000020ad78
0x000000000000000a 0x000000000000000a R 0x1
This new segment and 10 byte block at the end are the only changes made to the perfectly good, working ELF executable. Confirmed by binary comparison.
At runtime, the program is supposed to reach that data. It does so via the auxiliary vector:
Elf64_Phdr *header = (Elf64_Phdr *) getauxval(AT_PHDR);
size_t count = getauxval(AT_PHNUM);
size_t size = getauxval(AT_PHENT);
assert(size == sizeof(Elf64_Phdr));
for (size_t i = 0; i < count; ++header, ++i) {
if (header->p_type != PT_LOAD) { continue; }
if (0 == memcmp(header->p_vaddr, "test", sizeof("test") - 1)) {
// found it
}
}
I used libc
functions for clarity. My actual program is a static EXEC
ELF file written in freestanding C. It does not link to libc
and uses Linux system calls directly.
After patching the executable in this manner, I intended for this to happen:
- Linux automatically loads into memory the data appended to the executable.
- The 10 byte block located at offset
0xAD78
in the file.
- The 10 byte block located at offset
- Program finds the program header table via
AT_PHDR
value in the auxiliary vector. - Program scans
PT_LOAD
segments until it finds the data.- The
p_vaddr
of one of these headers should point to a memory block containing"test data\n"
- The
Instead, this program just completely crashes. Does not execute a single instruction. Does not even reach the entry point. Not even gdb
can debug it:
(gdb) run
Starting program: exe.patched
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) info registers
The program has no registers now.
(gdb) step
The program is not being run.
It runs without any problems without that PT_LOAD
header though. It also works if I change the type to PT_LOOS
or any other type.
I can't figure it out. Just what am I doing wrong?
The complete readelf
printout as requested:
$ readelf --file-header --program-headers program.patched
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: AArch64
Version: 0x1
Entry point address: 0x2037d8
Start of program headers: 64 (bytes into file)
Start of section headers: 43512 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 5
Size of section headers: 64 (bytes)
Number of section headers: 8
Section header string table index: 6
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x000000000000abf8 0x000000000020abf8 0x000000000020abf8
0x000000000000000a 0x000000000000000a R 0x1
LOAD 0x0000000000000000 0x0000000000200000 0x0000000000200000
0x00000000000027d8 0x00000000000027d8 R 0x1000
LOAD 0x00000000000027d8 0x00000000002037d8 0x00000000002037d8
0x0000000000005ed8 0x0000000000005ed8 R E 0x1000
LOAD 0x00000000000086b0 0x000000000020a6b0 0x000000000020a6b0
0x0000000000000000 0x0000000000100015 RW 0x1000
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x0
Section to Segment mapping:
Segment Sections...
00
01 .rodata
02 .text
03 .bss
04
Solution
PT_LOAD headers must be ordered in ascending order of the virtual address. Your new program header has a higher p_vaddr
than all the following PT_LOAD headers.
Also, the segment's virtual address ranges shouldn't be overlapping, but your new segment lies inside the last one. The relevant size of a mapped segment is the larger of p_filesz
and p_memsz
.
This is documented in man 5 elf.
Answered By - user17732522 Answer Checked By - Clifford M. (WPSolving Volunteer)