Issue
I'm having trouble finding appropriate documentation for the problem I'm having generating consistent HMACs in the kernel and user space. According to Robert Love in href="https://www.oreilly.com/library/view/linux-kernel-development/9780768696974/" rel="nofollow noreferrer">Linux Kernel Development, the Memory Descriptors mm->start_code
and mm->end_code
are supposed to contain the .text
segment. Finding the .text
segment in a static executable is well defined in the ELF documentation and is easy to get at. So, given the following two code snippets, one would expect to get a matching HMAC:
Kernel:
__mm = get_task_mm(__task);
__retcode = ntru_crypto_hmac_init(__crypto_context);
if(__retcode != NTRU_CRYPTO_HMAC_OK)
return 1;
__retcode = ntru_crypto_hmac_update(__crypto_context, (const uint8_t*)__mm->start_code,
__mm->end_code - __mm->start_code);
if(__retcode != NTRU_CRYPTO_HMAC_OK)
return 1;
__retcode = ntru_crypto_hmac_final(__crypto_context, __hmac);
if(__retcode != NTRU_CRYPTO_HMAC_OK)
return 1;
return 0;
Userland:
for (j = 0; j < file_hdr32.e_shnum; j++)
{
if (!strcmp(".text", strIndex + section_hdr32[j]->sh_name))
{
retcode = ntru_crypto_hmac_init(__crypto_context());
if(retcode != NTRU_CRYPTO_HMAC_OK)
{
syslog(LOG_ERR, "ntru_crypto_hmac_init error: retcode = %d, TID(0x%lx)",
retcode,pthread_self());
return 0;
}
retcode = ntru_crypto_hmac_update(__crypto_context(),
filebuf + section_hdr32[j]->sh_offset, section_hdr32[j]->sh_size);
if(retcode != NTRU_CRYPTO_HMAC_OK)
{
syslog(LOG_ERR, "Internal crypto error (%d)", retcode);
return 0;
}
retcode = ntru_crypto_hmac_final(__crypto_context(), _hmac);
if(retcode != NTRU_CRYPTO_HMAC_OK)
{
syslog(LOG_ERR, "Failed to finalize HMAC, TID(0x%lx)", pthread_self());
return 0;
}
return 1;
}
}
In both cases the .text
segment is exactly where it's documented to be but they never match. I've generated userland HMACs for all 17,000 executable files on the system so even if the code segment in the kernel memory descriptor were pointing to a dependency, rather than the primary executable, I still should get a match. But no dice. There's something fundamentally different between the two .text
segments and I was wondering if anyone out there knew what it was so I can save some time--any clues?
Solution
There's something fundamentally different between the two ".text" segments
Your problem is that you are ignoring the difference between segments and sections.
The ELF
format is an executable and linking format. Segments are used for the former, sections for the latter (and linking here means static linking, i.e. build-time). Once the binary is linked, sections can be completely discarded from it, and only segments are needed at runtime. Segments are mmap
ed, not sections.
Now let's look at the difference between the two.
readelf -l /bin/date
Elf file type is EXEC (Executable file)
Entry point 0x402000
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000d5ac 0x000000000000d5ac R E 200000
LOAD 0x000000000000de10 0x000000000060de10 0x000000000060de10
0x0000000000000440 0x0000000000000610 RW 200000
DYNAMIC 0x000000000000de38 0x000000000060de38 0x000000000060de38
0x00000000000001a0 0x00000000000001a0 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x000000000000c700 0x000000000040c700 0x000000000040c700
0x00000000000002a4 0x00000000000002a4 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 8
GNU_RELRO 0x000000000000de10 0x000000000060de10 0x000000000060de10
0x00000000000001f0 0x00000000000001f0 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .dynamic .got
Above you can see that multiple sections (.interp
, .note.ABI-tag
, ... .text
, ...) all got mapped into a single PT_LOAD
segment. All these sections have the same protections, and all are "covered" by a single [mm->start_core, mm->end_code)
region.
Compare this to the .text
section:
readelf -WS /bin/date | grep '\.text'
[13] .text PROGBITS 0000000000401900 001900 0077f8 00 AX 0 0 16
You'll note that the section is smaller and begins at a different offset.
No wonder you get different HMAC then. Try computing HMAC in user-land over segments, and you should get a match.
Answered By - Employed Russian Answer Checked By - Marie Seifert (WPSolving Admin)