Issue
Ok, this is gonna be a long question. I'm trying to understand how "buffer overflow" works. I am reading Smashing the stack for fun and profit by aleph1 and have just got the disassembly of the following code:
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
void main() {
function(1,2,3);
}
The disameembly using -S
flag of GCC gives me:
.file "example1.c"
.text
.globl function
.type function, @function
function:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $48, %rsp
movl %edi, -36(%rbp)
movl %esi, -40(%rbp)
movl %edx, -44(%rbp)
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
movq -8(%rbp), %rax
xorq %fs:40, %rax
je .L2
call __stack_chk_fail
.L2:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size function, .-function
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $3, %edx
movl $2, %esi
movl $1, %edi
call function
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2"
.section .note.GNU-stack,"",@progbits
the .cfi
directives are not in the paper by Aleph1 and I guess that they were not used back then. I have read this question on SO and I get that they are used by GCC for exception handling. I have also read another question on SO and I get that .LFB0, .LFE0, .LFE1 and .LFB1 are labels however I have the following doubts:
- I get that .cfi directives are used for exception handling however I don't understand what they mean. I have been here and I see some definitions like:
.cfi_def_cfa register, offset
.cfi_def_cfa defines a rule for computing CFA as: take address from register and add offset to it.
However, if you take a look at the disassembly that I have put above you don't find any register name (like EAX, EBX and so on) instead you find a number there (I have generally found '6') and I don't know how's that supposed to be a register. Especially, can anyone explain what .cfi_def_cfa_offset 16
, .cfi_offset 6, -16
, .cfi_def_cfa_register 6
and .cfi_def_cfa 7, 8
mean? Also, what does CFA
mean? I am asking this because mostly in books/papers the procedure prolog is like :
pushl %ebp
movl %esp,%ebp
subl $20,%esp
However, now I think the procedure prolog in modern computers is as follows:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $48, %rsp
Initially I thought that the CFI directives are used instead of sub
mnemonic to set the offset but that's not the case; the sub
command is still being used in spite of using the CFI directives.
I understood that there are labels for each procedure. However, why are multiple nested labels inside a procedure? In my case main has .LFB1 and .LFE2 labels. What is the need for multiple labels? Similarly the
function
procedure has the labels .LFB0, .L2 and .LFE0The last 3 lines for both the procedures seem to be used for some housekeeping functions (telling the size of the procedure, maybe?) but I am not sure what do they mean. Can anyone explain what do they mean and what's their use?
EDIT:
(adding one more question)
Do the CFI directives take up any space? Because in the procedure "function", each int parameter take up 4 bytes and the number of it is 3, so all parameter takes 12 bytes in memory. Next, the first
char
array takes 8 bytes (round up 5bytes to 8bytes), and nextchar
array takes 12bytes (round up 10bytes to 12bytes), so the wholechar
array takes 20 bytes. Summing these all, parameter and local variables only need 12+20=32 bytes.But in the procedure "function", compiler subtract 48 bytes to store values. Why?
Solution
Lindy Dancer Answered what cfi and cfa means
(call frame information
) and (call frame address
)
.L<num>
denotes labels as per various tidbits in Google in x64 GCC names all labels in the following format start with .L
and end with a numeral
so .L1 , .L2 , .L....infinity
are labels
according to Google and some earlier SO
answers BF<num>
indicates Function-Begin and EF<num>
indicates FUNCTION-END
so .LBF0 , .LBF1 . LBF.....infinity
and .LFE0 ,......., .LFE....infinity
denotes function begins and function ends in each function which the compiler probably requires to take care of some internal needs so you should forget them at this moment unless there is a very grave need to dig into compiler internals
the other label .L2
exists to address the branching instruction je in your function
je .L2
also every compiler aligns and pads the access to arguments and locals to certain boundary
i can't be sure but x64 default align is 16 bytes I think for GCC so if you request an odd reservation like
char foo[5] or
BYTE blah [10]
the indices 5 and 10
are not aligned even for x86
for 5 x86 compiler will assign
8 bytes and for 10 16 bytes
like wise x64 gcc might assign 16 bytes
for each of your requests
you actually shouldn't be worrying about why compiler does what it does
when you are trying to understand logic of assembly just concentrate on addresses
if the compiler decided that it will put x at rbp +/- X
it will also access it at the same location
through out the scope or life of that variable
Answered By - blabb Answer Checked By - Clifford M. (WPSolving Volunteer)