Issue
If you compile two C programs that differ only in the return value, I'd expect the binary to differ only in the bits of this value. However, if I compile the following programs using GCC, dump the bits of the binary (using xxd) and diff the dumps, I get another difference.
Files
return127.c
int main() {
return 127;
}
return128.c
int main() {
return 128;
}
Compile, Dump And Diff
# compile
gcc -Os -fdata-sections -ffunction-sections -fipa-pta -Wl,--gc-sections -Wl,-O1 -Wl,--as-needed -Wl,--strip-all return127.c -o return127
gcc -Os -fdata-sections -ffunction-sections -fipa-pta -Wl,--gc-sections -Wl,-O1 -Wl,--as-needed -Wl,--strip-all return128.c -o return128
# dump
xxd -b return127 > return127.xxd-bits
xxd -b return128 > return128.xxd-bits
# diff
diff return127.xxd-bits return128.xxd-bits
Note: I use the compile command of this comment to a question about the smallest binary of a C program.
Diff
108,111c108,111
< 00000282: 01010101 00000000 01101011 11011010 11101100 11100011 U.k...
< 00000288: 00111010 10001111 00101111 00101100 01100001 00111100 :./,a<
< 0000028e: 10010010 11001011 00011000 11101010 11100111 00100011 .....#
< 00000294: 01001010 00111011 11111001 11111010 00000001 00000000 J;....
---
> 00000282: 01010101 00000000 00011101 11000011 10101000 00011001 U.....
> 00000288: 11011011 00110001 10100000 01001101 01000110 10010011 .1.MF.
> 0000028e: 00101101 01011101 11101001 00001000 01010101 11111101 -]..U.
> 00000294: 11011011 01000011 11010100 10101011 00000001 00000000 .C....
211c211
< 000004ec: 00000000 00000000 00000000 00000000 10111000 01111111 ......
---
> 000004ec: 00000000 00000000 00000000 00000000 10111000 10000000 ......
There are two differences. The difference at the bottom shows the (expected) difference of the return values. The lines differ only in the last byte/block. Binary 01111111
is decimal 127
. Binary 10000000
is decimal 128
.
What is the difference at the top?
Solution
What is the difference at the top?
It's build id difference. Install diffoscope
(or compare readelf --wide --notes
output from both libraries) and you'll nicely see:
│ Displaying notes found in: .note.gnu.build-id
│ Owner Data size Description
│ - GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 817d41c45a09c3822337307250bdb9410a1959b4
│ + GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: de5fb81907549af3332e8136d6bd7ab4d884e0ce
How to compile C programs such that binaries differ only in different return value?
- You have to set
__TIME__
and__DATE__
to the same time on both gcc. - You have to make unique build-id for both calls.
The following script:
export SOURCE_DATE_EPOCH=$(date +%s)
f() {
gcc -Wl,--build-id=none \
-Os -fdata-sections -ffunction-sections -fipa-pta \
-Wl,--gc-sections -Wl,--as-needed -Wl,--strip-all \
-xc - -o "$1"
}
echo 'main(){return 127;}' | f /tmp/1
echo 'main(){return 128;}' | f /tmp/2
diffoscope /tmp/1 /tmp/2
and diffoscope
outputs:
│ 0000000000001020 <.text>:
│ - mov $0x7f,%eax
│ + mov $0x80,%eax
│ retq
Answered By - KamilCuk