Wednesday, January 5, 2022

[SOLVED] Understanding GCC's un-optimized assembly for UB n = ++n + ++n - why increment twice before shifting?

Issue

I understand this is undefined behaviour and no one actually writes code like this. However I'm curious as to what the compiler would do to this piece of code.

int n = 3;
n = ++n + ++n;

I compiled using both clang and gcc for comparison. Without optimizations. Here's the assembly generated from clang :

# clang -O0
movl    $3, -4(%rbp)
movl    -4(%rbp), %ecx
addl    $1, %ecx
movl    %ecx, -4(%rbp)
movl    -4(%rbp), %edx
addl    $1, %edx
movl    %edx, -4(%rbp)
addl    %edx, %ecx
movl    %ecx, -4(%rbp)

It's copying the 3 in a register, increments it, then copy this incremented value again and increment it once more, then add up (3+1) + (3+1+1). This seems pretty straightforward.

However I'm having trouble understanding what GCC is doing. Here's the generated assembly :

# gcc -O0
movl    $3, -4(%rbp)
addl    $1, -4(%rbp)
addl    $1, -4(%rbp)
sall    -4(%rbp)

From what I understand, it's incrementing twice, and then left shift (sall) once, which means multiply by 2.

I thought it noticed ++n being the same on both sides of the operand so it took it as common factor. However in that case why did it increment twice ?

Clang's version gives 9 and GCC gives 10. (Any result is acceptable, given the UB, but that confirms that the end result of the compilers' internal logic was actually different.)

Can anyone explain what is GCC trying to accomplish here ?


Solution

The unary ++ operator indicates that its operand is to be incremented before it is evaluated. Clang interprets your expression like this:

n = n + 1
tmp1 = n
n = n + 1
tmp2 = n
n = tmp1 + tmp2

whereas GCC does something like this, processing the preincrements before descending into the expression:

n = n + 1
n = n + 1
tmp1 = n
tmp2 = n
n = tmp1 + tmp2

Then, realising that both operands to + are the same expression it performs a strength reduction yielding

n = n + 1
n = n + 1
n = n << 1

This strength reduction is likely performed despite the lack of optimisation flags because GCC is known to perform certain strength reductions very early in the compilation process before optimisation flags affect the result.

Note however that the result may differ with different compiler options.



Answered By - fuz