Tuesday, April 12, 2022

[SOLVED] How to make gcc optimize function call with constant parameters at compile time?

Issue

I am trying to parse user input and execute some tasks according to the commands given by user. As, in C, switch doesn't work with strings I have decided to use switch of hash values of strings to compare which command to execute.

Now, as maintaining the list of all hashes of all available commands in something like this

#define EXIT 6385204799
...

is really tedious task, I was wandering if there exist way to convince gcc to evaluate the hash function with constant parameter on compile time so I could use something like this

switch(hash(command)){
    case hash("exit"): exit();
    // I know, case labels must be compile time constants
    // but that should be fulfilled in my case
}

I know I could use for example metaprogramming but I am interested more in the solution with gcc.

Is it even possible?

#include <stdio.h>


unsigned long hash(const char *str)
{
        unsigned long hash = 5381;
        int c;

        while ((c = *str++))
                hash = ((hash << 5) + hash) + c;
        return hash;
}

int main( int argc, char **argv ) {
        char *command=NULL;
        size_t size=0;
        printf("Enter string:");
        getline(&command, &size, stdin);
        printf("%ld",hash("exit")); // I want this to evaluate in compile time
        printf("%ld",hash(command)); // and this not
        return 0;
}

Solution

Is it even possible?

GCC cannot (for C - it can for C++, see below), but Clang/LLVM (version 3.9.1) can. Use the -O2 switch to enable level 2 optimisations (or higher).

Proof. See the disassembly - there is no call to a hash function, no loop; the compiler has calculated the hash at compile time. This reduced form of your test case:

#include <stdio.h>

static unsigned long hash(const char *str)
{
        unsigned long hash = 5381;
        int c;

        while ((c = *str++))
                hash = ((hash << 5) + hash) + c;
        return hash;
}

int main( int argc, char **argv ) {
        size_t size=0;
        printf("%ld",hash("exit")); // I want this to evaluate in compile time
        return 0;
}

Compiles to:

main:                                   # @main
# BB#0:
        push    rax
        #DEBUG_VALUE: main:argc <- %EDI
        #DEBUG_VALUE: main:argv <- %RSI
        #DEBUG_VALUE: main:size <- 0
        movabs  rsi, 6385204799
        mov     edi, .L.str
        xor     eax, eax
        call    printf
        xor     eax, eax
        pop     rcx
        ret

.L.str:
        .asciz  "%ld"

The line movabs rsi, 6385204799 directly loads the pre-calculated hash value into the rsi register.

However, the value will not be considered a compile-time constant for purposes of use in a case label in a switch statement. You need to use if ... else for comparison rather than switch.

In case you are interested, with modern C++ you can achieve this type of optimisation using GCC as well as Clang/LLVM, and you can even use a switch statement:

#include <cstdio>

static constexpr unsigned long hash(const char *str)
{
        unsigned long hash = 5381;
        int c = *str;

        while ((c = *str++))
                hash = ((hash << 5) + hash) + c;
        return hash;
}

int main( int argc, char **argv ) {
        size_t size=0;
        printf("%ld",hash("exit")); // I want this to evaluate in compile time

        switch((unsigned long)5 /* i.e. some number */) {
        case hash("exit"):
            // etc
            ;
        }

        return 0;
}

This is C++14 code, you will need to use -std=c++14 to compile it (or use GCC 6+ for which this is the default). (Of course, the code is not idiomatic C++ - it is intended to be as close as possible to the previous example).



Answered By - davmac
Answer Checked By - Dawn Plyler (WPSolving Volunteer)