Monday, February 5, 2024

[SOLVED] STM32 gcc optimization produces different result

February 05, 2024 gcc, optimization, stm32

Issue

I am trying to decode RGB buffer into rows suitable for LED RGB matrix using STM32F4 (gcc in SW4STM32 IDE). The code below works perfectly when setting the compiler optimization -O0.

The code produces different result when compiling with -O1 optimization. (also -O2, -O3). The code also produces different result when adding (attribute((packed)) to struct color_t definition with -O1.

The optimization is set to this .c file only, other files are still -O0.

Could someone spot why the optimization changed code logic/behavior ?

uint8_t badr[24576] = { 0xff , 0x2a, .... };

struct s_color_t
{
    uint8_t r;
    uint8_t g;
    uint8_t b;
} ;
//__attribute__((packed))

typedef struct s_color_t color_t;

#define WIDTH       128
#define HEIGHT      64
#define COLOR_SIZE  sizeof(color_t)

#define R0_POS      0x01
#define G0_POS      0x02
#define B0_POS      0x04
#define R1_POS      0x08
#define G1_POS      0x10
#define B1_POS      0x20

#define enc_color(c, s, r)  (c & s)? r : 0

color_t dispBuf[ WIDTH * HEIGHT];
uint8_t dispLine[WIDTH];

void fillBuf()
{
    uint8_t *dbuf = (uint8_t *) dispBuf;
    memcpy(dbuf, badr, WIDTH * HEIGHT * COLOR_SIZE);
}

void enc_row(uint8_t slice, uint16_t row, uint8_t *rBuf)
{
    uint8_t reg;

    color_t *upPtr = dispBuf + row * WIDTH * COLOR_SIZE;
    color_t *dnPtr = dispBuf + (row + (HEIGHT / 2)) * WIDTH * COLOR_SIZE;
    uint8_t *destPtr = rBuf;

    uint8_t sn = (1 << slice);
    for (int i=0; i < WIDTH; i++) {
        reg = 0;
        reg |= enc_color((*upPtr).r, sn, R0_POS);
        reg |= enc_color((*upPtr).g, sn, G0_POS);
        reg |= enc_color((*upPtr).b, sn, B0_POS);
        reg |= enc_color((*dnPtr).r, sn, R1_POS);
        reg |= enc_color((*dnPtr).g, sn, G1_POS);
        reg |= enc_color((*dnPtr).b, sn, B1_POS);

        *destPtr = reg;
        upPtr ++;
        dnPtr ++;
        destPtr ++;
    }
}

void do_func() 
{
    uint8_t *ptrLine = dispLine;

    fillBuf();
    for (int s=0; s < 8; s++) {
        for (int i=0; i< (HEIGHT/2); i++) {
            enc_row(s, i, ptrLine);
            printf("line %3i ", i);

            for(int c=0; c<WIDTH; c++) {
                printf("%02X, ", ptrLine[c] & 0xFF);
            }
            printf("\r\n");
        }
    }
}

Solution

Thanks Zan Lynx. Indeed, if a one is planning to use optimization later, he should put check points to verify that code produces same result with optimization.

The erroneous results above were caused by the pointer calculation. I replaced the two statements from the code above:

color_t *upPtr = dispBuf + row * WIDTH * COLOR_SIZE;
color_t *dnPtr = dispBuf + (row + (HEIGHT / 2)) * WIDTH * COLOR_SIZE;

with :

upPtr = &dispBuf[row * WIDTH];
dnPtr = &dispBuf[(row + (HEIGHT / 2)) * WIDTH];

During debugging (with optimization turned on), several functions were removed (i.e. operation was done by the compiler.)

Speaking embedded, interface signals to other devices may change speed too. You may get overclocking or shorter pulses.

The code execution time was reduced by 60% with -O1.

Answered By - alsaleem

Answer Checked By - Gilberto Lyons (WPSolving Admin)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, February 5, 2024

[SOLVED] STM32 gcc optimization produces different result

Issue

Solution

Popular Posts

Labels