Issue
I am trying to decode RGB buffer into rows suitable for LED RGB matrix using STM32F4 (gcc in SW4STM32 IDE). The code below works perfectly when setting the compiler optimization -O0.
The code produces different result when compiling with -O1 optimization. (also -O2, -O3). The code also produces different result when adding (attribute((packed)) to struct color_t definition with -O1.
The optimization is set to this .c file only, other files are still -O0.
Could someone spot why the optimization changed code logic/behavior ?
uint8_t badr[24576] = { 0xff , 0x2a, .... };
struct s_color_t
{
uint8_t r;
uint8_t g;
uint8_t b;
} ;
//__attribute__((packed))
typedef struct s_color_t color_t;
#define WIDTH 128
#define HEIGHT 64
#define COLOR_SIZE sizeof(color_t)
#define R0_POS 0x01
#define G0_POS 0x02
#define B0_POS 0x04
#define R1_POS 0x08
#define G1_POS 0x10
#define B1_POS 0x20
#define enc_color(c, s, r) (c & s)? r : 0
color_t dispBuf[ WIDTH * HEIGHT];
uint8_t dispLine[WIDTH];
void fillBuf()
{
uint8_t *dbuf = (uint8_t *) dispBuf;
memcpy(dbuf, badr, WIDTH * HEIGHT * COLOR_SIZE);
}
void enc_row(uint8_t slice, uint16_t row, uint8_t *rBuf)
{
uint8_t reg;
color_t *upPtr = dispBuf + row * WIDTH * COLOR_SIZE;
color_t *dnPtr = dispBuf + (row + (HEIGHT / 2)) * WIDTH * COLOR_SIZE;
uint8_t *destPtr = rBuf;
uint8_t sn = (1 << slice);
for (int i=0; i < WIDTH; i++) {
reg = 0;
reg |= enc_color((*upPtr).r, sn, R0_POS);
reg |= enc_color((*upPtr).g, sn, G0_POS);
reg |= enc_color((*upPtr).b, sn, B0_POS);
reg |= enc_color((*dnPtr).r, sn, R1_POS);
reg |= enc_color((*dnPtr).g, sn, G1_POS);
reg |= enc_color((*dnPtr).b, sn, B1_POS);
*destPtr = reg;
upPtr ++;
dnPtr ++;
destPtr ++;
}
}
void do_func()
{
uint8_t *ptrLine = dispLine;
fillBuf();
for (int s=0; s < 8; s++) {
for (int i=0; i< (HEIGHT/2); i++) {
enc_row(s, i, ptrLine);
printf("line %3i ", i);
for(int c=0; c<WIDTH; c++) {
printf("%02X, ", ptrLine[c] & 0xFF);
}
printf("\r\n");
}
}
}
Solution
Thanks Zan Lynx. Indeed, if a one is planning to use optimization later, he should put check points to verify that code produces same result with optimization.
The erroneous results above were caused by the pointer calculation. I replaced the two statements from the code above:
color_t *upPtr = dispBuf + row * WIDTH * COLOR_SIZE;
color_t *dnPtr = dispBuf + (row + (HEIGHT / 2)) * WIDTH * COLOR_SIZE;
with :
upPtr = &dispBuf[row * WIDTH];
dnPtr = &dispBuf[(row + (HEIGHT / 2)) * WIDTH];
During debugging (with optimization turned on), several functions were removed (i.e. operation was done by the compiler.)
Speaking embedded, interface signals to other devices may change speed too. You may get overclocking or shorter pulses.
The code execution time was reduced by 60% with -O1.
Answered By - alsaleem Answer Checked By - Gilberto Lyons (WPSolving Admin)