Issue
I'm getting race conditions when using the stl wide char library functions. Here's a minimal reproducible example in c, where I mention the compilers I've tried, the library version, and the expected and received outputs:
// temp.cpp
#include <wchar.h>
#include <assert.h>
#include <locale.h>
int main() {
setlocale(LC_ALL, "");
char utf8[] = {'a'};
wchar_t output[1];
mbstate_t ps;
size_t ret = mbrtowc(output, utf8, 1, &ps);
assert(ret == 1);
assert(output[0] == L'a');
return 0;
// let me define BAD output as:
// most of the time (~9/10), I get the following error:
// a.out: ../iconv/loop.c:457: utf8_internal_loop_single: Assertion `inptr - bytebuf > (state->__count & 7)' failed.
// Aborted (core dumped)
// the other (~1/10 times) the program ends correctly (exit code 0)
// and let me define GOOD output as:
// the program ends correctly (exit code 0), always
// compiler output
//
// gcc-7 BAD
// gcc-9 BAD
// gcc-10 BAD
// clang-10 GOOD
// clang-12 GOOD
// compile command: gcc temp.cpp
// clang temp.cpp
// gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
// glibc version: 2.31
}
My questions are as follow:
- Can anyone else reproduce this, or is it just me?
- What is the meaning of the assertion being tripped?
- Why is this happening?
Edit: ah, c++ default ctors have spoiled me. (also, in my original source I was working with std::mbrtowc, which threw me off.
Introspection: I will indeed use valgrind more in these cases.
Solution
You forgot to initialize ps
.
In all of the above cases, if
ps
[fourth argument] isNULL
, a static anonymous state known only to thembrtowc()
function is used instead. Otherwise,*ps
must be a validmbstate_t
object. Anmbstate_t
objecta
can be initialized to the initial state by zeroing it, for example using
memset(&a, 0, sizeof(a));
Your object ps
is uninitialized and therefore cannot in general be a valid mbstate_t
object.
valgrind
detects this bug.
Answered By - Nate Eldredge Answer Checked By - Mary Flores (WPSolving Volunteer)