Issue
I am rewriting some c++ code (originally written in Matlab as a MEX function) in codeblocks so that I can use debugging and profiling tools designed for c++. The code I am rewriting uses Eigen and SIMD intrinsic instructions, so I need to compile with the -march=native
flag. I was getting a memory access violation error when running my main project. Here is a slimmed down version of the code that causes the issue:
#include <iostream>
#include <fstream>
#include <string>
#include <sys/stat.h>
#include <immintrin.h>
#include <Eigen/Dense>
#include "Parameters.h"
using namespace std;
int main()
{
Parameters p;
p.na = 16;
p.TXangle = Eigen::VectorXd::LinSpaced(p.na,0,p.na-1);
cout << p.TXangle << endl;
cout << "Hello world!" << endl;
return 0;
}
where Parameters is a custom class defined with the following two files:
Parameters.h
#ifndef PARAMETERS_H_INCLUDED
#define PARAMETERS_H_INCLUDED
class Parameters
{
public:
int na;
Eigen::VectorXd TXangle;
Parameters();
};
#endif // PARAMETERS_H_INCLUDED
Parameters.cpp
#include <string>
#include <Eigen/Dense>
#include "Parameters.h"
Parameters::Parameters()
{
//ctor
}
The line that's breaking is when p.TXangle is initialized. At that point, the program throws the (0xC0000005) error. If I don't compile with '-march=native' then the error doesn't happen and the program runs fine. When building with '-march=native' I also get several alignment warnings. My computer supports upto AVX2 instructions and I'm compiling with MinGW GCC (not sure how to check the version of gcc on codeblocks).
gcc version is 8.1.0
Update: Is this the value @Sedenion was asking about in the comments?
This is the exact line the debugger stops at:
**Update: ** Based on the discussion in the comments, the disassembler shows that the code is at this assembly instruction when it fails:
I'm struggling to interpret this, reading assembly is still a bit new to me. Here are the registers at that same point:
Solution
I managed to reproduce this problem using Eigen 3.4.0 and mingw (gcc 8.1.0 with -mavx -m64 -std=c++17 -g
) on Windows using AVX (-mavx
, also enabled by -march=native
for the OP). As already suspected by people in the comments, it is certainly the issue that mingw-gcc fails to align stack variables correctly to 32 bytes, which is required by AVX (compare the bug issue for gcc, also see e.g. this post).
The crash is not related to the use of VectorXd
(which should not suffer from it since it uses dynamic memory allocations). Rather, the Eigen::VectorXd::LinSpaced()
call is the issue. In Eigen, this eventually calls the following function involving AVX instructions in PacketMath.h:
template<> EIGEN_STRONG_INLINE Packet4d plset<Packet4d>(const double& a) {
return _mm256_add_pd(_mm256_set1_pd(a), _mm256_set_pd(3.0,2.0,1.0,0.0));
}
In this call, temporary stack variables are involved which are not 32 byte aligned by mingw. At one point, an aligned mov vmovapd
is attempted to such a non-aligned address:
mov rax,QWORD PTR [rbp+0x10]
vmovapd YMMWORD PTR [rax],ymm0
For example, in one run, I got rax=0x67f890
which is only 16 byte but not 32 byte aligned. A minimal reproducible example that captures the behavior is the following (https://godbolt.org/z/qbE6z1nb8, note that mingw is not supported on godbolt, so the problem does not appear there):
#include <iostream>
#include <immintrin.h>
__m256d Set(const double&) {
__m256d temp = _mm256_setzero_pd(); // Crashes on mingw
return temp;
}
int main() {
Set(2);
std::cerr << "End" << std::endl;
}
The unused parameter to Set()
is just there to get an offset on the stack to trigger the issue.
It crashes using mingw-gcc (with -mavx -m64
) but runs fine when compiled with clang or MSVC on Windows. It also runs fine on all Linux compilers I have tried.
So, in short, your code is correct and the crash occurs in Eigen. There is no "magic switch" for gcc to fix this. Hence, I guess you have 3 options:
- Wait for the mingw problem to get fixed. According to the posts, it still persists in gcc 11.2.0. But considering the long history, I doubt it will get fixed soon.
- Do not compile with AVX or higher (or
-march=native
) and stick to <=SSE4.2 instead. Of course, this might impact performance. I'd advise to profile to be sure if this is the case for you. - Use another compiler such as clang or MSVC.
Answered By - Sedenion Answer Checked By - Clifford M. (WPSolving Volunteer)