Wednesday, March 16, 2022

[SOLVED] C++ in containered Linux environment: why does attempting to allocate large vector causes SIGABRT or neverending loop instead of bad_alloc?

Issue

I am writing in C++ on a Windows machine in three environments:

  1. Docker Linux container with Ubuntu - g++ compiler (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
  2. WSL Ubuntu enviornment on Windows - g++ compiler (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
  3. Windows - gcc compiler (i686-posix-dwarf-rev0, Built by MinGW-W64 project) 8.1.0

I am working with enormous datasets and basically need to be able to break them down into as-large-as-possible chunks to manipulate in memory. To find the size of blocks, I imagined something like the following:

size_t findOptimalSize(long long beginningSize) {
    long long currentSize = beginningSize;
    while (currentSize > 0) {
      try {
        std::vector<double> v(currentSize);
        return currentSize;
      } catch (std::bad_alloc &ex) {
        currentSize /= 10;
      }
    }
    return 0;
  }

int main() {
  long long size(50000000000000);
  try {
    std::vector<double> v(size);
    std::cout << "success" << std::endl;
  } catch (std::bad_alloc &ex){
    std::cout << "badAlloc" << std::endl;
  }
  size_t optimal = findOptimalSize(size);
  std::cout << "optimal size: " + std::to_string(optimal);
  return 0;
}

The above code behaves perfectly as expected in the Windows environment. In the two Linux environments, however, while it is always able to throw the first bad_alloc exception, it then either does one of two things:

  1. Throws a SIGABRT with the following message:

new cannot satisfy memory request. This does not necessarily mean you have run out of virtual memory. It could be due to a stack violation caused by e.g. bad use of pointers or an out-of-date shared library

  1. After a few iterations, seems to fall into an endless loop at the line std::vector<double> v(currentSize); (My best guess there is that it gets close to the amount of memory the Linux environment has available and it gets stuck waiting for that extra little bit of extra memory to be freed to satisfy my request)

Is there some way to accomplish what I'm attempting and sidestep these issues in the Linux environments? I'm guessing that the containers complicate matters and muddy up my simple allocation checks with their complex memory management logic.

How can I check whether I can allocate the memory in these circumstances?


Solution

Containers don't have complex memory management logic. What you're seeing is a result of a surprising Linux policy known as memory overcommit.

In Linux large allocations do not fail; malloc() always succeeds. The memory isn't actually allocated until you actually attempt to use it. If the OS can't satisfy the need it invokes the OOM killer, killing processes until it frees up enough memory.

Why does this exist?

A Linux computer typically has a lot of heterogeneous processes running in different stages of their lifetimes. Statistically, at any point in time, they do not collectively need a mapping for every virtual page they have been assigned (or will be assigned later in the program run).

A strictly non-overcommitting scheme would create a static mapping from virtual address pages to physical RAM page frames at the moment the virtual pages are allocated. This would result in a system that can run far fewer programs concurrently, because a lot of RAM page frames would be reserved for nothing.

(source)

You might find this ridiculous. You wouldn't be alone. It's a highly controversial system. If your initial reaction is "this is stupid," I encourage you to read up on it and suspend judgment for a bit. Ultimately, whether you like overcommit or not, it's a fact of life that all Linux developers have to accept and deal with.



Answered By - John Kugelman
Answer Checked By - Mildred Charles (WPSolving Admin)