Tuesday, February 22, 2022

[SOLVED] Kernel Build Caching/Nondeterminism

Issue

I run a CI server which I use to build a custom linux kernel. The CI server is not powerful and has a time limit of 3h per build. To work within this limit, I had the idea to cache kernel builds using ccache. My hope was that I could create a cache once every minor version release and reuse it for the patch releases e.g. I have a cache I made for 4.18 which I want to use for all 4.18.x kernels.

After removing the build timestamps, this works great for the exact kernel version I am building for. For the 4.18 kernel referenced above, building that on the CI gives the following statistics:

$ ccache -s
cache directory                     
primary config                      
secondary config      (readonly)    /etc/ccache.conf
stats zero time                     Thu Aug 16 14:36:22 2018
cache hit (direct)                 17812
cache hit (preprocessed)              38
cache miss                             0
cache hit rate                    100.00 %
called for link                        3
called for preprocessing           29039
unsupported code directive             4
no input file                       2207
cleanups performed                     0
files in cache                     53652
cache size                           1.4 GB
max cache size                       5.0 GB

Cache hit rate of 100% and an hour to complete the build, fantastic stats and as expected.

Unfortunately, when I try to build 4.18.1, I get

cache directory                     
primary config                      
secondary config      (readonly)    /etc/ccache.conf
stats zero time                     Thu Aug 16 10:36:22 2018
cache hit (direct)                     0
cache hit (preprocessed)             233
cache miss                         17658
cache hit rate                      1.30 %
called for link                        3
called for preprocessing           29039
unsupported code directive             4
no input file                       2207
cleanups performed                     0
files in cache                     90418
cache size                           2.4 GB
max cache size                       5.0 GB

That's a 1.30% hit rate and the build time reflects this poor performance. That from only a single patch version change.

I would have expected the caching performance to degrade over time but not to this extent, so my only thought is that there is more non-determinism than simply the timestamp. For example, are most/all of the source files including the full kernel version string? My understanding is that something like that would break the caching completely. Is there a way to make the caching work as I'd like it to or is it impossible?


Solution

There is include/generated/uapi/linux/version.h header (generated in the top Makefile https://elixir.bootlin.com/linux/v4.16.18/source/Makefile)

which includes exact kernel version as macro:

version_h := include/generated/uapi/linux/version.h
old_version_h := include/linux/version.h

define filechk_version.h
    (echo \#define LINUX_VERSION_CODE $(shell                         \
    expr $(VERSION) \* 65536 + 0$(PATCHLEVEL) \* 256 + 0$(SUBLEVEL)); \
    echo '#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))';)
endef

$(version_h): $(srctree)/Makefile FORCE
    $(call filechk,version.h)
    $(Q)rm -f $(old_version_h)

So, version.h for linux 4.16.18 will be generated like (266258 is (4 << 16) + (16 << 8) + 18 = 0x41012)

#define LINUX_VERSION_CODE 266258
#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))

Later, for example in module building there should be way to read LINUX_VERSION_CODE macro value https://www.tldp.org/LDP/lkmpg/2.4/html/lkmpg.html (4.1.6. Writing Modules for Multiple Kernel Versions)

The way to do this to compare the macro LINUX_VERSION_CODE to the macro KERNEL_VERSION. In version a.b.c of the kernel, the value of this macro would be 2^{16}a+2^{8}b+c. Be aware that this macro is not defined for kernel 2.0.35 and earlier, so if you want to write modules that support really old kernels

How version.h is included? The sample module includes <linux/kernel.h> <linux/module.h> and <linux/modversions.h>, and one of these files probably indirectly includes global version.h. And most or even all kernel sources will include version.h.

When your build timestamps were compared, version.h may be regenerated and disables ccache. When timestamps are ignored, LINUX_VERSION_CODE is same only for exactly same linux kernel version, and it is changed for next patchlevel.

Update: Check gcc -H output of some kernel object compilation, there will be another header with full kernel version macro definition. For example: include/generated/utsrelease.h (UTS_RELEASE macro), include/generated/autoconf.h (CONFIG_VERSION_SIGNATURE).

Or even do gcc -E preprocessing of same kernel object compilation between two patchlevels and compare the generated text. With simplest linux module I have -include ./include/linux/kconfig.h directly in gcc command line, and its includes include/generated/autoconf.h (but this is not visible in -H output, is it bug or feature of gcc?).

https://patchwork.kernel.org/patch/9326051/

... because the top Makefile forces to include it with:

-include $(srctree)/include/linux/kconfig.h

It actually does: https://elixir.bootlin.com/linux/v4.16.18/source/Makefile

# Use USERINCLUDE when you must reference the UAPI directories only.
USERINCLUDE    := \
        -I$(srctree)/arch/$(SRCARCH)/include/uapi \
        -I$(objtree)/arch/$(SRCARCH)/include/generated/uapi \
        -I$(srctree)/include/uapi \
        -I$(objtree)/include/generated/uapi \
                -include $(srctree)/include/linux/kconfig.h

# Use LINUXINCLUDE when you must reference the include/ directory.
# Needed to be compatible with the O= option
LINUXINCLUDE    := \
        -I$(srctree)/arch/$(SRCARCH)/include \
        -I$(objtree)/arch/$(SRCARCH)/include/generated \
        $(if $(KBUILD_SRC), -I$(srctree)/include) \
        -I$(objtree)/include \
        $(USERINCLUDE)

LINUXINCLUDE is exported to env and used in source/scripts/Makefile.lib to define compiler flags https://elixir.bootlin.com/linux/v4.16.18/source/scripts/Makefile.lib

  c_flags        = -Wp,-MD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE)    


Answered By - osgx
Answer Checked By - Katrina (WPSolving Volunteer)