泰晓科技 -- 聚焦 Linux - 追本溯源,见微知著!


LWN 382559: `NO_BOOTMEM` 补丁

Wang Chen 创作于 2018/09/05

原文:The NO_BOOTMEM patches 原创:By corbet @ April 7, 2010 翻译:By unicornx 校对:By Wu Zhangjin

Every kernel development cycle seems to involve one set of patches which turn out to be more trouble than had been expected. With 2.6.34, that award should probably go to the patches found under the somewhat confusing CONFIG_NO_BOOTMEM option.

在每个内核开发周期中似乎都会碰到一些会给大家带来额外 “惊喜” 的补丁(译者注:此处为反语)。在 2.6.34 版本开发期间,该 “荣誉” 属于引入 CONFIG_NO_BOOTMEM 选项的补丁,它给大家带来了不小的麻烦 。

“Bootmem” is a simple, low-level memory allocator used by the kernel during the early parts of the bootstrap process. One might think that the kernel does not need yet another allocator, but the memory management code used during operation requires that much of the kernel already be functional before it can be called. Getting to that point involves a chain of increasingly complicated memory allocation mechanisms; on the x86 architecture, those begin the “early_res” mechanism which takes over from the BIOS “e820” facility. Once things get a little farther, the architecture-independent bootmem allocator takes over, followed, eventually, by the full buddy allocator.

“Bootmem” 是内核在初始化过程的早期阶段所使用的一种简单的偏底层的内存分配器(译者注,相对于内核正常运行时所使用的虚拟内存管理子系统)。有些人可能认为内核并不需要这样的分配器,但是实际情况是内核正常运行期间所使用的内存管理代码过于复杂,在其能够正常工作之前还必须依赖于内核的其他子模块的运行。从系统上电到内核虚拟内存管理开始运行涉及一系列越来越复杂的内存分配机制;在 x86 架构上,内核首先基于 “e820” 机制(译者注,具体参考 维基百科的 e820 词条)从 BIOS 获取物理内存信息,并按照 “early_res” 方式对其进行管理(译者注:具体参考内核代码的 arch/x86/kernel/e820.c)。随着系统的进一步初始化,内存管理由独立于体系架构的 bootmem 分配器所临时接管,最后正式过渡给伙伴(buddy)分配器。

Yinghai Lu came to the conclusion that things could be simplified considerably if the bootmem stage were taken out of the picture. The result was a series of patches which extends the use of the early_res mechanism for long enough to bootstrap the buddy allocator. These changes were merged for 2.6.34, but the old bootmem-based code was left behind. The CONFIG_NO_BOOTMEM option controls which allocator is used, with the default being to short out bootmem.

Yinghai Lu 研究后认为,如果将 bootmem 阶段从整个初始化过程中删除,事情可以大大简化。基于该思路他提交了一套补丁,其基本思路是将 “early_res” 阶段尽可能地延长直到足以引导伙伴(buddy)分配器。该更改合入了内核版本 2.6.34,原有的基于 bootmem 的代码被保留。用户可以通过新增的 CONFIG_NO_BOOTMEM 选项控制是否启用 bootmem 分配器,缺省设置为 y,即不启用。

This is a significant change to the crucial and tricky early bootstrap code, so few people were surprised when some regressions were reported against 2.6.34-rc1. When the reports continued to arrive after -rc3, though, the level of irritation began to grow, to the point that Linus started talking about reverting the whole thing. Nobody seemed to dislike the objectives of the patches, but system-killer regressions after -rc3, along with the twisted mess of #ifdefs created by the patch and the fact that it was on by default led to some grumpiness.

这是一个对关键且微妙的内核初始化早期阶段部分代码的重大改变,所以当第一阶段 2.6.34-rc1 的测试中爆出问题时,大家并没有觉得有什么不正常。但是当集成测试进展到第三阶段 -rc3 时,问题还是没有解决,终于开始有人对此提出抱怨,并且反对的声音逐渐高涨,一度导致 Linus 也开始谈论是否需要回退该补丁。其实大家一开始还是支持该补丁的改动的,但由于一直持续到 -rc3 之后该问题仍然存在,以及补丁中大量条件编译 #ifdef 的随意使用,特别地由于该补丁默认不启用 bootmem 导致了一些系统启动失败,引发了一定程度的混乱,让大家对它实在忍无可忍。

Normally, new features are expected to be configured out by default; to the greatest extent possible, a new kernel should behave as much like its predecessors as possible when the default options are taken. In this case, the default led to significant changes and problems. The purpose of this option was twofold: to allow the new code to be configured out when it proved to be problematic, and to ensure that it was well tested in the mean time. Certainly it was successful on both fronts, even if some of the testers proved to be not entirely willing.


As of this writing, it would appear that the worst problems have been fixed; talk of removing the no-bootmem code has subsided. Eventually, perhaps, all architectures will make similar changes and the bootmem code can be removed entirely. Meanwhile, Yinghai has a new set of changes on the horizon for 2.6.35: replacing the early_res code with the “logical memory block” allocator currently used by some other architectures. That change looks even more disruptive than the bootmem elimination was.

截至本稿发布,这个严重的问题看上去已经得到了解决;至少已经很少听到那些要求删除该补丁的抱怨声了。最终,也许所有架构都会进行类似的改进,并最终完全剔除 bootmem 的代码。与此同时,Yinghai 针对 2.6.35 又提交了一系列新的更改:这次他期望将 “early_res” 的代码替换为其他一些架构当前使用的 “logical memory block” 分配器(译者注:即最新内核中使用的 memblock 分配器)。这个改进看起来比本文所介绍的去除 bootmem 的改动更加激进。

Read Album:

Read Related:

Read Latest: