泰晓科技 -- 聚焦 Linux - 追本溯源,见微知著!

泰晓 Linux 实验盘,不用安装,即插即跑

LWN 228143: 可延迟定时器

Wang Chen 创作于 2019/12/31

请点击 LWN 中文翻译计划,了解更多详情。

原文:Deferrable timers 原创:By Jonathan Corbet @ Mar. 28, 2007 翻译:By unicornx 校对:By Shaolin Deng

The dynamic tick code featured in the upcoming 2.6.21 kernel seeks to avoid processor wakeups by turning off the period timer tick when nothing is happening. Before stopping the clock, the kernel must decide when it should wake up again; this decision involves looking at the timer queue to see when the next timer expires. In the absence of other events (hardware interrupts, for example), the system will sleep until the nearest timer is due.

即将随 2.6.21 版本内核发布的 “动态时钟节拍(dynamic tick)” 功能在系统空闲时通过关闭周期定时器中断来避免唤醒处理器。在停止时钟中断前,内核必须确定何时应该再次重启中断。为此内核需要遍历定时器队列来检查最近一次到期的定时器的时间。在没有其他(诸如硬件中断)事件的影响下,系统将一直睡眠,直到最近的定时器事件到期才会被唤醒。

Many of these timers should, in fact, run as soon as the requested period has expired. Others, however, are less important - to the point that they are not worth waking up the processor. These non-critical timeouts can run some fraction of a second later (when the processor wakes up for other reasons) and nobody will notice the difference. So it would be nice if there were a way to tell the kernel that a specific timer does not require immediate action on expiration and that the processor should not wake up for the sole purpose of handling it.


Venki Pallipadi has created such a way with the deferrable timers patch. There is just one new function added to the internal kernel API:

出于该目的,Venki Pallipadi 提交了一份 可延迟定时器(deferrable timers)补丁(译者注,该补丁随 2.6.22 合入内核,具体的最终代码提交请参考 这里)。内核内部仅添加了一个新的 API:

void init_timer_deferrable(struct timer_list *timer);

Timers which are initialized in this fashion will be recognized as deferrable by the kernel. They will not be considered when the kernel makes its “when should the next timer interrupt be?” decision. When the system is busy these timers will fire at the scheduled time. When things are idle, instead, they will simply wait until something more important wakes up the processor.

用该方法初始化的定时器将被内核标识为为 “可延迟” 的。当内核查找 “下一个会发生的定时器中断” 时,将会忽略这类定时器。系统繁忙时,这些定时器仍然将按时超时并被触发执行。但当系统空闲时,它们的到期事件将会被推迟,直到有更重要的事件唤醒处理器时才会一并处理它们。

Venki appears to have gone to great length to minimize the changes required by this patch. So, in particular, the timer_list structure does not change at all. Instead, the low-order bit on an internal pointer (which is known to always be zero) is repurposed as a “deferrable” flag. The result is that the timer_list structure does not grow to support this new functionality, at the cost of requiring all code using the internal base pointer to mask out the “deferrable” bit.

看上去 Venki 已竭尽全力避免该补丁可能引入的改动。其实现的效果是,timer_list 这个结构体的定义完全不变。唯一的修改只是复用了该结构体内部的一个指针成员(译者注,即 base)上最低的那个比特位(正常情况下该比特位应该始终为零)作为标识 “可延迟” 的标志。所以,在支持该新特性的同时,timer_list 结构体的大小不会变大,但代价是要求所有会访问该 base 指针的代码不要覆盖新定义的 “可延迟” 比特位。

The patch, as presented, only affects timers used within the kernel; no code has been changed to actually use deferrable timers yet. There could be potential in extending this interface somehow to user space. Our user space remains full of applications which feel the need to wake up frequently to check the state of the world; these applications are a real problem for power-limited systems. If those applications truly cannot be fixed, perhaps they could at least indicate a willingness to wait when nothing important is going on.

当前该补丁仅会影响内核中对定时器的使用。其他部分的代码还没有真正开始实际使用 “可延迟定时器” 这项功能。或许可以将该项特性暴露给用户空间。用户空间的进程经常需要设置超时来唤醒自己检查某些状态。但这么做对于功耗敏感的系统来说是一个大问题。如果实在无法修改这些应用自身的逻辑的话,或许可以尝试一下让应用在启动定时器时通知一下内核是否可以推迟触发,这样在系统空闲期间,即使定时器到期,内核也不会立刻触发这些事件。

请点击 LWN 中文翻译计划,了解更多详情。

Read Album:

Read Related:

Read Latest: