泰晓科技 -- 聚焦 Linux - 追本溯源,见微知著!
网站地址:https://tinylab.org

儿童Linux系统,可打字编程学数理化
请稍侯

RISC-V Linux 内核及周边技术动态第 111 期

呀呀呀 创作于 2024/09/30

时间:20240930
编辑:晓瑜
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS

内核动态

RISC-V 架构支持

v1: riscv: mm: check the SV39 rule

SV39 rule: the address of bits[63..39] should be the same as bit[38], it is easy to violate if configure PAGE_OFFSET too small.

v2: riscv: add Svukte extension

RISC-V privileged spec will be added with Svukte extension

v2: Introduce support for T-head TH1520 Mailbox

The T-head TH1520 SoC supports a hardware mailbox that enables two cores within the SoC to communicate and coordinate.

v2: Add the dwmac driver support for T-HEAD TH1520 SoC

v2: Add initial support for Canaan Kendryte K230 pinctrl

This patch series introduces support for the pinctrl driver of the Canaan K230 SoC.

v1: irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()

An example where interrupt is both disabled and masked is when handle_fasteoi_irq() is the handler, and IRQS_ONESHOT is set.

v2: Support SSTC while PM operations

When the cpu is going to be hotplug, stop the stimecmp to prevent pending interrupt.

v3: riscv: Idle thread using Zawrs extension

This patch series introduces a new implementation of idle thread using Zawrs extension.

v3: Add Framework FRANME0000 dts

This is a developer-focused product, aimed at making tinkering with RISC-V more accessible.

v1: mmc: core: Only set maximum DMA segment size if DMA is supported

Since upstream commit 334304ac2bac (“dma-mapping: don’t return errors from dma_set_max_seg_size”) calling dma_set_max_seg_size() on a device not supporting DMA results in a warning traceback. This is seen when booting the sifive_u machine from SD.

GIT PULL: RISC-V Patches for the 6.12 Merge Window, Part 1

There are two conflicts here. The IRQ one seems pretty straight-forward, just two features colliding.

v1: Add initial support for Canaan Kendryte K230 reset controller

This patch series add reset controller support for the Canaan Kendryte K230 SoC.

v2: CAST Controller Area Network driver support

This patchset adds support for the CAST Controller Area Network Bus Controller (version fd-7x10N00S00) which is used in StarFive JH7110 SoC.

LoongArch 架构支持

v6: Consolidate IO memcpy functions

Thank you Catalin for the feedback. It’s not a nitpick. I have addressed it, and added the architecture before the message for the 3 commits that modify arch code.

v5: Add EDAC driver for loongson memory controller

Add a simple EDAC driver which report single bit errors (CE) only on loongson platform.

v2: ASoC: Some issues about loongson i2s

This patch set is mainly about Loongson i2s related issues.

v1: compiler.h: Specify correct attribute for .rodata..c_jump_table

Currently, there is an assembler message when generating kernel/bpf/core.o under CONFIG_OBJTOOL with LoongArch compiler toolchain.

ARM 架构支持

v1: ARM: topology: Allow missing CPU clock-frequency device-tree property

Allow the fallback mechanism to continue by assuming the same nominal frequency for all CPU cores, while still benefiting from the static coefficient provided by the compatible-driven table entries.

v1: KVM: arm64: Another reviewer reshuffle

It has been a while since James had any significant bandwidth to review KVM/arm64 patches.

v2: Add minimal boot support for IPQ5424

This series adds minimal board boot support for ipq5424-rdp466 board.

v1: PCI: add enabe(disable)_device() hook for bridge

Some system’s IOMMU stream(master) ID bits(such as 6bits) less than pci_device_id (16bit).

v3: Add initial support for QCS615 SoC and QCS615 RIDE board

Introduces the Device Tree for the QCS615 platform.

v1: Add I2C mux on BUS 14 for yosemite4

  • Add i2c-mux for ADC monitor on Spider Board.
  • Revise adc128d818 adc mode on Fan Boards.
  • Change the address of Fan IC on fan boards.
  • Remove led gpio pca9552 on fan boards.
  • Add i2c mux for for two fan boards.

v2: soc: imx8m: Probe the SoC driver as platform driver

With driver_async_probe=* on kernel command line, the following trace is produced because on i.MX8M Plus hardware because the soc-imx8m.c driver calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock driver is not yet probed. This was not detected during regular testing without driver_async_probe.

v5: Add support for new IMX8MP based board

This series originally included the dt-binding for that Type-C port controller but I finally removed it based on a good comment from Krzysztof.

v2: Add initial support for QCS8300 SoC and QCS8300 RIDE board

Introduce the Device Tree for the QCS8300 platform.

v5: Initial device trees for A7-A11 based Apple devices

This series adds device trees for all A7-A11 SoC based iPhones, iPads, iPod touches and Apple TVs.

v1: Revise Meta (Facebook) Minerva BMC (AST2600)

Revise linux device tree entry related to Meta (Facebook) Minerva specific devices connected to BMC (AST2600) SoC.

v5: Do not shatter hugezeropage on wp-fault

It was observed at [1] and [2] that the current kernel behaviour of shattering a hugezeropage is inconsistent and suboptimal.

v2: Adjust the setting for SPI flash of yosemite4

  • Split the patches for different targets.

v2: ARM: bcm: Support BCMBCA debug UART

The debug UART on the BCMBCA SoCs are in a different place than on the other BCM platforms. Support this with a static map when debugging is explicitly configured.

X86 架构支持

v6: platform/x86: introduce asus-armoury driver

The idea for this originates from a conversation with Mario Limonciello https://lore.kernel.org/platform-driver-x86/371d4109-a3bb-4c3b-802f-4ec27a945c99@amd.com/

GIT PULL: locking changes for v6.12

v1: x86/apic: Stop the TSC Deadline timer during lapic timer shutdown

This stops the local APIC timer for one-shot and periodic mode only. In TSC deadline mode, the timer is not properly stopped.

v1: x86/ibt: FineIBT-BHI

The thing I picked was FineIBT-BHI, an alternative mitigation for the native-BHI issue, something that I implemented somewhere late last year while the whole thing was still embargoed.

v1: Handle MMIO during event delivery error on SVM

This patch series eliminates this difference by returning a KVM internal error with suberror = KVM_INTERNAL_ERROR_DELIVERY_EV when guest is performing MMIO during event delivery, for both VMX and SVM.

v3: platform/x86/tuxedo: Add virtual LampArray for TUXEDO NB04 devices

The TUXEDO Sirius 16 Gen1 and TUXEDO Sirius 16 Gen2 devices have a per-key controllable RGB keyboard backlight. The firmware API for it is implemented via WMI.

v1: “custom” ACPI platform profile support

There are two major ways to tune platform performance in Linux:

  • ACPI platform profile
  • Manually tuning APU performance

v7: mm: multi-gen LRU: Walk secondary MMU page tables while aging

This patchset makes it possible for MGLRU to consult secondary MMUs while doing aging, not just during eviction. This allows for more accurate reclaim decisions, which is especially important for proactive reclaim.

v1: x86: Rely on toolchain for relocatable code

The x86_64 port has a number of historical quirks that result in a reliance on toolchain features that are either poorly specified or basically implementation details of the toolchain.

进程调度

v1: sched: Complete Renaming of scheduler_tick() to sched_tick()

scheduler_tick() was already renamed to sched_tick(), but this was missed. The previous commit record can be found at https://lore.kernel.org/all/Zer1o5bhkiq1cxaj@gmail.com/

内存管理

v1: Introduce ptr_eq() to preserve address dependency

Introduce ptr_eq() to compare two addresses while preserving the address dependencies for later use of the address. It should be used when comparing an address returned by rcu_dereference().

v8: mm: zswap swap-out of large folios

This patch-series enables zswap_store() to accept and store large folios. The most significant contribution in this series is from the earlier RFC submitted by Ryan Roberts [1].

v1: compiler.h: Introduce ptr_eq() to preserve address dependency

Compiler CSE and SSA GVN optimizations can cause the address dependency of addresses returned by rcu_dereference to be lost when comparing those pointers with either constants or previously loaded pointers.

[RFC/PATCH bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc (v2)

I’m proposing a new iterator and a kfunc for the slab memory allocator to get information of each kmem_cache like in /proc/slabinfo or /sys/kernel/slab in more flexible way.

v1: implement lightweight guard pages

This series takes a different approach - an idea suggested by Vlasimil Babka (and before him David Hildenbrand and Jann Horn - perhaps more - the provenance becomes a little tricky to ascertain after this - please forgive any omissions!) - rather than locating the guard pages at the VMA layer, instead placing them in page tables mapping the required ranges.

v1: zswap: improve memory.zswap.writeback inheritance

Improve the inheritance behavior of the memory.zswap.writeback cgroup attribute introduced during the 6.11 cycle.

v1: mm/huge_memory: check pmd_special() only after pmd_present()

This fixes confusing migration entries as PFN mappings, and not doing what we are supposed to do in the “is_swap_pmd()” case further down in the function – including messing up COW, page table handling and accounting.

v3: mm/madvise: unrestrict process_madvise() for current process

The process_madvise() call was introduced in commit ecb8ac8b1f14 (“mm/madvise: introduce process_madvise() syscall: an external memory hinting API”) as a means of performing madvise() operations on another process.

v2: Support large folios for tmpfs

This RFC patch series attempts to support large folios for tmpfs.

[RFC/PATCH bpf-next 0/3] bpf: Add slab iterator and kfunc (v1)

I’m proposing a new iterator and a kfunc for the slab memory allocator to get information of each kmem_cache like in /proc/slabinfo or /sys/kernel/slab.

v2: Introduce panic function when slub leaks

A method to detect slub leaks by monitoring its usage in real time on the page allocation path of the slub.

v1: memblock: Initialized the memory of memblock.reserve to the MIGRATE_MOVABL

After sparse_init function requests memory for struct page in memblock and adds it to memblock.reserved, this memory area is present in both memblock.memory and memblock.reserved.

v1: mm: migrate LRU_REFS_MASK bits in folio_migrate_flags

Bits of LRU_REFS_MASK are not inherited during migration which lead to new_folio start from tier0. Fix this by migrate the bits domain.

v1: dcssblk: Mark DAX broken

The dcssblk driver has long needed special case supoprt to enable limited dax operation, so called CONFIG_FS_DAX_LIMITED.

v3: mm: Make SPLIT_PTE_PTLOCKS depend on SMP

This in turn causes the m68k “q800” and “virt” machines to crash in qemu if debugging options are enabled.

v1: exec: add a flag for “reasonable” execveat() comm

This patch adds an AT_ flag to fix up /proc/pid/comm to instead be the contents of argv[0], instead of the fdno.

v2: unrestrict process_madvise() for current process

This patch series eliminates both limitations. This series also introduces a series of self-tests for this feature asserting that the flag functions as expected.

v1: mm/memory_hotplug: Print the correct pfn in do_migrate_range()

The pfn value needs to be retrieved correctly when PageTransHuge(page) is true. Fix it by replacing the usage of ‘pfn’ with ‘page_to_pfn(page)’ to ensure the correct pfn is printed in warning messages when isolation fails.

v1: mm: do not export const kfree and kstrdup variants

Both kfree_const() and kstrdup_const() use __start_rodata and __end_rodata, which do not work for modules. This is especially important for kfree_const().

v1: Userspace Can Control Memory Failure Recovery

Recently there is an enforcement on the userspace control over how kernel handles memory with corrected memory errors [1]. This RFC wants to extend userspace’s control to how the kernel deals with uncorrectable memory errors, so userspace can now control all aspects of memory failure recovery (MFR).

文件系统

v1: add group restriction bitmap

This patch adds the group restriction bitmap.

v2: vfs: Add a sysctl for automated deletion of dentry

This patch seeks to reintroduce the concept conditionally, where the associated dentry is deleted only when the user explicitly opts for it during file removal.

v3: fuse: folio conversions

v1: fanotify: allow reporting errors on failure to open fd

When working in “fd mode”, fanotify_read() needs to open an fd from a dentry to report event->fd to userspace.

v1: netfs: Advance iterator correctly rather than jumping it

This becomes more problematic when we use a bounce buffer made out of single-page folios to cover a multipage pagecache folio.

v1: pidfs: check for valid pid namespace

The user namespace is fine because it is only released when the last reference to struct task_struct is put and exit_creds() is called.

v1: rust: add PidNamespace wrapper

Ok, so here’s my feeble attempt at getting something going for wrapping struct pid_namespace as struct pid_namespace indirectly came up in the file abstraction thread.

v1: Miscdevices in Rust

A misc device is generally the best place to start with your first Rust driver, so having abstractions for miscdevice in Rust will be important for our ability to teach Rust to kernel developers.

v1: sysctl: Reduce dput(child) calls in proc_sys_fill_cache()

v2: fs: ext4: support relative path for journal_path in mount option.

The fs_lookup_param did not consider the relative path for block device. When we mount ext4 with journal_path option using relative path, param->dirfd was not set which will cause mounting error.

v1: add block size > page size support to ramfs

Add block size > page size to ramfs as we support minimum folio order allocation in the page cache.

GIT PULL: BPF struct_fd changes for 6.12

This pull includes struct_fd BPF changes from Al and Andrii.

v6: per-io hints and FDP

Another spin to incorporate the feedback from LPC and previous iteration. The series adds two capabilities:

  • FDP support at NVMe level (patch #1)
  • Per-io hinting via io_uring (patch #3)

GIT PULL: sysctl changes for v6.12-rc1

  • Bug fix: Avoid evaluating non-mount ctl_tables as a sysctl_mount_point by removing the unlikely (but possible) chance that the permanently empty ctl_table array shares its address with another ctl_table.
  • Update Joel Granados’ contact info in MAINTAINERS.

v1: xarray: rename xa_lock/xa_unlock to xa_enter/xa_leave

Functions such as __xa_store() may temporarily unlock the internal spinlock if allocation is necessary.

git pull: struct fd layout changes

Just the layout change and conversion to accessors (invariable branch in vfs.git#stable-struct_fd).

v1: blk: optimization for classic polling

This removes the dependency on interrupts to wake up task. Set task state as TASK_RUNNING, if need_resched() returns true, while polling for IO completion.

网络设备

v2: vhost/vsock: specify module version

Add an explicit MODULE_VERSION(“0.0.1”) specification for the vhost_vsock module.

v1: bpf: Prevent infinite loops with bpf_redirect_peer

It is possible to create cycles using bpf_redirect_peer which lead to an an infinite loop inside __netif_receive_skb_core.

v1: net: bridge: mcast: Fail MDB get request on empty entry

When user space deletes a port from an MDB entry, the port is removed synchronously. If this was the last port in the entry and the entry is not joined by the host itself, then the entry is scheduled for deletion via a timer.

v1: arcnet: com20020-pci: Add check devm_kasprintf() returned value

devm_kasprintf() can return a NULL pointer on failure but this returned value in com20020pci_probe() is not checked.

v6: PCIe TPH and cache direct injection support

This series introduces generic TPH support in Linux, allowing STs to be retrieved and used by PCIe endpoint drivers as needed.

v2: iwl-net: ice: Flush FDB entries before reset

Triggering the reset while in switchdev mode causes errors[1]. Rules are already removed by this time because switch content is flushed in case of the reset.

v1: net: phy: realtek: Check the index value in led_hw_control_get

Just like rtl8211f_led_hw_is_supported() and rtl8211f_led_hw_control_set(), the rtl8211f_led_hw_control_get() also needs to check the index value, otherwise the caller is likely to get an incorrect rules.

v1: net: ppp: do not assume bh is held in ppp_channel_bridge_input()

Networking receive path is usually handled from BH handler.

v1: net: retain NOCARRIER on protodown interfaces

Make interface with enabled protodown to retain NOCARRIER state during transfer of operstate from its lower device.

v2: net-next: net/smc: Introduce a hook to modify syn_smc at runtime

The introduction of IPPROTO_SMC enables eBPF programs to determine whether to use SMC based on the context of socket creation, such as network namespaces, PID and comm name, etc.

v5: net: systemport: Add error pointer checks in bcm_sysport_map_queues() and bcm_sysport_unmap_queues()

Add error pointer checks in bcm_sysport_map_queues() and bcm_sysport_unmap_queues() after calling dsa_port_from_netdev().

GIT PULL: Networking for v6.12-rc1

It looks like that most people are still traveling: both the ML volume and the processing capacity are low.

v1: net-next: gve: Link IRQs, queues, and NAPI instances

This RFC uses the netdev-genl API to link IRQs and queues to NAPI IDs so that this information is queryable by user apps.

v1: net-next: idpf: Don’t hardcode napi_struct size

I’m submitting this as an RFC so the Intel folks have time to take a look and request changes, but I plan to submit this next week when net-next reopens.

v2: net-next: e1000/e1000e: Link IRQs, NAPIs, and queues

This RFC v2 follows from an RFC submission I sent [1] for e1000e. The original RFC added netdev-genl support for e1000e, but this new RFC includes a patch to add support for e1000, as well.

v2: net-next: tg3: Link IRQs, NAPIs, and queues

This RFC v2 follows from a PATCH submission which received some feedback from broadcom on shortening the patch.

v2: net/ncsi: Disable the ncsi work before freeing the associated structure

The work function can run after the ncsi device is freed, resulting in use-after-free bugs or kernel panic.

v1: r8169: Potential divizion by zero in rtl_set_coalesce()

Variable ‘scale’, whose possible value set allows a zero value in a check at r8169_main.c:2014, is used as a denominator at r8169_main.c:2040 and r8169_main.c:2042.

安全增强

v1: coredump: Do not lock during ‘comm’ reporting

The ‘comm’ member will always be NUL terminated, and this is not fast-path, so we can just perform a direct memcpy during a coredump instead of potentially deadlocking while holding the task struct lock.

v1: hardening: Adjust dependencies in selection of MODVERSIONS

MODVERSIONS recently grew a dependency on !COMPILE_TEST so that Rust could be more easily tested. Add the !COMPILE_TEST dependency to the selections to clear up the warning.

异步 IO

v2: RESEND: io_uring/fdinfo: add timeout_list to fdinfo

io_uring fdinfo contains most of the runtime information,which is helpful for debugging io_uring applications;

Rust For Linux

v3: rust: add trylock method support for lock backend

Add a non-blocking trylock method to lock backend interface, mutex and spinlock implementations. It includes a C helper for spin_trylock.

v1: rust: kernel: sort Rust modules

Rust modules are intended to be sorted, thus do so. This makes rustfmtcheck to pass again.

v1: rust: KASAN+RETHUNK requires rustc 1.83.0

This is caused by the -Zfunction-return=thunk-extern flag in rustc not properly informing LLVM about the mitigation, which means that the KASAN functions asan.module_ctor and asan.module_dtor are generated without the rethunk mitigation.

v5: Extended MODVERSIONS Support

This patch series is intended for use alongside the Implement MODVERSIONS for RUST [1] series as a replacement for the symbol name hashing approach used there to enable RUST and MODVERSIONS at the same time.

v2: Untrusted Data Abstraction

Enable marking certain data as untrusted. For example data coming from userspace, hardware or any other external data source.

GIT PULL: Rust for 6.12

This is the next round of the Rust support.

v3: Implement DWARF modversions

Here’s v3 of the DWARF modversions series [1][2]. The main motivation remains modversions support for Rust, which is important for distributions like Android that are eager to ship Rust kernel modules.

BPF

[RFC/PATCH bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc (v2)

I’m proposing a new iterator and a kfunc for the slab memory allocator to get information of each kmem_cache like in /proc/slabinfo or /sys/kernel/slab in more flexible way.

v1: cpufreq_ext: Introduce cpufreq ext governor

I am currently working on a patch for a CPU frequency governor based on BPF, which can use BPF to customize and implement various frequency scaling strategies.

v2: uprobes: Improve the usage of xol slots for better scalability

The uprobe handler allocates xol slot from xol_area and quickly release it in the single-step handler.

v2: bpf-next: Implement mechanism to signal other threads

This set implements a kfunc called bpf_send_signal_remote() that is similar to sigqueue() as it can send a signal along with a cookie to a thread or thread group.

v1: bpf: Call kfree(obj) only once in free_one()

v2: bpf-next: bpf: Add kfuncs for read-only string operations

Kernel contains highly optimised implementation of traditional string operations. Expose them as kfuncs to allow BPF programs leverage the kernel implementation instead of needing to reimplement the operations.

v2: Add BPF Kernel Function bpf_ptrace_vprintk

add a kfunc ‘bpf_ptrace_vprintk’ printing bpf msg with trace_marker format requirement so that these msgs can be retrieved by android perfetto by default and well represented in perfetto UI.

GIT PULL: BPF struct_fd changes for 6.12

This pull includes struct_fd BPF changes from Al and Andrii.

v3: bpf-next: Support eliding map lookup nullness

This patch allows progs to elide a null check on statically known map lookup keys. In other words, if the verifier can statically prove that the lookup will be in-bounds, allow the prog to drop the null check.

v1: net-next: virtio-net: support AF_XDP zero copy (tx)

Because the merge window is closed, so this is RFC.

周边技术动态

Qemu

v1: hw/riscv/virt: Comment absence of #msi-cells

commit 6df664f87c73 (“Revert “hw/riscv/virt.c: imsics DT: add ‘#msi-cells’””) removed #msi-cells. Now that we have a Linux commit to reference add a comment explaining why it was removed to avoid it getting added back due to DT validation failures.

v2: target/riscv: Add support for Smdbltrp and Ssdbltrp extensions

This series adds support for Ssdbltrp and Smdbltrp ratified ISA extensions [1]. It is based on the Smrnmi series [6].

v2: riscv-to-apply queue

The following changes since commit 01dc65a3bc262ab1bec8fe89775e9bbfa627becb:Merge tag ‘pull-target-arm-20240919’ of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-09-19 14:15:15 +0100)

v1: target/riscv/kvm: add riscv-aia bool props

This series adds 3 new riscv-aia bool options for the KVM accel driver, each one representing the possible values (emul, hwaccel and auto). We’re also deprecating the existing ‘riscv-aia’ string option.

U-Boot

v6: efi: Add a test for EFI bootmeth

This series creates a simple test for this purpose. It includes a few patches to make this work.



Read Album:

Read Related:

Read Latest: