泰晓科技 -- 聚焦 Linux - 追本溯源,见微知著!
网站地址:https://tinylab.org

泰晓RISC-V实验箱,转战RISC-V,开箱即用
请稍侯

RISC-V Linux 内核及周边技术动态第 112 期

呀呀呀 创作于 2024/10/14

时间:20241006
编辑:晓瑜
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS

内核动态

RISC-V 架构支持

v1: pinctrl: th1520: Improve code quality

Two code quality improvements for the new TH1520 pinctrl driver [1].

v5: Enable serial NOR flash on RZ/G2UL SMARC EVK

This patch series aims to enable serial NOR flash on RZ/G2UL SMARC EVK.

v1: riscv: insn: add RV_EXTRACT_FUNCT3()

Add extraction for the func3 field of most instructions for use with anyone who needs it.

v1: riscv: interrupt-controller: Add T-HEAD C900 ACLINT SSWI

Add full support for T-HEAD C900 SSWI device.

v1: mmc: sdhci: Prevent stale command and data interrupt handling

While working with the T-Head 1520 LicheePi4A SoC, certain conditions arose that allowed me to reproduce a race issue in the sdhci code.

v2: irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()

It is possible that an interrupt is disabled and masked at the same time.

v1: Add some validation for vector, vector crypto and fp stuff

Kinda RFC as I want to see what people think of this.

v1: Redo PolarFire SoC’s mailbox/clock devicestrees and related code

Here’s something that I’ve been mulling over for a while, since I started to understand how devicetree stuff was “meant” to be done.

v1: riscv control-flow integrity for usermode

v5 for cpu assisted riscv user mode control flow integrity. zicfiss and zicfilp [1] are ratified riscv CPU extensions.

v9: Tracepoints and static branch in Rust

An important part of a production ready Linux kernel driver is tracepoints. So to write production ready Linux kernel drivers in Rust, we must be able to call tracepoints from Rust code. This patch series adds support for calling tracepoints declared in C from Rust.

v1: RISC-V: disallow gcc + rust builds

During the discussion before supporting rust on riscv, it was decided not to support gcc yet, due to differences in extension handling compared to llvm (only the version of libclang matching the c compiler is supported).

v3: Add the dwmac driver support for T-HEAD TH1520 SoC

This series is based on 6.12-rc1 and depends on this pinctrl series.

v3: pinctrl: Add T-Head TH1520 SoC pin controllers

This adds a pin control driver created by Emil for the T-Head TH1520 RISC-V SoC used on the Lichee Pi 4A and BeagleV Ahead boards and updates the device trees to make use of it.

v1: riscv: traps: make insn fetch common in unknown instruction

Add the insn as the second argument to riscv_v_first_use_handler() form the trap handler so when we add more handlers we can do the fetch of the instruction just once.

v10: riscv: Add support for xtheadvector

The devicetree name shown in OpenSBI is the one packed with U-Boot SPL.

v1: i2c: microchip-core: actually use repeated sends

At present, where repeated sends are intended to be used, the i2c-microchip-core driver sends a stop followed by a start.

v1: riscv: mm: check the SV39 rule

SV39 rule: the address of bits[63..39] should be the same as bit[38], it is easy to violate if configure PAGE_OFFSET too small.

LoongArch 架构支持

v7: Consolidate IO memcpy functions

I have also added the full history of the patchset, because it now targets additional architectures.

v2: LoongArch: Set correct size for VDSO code mapping

The current size of VDSO code mapping is hardcoded to PAGE_SIZE.

ARM 架构支持

v1: kselftest/arm64: Validate that GCS push and write permissions work

Add trivial assembly programs which give themselves the appropriate permissions and then execute GCSPUSHM and GCSSTR, they will report errors by generating signals on the non-permitted instructions.

v1: ARM/dma-mapping: Disambiguate ops from iommu_ops in IOMMU core

The architecture dma ops collides with the struct iommu_ops {} defined in /include/linux/iommu.h. This isn’t a major issue, just a nagging annoyance.

v6: Add NSS clock controller support for IPQ9574

Add bindings, driver and devicetree node for networking sub system clock controller on IPQ9574.

v1: arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386

Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering from erratum 3194386 added in commit 75b3c43eab59 (“arm64: errata: Expand speculative SSBS workaround”)

v1: Add support for Zyxel EX3510-B

This pair of patches adds an initial DT for the Zyxel EX3510-B “series” based on BCM4906, encompassing the EX3510-B0 and EX3510-B1.

v3: perf arm-spe: Refactor data source encoding

This patch series is dependent on the metadata version 2 series [1] for retrieving CPU MIDR.

v4: perf arm-spe: Introduce metadata version 2

This patch series enhances Arm SPE metadata in the Perf file to a version 2 format and maintains backward compatibility for metadata v1.

v12: TI K3 M4F support on AM62 and AM64 SoCs

M4F driver[0] and DT bindings[1] are in, so last step is adding the nodes to the base AM64/AM62 DTSI files, plus a couple of our SK boards.

v1: of/kexec: save pa of initial_boot_params for arm64 and use it at kexec

__pa() is only intended to be used for linear map addresses and using it for initial_boot_params which is in fixmap for arm64 will give an incorrect value.

v7: Add support for the LAN966x PCI device using a DT overlay

This series adds support for the LAN966x chip when used as a PCI device.

v3: Add I2C mux devices for yosemite4

v1: irqchip/gic-v4: Don’t allow a VMOVP on a dying VPE

Kunkun Jiang reports that there is a small window of opportunity for userspace to force a change of affinity for a VPE while the VPE has already been unmapped, but the corresponding doorbell interrupt still visible in /proc/irq/.

v1: perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control

KVM also configures PMUSERENR_EL1 in order to trap to EL2. UEN does not need to be set for it since only PMUv3.5 is exposed to guests.

X86 架构支持

v11: RFT: fork: Support shadow stacks in clone3()

The kernel has recently added support for shadow stacks, currently x86 only using their CET feature but both arm64 and RISC-V have equivalent features (GCS and Zicfiss respectively), I am actively working on GCS[1].

v2: KVM: x86/mmu: Repurpose MMU shrinker into page cache shrinker

This series is extracted out from the NUMA aware page table series[1]. MMU shrinker changes were in patches 1 to 9 in the old series.

v1: seal system mappings

Seal vdso, vvar, sigpage, uprobes and vsyscall. Those mappings are readonly or executable only, sealing can protect them from ever changing during the life time of the process.

v1: KVM: x86: Introduce new ioctl KVM_HYPERV_SET_TLB_FLUSH_INHIBIT

This series introduces a new ioctl KVM_HYPERV_SET_TLB_FLUSH_INHIBIT.

v1: media: cec: seco: add HAS_IOPORT dependency

Add a Kconfig dependency again.

v1: objtool: Detect non-relocated text references

When kernel IBT is enabled, objtool detects all text references in order to determine which functions can be indirectly branched to.

v1: futex: Improve get_inode_sequence_number()

Rewrite FOR loop to a DO-WHILE loop where returns are moved out of the loop. Use atomic64_inc_return() instead of atomic64_add_return().

进程调度

v2: sched+mm: Track lazy active mm existence with hazard pointers

Hazard pointers appear to be a good fit for replacing refcount based lazy active mm tracking.

v1: sched/fair: optimize the PLACE_LAG when se->vlag is zero

So if se->vlag is zero, there is no need to waste cycles to do the calculation.

v2: sched: Improve cache locality of RSEQ concurrency IDs

This series addresses this shortcoming. I observed speedups up to 16.7x compared to plain mm_cid indexing in benchmarks.

内存管理

v1: mm, kmsan: instrument copy_from_kernel_nofault

syzbot reported that bpf_probe_read_kernel() kernel helper triggered KASAN report via kasan_check_range() which is not the expected behaviour as copy_from_kernel_nofault() is meant to be a non-faulting helper.

v9: Generic Allocator support for Rust

This patch series adds generic kernel allocator support for Rust, which so far is limited to kmalloc allocations.

v1: preempt_rt: increase PERCPU_DYNAMIC_SIZE_SHIFT for slab randomization

The problem is the additional size overhead from local_lock in struct kmem_cache_cpu. Avoid this by preallocating a larger area.

v3: vdso: Use only headers from the vdso/ namespace

The recent implementation of getrandom in the generic vdso library, includes headers from outside of the vdso/ namespace.

v5: tmpfs: Add case-insensitive support for tmpfs

This patchset adds support for case-insensitive file names lookups in tmpfs.

v3: mm: swap: Make some count_mthp_stat() call-sites be THP-agnostic.

This patch propagates the benefits of the above change to page_io.c and vmscan.c.

v1: mm/truncate: reset xa_has_values flag on each iteration

Currently mapping_try_invalidate() and invalidate_inode_pages2_range() traverses the xarray in batches and then for each batch, maintains and set the flag named xa_has_values if the batch has a shadow entry to clear the entries at the end of the iteration.

v10: timekeeping/fs: multigrain timestamp redux

This is a replacement for the v6 series sitting in Christian’s vfs.mgtime branch.

v9: fs: multigrain timestamp redux

This is a replacement for the v6 series sitting in Christian’s vfs.mgtime branch.

v4: bpf-next: bpf: Add kmem_cache iterator and kfunc

I’m proposing a new iterator and a kfunc for the slab memory allocator to get information of each kmem_cache like in /proc/slabinfo or /sys/kernel/slab in more flexible way.

v1: mm: zswap: zswap_store_page() will initialize entry after adding to xarray.

This incorporates Yosry’s suggestions in [1] for further simplifying zswap_store_page().

v1: KSTATE: a mechanism to migrate some part of the kernel state across kexec

This is a very early RFC with a lot of hacks and cut corners with the purpose to demonstrate the concept itself.

v13: arm64/gcs: Provide support for GCS in userspace

The arm64 Guarded Control Stack (GCS) feature provides support for hardware protected stacks of return addresses, intended to provide hardening against return oriented programming (ROP) attacks and to make it easier to gather call stacks for applications such as profiling.

v2: tip/perf/core: uprobes,mm: speculative lockless VMA-to-uprobe lookup

Implement speculative (lockless) resolution of VMA to inode to uprobe, bypassing the need to take mmap_lock for reads, if possible.

v3: SLUB: Add support for per object memory policies

The old SLAB allocator used to support memory policies on a per
allocation bases. In SLUB the memory policies are applied on a
per page frame / folio bases. Doing so avoids having to check memory
policies in critical code paths for kmalloc and friends.

文件系统

v1: fs: port files to rcuref_long_t

As atomic_inc_not_zero() is implemented with a try_cmpxchg() loop it has O(N^2) behaviour under contention with N concurrent operations. The rcuref infrastructure uses atomic_add_negative_relaxed() for the fast path, which scales better under contention and we get overflow protection for free.

v1: UFS: Final folio conversions

This is the last use of struct page I’ve been able to find in UFS.

v1: btrfs reads through iomap

These patches incorporate btrfs buffered reads using iomap code. The final goal here is to give all folio handling to iomap.

v1: netfs: In readahead, put the folio refs as soon extracted

netfslib currently defers dropping the ref on the folios it obtains during readahead to after it has started I/O on the basis that we can do it whilst we wait for the I/O to complete, but this runs the risk of the I/O collection racing with this in future.

v1: Stash overlay real upper file in backing_file

Al Viro posted a proposal to cleanup overlayfs handling of temporary cloned real file references.

v7: block atomic writes for xfs

This series expands atomic write support to filesystems, specifically XFS.

v1: [PATCHES] struct fderr

There we want not “file reference or nothing” - it’s “file reference or an error”.

v2: fanotify: allow reporting errors on failure to open fd

When working in “fd mode”, fanotify_read() needs to open an fd from a dentry to report event->fd to userspace.

v1: nilfs2: Finish folio conversion

After “nilfs2: Convert nilfs_copy_buffer() to use folios”, there are only a few remaining users of struct page in all of nilfs2, and they’re

v5: rust: add PidNamespace

The lifetime of PidNamespace is bound to Task and struct pid.

网络设备

v1: net: sfp: change quirks for Alcatel Lucent G-010S-P

Seems Alcatel Lucent G-010S-P also have the same problem that it uses TX_FAULT pin for SOC uart. So apply sfp_fixup_ignore_tx_fault to it.

v2: net-next: tg3: Link IRQs, NAPIs, and queues

This RFC v3 follows from a previous RFC [1] submission which I noticed had an issue in patch 2.

v2: net-next: rust: Add IO polling

Add Rust version of read_poll_timeout (include/linux/iopoll.h), which polls periodically until a condition is met or a timeout is reached. By using the function, the 6th patch fixes QT2025 PHY driver to sleep until the hardware becomes ready.

v2: net: rtnetlink: Handle error of rtnl_register_module().

While converting phonet to per-netns RTNL, I found a weird comment

v3: net-next: rtnetlink: Per-netns RTNL.

rtnl_lock() is a “Big Kernel Lock” in the networking slow path and serialised all rtnetlink requests until 4.13.

v1: net: phy: disable eee due to errata on various KSZ switches

The well-known errata regarding EEE not being functional on various KSZ switches has been refactored a few times. Recently the refactoring has excluded several switches that the errata should also apply to.

v3: net-next: eth: fbnic: Add hardware monitoring support via HWMON interface

This patch adds support for hardware monitoring to the fbnic driver, allowing for temperature and voltage sensor data to be exposed to userspace via the HWMON interface.

v2: net-next: ipv4: Namespacify IPv4 address hash table.

This is a prep of per-net RTNL conversion for RTM_(NEWDELSET)ADDR.

v1: net-next: tcp: add skb->sk to more control packets

Currently, TCP can set skb->sk for a variety of transmit packets.

[net-next PATCH v2] net: phy: Validate PHY LED OPs presence before registering

Validate PHY LED OPs presence before registering and parsing them. Defining LED nodes for a PHY driver that actually doesn’t supports them is redundant and useless.

v1: net-next: vmxnet3: support higher link speeds from vmxnet3 v9

This patch adds support for vmxnet3 to report higher link speeds and converts it to mbps as expected by Linux stack.

v2: net-next: wireguard: Wire-up big tcp support

Advertise GSO_MAX_SIZE as TSO max size in order support BIG TCP for wireguard.

v1: net-next: net: Optimize IPv6 path in ip_neigh_for_gw()

This optimization aligns with the trend of IPv6 becoming the default IP version in many deployments, and should benefit modern network configurations.

v2: net-next: Allow isolating PHY devices

This is the V2 of a series to add isolation support for PHY devices.

v1: net-next: net: phy: mxl-gpy: add missing support for TRIGGER_NETDEV_LINK_10

The PHY also support 10MBit/s links as well as the corresponding link indication trigger to be offloaded. Add TRIGGER_NETDEV_LINK_10 to the supported triggers.

v1: net-next: net: phy: realtek: make sure paged read is protected by mutex

As we cannot rely on phy_read_paged function before the PHY is identified, the paged read in rtlgen_supports_2_5gbps needs to be open coded as it is being called by the match_phy_device function, ie. before .read_page and .write_page have been populated.

v1: net-next: net: phy: realtek: check validity of 10GbE link-partner advertisement

This prevents misinterpreting the stale 2500M link-partner advertisement bit in case a subsequent linkpartner doesn’t do any NBase-T advertisement at all.

v1: net-next: net: phy: always set polarity_modes if op is supported

One way to achieve this without introducing an additional ‘active-high’ property would be to always call the led_polarity_set operation if it is supported by the phy driver.

v1: net-next: net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension

This fixes checksumming errors seen with ip6_tunnel and fou6 encapsulation, for example with GRE-in-UDP over IPv6:

  • fou6 adds a UDP header with a partial checksum if the inner packet does not contains a valid checksum.
  • ip6_tunnel adds an IPv6 header with a destination option extension header if encap_limit is non-zero (the default value is 4).

v2: net-next: net: sparx5: prepare for lan969x switch driver

This series is the first of a multi-part series, that prepares and adds support for the new lan969x switch driver.

v4: net: dsa: lan9303: ensure chip reset and wait for READY status

Accessing device registers seems to be not reliable, the chip revision is sometimes detected wrongly (0 instead of expected 1).

v1: treewide: Switch to __pm_runtime_put_autosuspend()

This set will switch the users of pm_runtime_put_autosuspend() to __pm_runtime_put_autosuspend() while the former will soon be re-purposed to include a call to pm_runtime_mark_last_busy(). The two are almost always used together, apart from bugs which are likely common. Going forward, most new users should be using pm_runtime_put_autosuspend().

v1: iproute2-next: rt_names: read rt_addrprotos.d directory

My magic 8-ball predicts we might be grabbing a value or two for use in FRRouting at some point in the future. Let’s make it so we can ship those in a separate file when it’s time.

v1: net-next: Introduce VLAN support in HSR

This series adds VLAN support to HSR framework. This series also adds VLAN support to HSR mode of ICSSG Ethernet driver.

v1: iwl-next: ice: Add in/out PTP pin delays

HW can have different input/output delays for each of the pins. Add a field in ice_ptp_pin_desc structure to reflect that.

v1: net-next: mlxsw: spectrum_acl_flex_keys: Constify struct mlxsw_afk_element_inst

Constifying these structures moves some data to a read-only section, so increases overall security.

安全增强

v4: block: partition table OF support

Some background on this. Many OEM on embedded device (modem, router…) are starting to migrate from NOR/NAND flash to eMMC.

v1: coredump: Do not lock during ‘comm’ reporting

The ‘comm’ member will always be NUL terminated, and this is not fast-path, so we can just perform a direct memcpy during a coredump instead of potentially deadlocking while holding the task struct lock.

v1: hardening: Adjust dependencies in selection of MODVERSIONS

Add the !COMPILE_TEST dependency to the selections to clear up the warning.

异步 IO

v1: liburing: sanitize: add ifdef guard around sanitizer functions

Otherwise there are redefinition errors during compilation if CONFIG_USE_SANITIZER isn’t set.

v1: [PATCHES] xattr stuff and interactions with io_uring

The series below is a small-scale attempt at sanitizing the interplay between io_uring and normal syscalls.

v7: FDP and per-io hints

Another spin to incorporate the feedback from LPC and previous iterations.

Rust For Linux

v1: implement kernel::sync::Refcount and convert users

This series consolidate them to have a single Refcount which wraps refcount_t and have it used by both.

v3: rust: optimize error type to use nonzero

This reduces the space used by the Result type, as the NonZero* type enables the compiler to apply more efficient memory layout.

v2: rust: device: change the from_raw() function

The new name of function should be “get_device” to be consistent with the function get_device() already exist in .c files.

v1: net-next: add delay abstraction (sleep functions)

Add an abstraction for sleep functions in include/linux/delay.h for dealing with hardware delays. delay.h supports sleep and delay (busy wait). This adds support for sleep functions used by QT2025 PHY driver to sleep until a PHY becomes ready.

v3: rust: device: rename “Device::from_raw()”

This function “Device::from_raw()” increments the refcount by this command “bindings::get_device(prt)”. This can be confused because the function Arc::from_raw() from the standard library, doesn’t increment the refcount.

BPF

v4: dwarves: Emit global variables in BTF

This is v4 of the series which adds global variables to pahole’s generated BTF.

v3: tracing: Allow system call tracepoints to handle page faults

This series does the initial wire-up allowing tracers to handle page faults, but leaves out the actual handling of said page faults as future work.

v4: bpf-next: Support eliding map lookup nullness

This patch allows progs to elide a null check on statically known map lookup keys. In other words, if the verifier can statically prove that the lookup will be in-bounds, allow the prog to drop the null check.

v1: net: sfc: Don’t invoke xdp_do_flush() from netpoll.

Yury reported a crash in the sfc driver originated from netpoll_send_udp().

v3: HID: HID: bpf: add a new hook to control hid-generic

This is a slight change from the fundamentals of HID-BPF. In theory, HID-BPF is abstract to the kernel itself, and makes only changes at the HID level (through report descriptors or events emitted to/from the device).

[PATCH RESEND tip/perf/core] uprobes: switch to RCU Tasks Trace flavor for better performance

This patch switches uprobes SRCU usage to RCU Tasks Trace flavor, which is optimized for more lightweight and quick readers (at the expense of slower writers, which for uprobes is a fine trade-off) and has better performance and scalability with number of CPUs.

v2: PCI: add enabe(disable)_device() hook for bridge

Some system’s IOMMU stream(master) ID bits(such as 6bits) less than pci_device_id (16bit). It needs add hardware configuration to enable pci_device_id to stream ID convert.

v1: resend: tracing: Allow system call tracepoints to handle page faults

This series does the initial wire-up allowing tracers to handle page faults, but leaves out the actual handling of said page faults as future work.

答复: v2: Add BPF Kernel Function bpf_ptrace_vprintk

This patch is mainly considered based on the Android Perfetto (A powerful trace collection and analysis tool, support ftrace data source).

v1: bpf: Prevent infinite loops with bpf_redirect_peer

It is possible to create cycles using bpf_redirect_peer which lead to an an infinite loop inside __netif_receive_skb_core.

周边技术动态

Qemu

v9: riscv: QEMU RISC-V IOMMU Support

In this new version we fixed the IOVA == GPA MSI early check in patch 3, in riscv_iommu_spa_fetch(), after discussions with Tomasz and Drew on v8.

v15: riscv support for control flow integrity extensions

I’ve rebased again on https://github.com/alistair23/qemu/blob/riscv-to-apply.next (tag: pull-riscv-to-apply-20241002)

v3: riscv-to-apply queue

v1: target/riscv: Set vtype.vill on CPU reset

This change now makes QEMU consistent with Spike which sets vtype.vill on reset.

v1: hw/riscv/spike: Replace tswap64() by ldq_endian_p()

Hold the target endianness in HTIFState::target_is_bigendian. Pass the target endianness as argument to htif_mm_init(). Replace tswap64() calls by ldq_endian_p() ones.

Buildroot

arch/arm: add support for FDPIC

Linux on ARM supports FDPIC binaries intended for use on no-MMU systems. This patch enables support for building a toolchain that produces FDPIC binaries, only for ARMv7-M platforms, for which FDPIC binaries are relevant.

U-Boot

v1: Support OF_UPSTREAM for StarFive JH7110

This patchset add OF_STREAM support for StarFive JH7110 based boards. All the JH7110 based boards can use the DT from upstreaming linux kernel. The v1.3b board device tree is set as the default device tree.



Read Album:

Read Related:

Read Latest: