RISC-V Linux 内核及周边技术动态第 105 期

呀呀呀创作于 2024/08/23

时间：20240818
编辑：晓瑜
仓库：RISC-V Linux 内核技术调研活动
赞助：PLCT Lab, ISCAS

内核动态

RISC-V 架构支持

v5: Zacas/Zabha support and qspinlocks

This implements [cmp]xchgXX() macros using Zacas and Zabha extensions and finally uses those newly introduced macros to add support for qspinlocks.

v2: irqchip/sifive-plic: Probe plic driver early for Allwinner D1 platform

The latest Linux RISC-V no longer boots on the Allwinner D1 platform because the sun4i_timer driver fails to get an interrupt from PLIC.

v1: riscv: hwprobe: export Zicntr and Zihpm extensions

Export Zicntr and Zihpm ISA extensions through the hwprobe syscall.

v3: of/irq: Support #msi-cells=<0> in of_msi_get_domain

An ‘msi-parent’ property with a single entry and no accompanying ‘#msi-cells’ property is considered the legacy definition as opposed to its definition after being expanded with commit 126b16e2ad98 .

v11: riscv: sophgo: Add SG2042 external hardware monitor support

Add support for the onboard hardware monitor for SG2042. Can be tested with OpenSBI v1.5.

v7: Tracepoints and static branch in Rust

An important part of a production ready Linux kernel driver is tracepoints.

v1: RISC-V: KVM: Don’t zero-out PMU snapshot area before freeing data

With the latest Linux-6.11-rc3, the below NULL pointer crash is observed when SBI PMU snapshot is enabled for the guest and the guest is forcefully powered-off.

v4: riscv: mm: Add soft-dirty and uffd-wp support

This patchset adds soft dirty and userfaultfd write protect tracking support for RISC-V.

v1: riscv: misaligned: Restrict user access to kernel memory

raw_copy_{to,from}_user() do not call access_ok(), so this code allowed userspace to access any virtual memory address.

v1: kasan: RISC-V support for KASAN_SW_TAGS using pointer masking

This series implements support for software tag-based KASAN using the RISC-V pointer masking extension[1], which supports 7 and/or 16-bit tags.

v4: riscv: Per-thread envcfg CSR support

This series (or equivalent) is a prerequisite for both user-mode pointer masking and CFI support, as both of those are per-thread features and are controlled by fields in the envcfg CSR.

v5: PCI: microchip: support using either instance 1 or 2

The current driver and binding for PolarFire SoC’s PCI controller assume that the root port instance in use is instance 1.

v2: cpuidle: riscv-sbi: Allow cpuidle pd used by other devices

Add this patchset so the devices that inside the cpu/cluster power domain can use the cpuidle pd to register the genpd notifier to handle the PM when cpu/cluster is going to enter a deeper sleep state.

v2: riscv: Add perf support to collect KVM guest statistics from host side

Add basic guest support to RISC-V perf, enabling it to distinguish whether PMU interrupts occur in the host or the guest, and then collect some basic guest information from the host side (guest os callchain is not supported for now).

v4: Add SARADC support on Sophgo CV18XX series

This patchset adds initial ADC support for Sophgo SoC. This driver can work with or without interrupt and in “Active” and “No-Die” domains depending on if a clock is provided.

v1: perf/riscv-sbi: Add platform specific firmware event handling

The SBI v2.0 specification pointed to by the link below reserves the event code 0xffff for platform specific firmware events.

v8: RISC-V: ACPI: Add external interrupt controller support

The series primarily enables irqchip drivers for RISC-V ACPI based platforms.

v11: riscv: sophgo: add dmamux support for Sophgo CV1800/SG2000 SoCs

Add dma multiplexer support for the Sophgo CV1800/SG2000 SoCs.

v1: ACPI: RISCV: Make acpi_numa_get_nid() to be static

acpi_numa_get_nid() is only called in acpi_numa.c for riscv, no need to add it in head file, so make it static and remove related functions in the asm/acpi.h.

v9: riscv: Add support for xtheadvector

Regarding issues with Allwinner D1 CPU I tried to track down the problematic commit. I tested with MangoPi MQ board (Allwinner D1s) and starting from this merge I can’t get beyond “Starting kernel…”, ie. no output at all (and u-boot keeps restarting)

LoongArch 架构支持

v2: LoongArch: Implement getrandom() in vDSO

The vDSO getrandom() needs a stack-less ChaCha20 implementation, so we need to add architecture-specific code and wire it up with the generic code.

v10: Loongarch-avec support

This series of patches introduces support for advanced extended interrupt controllers (AVECINTC), and this hardware feature will be supported on 3C6000 for the first time

v6: Add extioi virt extension support

KVM_FEATURE_VIRT_EXTIOI is paravirt feature defined with EXTIOI interrupt controller, it can route interrupt to 256 vCPUs and CPU interrupt pin IP0-IP7.

ARM 架构支持

v3: drm: sun4i: add Display Engine 3.3 (DE33) support

V3 of this patch series adding support for the Allwinner DE33 display engine variant. V3 is rebased on top of layer init and modesetting changes merged for 6.11. No functional changes from V2, fixes and review from previous V1 and V2 added, and correction to DT bindings.

v3: Support for I/O width within ARM SCMI SHMEM

This patch series adds support for the ‘reg-io-width’ property and allows us to specify the exact access width that the SRAM supports.

v3: KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace

KVM exposes the OS double lock feature bit to Guests but returns RAZ/WI on Guest OSDLR_EL1 access.

v3: iommu/arm-smmu-v3: Match Stall behaviour for S2

According to the spec (ARM IHI 0070 F.b), in “5.5 Fault configuration (A, R, S bits)”:A STE with stage 2 translation enabled and STE.S2S == 0 isconsidered ILLEGAL if SMMU_IDR0.STALL_MODEL == 0b10.

v12: Add i2c-mux and eeprom devices for Meta Yosemite 4

v4: power: supply: max77693: Toggle charging/OTG based on extcon status

The extcon listener implementation is inspired by the rt5033 charger driver (commit 8242336dc8a8 (“power: supply: rt5033_charger: Add cable detection and USB OTG supply”)).

v12: Add Tegra241 (Grace) CMDQV Support (part 1/2)

NVIDIA’s Tegra241 (Grace) SoC has a CMDQ-Virtualization (CMDQV) hardware that extends standard ARM SMMUv3 to support multiple command queues with virtualization capabilities.

v1: KVM: arm64: Add support for FEAT_LS64 and co

The ARM architecture has introduced a while back some 64 byte Load/Store operations that are targeting NormalNC or Device memory.

v3: Add initial support for Rockchip RK3528 SoC

This series add a basic device tree with CPU, interrupts and UART nodes for it and is able to boot into a kernel with only UART console.

v1: Work around reserved SMMU context bank on msm8998

On qcom msm8998, writing to the last context bank of lpass_q6_smmu (base address 0x05100000) produces a system freeze & reboot.

v1: firmware: arm_ffa: FF-A basic v1.2 support

This series add basic support for FF-A v1.2.

v1: Cosmetic Work for ARM/Microchip (AT91)

This patch set proposes to:
clean up coding style errors reported by checkpatch.pl
align the nodename and sub nodename according to the devicetree specification even with their binding.

v3: MIPI DSI Controller support for SAM9X75 series

This patch series adds support for the Microchip’s MIPI DSI Controller wrapper driver that uses the Synopsys DesignWare MIPI DSI host controller bridge for SAM9X75 SoC series.

v2: Support Armv8.9/v9.4 FEAT_HAFT

This series adds basic support for FEAT_HAFT introduced in Armv8.9/v9.4 and enable ARCH_HAS_NONLEAF_PMD_YOUNG. The latter will be used in lru-gen aging.

v4: media: aspeed: Allow to capture from SoC display (GFX)

The aim of this series is to add another capture source, SoC Display(GFX), for video.

v1: KVM: arm64: Add EL2 support to FEAT_S1PIE

As this is a parallel series to the one implementing Address Translation, the S1PIE part of AT is not in any of the two.

v1: perf stat: Make default perf stat command work on Arm big.LITTLE

The important patches are 3 and 5, the rest are tidyups and tests.

v5: Add support for Kontron OSM-S i.MX93 SoM and carrier board

v7: Constify tool pointers

This change refactors struct perf_tool to have an init function that provides the default implementation.

v1: TQMa6x / MBa6x DT improvements

This series brings following improvements:
use a more specific compatible for the LM75 temperature sensors on SoM and mainboard
move I2C pinmux entries to variants that use them and prevent doubled declaration
rename node name for onboard USB hub

X86 架构支持

v5: Touch Bar support for T2 Macs

The Touch Bars found on x86 Macs support two USB configurations: one where the device presents itself as a HID keyboard and can display predefined sets of keys, and one where the operating system has full control over what is displayed.

v1: 5.15: 5.15.165-rc3 review

This is the start of the stable review cycle for the 5.15.165 release.

v1: 5.10: 5.10.224-rc3 review

This is the start of the stable review cycle for the 5.10.224 release.

v1: x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)

This series adds the support for L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) to resctrl infrastructure.

v4: platform/x86: thinkpad_acpi: Add Thinkpad Edge E531 fan support

Fan control on the E531 is done using the ACPI methods FANG and FANW. The correct parameters and register values were found by analyzing EC firmware as well as DSDT. This has been tested on my Thinkpad Edge E531 (6885CTO, BIOS HEET52WW 1.33).

v6: perf: Support searching local debugging vdso or specify vdso path in cmdline

The vdso dumped from process memory (in buildid-cache) lacks debugging info. To annotate vdso symbols with source lines we need a debugging version.

v3: Add CPU-type to topology

v1: KVM: x86: Optimize local variable in start_sw_tscdeadline()

Change the data type of the local variable this_tsc_khz to u32 because virtual_tsc_khz is also declared as u32.

v1: KVM: x86/mmu: Register MMU shrinker only when necessary

The shrinker is allocated with TDP MMU, which is meaningless except for nested VMs, and ‘count_objects’ is also called each time the reclaim path tries to shrink slab caches. Let’s allocate the shrinker only when necessary.

v2: mm-unstable: mm/hugetlb: alloc/free gigantic folios

Use __GFP_COMP for gigantic folios can greatly reduce not only the amount of code but also the allocation and free time.

v1: um: make personality(PER_LINUX32) work on x86_64

v1: TDX vCPU/VM creation

This series kicks off the actual interaction of KVM with the TDX module.

v1: Add SEV-SNP CipherTextHiding feature support

Ciphertext hiding prevents host accesses from reading the ciphertext of SNP guest private memory. Instead of reading ciphertext, the host will see constant default values (0xff).

进程调度

v1: sched/eevdf: Improve the clarity of the lag-based placement comments

In the original comments, the derivation starts with preserving v_i to calculate V’ and uses the equation v_i = V - vl_i. However, tasks might have migrated from other queues, which means that the relationship with this queue’s V does not necessarily hold.

v1: sched: Prepare for sched_ext

These patches apply on top of the EEVDF series (queue/sched/core), which re-arranges the fair pick_task() functions to make them state invariant such that they can easily be restarted upon picking (and dequeueing) a delayed task.

v1: sched/deadline: nanoseconds clarifications

A couple of clarifications about the time units for the deadline parameters uncovered in the discussion around https://lore.kernel.org/lkml/3c726cf5-0c94-4cc6-aff0-a453d840d452@arm.com/

内存管理

v1: mm: finish three more folio conversion

Convert to use folios then remove find_subpage(), thp_nr_pages() and PageTransHuge().

v2: resend: mm: memory_hotplug: improve do_migrate_range()

Unify hwpoisoned page handling and isolation of HugeTLB/LRU/non-LRU movable page, also convert to use folios in do_migrate_range().

v3: mm: clarify nofail memory allocation

If we must still fail a nofail allocation, we should trigger a BUG rather than exposing NULL dereferences to callers who do not check the return value.

v5: mm: override mTHP “enabled” defaults at kernel cmdline

Add thp_anon= cmdline parameter to allow specifying the default enablement of each supported anon THP size. The parameter accepts the following format and can be provided multiple times to configure each size

v7: Improve the copy of task comm

As suggested by Linus [0], we can identify all relevant code with the following git grep command

v3: mm: Optimize mseal checks

This series also depends on the powerpc series that removes arch_unmap[2]. This series is already in mm-unstable.

v4: filemap: add trace events for get_pages, map_pages, and fault

To allow precise tracking of page caches accessed, add new tracepoints that trigger when a process actually accesses them.

v11: EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers

Previously known as “ras: scrub: introduce subsystem + CXL/ACPI-RAS2 drivers”.

v5: Rebase v5 patchset to next-20240816

Since the v5 patchset some changes have taken place in the linux-next tree which make it impossible to cleanly apply that patchset.

v1: mm: control mthp per process/cgroup

Now the large folio control interfaces is system wide and tend to be default on: file systems use large folio by default if supported, mTHP is tend to default enable when boot [1].

v2: mm: memory_hotplug: improve do_migrate_range()

Unify hwpoisoned page handling and isolation of HugeTLB/LRU/non-LRU movable page, also convert to use folios in do_migrate_range().

v1: mm: document risk of PF_MEMALLOC_NORECLAIM

Andrew, could you merge the following before PF_MEMALLOC_NORECLAIM can be removed from the tree altogether please? For the full context the email thread starts here: https://lore.kernel.org/all/20240812090525.80299-1-laoar.shao@gmail.com/T/#u

v2: mm: add lazyfree folio to lru tail

With the change in place, workingset_refault_file is reduced by 33% in the continuous startup testing of the applications in the Android system.

v2: mm/slub: Add check for s->flags in the alloc_tagging_slab_free_hook

When enable CONFIG_MEMCG & CONFIG_KFENCE & CONFIG_KMEMLEAK, the following warning always occurs,This is because the following call stack occurred.

v6: Generic Allocator support for Rust

This patch series adds generic kernel allocator support for Rust, which so far is limited to kmalloc allocations.

v1: Virtualizing tagged disaggregated memory capacity (app specific, multi host shared)

v1: Consolidate iommu page table implementations

Currently each of the iommu page table formats duplicates all of the logic to maintain the page table and perform map/unmap/etc operations.

v12: enable bs > ps in XFS

This is the 12th version of the series that enables block size > page size (Large Block Size) experimental support in XFS. Please consider this for the inclusion in 6.12.

v1: codetag: debug: mark codetags for cma pages as empty

To avoid debug warnings while freeing cma pages which were not allocated with usual allocators, mark their codetags as empty before freeing.

v1: memcg: further decouple v1 code from v2

Some of the v1 code is still in v2 code base due to v1 fields in the struct memcg_vmstats_percpu. This field decouples those fileds from v2 struct and move all the related code into v1 only code base.

v4: mm,memcg: provide per-cgroup counters for NUMA balancing operations

The ability to observe the demotion and promotion decisions made by the kernel on a per-cgroup basis is important for monitoring and tuning containerized workloads on machines equipped with tiered memory.

v2: memcg: initiate deprecation of v1 features

Let start the deprecation process of the memcg v1 features which we discussed during LSFMMBPF 2024 [1]. For now add the warnings to collect the information on how the current users are using these features.

v2: mm, slub: print CPU id (and its node) on slab OOM

Depending on how remote_node_defrag_ratio is configured, allocations can end up in this path as a result of the local node being OOM, despite the allocation overall being unconstrained (node == -1).

文件系统

v5: block atomic writes for xfs

This series expands atomic write support to filesystems, specifically XFS. Extent alignment is based on new feature forcealign.

v1: -next: doc: correcting the idmapping mount example

In step 2, we obtain the kernel id k1000. So in next step (step 3), we should translate the k1000 not k21000.

v1: more close_range() fun

Note that close_range() call in the second thread does not affect any of the descriptors we work with in the first thread and at no point does thread 1 have descriptor 10 opened without descriptor 1023 also being opened.

v1: vfs: elide smp_mb in iversion handling in the common case

According to bpftrace on these routines most calls result in cmpxchg, which already provides the same guarantee.

v4: fanotify: add pre-content hooks

v2: netfs: Read/write improvements

This set of patches includes a couple of fixes

v6: bpf-next: Harden and extend ELF build ID parsing logic

The goal of this patch set is to extend existing ELF build ID parsing logic, currently mostly used by BPF subsystem, with support for working in sleepable mode in which memory faults are allowed and can be relied upon to fetch relevant parts of ELF file to find and fetch .note.gnu.build-id information.

v1: Merge PG_private_2 and PG_mappedtodisk

I believe these two flags have entirely disjoint uses and there will be no confusion in amalgamating them.

v2: autofs: add per dentry expire timeout

Add ability to set per-dentry mount expire timeout to autofs.

v1: f2fs: new mount API conversion

The series can be applied on top of the current mainline tree and the work is based on the patches from Lukas Czerner (has done this in ext4[1]).

v1: bpf-next: BPF follow ups to struct fd refactorings

This patch set extracts all the BPF-related changes done in [0] into a separate series based on top of stable-struct_fd branch ([1]) merged into bpf-next tree.

v4: forcealign for xfs

This series is being spun off the block atomic writes for xfs series at [0].

v3: ext4: simplify the counting and management of delalloc reserved blocks

v2: file: reclaim 24 bytes from f_owner

We do embedd struct fown_struct into struct file letting it take up 32 bytes in total. We could tweak struct fown_struct to be more compact but really it shouldn’t even be embedded in struct file in the first place.

v3: fuse: Allow page aligned writes

Write IOs should be page aligned as fuse server might need to copy data to another buffer otherwise in order to fulfill network or device storage requirements.

v1: net-next: Suspend IRQs during preferred busy poll

This is the idea I mentioned at netdev conf, for those who were there. Barring any major issues, we hope to submit this officially shortly after RFC.

网络设备

v1: net-next: net: hns3: Use ARRAY_SIZE() to improve readability

There is a helper function ARRAY_SIZE() to help calculating the u32 array size, and we don’t need to do it mannually. So, let’s use ARRAY_SIZE() to calculate the array size, and improve the code readability.

v1: net-next: tcp: do not allow to connect with the four-tuple symmetry socket

It can rarely happen on the loopback device when the connect() finds the same port as its remote port while listener is not running. It has the side-effect on other threads. Besides, this solo flow has no merit, no significance at all.

v3: net-next: net: ipv6: ioam6: introduce tunsrc

This series introduces a new feature called “tunsrc” (just like seg6 already does).

v1: net-next: pull-request appletalk 2024-08-17

this is a pull request of 2 patches for net-next/master

v4: net-next: net: phy: add Applied Micro QT2025 PHY driver

This patchset adds a PHY driver for Applied Micro Circuits Corporation QT2025.

v1: net-next: af_unix: Don’t call skb_get() for OOB skb.

Since introduced, OOB skb holds an additional reference count with no special reason and caused many issues.

v1: net-next: net: ag71xx: disable GRO by default

ag71xx is usually paired with qca8k or ar9331, both DSA drivers. DSA internally uses GRO cells to speed up transactions.

v1: net: mlxbf_gige: disable port during stop()

The mlxbf_gige_open() routine initializes and enables the Gigabit Ethernet port from a hardware point of view.

v1: iproute2-next: ip: nexthop: Support 16-bit nexthop weights

Two interlinked changes related to the nexthop group management have been recently merged in kernel commit e96f6fd30eec (“Merge branch ‘net-nexthop-increase-weight-to-u16’”).

v1: net-next: tcp: change source port selection at bind() time

This is a follow-up patch to an eariler commit 207184853dbd (“tcp/dccp: change source port selection at connect() time”).

v2: net: igb: cope with large MAX_SKB_FRAGS

Sabrina reports that the igb driver does not cope well with large MAX_SKB_FRAG values: setting MAX_SKB_FRAG to 45 causes payload corruption on TX.

v3: net-next: net: mana: Implement get_ringparam/set_ringparam for mana

Currently the values of WQs for RX and TX queues for MANA devices are hardcoded to default sizes.

v6: iwl-next: igb: Add support for AF_XDP zero-copy

This is version 6 of the AF_XDP zero-copy support for igb. Since Sriram’s duties changed I am sending this instead. Additionally, I’ve tested this on real hardware, Intel i210 [1].

v1: cxgb4: add forgotten u64 ivlan cast before shift

It is done everywhere in cxgb4 code, e.g. in is_filter_exact_match() There is no reason it should not be done here

v1: net: dsa: microchip: add KSZ8 change_tag_protocol support

Add support for changing the KSZ8 switches tag protocol.

v2: net-next: Add driver for Motorcomm yt8821 2.5G ethernet phy

yt8521 and yt8531s as Gigabit transiver use bit15:14(bit9 reserved default 0) as phy speed mask, yt8821 as 2.5G transiver uses bit9 bit15:14 as phy speed mask.

v1: net-next: Bonding: support new xfrm state offload functions

I planned to add the new XFRM state offload functions after Jianbo’s patchset [1], but it seems that may take some time.

v1: vhost_vdpa: assign irq bypass producer token correctly

We used to call irq_bypass_unregister_producer() in vhost_vdpa_setup_vq_irq() which is problematic as we don’t know if the token pointer is still valid or not.

v1: net: ethernet: ibm: Simpify code with for_each_child_of_node()

for_each_child_of_node can help to iterate through the device_node, and we don’t need to use while loop. No functional change with this conversion.

v11: Add AP6275P wireless support

These add AP6275P wireless support on Khadas Edge2. Enable 32k clock for Wi-Fi module and extend the hardware IDs table in the brcmfmac driver for it to attach.

v1: Implement performance impact measurement tool

Landlock LSM hooks are executed with many operations on Linux internal objects (files, sockets).

v1: net: kcm: Serialise kcm_sendmsg() for the same socket.

syzkaller reported UAF in kcm_release().

v2: net-next: flow_dissector: Dissect UDP encapsulation protocols

Add support in flow_dissector for dissecting into UDP encapsulations like VXLAN.

v3: Landlock: Signal Scoping Support

This patch series adds scoping mechanism for signals. Closes: https://github.com/landlock-lsm/linux/issues/8

v1: net: tc-testing: don’t access non-existent variable on exception

Since commit 255c1c7279ab (“tc-testing: Allow test cases to be skipped”) the variable test_ordinal doesn’t exist in call_pre_case().

v1: net-next: net/smc: add sysctl for smc_limit_hs

In commit 48b6190a0042 (“net/smc: Limit SMC visits when handshake workqueue congested”), we introduce a mechanism to put constraint on SMC connections visit according to the pressure of SMC handshake process.

v1: net: do not release sk in sk_wait_event

When investigating the kcm socket UAF which is also found by syzbot, I found that the root cause of this problem is actually in sk_wait_event.

安全增强

v1: mremap refactor: check src address for vma boundaries first.

mremap doesn’t allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won’t be unmapped, if src address failed the boundaries check.

v7: Add support for aw96103/aw96105 proximity sensor

Add drivers that support Awinic aw96103/aw96105 proximity sensors.

v1: coccinelle: Add rules to find str_down_up() replacements

As done with str_up_down(), add checks for str_down_up() opportunities. 5 cases currently exist in the tree.

v1: string_choices: Add wrapper for str_down_up()

The string choice functions which are not clearly true/false synonyms also have inverted wrappers. Add this for str_down_up() as well.

异步 IO

v1: liburing: Add io_uring_iowait_toggle()

Add io_uring_iowait_toggle() helper function for the userspace liburing side of IORING_ENTER_NO_IOWAIT flag added in io_uring for 6.12.

v2: io_uring: add IORING_ENTER_NO_IOWAIT to not set in_iowait

This patchset adds a IOURING_ENTER_NO_IOWAIT flag that can be set on enter. If set, then current->in_iowait is not set.

v1: io_uring: add option to not set in_iowait

This patchset adds a IOURING_ENTER_NO_IOWAIT flag that can be set on enter. If set, then current->in_iowait is not set. By default this flag is not set to maintain existing behaviour i.e. in_iowait is always set.

v1: implement asynchronous BLKDISCARD via io_uring

There is an interest in having asynchronous block operations like discard. The patch set implements that as io_uring commands, which is an io_uring request type allowing to implement custom file specific operations.

v1: io_uring/fdinfo: add timeout_list to fdinfo

io_uring fdinfo contains most of the runtime information,which is helpful for debugging io_uring applications;

v1: abstract napi tracking strategy

the actual napi tracking strategy is inducing a non-negligeable overhead.

v1: Coalesce provided buffer segments

When selecting provided buffers for a send/recv for bundles, there’s no reason why the number of buffers selected is the same as the mapped segments that will be passed to send/recv.

v1: Add support for incremental buffer consumption

The recommended way to use io_uring for networking workloads is to use ring provided buffers. The application sets up a ring (or several) for buffers, and puts buffers for receiving data into them.

v8: io_uring: releasing CPU resources when polling

This patch add a new hybrid poll at io_uring level, it also set a signal “IORING_SETUP_HY_POLL” to application, aim to provide a interface for users to enable use new hybrid polling flexibly.

v1: io_uring/napi: check napi_enabled in io_napi_add() before proceeding

doing so avoids the overhead of adding napi ids to all the rings that do not enable napi.

Rust For Linux

v1: rust: use the hidden variant of rust-project.json

Very soon after we requested it [1], rust-analyzer added support for .rust-project.json [2], i.e. the hidden variant of .rust-project.json.

v2: Implement DWARF modversions

The main motivation remains modversions support for Rust, which is important for distributions like Android that are eager to ship Rust kernel modules.

v3: kbuild: rust: split up helpers.c

This patch splits up the rust helpers C file.

v1: rust: cpufreq: Add cppc_cpufreq driver implementation

v1: rust: enable bindgen’s --enable-function-attribute-detection flag

bindgen is able to detect certain function attributes and annotate functions correspondingly in its output for the Rust side, when the --enable-function-attribute-detection is passed.

v2: Rust KASAN Support

Right now, if we turn on KASAN, Rust code will cause violations because it’s not enabled properly.

v6: drm/panic: Add a QR code panic screen

This series adds a new panic screen, with the kmsg data embedded in a QR code.

BPF

v2: bpf-next: support bpf_fastcall patterns for calls to kfuncs

Allow bpf_fastcall rewrite for bpf_cast_to_kern_ctx() and bpf_rdonly_cast() in order to conjure selftests for this feature.

v4: bpf-next: Share user memory to BPF program through task storage map.

Some of BPF schedulers (sched_ext) need hints from user programs to do a better job. For example, a scheduler can handle a task in a

v15: Reduce overhead of LSMs with static calls

LSM hooks (callbacks) are currently invoked as indirect function calls. These callbacks are registered into a linked list at boot time as the order of the LSMs can be configured on the kernel command line with the “lsm=” command line parameter.

v2: bpf-next: __jited_x86 test tag to check x86 assembly after jit

Some of the logic in the BPF jits might be non-trivial. It might be useful to allow testing this logic by comparing generated native code with expected code template.

v3: uprobes: Improve scalability by reducing the contention on siglock

The profiling result of BPF selftest on ARM64 platform reveals the significant contention on the current->sighand->siglock is the scalability bottleneck.

v1: bpf-next: allow calling kfuncs in normal tracepoint programs

It is possible to call a kfunc within a raw tp_btf program but not possible within a normal tracepoint program.

v1: bpf: cg_skb add get classid helper

At cg_skb hook point, can get classid for v1 or v2, allowing users to do more functions such as acl.

v1: net: Don’t allow to attach xdp if bond slave device’s upper already has a program

Cannot attach when an upper device already has a program, This restriction is only for bond’s slave devices, and should not be accidentally injured for devices like eth0 and vxlan0.

v1: arm64: insn: Simulate nop and push instruction for better uprobe performance

The root cause lies in the arch_probe_analyse_insn(), which excludes ‘nop’ and ‘stp’ from the emulatable instructions list.

v4: bpf-next: Support bpf_kptr_xchg into local kptr

This revision adds substaintial changes to patch 2 to support structures with kptr as the only special btf type.

v19: net-next: Device Memory TCP

v3: uprobes: turn trace_uprobe’s nhit counter to be per-CPU one

trace_uprobe->nhit counter is not incremented atomically, so its value is questionable in when uprobe is hit on multiple CPUs simultaneously.

v1: bpf-next: bpf: Add gen_epilogue and allow kfunc call in pro/epilogue

This set allows the subsystem to patch codes before BPF_EXIT.

v2: perf/bpf: Don’t call bpf_overflow_handler() for tracing events

The regressing commit is new in 6.10. It assumed that anytime event->prog is set bpf_overflow_handler() should be invoked to execute the attached bpf program.

v5: bpf-next: bpf: enable some functions in cgroup programs

Enable some BPF kfuncs and the helper bpf_current_task_under_cgroup() for program types BPF_CGROUP_*.

v4: bpf-next: bpf: enable generic kfuncs for BPF_CGROUP_* programs

These kfuncs are enabled even in BPF_PROG_TYPE_TRACING, so they should be safe also in BPF_CGROUP_* programs.

v3: uprobes: RCU-protected hot path optimizations

This patch set is heavily inspired by Peter Zijlstra’s uprobe optimization patches ([0]) and continue that work, albeit trying to keep complexity to the minimum, and attepting to reuse existing primitives as much as possible.

v1: bpf-next: support nocsr patterns for calls to kfuncs

Mark bpf_cast_to_kern_ctx() and bpf_rdonly_cast() kfuncs as KF_NOCSR in order to conjure selftests for this feature.

周边技术动态

Qemu

v2: bsd-user: Comprehensive RISCV Support

Key Changes Compared to Version 1:

v4: qemu: target/riscv: Add Zilsd and Zclsd extension support

This version no longer separates the implementation of Zilsd and Zclsd extensions.

v4: riscv support for control flow integrity extensions

v4 for riscv zicfilp and zicfiss extensions support in qemu.

v1: RISC-V: support CLIC v0.9 specification

This patch set gives an implementation of “RISC-V Core-Local Interrupt Controller(CLIC) Version 0.9-draft-20210217”.

Buildroot

package/gnu-efi: only supported on MMU-capable platforms

Since we’re anyway not interested in gnu-efi on noMMU platforms, let’s not even spend time on trying to fix this and make MMU support a requirement for gnu-efi.

v1: boot: optee-os: enable RISC-V (64-bit) architecture