泰晓科技 -- 聚焦 Linux - 追本溯源,见微知著!
网站地址:https://tinylab.org

儿童Linux系统,可打字编程学数理化
请稍侯

RISC-V Linux 内核及周边技术动态第 98 期

呀呀呀 创作于 2024/07/01

时间:20240630
编辑:晓瑜
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS

内核动态

RISC-V 架构支持

v1: arch: riscv: thead: implement basic spi

implemented basic spi support for TH1520 SoC. created a fixed clock and a simple spi0 node. updated the matching binding to include thead,th1520-spi as compatible. added a spidev device in devicetree which will utilise the spi0 node.

v4: Tracepoints and static branch in Rust

An important part of a production ready Linux kernel driver is tracepoints.

v1: riscv: signal: abstract header saving for setup_sigcontext

The function save_v_state() served two purposes.

v1: riscv: vector: treat VS_INITIAL as discard

The purpose of riscv_v_vstate_discard() is to invalidate v context at entries of syscalls.

v2: riscv: ftrace: atmoic patching and preempt improvements

This series makes atmoic code patching possible in riscv ftrace.

v6: Add Svade and Svadu Extensions Support

Svade and Svadu extensions represent two schemes for managing the PTE A/D bit.

v7: riscv: mm: Add support for Svinval extension

The Svinval extension splits SFENCE.VMA instruction into finer-grained invalidation and ordering operations and is mandatory for RVA23S64 profile.

v2: riscv: add initial support for SpacemiT K1

SpacemiT K1 is an ideal chip for some new extension such as RISC-V Vector 1.0 and Zicond evaluation now. Add initial support for it to allow more people to participate in building drivers to mainline for it.

v2: riscv: entry: always initialize regs->a0 to -ENOSYS

Otherwise when the tracer changes syscall number to -1, the kernel fails to initialize a0 with -ENOSYS and subsequently fails to return the error code of the failed syscall to userspace.

v2: Zacas/Zabha support and qspinlocks

This implements [cmp]xchgXX() macros using Zacas and Zabha extensions and finally uses those newly introduced macros to add support for qspinlocks: note that this implementation of qspinlocks satisfies the forward progress guarantee.

v7: Centralize _GNU_SOURCE definition into lib.mk

This is condensed into a single commit to avoid redefinition warnings from partial merges.

v1: riscv: uaccess: optimizations

This series tries to optimize riscv uaccess in the following way:

v1: riscv: Randomize lower bits of stack address

Implement arch_align_stack() to randomize the lower bits of the stack address.

v3: RISC-V: Detect and report speed of unaligned vector accesses

The vec_misaligned_speed key keeps the same format as the scalar unaligned access speed key.

v2: RISC-V: cmdline: Add support for ‘memmap’ parameter

Add parsing of ‘memmap’ to use or reserve a specific region of memory.

v2: irqchip/sifive-plic: ensure interrupt is enable before EOI

The PLIC signals it has completed executing an interrupt handler bywriting the interrupt ID it received from the claim to the claim/completeregister.

v1: riscv: Extend sv39 linear mapping max size to 128G

This harmonizes all virtual addressing modes which can now all map (PGDIR_SIZE * PTRS_PER_PGD) / 4 of physical memory.

v2: clk: thead: Add support for TH1520 AP_SUBSYS clock controller

This series adds support for the AP sub-system clock controller in the T-Head TH1520 .

v2: riscv: enable HAVE_ARCH_STACKLEAK

Add support for the stackleak feature. Whenever the kernel returns to user space the kernel stack is filled with a poison value.

v1: riscv: allwinner: ClockworkPi and DevTerm devicetrees

Here are a couple patches that were originally sent by Samuel, but later dropped due to the system LDO regulator bindings not getting merged. The regulator bindings were recently resent and landed [1], so now is the time to get the rest of the stragglers in.

LoongArch 架构支持

v1: LoongArch: uprobes: make UPROBE_SWBP_INSN/UPROBE_XOLBP_INSN constant

LoongArch defines UPROBE_SWBP_INSN as a function call and this breaks arch_uprobe_trampoline() which uses it to initialize a static variable.

v4: LoongArch: KVM: Add Binary Translation extension support

Loongson Binary Translation (LBT) is used to accelerate binary translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags) and x87 fpu stack pointer (ftop).

v1: LoongArch: Automatically disable KASLR for hibernation

Hibernation assumes the memory layout after resume be the same as that before sleep, so it expects the kernel is loaded at the same position.

进程调度

[PATCH-RT sched v2 0/2] Optimize the RT group scheduling

The first patch optimizes the enqueue and dequeue of rt_se, the strategy employs a bottom-up removal approach. The second patch provides validation for the efficiency improvements made by patch 1.

v1: ARM, sched/topology: Check return value of kcalloc()

Check the return value of kcalloc() and return early if memory allocation fails.

[PATCH-RT sched v1 0/2] Optimize the RT group scheduling

The first patch optimizes the enqueue and dequeue of rt_se, the strategy employs a bottom-up removal approach. The second patch provides validation for the efficiency improvements made by patch 1. The test case count the number of infinite loop executions for all threads.

v2: sched: Initialize the vruntime of a new task when it is first enqueued

When creating a new task, we initialize vruntime of the newly task at sched_cgroup_fork().

v1: sched/core: defer printk() while rq lock is held

syzbot is reporting circular locking dependency inside __bpf_prog_run() when trace_sched_switch() hook is called from __schedule(), for fault injection calls printk() despite rq lock is already held.

v5: Introduce –task-name and –fuzzy-name options in perf sched map

This patchset aims to reduce the amount of output printed on the terminal when using perf sched map, allowing users to focus only on the tasks of interest.

v2: sched/fair: Make SCHED_IDLE entity be preempted in strict hierarchy

According to the cgroup hierarchy, A should preempt B. But current check_preempt_wakeup_fair() treats cgroup se and task separately, so B will preempt A unexpectedly.

v1: sched/urgent: sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks

Set_load_weight() is called with @update_load set.

v1: sched/psi: Optimise psi_group_change a bit

The current code loops over the psi_states only to call a helper which then resolves back to the action needed for each state using a switch statement.

v1: sched/eevdf: Augment comments to account for reality

The references to “CFS” is a bit misleading these days since the scheduling principe is EEVDF.

v1: sched/fair: Make SCHED_IDLE se be preempted in strict hierarchy

According to the cgroup hierarchy, A should preempt B.

内存管理

v4: mm: support mTHP swap-in for zRAM-like swapfile

In an embedded system like Android, more than half of anonymous memory is actually stored in swap devices such as zRAM.

[v3 linus-tree PATCH] mm: gup: stop abusing try_grab_folio

A kernel warning was reported when pinning folio in CMA memory when launching SEV virtual machine.

v2: Make core VMA operations internal and testable

There are a number of “core” VMA manipulation functions implemented in mm/mmap.c, notably those concerning VMA merging, splitting, modifying, expanding and shrinking, which logically don’t belong there.

v2: mm: introduce per-order mTHP split counters

Currently, the split counters in THP statistics no longer include PTE-mapped mTHP.

v1: support “THPeligible” semantics for mTHP with anonymous shmem

After the commit 7fb1b252afb5 (“mm: shmem: add mTHP support for anonymous shmem”), we can configure different policies through the multi-size THP sysfs interface for anonymous shmem.

v4: Improve the copy of task comm

Using {memcpy,strncpy,strcpy,kstrdup} to copy the task comm relies on the length of task comm. Changes in the task comm could result in a destination string that is overflow.

v1: mm/zsmalloc: add zpdesc memory descriptor for zswap.zpool

According to Metthew’s plan, the page descriptor will be replace by a 8 bytes mem_desc on destination purpose.

v1: New uid & gid mount option parsing helpers

Multiple filesystems take uid and gid as options, and the code to create the ID from an integer and validate it is standard boilerplate that can be moved into common helper functions, so do that for consistency and less cut&paste.

[v2 linus-tree PATCH] mm: gup: do not call try_grab_folio() in slow path

The try_grab_folio() is supposed to be used in fast path and it elevates folio refcount by using add ref unless zero.

[v2 PATCH] mm: gup: do not call try_grab_folio() in slow path

The try_grab_folio() is supposed to be used in fast path and it elevates folio refcount by using add ref unless zero.

v1: cachestat: do not flush stats in recency check

This is done in the workingset_test_recent() step (which checks if the folio’s eviction is recent).

**[v6: ioctl()-based API to query VMAs from /proc//maps](http://lore.kernel.org/linux-mm/20240627170900.1672542-1-andrii@kernel.org/)**

Implement binary ioctl()-based interface to /proc//maps file to allow applications to query VMA information more efficiently than reading *all* VMAs nonselectively through text-based interface of /proc//maps file.

v1: mm-unstable: mm/damon/core: ensure max threshold attempt for max_nr_regions violation

Fix this by stopping the loop by comparing the last-used threshold instead of the to-be-used threshold, and if the last-used threshold is same to or higher than the maximum possible threshold.

v1: DRM resource management cgroup, try 2.

This series allows setting limits on VRAM similar to system memory, with min/low/max limits. This allows various cgroups to have their own limits for usage.

v7: mm: store zero pages to be swapped out in a bitmap

As shown in the patchseries that introduced the zswap same-filled optimization , 10-20% of the pages stored in zswap are same-filled.

v1: fs: multigrain timestamp redux

v1: swapfile: disable swapon for bs > ps devices

Devices which have a requirement for bs > ps cannot be supported for swap as swap still needs work.

v1: mm-unstable: mm/damon/core: increase regions merge aggressiveness while respecting min_nr_regions

The access frequency threshold avoids merging two adjacent regions that having pretty different access frequency.

v2: mm: vmalloc: Check if a hash-index is in cpu_possible_mask

The problem is that there are systems where cpu_possible_mask has gaps between set CPUs, for example SPARC.

v5: mm: migrate: support poison recover from migrate folio

The folio migration is widely used in kernel, memory compaction, memory hotplug, soft offline page, numa balance, memory demote/promotion, etc, but once access a poisoned source folio when migrating, the kernel will panic.

v1: mm: introduce gen information in /proc/xxx/smaps

This commit would like to introduce the folios’ gen information based on VMA block via which the userspace could query the VA’s activity before madvise.

v1: mm: Prevent derefencing NULL ptr in pfn_section_valid()

Commit 5ec8e8ea8b77 (“mm/sparsemem: fix race in accessing memory_section->usage”) changed pfn_section_valid() to add a READ_ONCE() call around “ms->usage” to fix a race with section_deactivate() where ms->usage can be cleared.

v1: hugetlbfs: add MTE support

MTE can be supported on ram based filesystem. It is supported on tmpfs. There is use case to use MTE on hugetlbfs as well, adding MTE support.

文件系统

v1: vfs: rename parent_ino to d_parent_ino and make it use RCU

The routine is used by procfs through dir_emit_dots.

v1: pidfs: allow retrieval of namespace descriptors

This adds support from deriving a namespace file descriptor from a pidfd for all namespace types.

v3: fs/namespace: defer RCU sync for MNT_DETACH umount

Attached is v3 of the umount optimization. Please take a look at v1 for the original introduction to the problem.

v2: Rosebush, a new hash table

Rosebush is a resizing, scalable, cache-aware, RCU optimised hash table.

v2: fat: add support for directories without . and .. entries

Some FAT filesystems do not have . and .. entries in some directories.

v3: vfs: support statx(…, NULL, AT_EMPTY_PATH, …)

The newly used helper also checks for empty (“”) paths.

v9: arm64/gcs: Provide support for GCS in userspace

The arm64 Guarded Control Stack (GCS) feature provides support for hardware protected stacks of return addresses, intended to provide hardening against return oriented programming (ROP) attacks and to make it easier to gather call stacks for applications such as profiling.

网络设备

v3: Add AP6275P wireless support

These add AP6275P wireless support on Khadas Edge2. Enable 32k clock for Wi-Fi module and extend the hardware IDs table in the brcmfmac driver for it to attach.

v6: bpf-next: netfilter: Add the capability to offload flowtable in XDP layer

Introduce bpf_xdp_flow_lookup kfunc in order to perform the lookup of a given flowtable entry based on the fib tuple of incoming traffic.

v4: Introduce EN7581 ethernet support

Add airoha_eth driver in order to introduce ethernet support for Airoha EN7581 SoC available on EN7581 development board.

v1: net-next: pull-request: can-next 2024-06-29

this is a pull request of 14 patches for net-next/master.

v1: net-next: gve: Add retry logic for recoverable adminq errors

This method keeps track of return codes for each queue and retries the commands for the queues that failed with ETIME.

v3: net: sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket

When using a BPF program on kernel_connect(), the call can return -EPERM. This causes xs_tcp_setup_socket() to loop forever, filling up the syslog and causing the kernel to potentially freeze up.

v1: gve: Add retry logic for recoverable adminq errors

An adminq command is retried if it fails with an ETIME error code which translates to the deadline exceeded error for the device.

v1: net/socket: clamp negative backlog value to 0 in listen()

If listen() is called with a backlog argument value that is less than 0, the function behaves as if it had been called with a backlog argument value of 0.

[net-next PATCH] octeontx2-af: Sync NIX and NPA contexts from NDC to LLC/DRAM

Octeontx2 hardware uses Near Data Cache(NDC) block to cache contexts in it so that access to LLC/DRAM can be avoided.

v1: net-next: net: tn40xx: add initial ethtool_ops support

Call phylink_ethtool_ksettings_get() for get_link_ksettings method and ethtool_op_get_link() for get_link method.

v2: net-next: net: ethernet: ti: am65-cpsw: Add multi queue RX support

This series adds multi-queue support. The driver starts with 1 RX queue by default. User can increase the RX queues via ethtool,

v5: net-next: net: pse-pd: Add new PSE c33 features

This patch series adds new c33 features to the PSE API.

v2: net: phy: aquantia: add missing include guards

The header is missing the include guards so add them.

v15: net-next: Device Memory TCP

v6: landlock: Add abstract unix socket connect restriction

Abstract unix sockets are used for local inter-process communications without a filesystem.

v2: net-next: tcp_metrics: add netlink protocol spec in YAML

Add a netlink protocol spec for the tcp_metrics generic netlink family. First patch adjusts the uAPI header guards to make it easier to build tools/ with non-system headers.

v2: net: tcp_metrics: validate source addr length

v1: net-next: net: introduce TX shaping H/W offload API

This series introduces new device APIs to configure in a flexible way TX shaping H/W offload.

v4: net-next: enic: add ethtool get_channel support

Add .get_channel to enic_ethtool_ops to enable basic ethtool -l support to get the current channel configuration.

v6: net/mlx5: Reclaim max 50K pages at once

This needs humongous number of cmd mailboxes, which to be released once the pages are reclaimed. Release of humongous number of cmd mailboxes is consuming cpu time running into many seconds.

安全增强

v2: mfd: omap-usb-tll: use struct_size to allocate tll

In particular, the allocation for the array of pointers was converted into a single-pointer allocation.

v1: printk: Add a short description string to kmsg_dump()

This patch adds a new parameter “const char *desc” to the kmsg_dumper dump() callback, and update all drivers that are using it.

v5: Add sy7802 flash led driver

This series introduces a driver for the Silergy SY7802 charge pump used in the BQ Aquaris M5 and X5 smartphones.

v3: Add per-core RAPL energy counter support for AMD CPUs

This patchset adds a new “power_per_core” PMU alongside the existing “power” PMU, which will be responsible for collecting the new “energy-per-core” event.

异步 IO

v5: io_uring/rsrc: coalescing multi-hugepage registered buffers

This patch series enables coalescing registered buffers with more than one hugepages. It optimizes the DMA-mapping time and saves memory for these kind of buffers.

v2: Read/Write with meta/integrity

This adds a new io_uring interface to exchange meta along with read/write.

v1: statx NULL path support

Generated against vfs/vfs.empty.path, uses the new vfs_empty_path helper.

v1: io_uring: signal SQPOLL task_work with TWA_SIGNAL_NO_IPI

Before SQPOLL was transitioned to managing its own task_work, the core used TWA_SIGNAL_NO_IPI to ensure that task_work was processed.

Rust For Linux

BPF

v1: bpf-next: no_caller_saved_registers attribute for helper calls

This RFC seeks to allow using no_caller_saved_registers gcc/clang attribute with some BPF helper functions.

v13: Reduce overhead of LSMs with static calls

With this patch-set some syscalls with lots of LSM hooks in their path benefitted at an average of 3% and I/O and Pipe based system calls benefitting the most.

v6: bpf-next: use network helpers, part 8

v5: Faultable Tracepoints

Wire up the system call tracepoints with Tasks Trace RCU to allow the ftrace, perf, and eBPF tracers to handle page faults.

v1: bpf-next: libbpf: Make btf_name_info.needs_size unsigned

Resolve the issue by making needs_size unsigned.

v1: bpf-next: s390/bpf: Implement arena

This series adds arena support to the s390x JIT.

v1: sched_ext/for-6.11: sched_ext: Disallow loading BPF scheduler if isolcpus= domain isolation is in effect

sched_domains regulate the load balancing for sched_classes.

v2: HID: HID: bpf_struct_ops, part 2

This series is a followup of the struct_ops conversion.

v1: sched_ext/for-6.11: sched_ext: Account for idle policy when setting p->scx.weight in scx_ops_enable_task()

Update it to use WEIGHT_IDLEPRIO as the source weight for SCHED_IDLE tasks.

v14: net-next: Device Memory TCP

v1: bpf: defer printk() inside __bpf_prog_run()

syzbot is reporting circular locking dependency inside __bpf_prog_run(), for fault injection calls printk() despite rq lock is already held.

周边技术动态

Qemu

v2: target/riscv: Support zimop/zcmop/zama16b/zabha

We have sent their implementations separately, and we have received few objective comments except for some ISA extensions order. So, I have put them together as one patch set to make it easier for merging.

v1: util: Add cpuinfo support for riscv

Do cpu feature detection in util, like other hosts. Support the OpenBSD ucontext_t. Support the Linux __riscv_hwprobe syscall.

v1: riscv-to-apply queue

The following changes since commit 3f044554b94fc0756d5b3cdbf84501e0eea0e629:

v6: RISC-V: Modularize common match conditions for trigger

This series modularize the code for checking the privilege levels of type 2/3/6 triggers by implementing functions trigger_common_match() and trigger_priv_match().

v7: Add RISC-V ISA extension smcntrpmf support

This patch series adds the support for RISC-V ISA extension smcntrpmf (cycle and privilege mode filtering) .

v1: disas/riscv: Add decode for Zawrs extension

Add disassembly support for these instructions from Zawrs:

v3: Support RISC-V CSR read/write in Qtest environment

These patches add functionality for unit testing RISC-V-specific registers.

v8: target/riscv/kvm/kvm-cpu.c: kvm_riscv_handle_sbi() fail with vendor-specific SBI

Add new error path to provide proper error in case of qemu_chr_fe_read_all() may not return sizeof(ch).

v2: target/riscv: Add support for machine specific pmu’s events

Was added call backs for machine specific pmu events.

v4: riscv: QEMU RISC-V IOMMU Support

This new version contains changes based on suggestions made during the v3 review.

Buildroot

package/xz: explicitly specify all autoconf options

Explicitly specify all autoconf options with their default values, with the following special cases



Read Album:

Read Related:

Read Latest: