泰晓科技 -- 聚焦 Linux - 追本溯源,见微知著!
网站地址:https://tinylab.org

泰晓Linux知识星球:1300+知识点,520+用户
请稍侯

RISC-V Linux 内核及周边技术动态第 76 期

呀呀呀 创作于 2024/02/01

时间:20240201
编辑:晓怡
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS

内核动态

RISC-V 架构支持

v3: riscv: dts: starfive: add Milkv Mars board device tree

patch1 adds ‘cpus’ label patch2 adds “milkv,mars” board dt-binding patch3patch4 adopt Krzysztof’s suggestions to DT node names patch5 introduces a board common dtsi for visionfive2 and mars patch3 adds the mars board dts file describing the currently supported features: Namely PMIC, UART, I2C, GPIO, SD card, QSPI Flash, eMMC and Ethernet.

v2: irqchip/sifive-plic: enable interrupt if needed before EOI

RISC-V PLIC cannot “end-of-interrupt” (EOI) disabled interrupts, as explained in the description of Interrupt Completion in the PLIC spec:

“The PLIC signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register. The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that is currently enabled for the target, the completion is silently ignored.”

v3: riscv: mm: Extend mappable memory up to hint address

On riscv, mmap currently returns an address from the largest address space that can fit entirely inside of the hint address. This makes it such that the hint address is almost never returned. This patch raises the mappable area up to and including the hint address. This allows mmap to often return the hint address, which allows a performance improvement over searching for a valid address as well as making the behavior more similar to other architectures.

v2: bpf-next: Mixing bpf2bpf and tailcalls for RV64

In the current RV64 JIT, if we just don’t initialize the TCC in subprog, the TCC can be propagated from the parent process to the subprocess, but the TCC of the parent process cannot be restored when the subprocess exits. Since the RV64 TCC is initialized before saving the callee saved registers into the stack, we cannot use the callee saved register to pass the TCC, otherwise the original value of the callee saved register will be destroyed.

v3: riscv: sophgo: add reset support for SG2042

This series adds reset controller support for Sophgo SG2042 using reset-simple driver.

v1: Add IAX45 support for RZ/Five SoC

The IAX45 block on RZ/Five SoC is almost identical to the IRQC bock found on the RZ/G2L family of SoCs.

IAX45 performs various interrupt controls including synchronization for the external interrupts of NMI, IRQ, and GPIOINT and the interrupts of the built-in peripheral interrupts output by each module. And it notifies the interrupt to the PLIC.

v3: mm/memory: optimize fork() with PTE-mapped THP

Now that the rmap overhaul[1] is upstream that provides a clean interface for rmap batching, let’s implement PTE batching during fork when processing PTE-mapped THPs.

This series is partially based on Ryan’s previous work[2] to implement cont-pte support on arm64, but its a complete rewrite based on [1] to optimize all architectures independent of any such PTE bits, and to use the new rmap batching functions that simplify the code and prepare for further rmap accounting changes.

v1: CAST Controller Area Network driver support

This patchset adds initial rudimentary support for the CAST Controller Area Network driver. And we registered cast in kernel as well. This driver will be applied in JH7110 SoC first, so add relevant compatibility support.

v1: riscv: Implement pte_accessible()

Like other architectures, a pte is accessible if it is present or if there is a pending tlb flush and the pte is protnone (which could be the case when a pte is downgraded to protnone before a flush tlb is executed).

v1: riscv: optimize memcpy/memmove/memset

Compared with Matteo’s original series, Jisheng made below changes:

  1. adopt Emil’s change to fix boot failure when build with clang
  2. add corresponding changes to purgatory
  3. always build optimized string.c rather than only build when optimize for performance
  4. implement unroll support when src & dst are both aligned to keep the same performance as assembly version.

v12: Linux RISC-V AIA Support

The RISC-V AIA specification is ratified as-per the RISC-V international process. The latest ratified AIA specifcation can be found at: https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf

v8: Change PWM-controlled LED pin active mode and algorithm

According to the circuit diagram of User LEDs - RGB described in the manual hifive-unleashed-a00.pdf[0] and hifive-unmatched-schematics-v3.pdf[1].

v5: pwm: Improve lifetime tracking for pwm_chips

this is v5 of this series. The relevant changes since v4 (https://lore.kernel.org/linux-pwm/cover.1701860672.git.u.kleine-koenig@pengutronix.de):

  • New first patch to reshuffle functions in core.c. This is a preparation for the later changes which brings functions in a better order to not need declarations.
  • Fix kernel docs in several drivers
  • Added a few ack and review tags received for v4
  • non-trivially rebased to current pwm/for-next

v2: riscv: Optimize crc32 with Zbc extension

As suggested by the B-ext spec, the Zbc (carry-less multiplication) instructions can be used to accelerate CRC calculations. Currently, the crc32 is the most widely used crc function inside kernel, so this patch focuses on the optimization of just the crc32 APIs.

v1: Bump the minimum supported version of LLVM to 13.0.1

This series bumps the minimum supported version of LLVM for building the kernel to 13.0.1. The first patch does the bump and all subsequent patches clean up all the various workarounds and checks for earlier versions.

进程调度

v1: sched/Documentation: Add RT_RUNTIME_SHARE documentation

RT_RUNTIME_SHARE is an important strategy for rt bandwidth, and we should document this sched feature.

v2: sched/fair: Sanity check ‘best’ in pick_eevdf()

Before commit 2227a957e1d5 (“sched/eevdf: Sort the rbtree by virtual deadline”), there was a sanity check to catch unexpected failures in the EEVDF scheduling, which was helpful in identifying problems. It would be better to restore its previous capability.

v1: sched/eevdf: Prevent vlag from exceeding the limit value

There are some scenarios here that will cause vlag to exceed eevdf’s limit.

v1: resend: sched: Can we rename ‘core scheduling’ to ‘smt scheduling’?

On Tue, Dec 19, 2023 at 03:07:43PM +0800, Wang Jinchao wrote:

The term ‘core’ in ‘kernel/sched/’ implies a relation to the kernel of sched, and at the same time, ‘core’ is used in ‘core scheduling’ to represent a CPU core. Both meanings coexist in the ‘core.c’ file and appear numerous times.

v1: Revert “nl80211/cfg80211: Specify band specific min RSSI thresholds with sched scan”

During the review of a new patch [1] it was observed that the functionality being modified was not actually being used by any in-tree driver. Further research determined that the functionality was originally introduced to support a new Android interface, but that interface was subsequently abandoned. Since the functionality has apparently never been used, remove it. However, to mantain the sanctity of the UABI, keep the nl80211.h assignments, but clearly mark them as obsolete.

内存管理

v1: mm/cma: Don’t treat bad input arguments for cma_alloc() as its failure

Invalid cma_alloc() input scenarios - including excess allocation request should neither be counted as CMA_ALLOC_FAIL nor ‘cma->nr_pages_failed’ be updated when applicable with CONFIG_CMA_SYSFS. This also drops ‘out’ jump label which has become redundant.

v2: mm: swap: async free swap slot cache entries

We discovered that 1% swap page fault is 100us+ while 50% of the swap fault is under 20us.

Further investigation show that a large portion of the time spent in the free_swap_slots() function for the long tail case.

v2: kasan: add atomic tests

I’m still uncelar on which kinds of atomic accesses we should be testing though. The patch below only covers a subset, and I don’t know if it would be feasible to just manually add all atomics of interest. Which ones would those be exactly? As Andrey pointed out on Bugzilla, if we were to include all of the atomic64_* ones, that would make a lot of function calls.

v1: mm/vmscan: Change the type of file from int to bool

Change the type of file from int to bool because is_file_lru return bool

v1: mm/slab: Add slabreclaim flag to slabinfo

In order to enhance slab debugging, we add slabreclaim flag to slabinfo. Slab type is also an important analysis point in slabinfo for per slab, when various problems such as memory leaks or memory statistics occur.

v2: mm/mmap: use SZ_{8K, 128K} helper macro

Use SZ_{8K, 128K} helper macro instead of the number in init_user_reserve and reserve_mem_notifier. This is more readable.

v4: mm/mempolicy: weighted interleave mempolicy and sysfs extension

This is hopefully the final major update to this line. Full version nodes at the end of initial cover letter chunk. (v4: style, task->il_weight, uninitialized values, docs)

v1: mempolicy: add home_node info to mpol_to_str()

There is currently no userspace interface for obtaining home_node, so we have added home_node to the mpol_to_str(). This allows us to obtain the home_node from the /proc/pid/numa_map.

v1: mm/vmscan: Change the calculation of the number of can reclaim anon pages in zone_reclaimable_pages

The spaces of swap devices that can be set by the user are unpredictable values, so we take the minimum value between the anonymous page in the specified zone and the spaces of swap devices.

v1: mm/damon: make DAMON debugfs interface deprecation unignorable

DAMON debugfs interface is deprecated in February 2023, by commit interface deprecation notice”). Make the fact unable to be easily ignored by removing an example usage from the document (patch 1), renaming the config (patch 2), adding a deprecation notice file to the debugfs directory (patches 3-5), and renaming the debugfs file that essnetial to be used for real use of DAMON (patches 6-9).

v2: per-vma locks in userfaultfd

Performing userfaultfd operations (like copy/move etc.) in critical section of mmap_lock (read-mode) causes significant contention on the lock when operations requiring the lock in write-mode are taking place concurrently. We can use per-vma locks instead to significantly reduce the contention issue.

v1: shmem: Properly report quota mount options

Report quota options among the set of mount options. This allows proper user visibility into whether quotas are enabled or not.

v1: selftests/mm: run_vmtests.sh: add hugetlb test category

The usage of run_vmtests.sh does not include hugetlb, which is a valid test category.

v1: rfc: mm: migrate: support poison recover from migrate folio

The folio migration is widely used in kernel, memory compaction, memory hotplug, soft offline page, numa balance, memory demote/promotion, etc, but once access a poisoned source folio when migrating, the kerenl will panic.

文件系统

v3: filelock: split file leases out of struct file_lock

I’m not sure this is much prettier than the last, but contracting “fl_core” to “c”, as Neil suggested is a bit easier on the eyes.

v2: test_xarray: advanced API multi-index tests

This is a respin of the test_xarray multi-index tests [0] which use and demonstrate the advanced API which is used by the page cache. This should let folks more easily follow how we use multi-index to support for example a min order later in the page cache. It also lets us grow the selftests to mimic more of what we do in the page cache.

v1: Restore data lifetime support

UFS devices are widely used in mobile applications, e.g. in smartphones. UFS vendors need data lifetime information to achieve good performance. Providing data lifetime information to UFS devices can result in up to 40% lower write amplification. Hence this patch series that restores the bi_write_hint member in struct bio. After this patch series has been merged, patches that implement data lifetime support in the SCSI disk (sd) driver will be sent to the Linux kernel SCSI maintainer.

v2: eventfs: Rewrite to simplify the code (aka: crapectomy)

Linus took the time to massively clean up the eventfs logic. I took his code and made tweaks to represent some of the feedback from Al Viro and also fix issues that came up in testing.

v1: jbd2: user-memory-access in jbd2__journal_start

Before reusing the handle, it is necessary to confirm that the transaction is ready.

v1: fs: Use KMEM_CACHE instead of kmem_cache_create

commit 0a31bd5f2bbb (“KMEM_CACHE(): simplify slab cache creation”) introduces a new macro. Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.

v1: exfat: ratelimit error msg in exfat_file_mmap()

Ratelimit the error message of zeroing out data between the valid size and the file size in exfat_file_mmap() to not flood dmesg.

v5: Set casefold/fscrypt dentry operations through sb->s_d_op

Sorry for the quick respin. The only difference from v4 is that we change the way we check for relevant dentries during a d_move, as suggested by Eric.

v1: select: Avoid wrap-around instrumentation in do_sys_poll()

The mix of int, unsigned int, and unsigned long used by struct poll_list::len, todo, len, and j meant that the signed overflow sanitizer got worried it needed to instrument several places where arithmetic happens between these variables.

v8: io_uring: add support for ftruncate

This patch adds support for doing truncate through io_uring, eliminating the need for applications to roll their own thread pool or offload mechanism to be able to do non-blocking truncates.

v1: 9p: Further netfslib-related changes

Here are some netfslib-related changes we might want to consider applying to 9p:

(1) Enable large folio support for 9p. This is handled entirely bynetfslib and is already supported in afs. I wonder if we should limitthe maximum folio size to 1MiB to match the maximum I/O size in the 9pprotocol.

网络设备

v3: treewide: Use clocksource ID for get_device_system_crosststamp()

This patch series changes struct system_counterval_t to identify the clocksource through enum clocksource_ids, rather than through struct clocksource *. The net effect of the patch series is that get_device_system_crosststamp() callers can supply clocksource ids instead of clocksource pointers. The pointers can be problematic to get hold of.

v1: Intel On Demand: Add netlink interface for SPDM attestation

This patch series primarily adds support for a new netlink ABI in the Intel On Demand driver for performing attestation of the hardware state.

[PATCH RESUBMIT net-next] r8169: simplify EEE handling

We don’t have to store the EEE modes to be advertised in the driver, phylib does this for us and stores it in phydev->advertising_eee. phylib also takes care of properly handling the EEE advertisement.

v1: net-next: net: phy: realtek: add support for RTL8126A-integrated 5Gbps PHY

A user reported that first consumer mainboards show up with a RTL8126A 5Gbps MAC/PHY. This adds support for the integrated PHY, which is also available stand-alone. From a PHY driver perspective it’s treated the same as the 2.5Gbps PHY’s, we just have to support the new PHY ID.

v1: UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x

During bnx2i iSCSI testing we ran into page refcounting issues in the uio mmaps exported from cnic to the iscsiuio process, and bisected back to the removal of the __GFP_COMP flag from dma_alloc_coherent calls.

v1: net-next: Improve GbEth performance on Renesas RZ/G2L and related SoCs

This series aims to improve peformance of the GbEth IP in the Renesas RZ/G2L SoC family and the RZ/G3S SoC, which use the ravb driver. Along the way, we do some refactoring and ensure that napi_complete_done() is used in accordance with the NAPI documentation.

v1: net-next: net/sched: report errors with extack

While working a BPF action, found that the error handling was limited. The support of external ack was only added to some but not all actions.

v5: iwl-next: ixgbe: Convert ret val type from s32 to int

Currently big amount of the functions returning standard error codes are of type s32. Convert them to regular ints as typdefs here are not necessary to return standard error codes.

v1: Introduce uts_release

Files like drivers/base/firmware_loader/main.c needs to be recompiled as it includes generated/utsrelease.h for UTS_RELEASE macro, and utsrelease.h is regenerated when the head commit changes.

v2: net: dqs: NIC stall detector

This is a patch that was sent by Jakub Kicinski six month ago, and I am reviving it.

v1: net-next: net: dccp: Simplify the allocation of slab caches in dccp_ackvec_init

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.

v1: net-next: nfp: series of minor driver improvements

This short series bundles two unrelated but small updates to the nfp driver.

v1: net-next: sctp: Simplify the allocation of slab caches

commit 0a31bd5f2bbb (“KMEM_CACHE(): simplify slab cache creation”) introduces a new macro. Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.

v5: net-next: net: ravb: Prepare for suspend to RAM and runtime PM support (part 1)

This series prepares ravb driver for runtime PM support and adjust the already existing suspend to RAM code to work for RZ/G3S (R9A08G045) SoC.

v2: Dynamically allocate BPIDs for LBK

In current driver 64 BPIDs are reserved for LBK interfaces. These bpids are 1-to-1 mapped to LBK interface channel numbers. In some usecases one LBK interface required more than one bpids and in some case they may not require at all. These usescas can’t be address with the current implementation as it always reserves only one bpid per LBK channel.

v1: ice: Add get/set hw address for VF representor ports

Changing the mac address of the VF representor ports are not available via devlink. Add the function handlers to set and get the HW address for the VF representor ports.

v1: net-next: selftests: openvswitch: Test ICMP related matches work with SNAT

Add a test case for regression in openvswitch nat that was fixed by commit e6345d2824a3 (“netfilter: nf_nat: fix action not being set for all ct states”).

v3: net: octeontx2-af: Initialize maps.

kmalloc_array() without __GFP_ZERO flag does not initialize memory to zero. This causes issues. Use __GFP_ZERO flag for maps and bitmap_zalloc() for bimaps.

v4: net: connector: cn_netlink_has_listeners replaces proc_event_num_listeners

It is inaccurate to judge whether proc_event_num_listeners is cleared by cn_netlink_send_mult returning -ESRCH. In the case of stress-ng netlink-proc, -ESRCH will always be returned, because netlink_broadcast_filtered will return -ESRCH, which may cause stress-ng netlink-proc performance degradation.

v5: net-next: net: dsa: realtek: variants to drivers, interfaces to a common module

The current driver consists of two interface modules (SMI and MDIO) and two family/variant modules (RTL8365MB and RTL8366RB). The SMI and MDIO modules serve as the platform and MDIO drivers, respectively, calling functions from the variant modules. In this setup, one interface module can be loaded independently of the other, but both variants must be loaded (if not disabled at build time) for any type of interface. This approach doesn’t scale well, especially with the addition of more switch variants (e.g., RTL8366B), leading to loaded but unused modules. Additionally, this also seems upside down, as the specific driver code normally depends on the more generic functions and not the other way around.

v8: net-next: netdevsim: link and forward skbs between ports

This patchset adds the ability to link two netdevsim ports together and forward skbs between them, similar to veth. The goal is to use netdevsim for testing features e.g. zero copy Rx using io_uring.

v2: net-next: net: switchdev: Tracepoints

This series starts off (1-2/5) by creating stringifiers for common switchdev objects. This will primarily be used by the tracepoints for decoding switchdev notifications, but drivers could also make use of them to provide richer debug/error messages.

v1: net-next: net: ipa: simplify TX power handling

In order to deliver a packet to the IPA hardware, we must ensure it is powered. We request power by calling pm_runtime_get(), and its return value tells us the power state. We can’t block in ipa_start_xmit(), so if power isn’t enabled we prevent further transmit attempts by calling netif_stop_queue(). Power will eventually become enabled, at which point we call netif_wake_queue() to allow the transmit to be retried. When it does, the power should be enabled, so the packet delivery can proceed.

v1: wifi: ath10k: support board-specific firmware overrides

On WCN3990 platforms actual firmware, wlanmdsp.mbn, is sideloaded to the modem DSP via the TQFTPserv. These MBN files are signed by the device vendor, can only be used with the particular SoC or device.

v1: net: ethernet: mtk_eth_soc: ppe: add support for multiple PPEs

Add the missing pieces to allow multiple PPEs units, one for each GMAC. mtk_gdm_config has been modified to work on targted mac ID, the inner loop moved outside of the function to allow unrelated operations like setting the MAC’s PPE index.

v1: net-next: dpll: move xa_erase() call in to match dpll_pin_alloc() error path order

This is cosmetics. Move the call of xa_erase() in dpll_pin_put() so the order of cleanup calls matches the error path of dpll_pin_alloc().

安全增强

v1: Tegra30: add support for LG tegra based phones

Bring up Tegra 3 based LG phones Optimus 4X HD and Optimus Vu based on LG X3 board.

v1: string: Allow 2-argument strscpy()

Using sizeof(dst) for the “size” argument in strscpy() is the overwhelmingly common case. Instead of requiring this everywhere, allow a 2-argument version to be used that will use the sizeof() internally. There are other functions in the kernel with optional arguments[1], so this isn’t unprecedented, and improves readability. Update and relocate the kern-doc for strscpy() too.

v1: LoongArch: vDSO: Disable UBSAN instrumentation

The vDSO executes in userspace, so the kernel’s UBSAN should not instrument it.

v1: iov_iter: Avoid wrap-around instrumentation in copy_compat_iovec_from_user()

The loop counter “i” in copy_compat_iovec_from_user() is an int, but because the nr_segs argument is unsigned long, the signed overflow sanitizer got worried “i” could wrap around. Instead of making “i” an unsigned long (which may enlarge the type size), switch both nr_segs and i to u32. There is no truncation with nr_segs since it is never larger than UIO_MAXIOV anyway. This keeps sanitizer instrumentation[1] out of a UACCESS path:

v1: overflow: Introduce wrapping helpers

In preparation for gaining instrumentation for signed[1], unsigned[2], and pointer[3] wrap-around, expand the overflow header to include wrap-around helpers that can be used to annotate arithmetic where wrapped calculations are expected (e.g. atomics).

v1: ubsan: Introduce wrap-around sanitizers

Lay the ground work for gaining instrumentation for signed[1], unsigned[2], and pointer[3] wrap-around by making all 3 sanitizers available for testing. Additionally gets x86_64 bootable under the unsigned sanitizer for the first time.

v1: dmaengine: pl08x: Use kcalloc() instead of kzalloc()

This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1].

v2: bus: mhi: ep: Use kcalloc() instead of kzalloc()

This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1].

v1: wifi: brcmfmac: Adjust n_channels usage for __counted_by

After commit e3eac9f32ec0 (“wifi: cfg80211: Annotate struct cfg80211scan_request with __counted_by”), the compiler may enforce dynamic array indexing of req->channels to stay below n_channels. As a result, n_channels needs to be increased _before accessing the newly added array index. Increment it first, then use “i” for the prior index. Solves this warning in the coming GCC that has __counted_by support:

v5: Add device tree for IBM system1 BMC

This patchset adds device tree for IBM system1 bmc board.

Change log:

异步 IO

v1: io_uring: Simplify the allocation of slab caches

commit 0a31bd5f2bbb (“KMEM_CACHE(): simplify slab cache creation”) introduces a new macro. Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.

v1: io_uring/rw: ensure poll based multishot read retries appropriately

io_read_mshot() always relies on poll triggering retries, and this works fine as long as we do a retry per size of the buffer being read. The buffer size is given by the size of the buffer(s) in the given buffer group ID.

Rust For Linux

v3: rust: kernel: documentation improvements

This patch set aims to make small improvements to the documentation of the kernel crate. It engages in a few different activities:

  • fixing trivial typos (commit #1),
  • updating code examples to better reflect an idiomatic coding style (commits #2,6),
  • increasing the consistency within the crate’s documentation as a whole (commits #3,5,7,8,9,11,12),
  • adding more intra-doc links as well as srctree-relative links to C header files (commits #4,10).

v1: rust: prelude: add bit function

In order to create masks easily, the define BIT() is used in C code. This commit adds the same functionality to the rust kernel.

v1: rust: add reexports for macros

Currently, all macros are reexported with #[macro_export] only, which means that to access new_work! from the workqueue, you need to import it from the path kernel::new_work instead of importing it from the workqueue module like all other items in the workqueue. By adding reexports of the macros, it becomes possible to import the macros from the correct modules.

v6: rust: types: Add try_from_foreign() method

Currently ForeignOwnable::from_foreign() only works for non-null pointers for the existing impls (e.g. Box, Arc). It may create a few duplicate code like:

Adding a try_from_foreign() method that will return None if ptr is null, otherwsie return Some(from_foreign(ptr)).

v2: rust: str: implement Display and Debug for BStr

Currently, BStr is just a type alias of [u8], limiting its representation to a byte list rather than a character list, which is not ideal for printing and debugging.

BPF

v1: bpf-next: libbpf Userspace Runtime-Defined Tracing (URDT)

Adding userspace tracepoints in other languages like python and go is a very useful for observability. libstapsdt [1] and language bindings like python-stapsdt [2] that rely on it use a clever scheme of emulating static (USDT) userspace tracepoints at runtime.

v1: bpf-next: bpf: Add generic kfunc bpf_ffs64()

This patchset introduces a new generic kfunc bpf_ffs64(). This kfunc allows bpf to reuse kernel’s __ffs64() function to improve ffs performance in bpf.

v5: bpf-next: bpf: Add bpf_iter_cpumask

Three new kfuncs, namely bpf_iter_cpumask_{new,next,destroy}, have been added for the new bpf_iter_cpumask functionality. These kfuncs enable the iteration of percpu data, such as runqueues, system_group_pcpu, and more.

v1: bpf: Separate bpf_local_storage_lookup() fast and slow paths

To allow the compiler to inline the bpf_local_storage_lookup() fast- path, factor it out by making bpf_local_storage_lookup() a static inline function and move the slow-path to bpf_local_storage_lookup_slowpath().

[PATCH bpf-next ] selftests/bpf: disable IPv6 for lwt_redirect test

After a recent change in the vmtest runner, this test started failing sporadically.

Investigation showed that this test was subject to race condition which got exacerbated after the vm runner change. The symptoms being that the logic that waited for an ICMPv4 packet is naive and will break if 5 or more non-ICMPv4 packets make it to tap0. When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6 router solicitation… On a system with good performance, the expected ICMPv4 packet would very likely make it to the network interface promptly, but on a system with poor performance, those “guarantees” do not hold true anymore.

v4: bpf-next: bpftool: add support for split BTF to gen min_core_btf

Enables a user to generate minimized kernel module BTF.

If an eBPF program probes a function within a kernel module or uses types that come from a kernel module, split BTF is required. The split module BTF contains only the BTF types that are unique to the module. It will reference the base/vmlinux BTF types and always starts its type IDs at X+1 where X is the largest type ID in the base BTF.

v1: vhost: virtio: drivers maintain dma info for premapped vq

If the virtio is premapped mode, the driver should manage the dma info by self. So the virtio core should not store the dma info. So we can release the memory used to store the dma info.

v1: bpf-next: bpf: move -Wno-compare-distinct-pointer-types to BPF_CFLAGS

Clang supports enabling/disabling certain conversion diagnostics via the -W[no-]compare-distinct-pointer-types command line options. Disabling this warning is required by some BPF selftests due to -Werror. Until very recently GCC would emit these warnings unconditionally, which was a problem for gcc-bpf, but we added support for the command-line options to GCC upstream [1].

v1: bpf-next: bpf: build type-punning BPF selftests with -fno-strict-aliasing

A few BPF selftests perform type punning and they may break strict aliasing rules, which are exploited by both GCC and clang by default while optimizing. This can lead to broken compiled programs.

v3: bpf-next: Trusted PTR_TO_BTF_ID arg support in global subprogs

This patch set follows recent changes that added btf_decl_tag-based argument annotation support for global subprogs. This time we add ability to pass PTR_TO_BTF_ID (BTF-aware kernel pointers) arguments into global subprograms. We support explicitly trusted arguments only, for now.

v1: 5.10.y: bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)

[ Upstream commit c86df29d11dfba27c0a1f5039cd6fe387fbf4239 ]

The dispatcher function is currently abusing the ftrace fentry call location for its own purposes – this obviously gives trouble when the dispatcher and ftrace are both in use.

[ Upstream commit c86df29d11dfba27c0a1f5039cd6fe387fbf4239 ]

The dispatcher function is currently abusing the ftrace fentry call location for its own purposes – this obviously gives trouble when the dispatcher and ftrace are both in use.

v4: net-next: Enable SGMII and 2500BASEX interface mode switching for Intel platforms

At the start of link initialization, the ‘allow_switch_interface’ flag is set to true. Based on ‘allow_switch_interface’ flag, the interface mode is configured to PHY_INTERFACE_MODE_NA within the ‘phylink_validate_phy’ function. This setting allows all ethtool link modes that are supported and advertised will be published. Then interface mode switching occurs based on the selection of different link modes.

v1: bpf-next: bpftool: Support dumping kfunc prototypes from BTF

This patch enables dumping kfunc prototypes from bpftool. This is useful b/c with this patch, end users will no longer have to manually define kfunc prototypes.

v3: dwarves: pahole: Inject kfunc decl tags into BTF

This commit teaches pahole to parse symbols in .BTF_ids section in vmlinux and discover exported kfuncs. Pahole then takes the list of kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc.

v4: bpf-next: Annotate kfuncs in .BTF_ids section

This is a bpf-treewide change that annotates all kfuncs as such inside .BTF_ids. This annotation eventually allows us to automatically generate kfunc prototypes from bpftool.

v6: net-next: add multi-buff support for xdp running in generic mode

Introduce multi-buffer support for xdp running in generic mode not always linearizing the skb in netif_receive_generic_xdp routine. Introduce page_pool in softnet_data structure

v1: bpf: generate const static pointers for kernel helpers

The generated bpf_helper_defs.h file currently contains definitions like this for the kernel helpers, which are static objects:

static void (bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;

v3: bpf-next: Improvements for tracking scalars in the BPF verifier

The goal of this series is to extend the verifier’s capabilities of tracking scalars when they are spilled to stack, especially when the spill or fill is narrowing. It also contains a fix by Eduard for infinite loop detection and a state pruning optimization by Eduard that compensates for a verification complexity regression introduced by tracking unbounded scalars. These improvements reduce the surface of false rejections that I saw while working on Cilium codebase.

v1: bpf-next: bpf,token: use BIT_ULL() to convert the bit mask

Replace the ‘(1ULL « *)’ with the macro BIT_ULL(nr).

v1: net-next: net: stmmac: EST conformance support

This patchset enables support for queueMaxSDU and transmission overrun counters which are required for Qbv conformance.

v1: net-next: dma: skip calling no-op sync ops when possible

The series grew from Eric’s idea and patch at [0]. The idea of using the shortcut for direct DMA as well belongs to Chris.

v1: bpf-next: selftests/bpf: Add missing line break in test_verifier

There are no break lines in the test log for test_verifier #106#111 if jit is disabled, add the missing line break at the end of printf() to fix it.

v3: libbpf: Add some details for BTF parsing failures

As CONFIG_DEBUG_INFO_BTF is default off the existing “failed to find valid kernel BTF” message makes diagnosing the kernel build issue some what cryptic. Add a little more detail with the hope of helping users.

周边技术动态

Qemu

v3: target/riscv: mcountinhibit, mcounteren, scounteren, hcounteren is 32-bit

mcountinhibit, mcounteren, scounteren and hcounteren must always be 32-bit by privileged spec

v1: target/riscv: Support mxstatus CSR for thead-c906

We first add a framework for vendor CSRs in patch 1. After that we add one thead-c906 CSR mxstatus, which is used for mmu extension xtheadmaee.

v1: target/riscv: FCSR doesn’t contain vxrm and vxsat

vxrm and vxsat have been moved into a special register vcsr since RVV v1.0. So remove them from FCSR for vector 1.0.

v1: target/riscv: Use RISCVException as return type for all csr ops

The real return value type has been converted to RISCVException, but some function declarations still not. This patch makes all csr operation declarations use RISCVExcetion.

v1: hw/riscv/virt-acpi-build.c: Add SRAT and SLIT ACPI tables

Enable ACPI NUMA support by adding the following 2 ACPI tables: SRAT: provides the association for memory/Harts and Proximity Domains SLIT: provides the relative distance between Proximity Domains

v2: target/riscv: mcountinhibit, mcounteren and scounteren always 32-bit

mcountinhibit, mcounteren and scounteren must always be 32-bit by privileged spec

[RESEND v2 0/2] RISC-V: ACPI: Enable SPCR

This series focuses on enabling the Serial Port Console Redirection (SPCR) table for the RISC-V virt platform. Considering that ARM utilizes the same function, the initial patch involves migrating the build_spcr function to common code. This consolidation ensures that RISC-V avoids duplicating the function.

v2: riscv: named features riscv,isa, ‘svade’ rework

This is a bundle of fixes based on discoveries that were made in the last week or so:

  • what we call “named features” are actually real extensions, which are considered to be ratified by the profile spec that defines them. This means that we need to add riscv,isa strings for them. More info can be found on the commit msg of patch 2;

U-Boot

Pull request efi-2024-04-rc1-4

The following changes since commit 526a865fe4fea59fb2638726c26e39557eb97fdd:

Merge branch ‘master-cleanup’ of https://source.denx.de/u-boot/custodians/u-boot-sh (2024-01-27 20:43:20 -0500)

v2: Add device tree nodes needed by OpenSBI on Visionfive 2

This series adds 2 device tree nodes. These are needed by OpenSBI to perform reset/shutdown of the board.

v4: riscv: sophgo: milkv_duo: add support for Milk-V Duo board

The Milk-V Duo board is built upon Sophgo’s CV1800B SoC, featuring two XuanTie C906 CPUs running at 1.0GHz and 700MHz, respectively.

riscv defconfig starfive visionfive2 env is nowhere, missing axp15060 regulator, misc vendor hacks for Milk-V Mars CM Lite

ENV_IS_NOWHERE=y when it should not be, on make starfive_visionfive2_defconfig. I guess that riscv is forgotten in some of the logic there. TL;DR the env is supposed to save in SPI flash as the defconfig provides (but then also ENV_IS_NOWHERE=y appears ? strange).

v2: riscv: Support building with Clang

Hello everyone!

This is a minimal patchset for making U-Boot build with Clang on RISC-V, something I stumbled upon while writing U-Boot build scripts for SerenityOS’s RISC-V port. The only change is a (for unclear reasons…)

v2: riscv: separate .data and .text sections of EFI binaries

EFI binaries should not contain sections that are both writable and executable. Separate the RX .text section from the RW .data section



Read Album:

Read Related:

Read Latest: