[置顶] 泰晓 RISC-V 实验箱,配套 30+ 讲嵌入式 Linux 系统开发公开课
RISC-V Linux 内核及周边技术动态第 72 期
时间:20231231
编辑:晓怡
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS
内核动态
RISC-V 架构支持
v2: riscv: Add Zicbop & prefetchw support
This patch series adds Zicbop support and then enables the Linux prefetchw feature. It’s based on v6.7-rc7.
GIT PULL: KVM/riscv changes for 6.8 part #1
We have the following KVM RISC-V changes for 6.8: 1) KVM_GET_REG_LIST improvement for vector registers 2) Generate ISA extension reg_list using macros in get-reg-list selftest 3) Steal time account support along with selftest
v1: Enable SPCR table for console output on RISC-V
This patch will enable the SPCR table for RISC-V.
Vendor will enable/disable the SPCR table in the firmware based on the platform design. However, in cases where the SPCR table is not usable, a kernel parameter could be used to specify the preferred console.
v1: riscv: dts: sophgo: add watchdog dt node for CV1800
Add the watchdog device tree node to cv1800 SoC. This patch depends on the clk driver and reset driver. Clk driver link: https://lore.kernel.org/all/IA1PR20MB49539CDAD9A268CBF6CA184BBB9FA@IA1PR20MB4953.namprd20.prod.outlook.com/ Reset driver link: https://lore.kernel.org/all/20231113005503.2423-1-jszhang@kernel.org/
v1: riscv: dts: sophgo: add timer dt node for CV1800
Add the timer device tree node to CV1800 SoC. This patch depends on the clk driver and reset driver. Clk driver link: https://lore.kernel.org/all/IA1PR20MB49539CDAD9A268CBF6CA184BBB9FA@IA1PR20MB4953.namprd20.prod.outlook.com/ Reset driver link: https://lore.kernel.org/all/20231113005503.2423-1-jszhang@kernel.org/
v1: riscv: tlb: avoid tlb flushing on exit & execve
The mmu_gather code sets fullmm=1 when tearing down the entire address space for an mm_struct on exit or execve. So if the underlying platform supports ASID, the tlb flushing can be avoided because the ASID allocator will never re-allocate a dirty ASID.
v1: Add driver for Cadence SD6HC SD/eMMC controller
Starfive JH8100 SoC consists of a Cadence SD/eMMC host controller (Version 6) with Combo PHY which provides DFI interface to SD/eMMC removable or embedded devices. This patch adds initial SD/eMMC support for JH8100 SoC by providing device drivers for Cadence SD/eMMC Version 6 host controller and Combo PHY. This patch series is depending on the JH8100 base patch series in [1], [2], and [3]. The relevant dt-bindings documentation has been updated accordingly.
v2: Unified cross-architecture kernel-mode FPU API
This series unifies the kernel-mode FPU API across several architectures by wrapping the existing functions (where needed) in consistently-named functions placed in a consistent header location, with mostly the same semantics: they can be called from preemptible or non-preemptible task context, and are not assumed to be reentrant. Architectures are also expected to provide CFLAGS adjustments for compiling FPU-dependent code. For the moment, SIMD/vector units are out of scope for this common API.
v1: dt-bindings: riscv: cpus: Clarify mmu-type interpretation
The current description implies that only a single address translation mode is available to the operating system. However, some implementations support multiple address translation modes, and the operating system is free to choose between them.
v14: riscv: Add fine-tuned checksum functions
Each architecture generally implements fine-tuned checksum functions to leverage the instruction set. This patch adds the main checksum functions that are used in networking. Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete an average of 50.9% faster.
v5: riscv: sophgo: add clock support for Sophgo CV1800 SoCs
Add clock controller support for the Sophgo CV1800B and CV1812H.
This patch follow this patch series: https://lore.kernel.org/all/IA1PR20MB495399CAF2EEECC206ADA7ABBBD5A@IA1PR20MB4953.namprd20.prod.outlook.com/
v1: irqchip/sifive-plic: One function call less in __plic_init() after error detection
Date: Tue, 26 Dec 2023 21:34:47 +0100
The kfree() function was called in one case by the __plic_init() function during error handling even if the passed data structure member contained a null pointer. This issue was detected by using the Coccinelle software.
v1: Basic clock and reset support for StarFive JH8100 RISC-V SoC
This patch series enabled basic clock & reset support for StarFive JH8100 SoC.
This patch series depends on the Initial device tree support for StarFive JH8100 SoC patch series which can be found at [1].
v6: Support Andes PMU extension
This patch series introduces the Andes PMU extension, which serves the same purpose as Sscofpmf. To use FDT-based probing for hardware support of the PMU extensions, we first convert T-Head’s PMU to CPU feature alternative, then add Andes PMU alternatives.
v4: riscv: enable EFFICIENT_UNALIGNED_ACCESS and DCACHE_WORD_ACCESS
Some riscv implementations such as T-HEAD’s C906, C908, C910 and C920 support efficient unaligned access, for performance reason we want to enable HAVE_EFFICIENT_UNALIGNED_ACCESS on these platforms. To avoid performance regressions on non efficient unaligned access platforms, HAVE_EFFICIENT_UNALIGNED_ACCESS can’t be globally selected.
v1: riscv: Improve exception and system call latency
Many CPUs implement return address branch prediction as a stack. The RISCV architecture refers to this as a return address stack (RAS). If this gets corrupted then the CPU will mispredict at least one but potentally many function returns.
进程调度
v2: net-next: net/sched: cls_api: complement tcf_tfilter_dump_policy
In function
tc_dump_tfilter
, the attributes array is parsed via tcf_tfilter_dump_policy which only describes TCA_DUMP_FLAGS. However, the NLA TCA_CHAIN is also accessed withnla_get_u32
.
v1: drm/sched: Adjustments for drm_sched_init()
Date: Tue, 26 Dec 2023 16:48:48 +0100
A few update suggestions were taken into account from static source code analysis.
v2: sched/fair: Do not scan non-movable tasks several times
If busiest rq is small, nr_running < SCHED_NR_MIGRATE_BREAK and all tasks are not movable, detach_tasks() should not iterate more than tasks available in the busiest rq.
v1: net: net/sched: cls_api: complement tcf_tfilter_dump_policy
In function
tc_dump_tfilter
, the attributes array is parsed via tcf_tfilter_dump_policy which only describes TCA_DUMP_FLAGS. However, the NLA TCA_CHAIN is also accessed withnla_get_u32
. According to the commit 5e2424708da7 (“xfrm: add forgotten nla_policy for XFRMA_MTIMER_THRESH”), such a missing piece could lead to a potential heap data leak.
内存管理
v11: Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
This patchset is also available at:
https://github.com/amdese/linux/commits/snp-host-v11
and is based on top of the following series:
“v1: Add AMD Secure Nested Paging (SEV-SNP) Initialization Support”https://lore.kernel.org/kvm/20231230161954.569267-1-michael.roth@amd.com/
v1: mm: memory: use nth_page() in clear/copy_subpage()
The clear and copy of huge gigantic page has converted to use nth_page() to handle the possible discontinuous struct page(SPARSEMEM without VMEMMAP), but not change for the non-gigantic part, fix it too.
[mm-stable PATCH] mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
Demotion can work well without CONFIG_NUMA_BALANCING. But the commit 23e9f0138963 (“mm/vmstat: move pgdemote_* to per-node stats”) wrongly hid it behind CONFIG_NUMA_BALANCING.
v1: x86 NUMA-aware kernel replication
This patchset implements initial support of kernel text and rodata replication for x86_64 platform. Linux kernel 6.5.5 is used as a baseline.
There was a work previously published for ARM64 platform by Russell King (arm64 kernel text replication). We hope that it will be possible to push this technology forward together.
v1: mm: ratelimit stat flush from workingset shrinker
One of our internal workload regressed on newer upstream kernel and on further investigation, it seems like the cause is the always synchronous rstat flush in the count_shadow_nodes() added by the commit f82e6bf9bb9b (“mm: memcg: use rstat for non-hierarchical stats”). On further inspection it seems like we don’t really need accurate stats in this function as it was already approximating the amount of appropriate shadow entried to keep for maintaining the refault information. Since there is already 2 sec periodic rstat flush, we don’t need exact stats here. Let’s ratelimit the rstat flush in this code path.
v9: mm/gup: Introduce memfd_pin_folios() for pinning memfd folios (v9)
The first two patches were previously reviewed but not yet merged. These ones need to be merged first as the fourth patch depends on the changes introduced in them and they also fix bugs seen in very specific scenarios (running Qemu with hugetlb=on, blob=true and rebooting guest VM).
v1: mm: kasan: stop leaking stack trace handles
Commit 773688a6cb24 (“kasan: use stack_depot_put for Generic mode”) added support for stack trace eviction for Generic KASAN.
However, that commit didn’t evict stack traces when the object is not put into quarantine. As a result, some stack traces are never evicted from the stack depot.
v2: vhost-vdpa: account iommu allocations
iommu allocations should be accounted in order to allow admins to monitor and limit the amount of iommu memory.
v1: mm: xtensa, kasan: define KASAN_SHADOW_END
Common KASAN code might rely on the definitions of the shadow mapping start, end, and size. Define KASAN_SHADOW_END in addition to KASAN_SHADOW_START and KASAN_SHADOW_SIZE.
v1: kernel: Introduce a write lock/unlock wrapper for tasklist_lock
As a rwlock for tasklist_lock, there are multiple scenarios to acquire read lock which write lock needed to be waiting for. In freeze_process/thaw_processes it can take about 200+ms for holding read lock of tasklist_lock by walking and freezing/thawing tasks in commercial devices. And write_lock_irq will have preempt disabled and local irq disabled to spin until the tasklist_lock can be acquired. This leading to a bad responsive performance of current system.
文件系统
v1: virtiofs: Adjustments for two function implementations
Date: Fri, 29 Dec 2023 09:28:09 +0100
A few update suggestions were taken into account from static source code analysis.
v1: fuse: Improve error handling in two functions
Date: Thu, 28 Dec 2023 21:57:00 +0100
The kfree() function was called in two cases during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software.
v1: fuse: use page cache pages for writeback io when virtio_fs is in use
This patch just shows the idea, to see if I’m in the right direction 😊 And a quick prototype shows the performance improvement. If there’re no obvious concerns, I’ll try to make a formal patch and run the fstests
v1: fs: extract include/linux/fs_type.h
struct file_system_type is one of the things which could be extracted out of include/linux/fs.h easily.
Drop some useless forward declarations and externs too.
v2: Move fscrypt keyring destruction to after ->put_super
This series moves the fscrypt keyring destruction to after ->put_super, as this will be needed by the btrfs fscrypt support. To make this possible, it also changes f2fs to release its block devices after generic_shutdown_super() rather than before.
v1: sysctl: treewide: constify ctl_table_root::set_ownership
The set_ownership callback is not supposed to modify the ctl_table. Enforce this expectation via the typesystem.
This change also is a step to put “struct ctl_table” into .rodata throughout the kernel.
v1: blk: optimization for classic polling
This removes the dependency on interrupts to wake up task. Set task state as TASK_RUNNING, if need_resched() returns true, while polling for IO completion. Earlier, polling task used to sleep, relying on interrupt to wake it up. This made some IO take very long when interrupt-coalescing is enabled in NVMe.
网络设备
v1: sunrpc: Improve exception handling in krb5_etm_checksum()
Date: Sun, 31 Dec 2023 14:43:05 +0100
The kfree() function was called in one case by the krb5_etm_checksum() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software.
[net-next: PATCH] net: mvpp2: initialize port fwnode pointer
Update the port’s device structure also with its fwnode pointer with a recommended device_set_node() helper routine.
v1: tipc: Improve exception handling in tipc_bcast_init()
Date: Sun, 31 Dec 2023 12:20:06 +0100
The kfree() function was called in two cases by the tipc_bcast_init() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software.
v1: wifi: cfg80211: Replace a label in cfg80211_parse_ml_sta_data()
Date: Sun, 31 Dec 2023 11:22:42 +0100
The kfree() function was called in one case by the cfg80211_parse_ml_sta_data() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software.
v1: bpf: Adjustments for four function implementations
Date: Sat, 30 Dec 2023 20:51:23 +0100
A few update suggestions were taken into account from static source code analysis.
v1: Revert “net: ipv6/addrconf: clamp preferred_lft to the minimum required”
The commit had a bug and might not have been the right approach anyway.
v1: net-next: net/sched: sch_api: conditional netlink notifications
Implement conditional netlink notifications for Qdiscs and classes, which were missing in the initial patches that targeted tc filters and actions. Notifications will only be built after passing a check for ‘rtnl_notify_needed()’.
v1: net-next: net/sched: introduce ACT_P_BOUND return code
Bound actions always return ‘0’ and as of today we rely on ‘0’ being returned in order to properly skip bound actions in tcf_idr_insert_many. In order to further improve maintainability, introduce the ACT_P_BOUND return code.
v2: net-next: selftests/net: change shebang to bash to support “source”
The patch set [1] added a general lib.sh in net selftests, and converted several test scripts to source the lib.sh.
unicast_extensions.sh (converted in [1]) and pmtu.sh (converted in [2]) have a /bin/sh shebang which may point to various shells in different distributions, but “source” is only available in some of them. For example, “source” is a built-it function in bash, but it cannot be used in dash.
v1: net: rtnetlink: allow to set iface down before enslaving it
The below commit adds support for:
ip link set dummy0 down ip link set dummy0 master bond0 up
but breaks the opposite:
ip link set dummy0 up ip link set dummy0 master bond0 down
v1: bpf-next: bpf: add csum/ip_summed fields to __sk_buff
For now, we have to call some helpers when we need to update the csum, such as bpf_l4_csum_replace, bpf_l3_csum_replace, etc. These helpers are not inlined, which causes poor performance.
v3: net-next: virtio-net: support AF_XDP zero copy
AF_XDP
XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero copy feature of xsk (XDP socket) needs to be supported by the driver. The performance of zero copy is very good. mlx5 and intel ixgbe already support this feature, This patch set allows virtio-net to support xsk’s zerocopy xmit feature.
v4: VMware hypercalls enhancements
VMware hypercalls invocations were all spread out across the kernel implementing same ABI as in-place asm-inline. With encrypted memory and confidential computing it became harder to maintain every changes in these hypercall implementations.
v3: posix-timers: add multi_clock_gettime system call
Some user space applications need to read some clocks. Each read requires moving from user space to kernel space. The syscall overhead causes unpredictable delay between N clocks reads Removing this delay causes better synchronization between N clocks.
v8: GenieZone hypervisor drivers
This series is based on linux-next, tag: next-20231222.
GenieZone hypervisor(gzvm) is a type-1 hypervisor that supports various virtual machine types and provides security features such as TEE-like scenarios and secure boot. It can create guest VMs for security use cases and has virtualization capabilities for both platform and interrupt. Although the hypervisor can be booted independently, it requires the assistance of GenieZone hypervisor kernel driver(gzvm-ko) to leverage the ability of Linux kernel for vCPU scheduling, memory management, inter-VM communication and virtio backend support.
v1: net-next: net: mctp: use deprecated parser in mctp_set_link_af
In mctp set_link_af implementation
mctp_set_link_af
, it uses strict parser nla_parse_nested to parse the nested attribute. This is fine in most cases but not here, as the rtnetlink uses bad magic in setlink code, see code snippet in functiondo_setlink
.
v5: net-next: netdevsim: link and forward skbs between ports
This patchset adds the ability to link two netdevsim ports together and forward skbs between them, similar to veth. The goal is to use netdevsim for testing features e.g. zero copy Rx using io_uring.
v1: iwl-net: idpf: avoid compiler padding in virtchnl2_ptype struct
Config option in arm random config file is causing the compiler to add padding. Avoid it by using “__packed” structure attribute for virtchnl2_ptype struct.
v1: nfc: mei_phy: Adjustments for two function implementations
Date: Wed, 27 Dec 2023 16:53:21 +0100
A few update suggestions were taken into account from static source code analysis.
v1: ss: add option to suppress queue columns
Add a new option
-Q/--no-queues
to ss(8) to suppress the two standard columns Send-Q and Recv-Q. This helps to keep the output steady for monitoring purposes (like listening sockets).
[net-next PATCH 0/3] net: phy: at803x: even more generalization
This is part 3 of at803x required patches to split the PHY driver in more specific PHY Family driver.
While adding support for a new PHY Family qca807x it was notice lots of similarities with the qca808x cdt function. Hence this series is done to make things easier in the future when qca807x PHY will be submitted.
v2: net-next: MT7530 DSA Subdriver Improvements Act I
Hello!
This patch series simplifies the MT7530 DSA subdriver and improves the logic of the support for MT7530, MT7531, and the switch on the MT7988 SoC.
I have done a simple ping test to confirm basic communication on all switch ports on MCM and standalone MT7530, and MT7531 switch with this patch series applied.
v1: iproute2-next: bridge: mdb: Add flush support
Implement MDB flush functionality, allowing user space to flush MDB entries from the kernel according to provided parameters.
v1: net-next: virtio-net: support device stats
As the spec:
https://github.com/oasis-tcs/virtio-spec/commit/42f389989823039724f95bbbd243291ab0064f82
The virtio net supports to get device stats.
v2: net: wwan: t7xx: Add fastboot interface
To support cases such as firmware update or core dump, the t7xx device is capable of signaling the host that a special port needs to be created before the handshake phase.
v1: net-next: sockptr: Change sockptr_t to be a struct
The original commit for sockptr_t tried to use the pointer value to determine whether a pointer was user or kernel. This can’t work on some architectures and was buggy on x86. So the is_kernel discriminator was added after the union of pointers.
安全增强
v6: shrink lib/string.i via IWYU
This patch series changes the include list of string.c to minimize the preprocessing size. The patch series intends to remove REPEAT_BYE from kernel.h and move it into its own header file because word-at-a-time.h has an implicit dependancy on it but it is declared in kernel.h which is bloated.
BPF
v1: bpf-next: bpf: introduce BPF_MAP_TYPE_RELAY
The patch set introduce a new type of map, BPF_MAP_TYPE_RELAY, based on relay interface [0]. It provides a way for persistent and overwritable data transfer.
周边技术动态
Qemu
v2: target/riscv: SMBIOS support for RISC-V virt machine
Generate SMBIOS tables for the RISC-V mach-virt. Add CONFIG_SMBIOS=y to the RISC-V default config.
v1: target/riscv/tcg: do not set defaults for non-generic
riscv_cpu_options[] are exported using qdev and some of them are defined with default values. This is unfortunate since riscv_cpu_add_user_properties() is called after CPU instance init and there is no clear way to disable MMU/PMP for some CPUs.
This series focuses on enabling the Serial Port Console Redirection (SPCR) table for the RISC-V virt platform. Considering that ARM utilizes the same function, the initial patch involves migrating the build_spcr function to common code. This consolidation ensures that RISC-V avoids duplicating the function.
Buildroot
package/gdb: add support for GDB 14.1
commit: https://git.buildroot.net/buildroot/commit/?id=a9a56ab6fd98125ca09078bdeb7c8d55d53aa35e branch: https://git.buildroot.net/buildroot/commit/?id=refs/heads/master
All patches are still relevant, and have been rebased on top of GDB 14.1.
configs/qemu_riscv64_virt_efi: new defconfig
commit: https://git.buildroot.net/buildroot/commit/?id=8219955118fee56ccd3ca8a13a6350d0e15de418 branch: https://git.buildroot.net/buildroot/commit/?id=refs/heads/master
boot/grub2: add RISC-V 64bit EFI support
commit: https://git.buildroot.net/buildroot/commit/?id=f439b47ed6e987306c7de6d9c3be11de04935377 branch: https://git.buildroot.net/buildroot/commit/?id=refs/heads/master
Grub can be built as a RISC-V UEFI application since commit [1]. This commit was first included in grub version 2.04.
U-Boot
v1: rtc: driver for Goldfish RTC
The Goldfish RTC is a virtual device which may be supplied by QEMU. It is enabled by default on QEMU’s RISC-V virt machine.
Provide a driver and enable it by default on RISC-V QEMU.
v1: smbios: riscv: set correct SMBIOS processor family value
Many value of processor type exceed 0xff and have to be stored as u16 value. In the type 4 table set processor_family = 0xfe signaling that field processor_family2 is used and write the actual value into the processor_family2 field.
The following changes since commit 4b151562bb8e54160adedbc6a1c0c749c00a2f84:
bootmeth: pass size to efi_binary_run() (2023-12-22 10:36:50 -0500)
are available in the Git repository at:
https://source.denx.de/u-boot/custodians/u-boot-riscv.git next
猜你喜欢:
- 我要投稿:发表原创技术文章,收获福利、挚友与行业影响力
- 泰晓资讯:汇总一周技术趣闻与文章,查看「Linux 资讯」
- 知识星球:独家 Linux 实战经验与技巧,订阅「Linux知识星球」
- 视频频道:泰晓学院,B 站,发布各类 Linux 视频课
- 开源小店:欢迎光临泰晓科技自营店,购物支持泰晓原创
- 技术交流:Linux 用户技术交流微信群,联系微信号:tinylab
支付宝打赏 ¥9.68元 | 微信打赏 ¥9.68元 | |
请作者喝杯咖啡吧 |
Read Album:
- Stratovirt 的 RISC-V 虚拟化支持(四):内存模型和 CPU 模型
- Stratovirt 的 RISC-V 虚拟化支持(三):KVM 模型
- Stratovirt 的 RISC-V 虚拟化支持(二):库的 RISC-V 适配
- Stratovirt 的 RISC-V 虚拟化支持(一):环境配置
- TinyBPT 和面向 buildroot 的二进制包管理服务(3):服务端说明