[置顶] 泰晓 RISC-V 实验箱,配套 30+ 讲嵌入式 Linux 系统开发公开课
RISC-V Linux 内核及周边技术动态第 95 期
时间:20240609
编辑:晓瑜
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS
内核动态
RISC-V 架构支持
v2: riscv: Improve exception and system call latency
Many CPUs implement return address branch prediction as a stack. The RISCV architecture refers to this as a return address stack (RAS).
v3: vmalloc: Modify the alloc_vmap_area() error message for better diagnostics
This message is misleading because ‘vmalloc=’ is supported on arm32, x86 platforms and is not a valid kernel parameter on a number of other platforms (in particular its not supported on arm64,alpha,loongarch,arc, csky,hexagon,microblaze,mips,nios2,openrisc,parisc,m64k,powerpc,riscv,sh, um,xtensa,s390,sparc). With the update, the output gets modified to include the function parameters along with the start and end of the virtual memory range allowed.
v16: riscv: sophgo: add clock support for sg2042
This series adds clock controller support for sophgo sg2042.
v1: riscv: Per-thread envcfg CSR support
This series (or equivalent) is a prerequisite for both user-mode pointer masking and CFI support, as those are per-thread features are controlled by fields in the envcfg CSR.
v7: Linux RISC-V IOMMU Support
This patch series introduces support for RISC-V IOMMU architected hardware into the Linux kernel.
v5: Add Svade and Svadu Extensions Support
Svade and Svadu extensions represent two schemes for managing the PTE A/D bit. When the PTE A/D bits need to be set, Svade extension intdicates that a related page fault will be raised.
v4: riscv: Memory Hot(Un)Plug support
Memory Hot(Un)Plug support (and ZONE_DEVICE) for the RISC-V port
v0: RISCV: Report vector unaligned accesses hwprobe
Detected if a system traps into the kernel on an vector unaligned access. Add the result to a new key in hwprobe.
v2: riscv: sophgo: add thermal sensor support for cv180x/sg200x SoCs
This series implements driver for Sophgo cv180x/sg200x on-chip thermal sensor and adds thermal zones for CV1800B SoCs.
v6: Add support for a few Zc* extensions, Zcmop and Zimop
Add support for (yet again) more RVA23U64 missing extensions. Add support for Zimop, Zcmop, Zca, Zcf, Zcd and Zcb extensions ISA string parsing, hwprobe and kvm support. Zce, Zcmt and Zcmp extensions have been left out since they target microcontrollers/embedded CPUs and are not needed by RVA23U64.
v2: Add the core reset for UARTs of StarFive JH7110
The UART of StarFive JH7110 needs two reset signals (apb, core) to initialize. This patch series adds the missing core reset.
v2: riscv: stacktrace: Add USER_STACKTRACE support
Currently, userstacktrace is unsupported for riscv. So use the perf_callchain_user() code as blueprint to implement the arch_stack_walk_user() which add userstacktrace support on riscv. Meanwhile, we can use arch_stack_walk_user() to simplify the implementation of perf_callchain_user().
v6: RISC-V: ACPI: Add external interrupt controller support
This series adds support for the below ECR approved by ASWG. The series primarily enables irqchip drivers for RISC-V ACPI based platforms. The series can be broadly categorized like below.
LoongArch 架构支持
GAS <= 2.41 does not support generating R_LARCH_{32,64}PCREL for “label - .” and it generates R_LARCH{ADD,SUB}{32,64} pairs instead. objtool cannot handle R_LARCH_{ADD,SUB}{32,64} pair in __jump_table (static key implementation) and etc.
v1: LoongArch: KVM: Discard dirty page tracking on readonly memslot
For readonly memslot such as UEFI bios or UEFI var space, guest can not write this memory space directly. So it is not necessary to track dirty pages for readonly memslot. Here there is such optimization in function kvm_arch_commit_memory_region().
进程调度
v1: sched: Initialize the vruntime of a new task when it is first enqueued
When create a new task, we initialize vruntime of the new task at sched_cgroup_fork(). However, the timing of executing this action is too early and may not be accurate.
v1: sched/fair: Prevent cpu_busy_time from exceeding actual_cpu_capacity
Because the effective_cpu_util() would return a util which maybe bigger than the actual_cpu_capacity, this could cause the pd_busy_time calculation errors.
内存管理
v1: mm: sparse: clarify a variable name and its value
Setting ‘limit’ variable to 0 might seem like it means “no limit”. But in the memblock API, 0 actually means the ‘MEMBLOCK_ALLOC_ACCESSIBLE’ enum, which limits the physical address range based on ‘memblock.current_limit’. This can be confusing.
v2: mm: zswap: handle incorrect attempts to load of large folios
Zswap does not support storing or loading large folios. Until proper support is added, attempts to load large folios from zswap are a bug.
v2: mm: introduce pmd/pte_needs_soft_dirty_wp helpers and utilize them
This patchset introduces the pte_need_soft_dirty_wp and pmd_need_soft_dirty_wp helpers to determine if write protection is required for softdirty tracking.
v2: Introduce a store type enum for the Maple tree
This series implements two work items: “aligning mas_store_gfp() with mas_preallocate()” and “enum for store type”.
This is the seventh version of the series that enables block size > page size (Large Block Size) in XFS targetted for inclusion in 6.11.
v1: 6.6.y: mm: ratelimit stat flush from workingset shrinker
One of our workloads (Postgres 14 + sysbench OLTP) regressed on newer upstream kernel and on further investigation, it seems like the cause is the always synchronous rstat flush in the count_shadow_nodes() added by the commit f82e6bf9bb9b (“mm: memcg: use rstat for non-hierarchical stats”).
v1: rust: alloc: add __GFP_HIGHMEM flag
Make it possible to allocate memory that doesn’t need to mapped into the kernel’s address space. This flag is useful together with Page::alloc_page .
v1: mm: zswap: add VM_BUG_ON() if large folio swapin is attempted
With ongoing work to support large folio swapin, it is important to make sure we do not pass large folios to zswap_load() without implementing proper support.
v1: mm: zswap: limit number of zpools based on CPU and RAM
This patch limits the number of zpools used by zswap on smaller systems.
v2: mm/memblock: Add “reserve_mem” to reserved named memory at boot up
Reserve unspecified location of physical memory from kernel command line
v1: support large folio swap-out and swap-in for shmem
Shmem will support large folio allocation to get a better performance, however, the memory reclaim still splits the precious large folios when trying to swap-out shmem, which may lead to the memory fragmentation issue and can not take advantage of the large folio for shmeme.
v1: mm: introduce pmd/pte_need_soft_dirty_wp helpers for softdirty write-protect
This patch introduces the pte_need_soft_dirty_wp and pmd_need_soft_dirty_wp helpers to determine if write protection is required for softdirty tracking. This can enhance code readability and improve its overall appearance.
v3: maple_tree: modified return type of mas_wr_store_entry()
Since the return value of mas_wr_store_entry() is not used, the return type can be changed to void.
v13: mm: report per-page metadata information
This patch adds 2 fields to /proc/vmstat that can used as shown below:
v1: mm/mm_init.c: don’t initialize page->lru again
After init_reserved_page(), we expect __init_single_page() has done its work to the page, which already initialize page->lru properly.
v1: Enable P2PDMA in Userspace RDMA
This patch series enables P2PDMA memory to be used in userspace RDMA transfers.
v1: ML infrastructure in Linux kernel
Initiate a discussion related to an unified infrastructure for ML workloads and user-space drivers.
v2: -next: mm/hugetlb_cgroup: rework on cftypes
This patchset provides an intuitive view of the control files through static templates of cftypes, improve the readability of the code.
**[v1: mm/mm_init.c: simplify logic of deferred_[init | free]_pages](http://lore.kernel.org/linux-mm/20240605010742.11667-1-richard.weiyang@gmail.com/)** |
Function deferred_[init|free]_pages are only used in deferred_init_maxorder(), which makes sure the range to init/free is within MAX_ORDER_NR_PAGES size.
**[v3: ioctl()-based API to query VMAs from /proc/
Implement binary ioctl()-based interface to /proc/
/maps file to allow applications to query VMA information more efficiently than reading *all* VMAs nonselectively through text-based interface of /proc/ /maps file.
文件系统
v1: fs: allow listmount() with reversed ordering
A few smaller cleanups included in this series.
[PATCHES]v1: rework of struct fd handling
Experimental series trying to sanitize the handling of struct fd. Lightly tested, in serious need of review.
v4: Improve readability of copy_tree
This involves renaming the opaque variables (e.g., p, q, r, s) to be more descriptive, aiming to make the code easier to understand.
v5: fs: Improve eventpoll logging to stop indicting timerfd
This change addresses this problem by changing the way eventpoll wakesources are named
v1: vfs: add rcu-based find_inode variants for iget ops
Instantiating a new inode normally takes the global inode hash lock twice:
v2: Employ `copy mount tree from src to dst` concept in copy_tree
Variable names in copy_tree (e.g., p, q, r, s) are opaque; renaming them to be more descriptive would aim to make the code easier to understand.
v1: possible way to deal with dup2() vs. allocated but still not opened descriptors
It’s outside of POSIX scope and any userland code that might run into it is buggy. However, we need to make sure that nothing breaks kernel-side. We used to have interesting bugs in that area and so did *BSD kernels.
v1: fs_parse: add uid & gid option parsing helpers
Multiple filesystems take uid and gid as options, and the code to create the ID from an integer and validate it is standard boilerplate that can be moved into common parsing helper functions, so do that for consistency and less cut&paste.
[HACK PATCH] fs: dodge atomic in putname if ref == 1
The struct used to be refcounted with regular inc/dec ops, atomic usage showed up in commit 03adc61edad4 (“audit,io_uring: io_uring openat triggers audit reference count underflow”).
v1: NFSv4: set sb_flags to second superblock
Added sb_flags parameter to d_automount callback function and fs_context_for_submount(). NFSv4 uses this parameter to set the second superblock.
v2: printk: add threaded printing + the rest
This is v2 of a series to implement threaded console printing as well as some other minor pieces (such as proc and sysfs support). This series is only a subset of the original v1 [0].
v1: iomap: keep on increasing i_size in iomap_write_end()
Commit ‘943bc0882ceb (“iomap: don’t increase i_size if it’s not a write operation”)’ breaks xfs with realtime device on generic/561, the problem is when unaligned truncate down a xfs realtime inode with rtextsize > 1 fs block, xfs only zero out the EOF block but doesn’t zero out the tail blocks that aligned to rtextsize, so if we don’t increase i_size in iomap_write_end(), it could expose stale data after we do an append write beyond the aligned EOF block.
New syscall for mapping generic ringbuffers for arbitary (supported) file descriptors.
This series introduces a proposal to implementing atomic writes in the kernel for torn-write protection.
v1: fs/ntfs3: dealing with situations where dir_search_u may return null
If hdr_find_e() fails to find an entry in the index buffer, dir_search_u() maybe return NULL.
v1: readdir: Add missing quote in macro comment
Add a missing double quote in the unsafe_copy_dirent_name() macro comment.
v1: blk: optimization for classic polling
This removes the dependency on interrupts to wake up task. Set task state as TASK_RUNNING, if need_resched() returns true, while polling for IO completion.
网络设备
v1: can: treewide: decorate flexible array members with __counted_by()
A new __counted_by() attribute was introduced in [1]. It makes the compiler’s sanitizer aware of the actual size of a flexible array member, allowing for additional runtime checks.
v2: net-next: net: flow dissector: allow explicit passing of netns
Change since last version:fix kdoc comment warning reported by kbuild robot, no other changes,thus retaining RvB tags from Eric and Willem.
v4: bpf-next: bpf: Support dumping kfunc prototypes from BTF
This patchset enables both detecting as well as dumping compilable prototypes for kfuncs.
v2: net: bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response
Firmware interface 1.10.2.118 has increased the size of HWRM_PORT_PHY_QCFG response beyond the maximum size that can be forwarded. When the VF’s link state is not the default auto state, the PF will need to forward the response back to the VF to indicate the forced state. This regression may cause the VF to fail to initialize.
v6: af_packet: Handle outgoing VLAN packets without hardware offloading
The issue initially stems from libpcap. The ethertype will be overwritten as the VLAN TPID if the network interface lacks hardware VLAN offloading.
v1: net-next: net: dsa: generate port ifname if exists or invalid
In the case where a DSA port (via DTB label) had an interface name that collided with an existing netdev name, register_netdevice failed with -EEXIST, and the port was not usable.
v1: isdn: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports: Add the missing invocations of the MODULE_DESCRIPTION() macro.
v1: net/sched: initialize noop_qdisc owner
When the noop_qdisc owner isn’t initialized, then it will be 0, so packets will erroneously be regarded as having been subject to recursion as long as only CPU 0 queues them.
v3: net-next: Enable PTP timestamping/PPS for AM65x SR1.0 devices
This patch series enables support for PTP in AM65x SR1.0 devices.
v5: iwl-net: ice: Do not get coalesce settings while in reset
Getting coalesce settings while reset is in progress can cause NULL pointer deference bug.
v4: can: m_can: don’t enable transceiver when probing
The m_can driver sets and clears the CCCR.INIT bit during probe (both when testing the NON-ISO bit, and when configuring the chip).
v4: iwl-next: ice: Add support for devlink local_forwarding param.
Add support for driver-specific devlink local_forwarding param. Supported values are “enabled”, “disabled” and “prioritized”. Default configuration is set to “enabled”.
v3: net-next: net: core: Unify dstats with tstats and lstats, implement generic dstats collection
The struct pcpu_dstats (“dstats”) has a few variations from the other two stats types (struct pcpu_sw_netstats and struct pcpu_lstats), and doesn’t have generic helpers for collecting the per-cpu stats into a struct rtnl_link_stats64.
v5: Series to deliver Ethernet for STM32MP13
Rework dwmac glue to simplify management for next stm32 (integrate RFC from Marek)
v20: net-next: Add Realtek automotive PCIe driver
This series includes adding realtek automotive ethernet driver and adding rtase ethernet driver entry in MAINTAINERS file.
v6: net-next: net: ethernet: mtk_eth_soc: ppe: add support for multiple PPEs
Add the missing pieces to allow multiple PPEs units, one for each GMAC. mtk_gdm_config has been modified to work on targted mac ID, the inner loop moved outside of the function to allow unrelated operations like setting the MAC’s PPE index.
v1: CDC-NCM: add support for Apple’s private interface
This private interface lacks a status endpoint, presumably because there isn’t a physical cable that can be unplugged, nor any speed changes to be notified about.
v2: net-next: net: pse-pd: Add new PSE c33 features
This patch series adds new c33 features to the PSE API.
v13: net-next: Introduce PHY listing and link_topology tracking
This is V13 for the link topology addition, allowing to track all PHYs that are linked to netdevices.
v1: net: bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
In case of token is released due to token->state == BNXT_HWRM_DEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code.
v5: net-next: locking: Introduce nested-BH locking.
Disabling bottoms halves acts as per-CPU BKL. On PREEMPT_RT code within local_bh_disable() section remains preemtible. As a result high prior tasks (or threaded interrupts) will be blocked by lower-prio task (or threaded interrupts) which are long running which includes softirq sections.
v2: net: gve: ignore nonrelevant GSO type bits when processing TSO headers
TSO currently fails when the skb’s gso_type field has more than one bit set.
v3: ipsec-next: Add IP-TFS mode to xfrm
This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS (AggFrag encapsulation) has been standardized in RFC9347.
安全增强
v4: batman-adv: Add flex array to struct batadv_tvlv_tt_data
The “struct batadv_tvlv_tt_data” uses a dynamically sized set of trailing elements. Specifically, it uses an array of structures of type “batadv_tvlv_tt_vlan_data”. So, use the preferred way in the kernel declaring a flexible array .
v1: mm/pstore: Reserve named unspecified memory across boots
Reserve unspecified location of physical memory from kernel command line
This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows .
异步 IO
v1: Wait on cancelations at release time
The idea is to ensure that we’ve done any fputs that we need to when a task using a ring exit, so that we don’t leave references that will get put “shortly afterwards”.
v1: io_uring: check for non-NULL file pointer in io_file_can_poll()
In earlier kernels, it was possible to trigger a NULL pointer dereference off the forced async preparation path, if no file had been assigned. The trace leading to that looks as follows:
Rust For Linux
v2: Rust bindings for cpufreq and OPP core + sample driver
This RFC adds initial rust bindings for two subsystems, cpufreq and operating performance points (OPP). The bindings are provided for most of the interface these subsystems expose.
v1: Tracepoints and static branch/call in Rust
An important part of a production ready Linux kernel driver is tracepoints. So to write production ready Linux kernel drivers in Rust, we must be able to call tracepoints from Rust code. This patch series adds support for calling tracepoints declared in C from Rust.
v1: arch: um: rust: Add i386 support for Rust
At present, Rust in the kernel only supports 64-bit x86, so UML has followed suit.
v5: Rust block device driver API and null block driver
This revision includes a check to validate the block size in the abstractions rather than in the driver. Also, the `GenDisk` type state was changed to a builder pattern.
v3: net::phy add unified API for C22 and C45
add unified API for C22 and C45, reading/writing registers and genphy_read_status().
BPF
v3: bpf: Using binary search to improve the performance of btf_find_by_name_kind
Currently, we are only using the linear search method to find the type id by the name, which has a time complexity of O(n). This change involves sorting the names of btf types in ascending order and using binary search, which has a time complexity of O(log(n)).
v1: bpf: don’t call mmap_read_trylock() from IRQ context
syzbot is reporting that the same local lock is held when trying to hold mmap sem from both IRQ enabled context and IRQ context.
v1: bpf-next: bpf: Track delta between “linked” registers.
The “undo” pass was introduced in LLVM https://reviews.llvm.org/D121937 to prevent this optimization, but it cannot cover all cases.
v1: ftrace: Skip __fentry__ location of overridden weak functions
The case is that, based on current compiler behavior
v3: bpftool: Query only cgroup-related attach types
From strace and kernel tracing, I found netkit returned ENXIO and this command failed. I think this AttachType(BPF_NETKIT_PRIMARY) is not relevant to cgroup.
v11: net-next: Device Memory TCP
GIT PULL: Networking for v6.10-rc3
Including fixes from BPF and big collection of fixes for WiFi core and drivers.
v2: bpf-next: Regular expression support for test output matching
This is v2 on the regular expression for test output matching patches.
v1: bpf-next: libbpf: auto-attach skeletons struct_ops
Similarly to `bpf_program`, support `bpf_map` automatic attachment in `bpf_object__attach_skeleton`. Currently only struct_ops maps could be attached.
v2: bpf-next: libbpf: BTF field iterator
Add BTF field (type and string fields, right now) iterator support instead of using existing callback-based approaches, which make it harder to understand and support BTF-processing code.
v1: bpf-next: uprobe, bpf: Add session support
this patchset is adding support for session uprobe attachment and using it through bpf link for bpf programs.
v1: bpf: Support bpf shadow stack
This works for all bpf selftests, but it is expensive. To avoid runtime kmalloc, we could preallocate some spaces, e.g., percpu pages to be used for stack. This should work for non-sleepable programs.
Try to add 3rd argument to bpf program where the 3rd argument is the frame pointer to bpf program stack.
周边技术动态
Qemu
v1: target/riscv: support atomic instruction fetch (Ziccif)
Support 4-byte atomic instruction fetch when instruction is natural aligned.
v4: target/riscv: Support RISC-V privilege 1.13 spec
Based on the change log for the RISC-V privilege 1.13 spec, add the support for ss1p13.
v4: target/riscv/kvm: QEMU support for KVM Guest Debug on RISC-V
This series implements QEMU KVM Guest Debug on RISC-V, with which we could debug RISC-V KVM guest from the host side, using software breakpoints.
v4: target/riscv: raise an exception when CSRRS/CSRRC writes a read-only CSR
Both CSRRS and CSRRC always read the addressed CSR and cause any read side effects regardless of rs1 and rd fields. Note that if rs1 specifies a register holding a zero value other than x0, the instruction will still attempt to write the unmodified value back to the CSR and will cause any attendant side effects.
v5: RISC-V: Modularize common match conditions for trigger
This series modularize the code for checking the privilege levels of type 2/3/6 triggers by implementing functions trigger_common_match() and trigger_priv_match().
The following changes since commit 74abb45dac6979e7ff76172b7f0a24e869405184:
Buildroot
[branch/2024.02.x] package/bpftool: enable on riscv
bpftool supports RISC-V, including rv64 and rv32, so let’s enable the bpftool package on RISC-V.
U-Boot
v1: Make U-Boot memory reservations coherent
The aim of this patch series is to fix the current state of incoherence between modules when it comes to memory usage.
猜你喜欢:
- 我要投稿:发表原创技术文章,收获福利、挚友与行业影响力
- 泰晓资讯:汇总一周技术趣闻与文章,查看「Linux 资讯」
- 知识星球:独家 Linux 实战经验与技巧,订阅「Linux知识星球」
- 视频频道:泰晓学院,B 站,发布各类 Linux 视频课
- 开源小店:欢迎光临泰晓科技自营店,购物支持泰晓原创
- 技术交流:Linux 用户技术交流微信群,联系微信号:tinylab
支付宝打赏 ¥9.68元 | 微信打赏 ¥9.68元 | |
请作者喝杯咖啡吧 |
Read Album:
- TinyBPT 和面向 buildroot 的二进制包管理服务(2):客户端说明
- TinyBPT 和面向 buildroot 的二进制包管理服务(1):设计简介与框架
- RISC-V Linux 内核及周边技术动态第 118 期
- RISC-V Linux 内核及周边技术动态第 117 期
- 实时分析工具 rtla timerlat 介绍(二):延迟测试原理