[置顶] 泰晓 RISC-V 实验箱,配套 30+ 讲嵌入式 Linux 系统开发公开课
RISC-V Linux 内核及周边技术动态第 81 期
时间:20240303
编辑:晓怡
仓库:RISC-V Linux 内核技术调研活动
赞助:PLCT Lab, ISCAS
内核动态
RISC-V 架构支持
v6: riscv: Use Kconfig to set unaligned access speed
If the hardware unaligned access speed is known at compile time, it is possible to avoid running the unaligned access speed probe to speedup boot-time.
v1: riscv: hwprobe: export highest virtual userspace address
Some userspace applications (OpenJDK for instance) uses the free MSBs in pointers to insert additional information for their own logic and need to get this information from somewhere. Currently they rely on parsing /proc/cpuinfo “mmu=svxx” string to obtain the current value of virtual address usable bits 1. Since this reflect the raw supported MMU mode, it might differ from the logical one used internally which is why arch_get_mmap_end() is used.
v5: riscv: ASID-related and UP-related TLB flush enhancements
While reviewing Alexandre Ghiti’s “riscv: tlb flush improvements” series1, I noticed that most TLB flush functions end up as a call to local_flush_tlb_all() when SMP is disabled. This series resolves that, and also optimizes the scenario where SMP is enabled but only one CPU is present or online. Along the way, I realized that we should be using single-ASID flushes wherever possible, so I implemented that as well.
v4: RISC-V SBI v2.0 PMU improvements and Perf sampling in KVM guest
This series implements SBI PMU improvements done in SBI v2.01 i.e. PMU snapshot and fw_read_hi() functions.
SBI v2.0 introduced PMU snapshot feature which allows the SBI implementation to provide counter information (i.e. values/overflow status) via a shared memory between the SBI implementation and supervisor OS. This allows to minimize the number of traps in when perf being used inside a kvm guest as it relies on SBI PMU + trap/emulation of the counters.
GIT PULL: RISC-V Sophgo Devicetrees for v6.9
The following changes since commit 41bccc98fb7931d63d03f326a746ac4d429c1dd3:
Linux 6.8-rc2 (2024-01-28 17:01:12 -0800)
are available in the Git repository at:
https://github.com:sophgo/linux.git riscv-sophgo-dt-for-v6.9
v15: Refactoring Microchip PCIe driver and add StarFive PCIe
This patchset final purpose is add PCIe driver for StarFive JH7110 SoC. JH7110 using PLDA XpressRICH PCIe IP. Microchip PolarFire Using the same IP and have commit their codes, which are mixed with PLDA controller codes and Microchip platform codes.
回复: v8: Add timer driver for StarFive JH7110 RISC-V SoC
Could you please help to review this patch and give your comments if you have time? Thanks.
v1: arch: mm, vdso: consolidate PAGE_SIZE definition
Naresh noticed that the newly added usage of the PAGE_SIZE macro in include/vdso/datapage.h introduced a build regression. I had an older patch that I revived to have this defined through Kconfig rather than through including asm/page.h, which is not allowed in vdso code.
v1: riscv: deprecate CONFIG_MMU=n
Deprecation of NOMMU support for riscv was discussed during LPC 2023 1. Reasons for this involves lack of users as well as maintenance efforts to support this mode. psABI FDPIC specification also never made it upstream and last public messages of this development seems to date back from 2020 [2]. Plan the deprecation to be done in 2 years from now. Mark the Kconfig option as deprecated by adding a new dummy option which explicitly displays the deprecation in case of CONFIG_MMU=n.
This is a note to let you know that I’ve just added the patch titled
irqchip/sifive-plic: Enable interrupt if needed before EOI
to the 6.7-stable tree which can be found at:http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
This is a note to let you know that I’ve just added the patch titled
irqchip/sifive-plic: Enable interrupt if needed before EOI
to the 6.6-stable tree which can be found at:http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
This is a note to let you know that I’ve just added the patch titled
irqchip/sifive-plic: Enable interrupt if needed before EOI
to the 6.1-stable tree which can be found at:http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
v1: cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()
When the cpus in the same cluster are all in the idle state, the kernel might put the cluster into a deeper low power state. Call the cluster_pm_enter() before entering the low power state and call the cluster_pm_exit() after the cluster woken up.
The RISC-V AIA specification is ratified as-per the RISC-V international process. The latest ratified AIA specifcation can be found at: https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
v1: dt-bindings: pwm: opencores: Add compatible for StarFive JH8100
StarFive JH8100 uses the same OpenCores PWM controller as JH7110. Mark JH8100 as compatible to the OpenCores PWM controller.
进程调度
v2: sched: Add trace_sched_waking() tracepoint to sched_ttwu_pending()
Zimuzo reported seeing occasional cases in perfetto traces where tasks went from sleeping directly to trace_sched_wakeup() without always seeing a trace_sched_waking().
v1: sched/eevdf: avoid task starvation in cgroups
When running update_curr, it is checked whether the current task has missed its deadline (update_deadline). If the deadline has been crossed, the task is set to be rescheduled if there are other tasks available on its cfs_rq. This can cause task starvation in some cgroup configurations.
v1: sched/eevdf: sched feature to dismiss lag on wakeup
The previously used CFS scheduler gave tasks that were woken up an enhanced chance to see runtime immediately by deducting a certain value from its vruntime on runqueue placement during wakeup.
This property was used by some, at least vhost, to ensure, that certain kworkers are scheduled immediately after being woken up. The EEVDF scheduler, does not support this so far. Instead, if such a woken up entitiy carries a negative lag from its previous execution, it will have to wait for the current time slice to finish, which affects the performance of the process expecting the immediate execution negatively.
v1: sched/core: split iowait state into two states
iowait is a bogus metric, but it’s helpful in the sense that it allows short waits to not enter sleep states that have a higher exit latency than we would’ve picked for iowait’ing tasks. However, it’s harmless in that lots of applications and monitoring assumes that iowait is busy time, or otherwise use it as a health metric. Particularly for async IO it’s entirely nonsensical.
内存管理
v1: memcg_kmem hooks refactoring and kmem_cache_charge()
I have tried to look into Linus’s suggestions to reduce slab memcg accounting overhead [1] [2].
The reorganized hooks are in Patch 1 and it definitely seems like nice cleanup on its own.
This is the second version of the series that enables block size > page size (Large Block Size) in XFS. The context and motivation can be seen in cover letter of the RFC v11. We also recorded a talk about this effort at LPC [3], if someone would like more context on this effort.
v7: mm/vmalloc: lock contention optimization under multi-threading
This version has the rearrangement of macros from the previous one.
We are not sure whether we have completely moved these macros and their corresponding helper to the correct position. Could you please help to check whether they are correct?
v1: Merge arm64/riscv hugetlbfs contpte support
This patchset intends to merge the contiguous ptes hugetlbfs implementation of arm64 and riscv.
Both arm64 and riscv support the use of contiguous ptes to map pages that are larger than the default page table size, respectively called contpte and svnapot.
v1: Improved Memory Tier Creation for CPUless NUMA Nodes
The memory tiering component in the kernel is functionally useless for CPUless memory/non-DRAM devices like CXL1.1 type3 memory because the nodes are lumped together in the DRAM tier. https://lore.kernel.org/linux-mm/PH0PR08MB7955E9F08CCB64F23963B5C3A860A@PH0PR08MB7955.namprd08.prod.outlook.com/T/
v1: selftests/mm: Dont fail testsuite due to a lack of hugepages
On systems that have large core counts and large page sizes, but limited memory, the userfaultfd test hugepage requirement is too large.
v1: mm/mempolicy: Use a folio in do_mbind()
We actually add folios to the pagelist already, but then work with them as pages. Removes a call to compound_head() in PageKsm() and removes a reference to page->index.
v2: mm/vmstat: Add order’s information for extfrag_index and unusable_index
Current cat /sys/kernel/debug/extfrag/extfrag_index and /sys/kernel/debug/extfrag/unusable_index is not friendly to userspace.
v2: zswap: replace RB tree with xarray
Very deep RB tree requires rebalance at times. That contributes to the zswap fault latencies. Xarray does not need to perform tree rebalance. Replacing RB tree to xarray can have some small performance gain.
One small difference is that xarray insert might fail with ENOMEM, while RB tree insert does not allocate additional memory.
v3: filemap: avoid unnecessary major faults in filemap_fault()
The major fault occurred when using mlockall(MCL_CURRENT | MCL_FUTURE) in application, which leading to an unexpected issue1.
This caused by temporarily cleared PTE during a read+clear/modify/write update of the PTE, eg, do_numa_page()/change_pte_range().
v2: mm: support large folios swap-in
-v2:
- lots of code cleanup according to Chris’s comments, thanks!
- collect Chris’s ack tags, thanks!
- address David’s comment on moving to use folio_add_new_anon_rmap for !folio_test_anon in do_swap_page, thanks!
- remove the MADV_PAGEOUT patch from this series as Ryan will intergrate it into swap-out series
- Apply Kairui’s work of “mm/swap: fix race when skipping swapcache” on large folios swap-in as well
- fixed corrupted data(zero-filled data) in two races: zswap and a part of entries are in swapcache while some others are not in by checking SWAP_HAS_CACHE while swapping in a large folio
v1: mm: page_alloc: Use div64_ul() instead of do_div()
Fixes Coccinelle/coccicheck warning reported by do_div.cocci.
Compared to do_div(), div64_ul() does not implicitly cast the divisor and does not unnecessarily calculate the remainder.
v1: mm/kmemleak: Don’t hold kmemleak_lock when calling printk()
When some error conditions happen (like OOM), some kmemleak functions call printk() to dump out some useful debugging information while holding the kmemleak_lock. This may cause deadlock as the printk() function may need to allocate additional memory leading to a create_object() call acquiring kmemleak_lock again.
v1: Is pagecache_isize_extended() compatible with large folios?
I’d appreciate some filesystem people checking my work here (in that pagecache_isize_extended() may already be broken and we didn’t notice).
As far as I can tell (and it’d be nice to explain this in the kernel-doc a little more thoroughly), the reason pagecache_isize_extended() exists is that some filesystems rely on getting page_mkwrite() calls in order to instantiate blocks. So if you have a filesystem using 512 byte blocks and a 256 byte file mmaped, a store anywhere in the page will only result in block 0 of the file being instantiated and the folio will now be marked as dirty.
v1: mm: Use folio more widely in __split_huge_page
We already have a folio; use it instead of the head page where reasonable. Saves a couple of calls to compound_head() and elimimnates a few references to page->mapping.
v1: mm/zsmalloc: move get_zspage_lockless into #ifdef
It’s only used from inside of an #ifdef section, causing a warning otherwise:
mm/zsmalloc.c:735:23: error: unused function ‘get_zspage_lockless’ [-Werror,-Wunused-function]735 | static struct zspage *get_zspage_lockless(struct page *page)| ^
Move it down into that block to avoid adding another #ifdef.
v1: mm/treewide: Replace pXd_large() with pXd_leaf()
[based on latest akpm/mm-unstable, commit 1274e7646240]
These two APIs are mostly always the same. It’s confusing to have both of them. Merge them into one. Here I used pXd_leaf() only because pXd_leaf() is a global API which is always defined, while pXd_large() is not.
v2: mm: add alloc_contig_migrate_range allocation statistics
alloc_contig_migrate_range has every information to be able to understand big contiguous allocation latency. For example, how many pages are migrated, how many times they were needed to unmap from page tables.
v2: mm/zsmalloc: don’t need to reserve LSB in handle
We will save allocated tag in the object header to indicate that it’s allocated.
handle |= OBJ_ALLOCATED_TAG;
So the object header needs to reserve LSB for this tag bit.
v1: mm/vmscan: simplify the calculation of fractions for SCAN_FRACT
The current way to calculate fractions for SACN_FRACT is little readable and more complicated than it should be. It also performs unnecessary division and adjustment to avoid zero operands. Prune away by multiplying the fractions by ‘anon_cost * file_cost / (3 * total_cost)’:
v1: mm: convert folio_estimated_sharers() to folio_likely_mapped_shared()
Callers of folio_estimated_sharers() only care about “mapped shared vs. mapped exclusively”, not the exact estimate of sharers. Let’s consolidate and unify the condition users are checking. While at it clarify the semantics and extend the discussion on the fuzziness.
v1: mm/cma: convert cma_alloc() to return folio
Change cma_alloc() to return struct folio. This further increases the usage of folios in mm/hugetlb.
v3: Rearrange batched folio freeing
Other than the obvious “remove calls to compound_head” changes, the fundamental belief here is that iterating a linked list is much slower than iterating an array (5-15x slower in my testing). There’s also an associated belief that since we iterate the batch of folios three times, we do better when the array is small (ie 15 entries) than we do with a batch that is hundreds of entries long, which only gives us the opportunity for the first pages to fall out of cache by the time we get to the end.
v1: make the hugetlb migration strategy consistent
As discussed in previous thread 1, there is an inconsistency when handling hugetlb migration. When handling the migration of freed hugetlb, it prevents fallback to other NUMA nodes in alloc_and_dissolve_hugetlb_folio(). However, when dealing with in-use hugetlb, it allows fallback to other NUMA nodes in alloc_hugetlb_folio_nodemask(), which can break the per-node hugetlb pool and might result in unexpected failures when node bound workloads doesn’t get what is asssumed available.
v2: mm: make folio_pte_batch available outside of mm/memory.c
madvise, mprotect and some others might need folio_pte_batch to check if a range of PTEs are completely mapped to a large folio with contiguous physical addresses. Let’s make it available in mm/internal.h.
v1: mm/zsmalloc: simplify synchronization between zs_page_migrate() and free_zspage()
free_zspage() has to hold locks of all pages, since zs_page_migrate() path rely on this page lock to protect the race between zs_free() and it, so it can safely get zspage from page->private.
v1: mm/zsmalloc: don’t need to save tag bit in handle
We only need to save the position (pfn + obj_idx) in the handle, don’t need to save tag bit in handle. So one more bit can be used as obj_idx.
v1: mm: export folio_pte_batch as a couple of modules might need it
madvise and some others might need folio_pte_batch to check if a range of PTEs are completely mapped to a large folio with contiguous physcial addresses. Let’s export it for others to use.
v5: Split a folio to any lower order folios
File folio supports any order and multi-size THP is upstreamed1, so both file and anonymous folios can be >0 order. Currently, split_huge_page() only splits a huge page to order-0 pages, but splitting to orders higher than 0 might better utilize large folios, if done properly. In addition, Large Block Sizes in XFS support would benefit from it during truncate[2]. This patchset adds support for splitting a large folio to any lower order folios. The patchset is on top of mm-everything-2024-02-24-02-40.
v2: Cover a guard gap corner case
For v2, the notable change is a bug fix to not clobber the MMF_TOPDOWN during fork. In the RFC this resulted in fork() children that didn’t exec getting the map up behavior, which included the stress-ng bigheap test. It turns out much of the 4% improvement seen was due to the bottomup mapping direction. With the fix, the performance benefit was a less surprising
v1: vfio/type1: unpin PageReserved page
We meet a warning as following:WARNING: CPU: 99 PID: 1766859 at mm/gup.c:209 try_grab_page.part.0+0xe8/0x1b0CPU: 99 PID: 1766859 Comm: qemu-kvm Kdump: loaded Tainted: GOE 5.10.134-008.2.x86_64 #1Hardware name: Foxconn AliServer-Thor-04-12U-v2/Thunder2, BIOS 1.0.PL.FC.P.031.00 05/18/2022
文件系统
v2: qnx6: convert qnx6 to use the new mount api
Convert the qnx6 filesystem to use the new mount API.
Mostly untested, since there is no qnx6 fs image readily available. Testing did include parsing of the mmi_fs option.
GIT PULL: xfs: Code changes for 6.8
Please pull this branch with changes for xfs for 6.8-rc7. The changes are limited to just one patch where we drop experimental warning message when mounting an xfs filesystem on an fsdax device. We now consider xfs on fsdax to be stable.
v1: ext4: Add direct-io atomic write support using fsawu
This RFC series adds support for atomic writes to ext4 direct-io using filesystem atomic write unit. It’s built on top of John’s “block atomic write v5” series which adds RWF_ATOMIC flag interface to pwritev2() and enables atomic write support in underlying device driver and block layer.
v7: tracing: Support to dump instance traces by ftrace_dump_on_oops
Currently ftrace only dumps the global trace buffer on an OOPs. For debugging a production usecase, instance trace will be helpful to check specific problems since global trace buffer may be used for other purposes.
v1: afs: Don’t cache preferred address
In the AFS fileserver rotation algorithm, don’t cache the preferred address for the server as that will override the explicit preference if a non-preferred address responds first.
v1: fanotify: move path permission and security check
In current state do_fanotify_mark() does path permission and security checking before doing the event configuration checks. In the case where user configures mount and sb marks with kernel internal pseudo fs, security_path_notify() yields an EACESS and causes an earlier exit. Instead, this particular case should have been handled by fanotify_events_supported() and exited with an EINVAL.
v1: fs_parser: handle parameters that can be empty and don’t have a value
While investigating an ext4/053 fstest failure, I realised that when the flag ‘fs_param_can_be_empty’ is set in a parameter and it’s value is NULL that parameter isn’t being handled as a ‘flag’ type. Even if it’s type is set to ‘fs_value_is_flag’. The first patch in this series changes this behaviour.
v2: qnx4: convert qnx4 to use the new mount api
Convert the qnx4 filesystem to use the new mount API.
Tested mount, umount, and remount using a qnx4 boot image.
v1: hugetlbfs: support idmapped mounts
pass down the idmapped mount information to the different helper functions.
Differently, hugetlb_file_setup() will continue to not have any mapping since it is only used from contexts where idmapped mounts are not used.
v1: ext4: Do endio process under irq context for DIO overwrites
Recently we found an ext4 performance regression problem between 4.18 and 5.10 by following test command on a x86 physical machine with nvme:fio -direct=1 -iodepth=128 -rw=randwrite -ioengine=libaio -bs=4k-size=2G -numjobs=1 -time_based -runtime=60 -group_reporting-filename=/test/test -name=Rand_write_Testing –cpus_allowed=1
v1: buffered write path without inode lock (for bcachefs)
this is going in my for-next branch - it’s tested and I think all the corner cases are handled to my satisfaction (there are some fun ones!)
v1: virtiofs: don’t mark virtio_fs_sysfs_exit as __exit
Calling an __exit function from an __init function is not allowed and will result in undefined behavior when the code is built-in:
v1: fs: use inode_set_ctime_to_ts to set inode ctime to current time
The function inode_set_ctime_current simply retrieves the current time and assigns it to the field __i_ctime without any alterations. Therefore, it is possible to set ctime to now directly using inode_set_ctime_to_ts
v1: xarray: add guard definitions for xa_lock
Add DEFINE_GUARD definitions so that xa_lock can be used with guard() or scoped_guard().
v1: xfs: stop advertising SB_I_VERSION
The redefinition of how NFS wants inode->i_version to be updated is incomaptible with the XFS i_version mechanism. The VFS now wants inode->i_version to only change when ctime changes (i.e. it has become a ctime change counter, not an inode change counter). XFS has
This series introduces a proposal to implementing atomic writes in the kernel for torn-write protection.
This series takes the approach of adding a new “atomic” flag to each of pwritev2() and iocb->ki_flags - RWF_ATOMIC and IOCB_ATOMIC, respectively. When set, these indicate that we want the write issued “atomically”.
v2: fuse: add support for explicit export disabling
open_by_handle_at(2) can fail with -ESTALE with a valid handle returned by a previous name_to_handle_at(2) for evicted fuse inodes, which is especially common when entry_valid_timeout is 0, e.g. when the fuse daemon is in “cache=none” mode.
v1: bcachefs disk accounting rewrite
here it is; the disk accounting rewrite I’ve been talking about since forever.
git link: https://evilpiepirate.org/git/bcachefs.git/log/?h=bcachefs-disk-accounting-rewrite
v1: blk: optimization for classic polling
This removes the dependency on interrupts to wake up task. Set task state as TASK_RUNNING, if need_resched() returns true, while polling for IO completion. Earlier, polling task used to sleep, relying on interrupt to wake it up. This made some IO take very long when interrupt-coalescing is enabled in NVMe.
网络设备
v9: net-next: net: ethernet: Rework EEE
Most MAC drivers get EEE wrong. The API to the PHY is not very obvious, which is probably why. Rework the API, pushing most of the EEE handling into phylib core, leaving the MAC drivers to just enable/disable support for EEE in there change_link call back.
v2: net-next: Add en8811h phy driver and devicetree binding doc
This patch series adds the driver and the devicetree binding documentation for the Airoha en8811h PHY.
v1: net-next: net: constify struct class usage
This is a simple and straight forward cleanup series that aims to make the class structures in net constant. This has been possible since 2023 1.
v2: net-next: ethtool: ignore unused/unreliable fields in set_eee op
This function is used with the set_eee() ethtool operation. Certain fields of struct ethtool_keee() are relevant only for the get_eee() operation. In addition, in case of the ioctl interface, we have no guarantee that userspace sends sane values in struct ethtool_eee. Therefore explicitly ignore all fields not needed for set_eee(). This protects from drivers trying to use unchecked and unreliable data, relying on specific userspace behavior.
v1: net-next: net/nlmon: Cancel setting filelds of statistics to zero.
Since filelds of rtnl_link_stats64 have been set to zero in previous dev_get_stats function, there is no need to set again in ndo_get_stats64 function.
v1: net-next: net/smc: reduce rtnl pressure in smc_pnet_create_pnetids_list()
Many syzbot reports show extreme rtnl pressure, and many of them hint that smc acquires rtnl in netns creation for no good reason 1
v1: net-next: tools: ynl: add –dbg-small-recv for easier kernel testing
When testing netlink dumps I usually hack some user space up to constrain its user space buffer size (iproute2, ethtool or ynl). Netlink will try to fill the messages up, so since these apps use large buffers by default, the dumps are rarely fragmented.
v6: net-next: net: dsa: vsc73xx: Make vsc73xx usable
This patch series focuses on making vsc73xx usable.
The first patch was added in v2; it switches from a poll loop to read_poll_timeout.
v1: net-next: mptcp: userspace pm: ‘dump addrs’ and ‘get addr’
This series from Geliang adds two new Netlink commands to the userspace PM:
- one to dump all addresses of a specific MPTCP connection:
- feature added in patches 3 to 5
- test added in patches 7, 8 and 10
v1: net-next: mptcp: add TCP_NOTSENT_LOWAT sockopt support
Patch 3 does the magic of adding TCP_NOTSENT_LOWAT support, all the other ones are minor cleanup seen along when working on the new feature.
v1: net-next: tools/net/ynl: Add support for nlctrl netlink family
This series adds a new YNL spec for the nlctrl family, plus some fixes and enhancements for ynl.
v2: net-next: net: ipa: simplify device pointer access
This version of this patch series fixes the bugs in the first patch (which were fixed in the second), where ipa_interrupt_config() had two remaining spots that returned a pointer rather than an integer.
v1: net-next: rxrpc: Miscellaneous changes and make use of MSG_SPLICE_PAGES
Here are some changes to AF_RXRPC:
(1) Cache the transmission serial number of ACK and DATA packets in therxrpc_txbuf struct and log this in the retransmit tracepoint.
v2: iwl-next: XDP Tx Hardware Timestamp for igc driver
Implemented XDP transmit hardware timestamp metadata for igc driver.
This patchset is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel ADL-S platform. Below are the test steps and results.
v2: net: netfilter: Add protection for bmp length out of range
UBSAN load reports an exception of BRK#5515 SHIFT_ISSUE:Bitwise shifts that are out of bounds for their data type.
vmlinux get_bitmap(b=75) + 712 <net/netfilter/nf_conntrack_h323_asn1.c:0> vmlinux decode_seq(bs=0xFFFFFFD008037000, f=0xFFFFFFD008037018, level=134443100) + 1956 <net/netfilter/nf_conntrack_h323_asn1.c:592>
v1: net: hns: Use common error handling code in hns_mac_init()
Date: Fri, 1 Mar 2024 15:48:25 +0100
Add a jump target so that a bit of exception handling can be better reused at the end of this function implementation.
v2: Add minimal XDP support to TI AM65 CPSW Ethernet driver
This patch adds XDP support to TI AM65 CPSW Ethernet driver.
The following features are implemented: NETDEV_XDP_ACT_BASIC, NETDEV_XDP_ACT_REDIRECT, and NETDEV_XDP_ACT_NDO_XMIT.
v4: net: ipv6: fib6_rules: flush route cache when rule is changed
When rule policy is changed, ipv6 socket cache is not refreshed. The sock’s skb still uses a outdated route cache and was sent to a wrong interface.
v3: net-next: MT7530 DSA Subdriver Improvements Act III
This is the third patch series with the goal of simplifying the MT7530 DSA subdriver and improving support for MT7530, MT7531, and the switch on the MT7988 SoC.
v1: net-next: ps3_gelic_net: Use napi routines for RX SKB
Convert the PS3 Gelic network driver’s RX SK buffers over to use the napi_alloc_frag_align and napi_build_skb routines.
安全增强
v3: string: Convert selftests to KUnit
I realized the string selftests hadn’t been converted to KUnit yet. Do that.
v2: Handle faults in KUnit tests
This patch series teaches KUnit to handle kthread faults as errors, and it brings a few related fixes and improvements.
v5: arm64: qcom: add AIM300 AIoT board suppo
Add AIM300 AIoT support along with usb, ufs, regulators, serial, PCIe, and PMIC functions. AIM300 Series is a highly optimized family of modules designed to support AIoT applications. It integrates QCS8550 SoC, UFS and PMIC chip etc.
v1: overflow: Allow non-type arg to type_max() and type_min()
A common use of type_max() is to find the max for the type of a variable. Using the pattern type_max(typeof(var)) is needlessly verbose.
v1: compiler.h: Explain how __is_constexpr() works
The __is_constexpr() macro is dark magic. Shed some light on it with a comment to explain how and why it works.
v1: netdev: Use flexible array for trailing private bytes
Introduce a new struct net_device_priv that contains struct net_device but also accounts for the commonly trailing bytes through the “size” and “data” members.
v1: Run KUnit tests late and handle faults
This patch series moves KUnit test execution at the very end of kernel initialization, just before launching the init process. This opens the way to test any kernel code in its normal state (i.e. fully initialized).
v2: scsi: replace deprecated strncpy
This series contains multiple replacements of strncpy throughout the scsi subsystem.
v4: iio: core: New macros and making use of them
Added new macros to overflow.h and reuse it in IIO. For the sake of examples a few more places were updated (requested by Kees). In case maintainers are okay, tags will be appreciated.
v1: lib: stackinit: Adjust target string to 8 bytes for m68k
For reasons I cannot understand, m68k moves the start of the stack frame for consecutive calls to the same function if the function’s test variable is larger than 8 bytes.
v2: x86, relocs: Ignore relocations in .notes section
When building with CONFIG_XEN_PV=y, .text symbols are emitted into the .notes section so that Xen can find the “startup_xen” entry point.
v1: thermal: core: Move initial num_trips assignment before memcpy()
This panic occurs because trips is counted by num_trips but num_trips is assigned after the call to memcpy(), so the fortify checks think the buffer size is zero because tz was allocated with kzalloc().
异步 IO
[PATCH io_uring/net: correct the type of variable
The namelen is of type int. It shouldn’t be made size_t which is unsigned. The signed number is needed for error checking before use.
v1: io_uring: get rid of intermediate aux cqe caches
With defer taskrun we store aux cqes into a cache array and then flush into the CQ, and we also maintain the ordering so aux cqes are flushed before request completions.
v10: io_uring: Statistics of the true utilization of sq threads.
Count the running time and actual IO processing time of the sqpoll thread, and output the statistical data to fdinfo.
v2: io_uring/net: improve the usercopy for sendmsg/recvmsg
We’re spending a considerable amount of the sendmsg/recvmsg time just copying in the message header. And for provided buffers, the known single entry iovec.
Rust For Linux
v5: kselftest: Add basic test for probing the rust sample modules
Add new basic kselftest that checks if the available rust sample modules can be added and removed correctly.
v2: Arc methods for linked list
This patchset contains two useful methods for the Arc type. They will be used in my Rust linked list implementation, which Rust Binder uses.
v1: Rewrite the VP9 codec library in Rust
This patch ports the VP9 library written by Andrzej into Rust as a proof-of-concept. This is so that we can evaluate the Rust in V4L2 initiative with source code in hand.
v4: rust: locks: Add get_mut
method to Lock
Having a mutable reference guarantees that no other threads have access to the lock, so we can take advantage of that to grant callers access to the protected data without the the cost of acquiring and releasing the locks.
This allows you to get a raw pointer to THIS_MODULE for use in unsafe code. The Rust Binder RFC uses it when defining fops for the binderfs component 1.
BPF
v1: libbpf: Correct debug message in btf__load_vmlinux_btf
In the function btf__load_vmlinux_btf, the debug message incorrectly refers to ‘path’ instead of ‘sysfs_btf_path’.
v1: bpf-next: selftests/bpf: extend uprobe/uretprobe triggering benchmarks
Settle on three “flavors” of uprobe/uretprobe, installed on different kinds of instruction: nop, push, and ret. All three are testing
v1: dwarves: btf_encoder: dynamically allocate the vars array for percpu variables
Use consistent method across allocating function and per-cpu variable representations, based around (re)allocating the arrays based on demand. This avoids issues where the number of per-CPU variables exceeds the hardcoded limit.
v2: net: raise RCU qs after each threaded NAPI poll
We noticed task RCUs being blocked when threaded NAPIs are very busy at workloads: detaching any BPF tracing programs, i.e. removing a ftrace trampoline, will simply block for very long in rcu_tasks_wait_gp.
v2: net-next: Use per-task storage for XDP-redirects on PREEMPT_RT
In [0] I introduced explicit locking for resources which are otherwise locked implicit locked by local_bh_disable() and this protections goes away if the lock in local_bh_disable() is removed on PREEMPT_RT.
v1: tools/testing/selftests/bpf/test_tc_tunnel.sh: Prevent client connect before server bind
In some systems, the netcat server can incur in delay to start listening. When this happens, the test can randomly fail in various points.
v1: bpf-next: bpftool: Mount bpffs on provided dir instead of parent dir
When pinning programs/objects under PATH (eg: during “bpftool prog loadall”) the bpffs is mounted on the parent dir of PATH in the following situations:
- the given dir exists but it is not bpffs.
- the given dir doesn’t exist and the parent dir is not bpffs.
v3: vhost: virtio: drivers maintain dma info for premapped vq
As discussed: http://lore.kernel.org/all/CACGkMEvq0No8QGC46U4mGsMtuD44fD_cfLcPaVmJ3rHYqRZxYg@mail.gmail.com
v6: bpf-next: Create shadow types for struct_ops maps in skeletons
This patchset allows skeleton users to change the values of the fields in struct_ops maps at runtime. It will create a shadow type pointer in a skeleton for each struct_ops map, allowing users to access the values of fields through these pointers.
v1: bpf: Chose RCU Tasks based on TASKS_RCU rather than PREEMPTION
The advent of CONFIG_PREEMPT_AUTO, AKA lazy preemption, will mean that even kernels built with CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY might see the occasional preemption, and that this preemption just might happen within a trampoline.
v2: net-next: tun: AF_XDP Tx zero-copy support
Now, some drivers support the zero-copy feature of AF_XDP sockets, which can significantly reduce CPU utilization for XDP programs.
[PATCH RFCv2 bpf-next 0/4] bpf: Introduce kprobe multi wrapper attach
adding support to attach both entry and return bpf program on single kprobe multi link. The first RFC patchset is in [0].
v2: perf lock contention: Account contending locks too
Currently it accounts the contention using delta between timestamps in lock:contention_begin and lock:contention_end tracepoints. But it means the lock should see the both events during the monitoring period.
v1: bpf-next: Support kCFI + BPF on arm64
On ARM64 with CONFIG_CFI_CLANG, CFI warnings can be triggered by running the bpf selftests. This is because the JIT doesn’t emit proper CFI prologues for BPF programs, callbacks, and struct_ops trampolines.
v12: net-next: Introducing P4TC (series 1)
This is the first patchset of two. In this patch we are submitting 15 which cover the minimal viable P4 PNA architecture.
周边技术动态
Qemu
v1: target/riscv: move ratified/frozen exts to non-experimental
smaia and ssaia were ratified in August 25th 2023 1.
zvfh and zvfhmin were ratified in August 2nd 2023 [2].
What riscv tracing tools do you recommend and how are they accurate for measurements?
Recently, I was planning to measure the performance of my application of interest for potential RISC-V hardware. Hence, I started my simulations from Spike to analyze dynamic instruction traces and instruction count, nevertheless given it does not support multithreading, I started using Qemu to test my app too.
v4: RISC-V: Modularize common match conditions for trigger
According to RISC-V Debug specification ratified version 0.13 1 (also applied to version 1.0 [2] but it has not been ratified yet), the enabled privilege levels of the trigger is common match conditions for all the types of the trigger.
hi, i would like developpe my OS on risc-v 128 bits. after search the support isn´t fully operational
how can i help, and in the same learn risc-v 128 bits
猜你喜欢:
- 我要投稿:发表原创技术文章,收获福利、挚友与行业影响力
- 泰晓资讯:汇总一周技术趣闻与文章,查看「Linux 资讯」
- 知识星球:独家 Linux 实战经验与技巧,订阅「Linux知识星球」
- 视频频道:泰晓学院,B 站,发布各类 Linux 视频课
- 开源小店:欢迎光临泰晓科技自营店,购物支持泰晓原创
- 技术交流:Linux 用户技术交流微信群,联系微信号:tinylab
支付宝打赏 ¥9.68元 | 微信打赏 ¥9.68元 | |
请作者喝杯咖啡吧 |
Read Album:
- Stratovirt 的 RISC-V 虚拟化支持(四):内存模型和 CPU 模型
- Stratovirt 的 RISC-V 虚拟化支持(三):KVM 模型
- Stratovirt 的 RISC-V 虚拟化支持(二):库的 RISC-V 适配
- Stratovirt 的 RISC-V 虚拟化支持(一):环境配置
- TinyBPT 和面向 buildroot 的二进制包管理服务(3):服务端说明