mirror of
https://github.com/torvalds/linux.git
synced 2024-11-01 04:53:36 +01:00
c4d91e225f
Patch series "introduce VMA merge mode to improve brk() performance".
A ~5% performance regression was discovered on the
aim9.brk_test.ops_per_sec by the linux kernel test bot [0].
In the past to satisfy brk() performance we duplicated VMA expansion code
and special-cased do_brk_flags(). This is however horrid and undoes work
to abstract this logic, so in resolving the issue I have endeavoured to
avoid this.
Investigating further I was able to observe that the use of a
vma_iter_next_range() and vma_prev() pair, causing an unnecessary maple
tree walk. In addition there is work that we do that is simply
unnecessary for brk().
Therefore, add a special VMA merge mode VMG_FLAG_JUST_EXPAND to avoid
doing any of this - it assumes the VMA iterator is pointing at the
previous VMA and which skips logic that brk() does not require.
This mostly eliminates the performance regression reducing it to ~2% which
is in the realm of noise. In addition, the will-it-scale test brk2,
written to be more representative of real-world brk() usage, shows a
modest performance improvement - which gives me confidence that we are not
meaningfully regressing real workloads here.
This series includes a test asserting that the 'just expand' mode works as
expected.
With many thanks to Oliver Sang for helping with performance testing of
candidate patch sets!
[0]:https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@intel.com
This patch (of 2):
We know in advance that do_brk_flags() wants only to perform a VMA
expansion (if the prior VMA is compatible), and that we assume no
mergeable VMA follows it.
These are the semantics of this function prior to the recent rewrite of
the VMA merging logic, however we are now doing more work than necessary -
positioning the VMA iterator at the prior VMA and performing tasks that
are not required.
Add a new field to the vmg struct to permit merge flags and add a new
merge flag VMG_FLAG_JUST_EXPAND which implies this behaviour, and have
do_brk_flags() use this.
This fixes a reported performance regression in a brk() benchmarking suite.
Link: https://lkml.kernel.org/r/cover.1729174352.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/4e65d4395e5841c5acf8470dbcb714016364fd39.1729174352.git.lorenzo.stoakes@oracle.com
Fixes:
|
||
---|---|---|
.. | ||
damon | ||
kasan | ||
kfence | ||
kmsan | ||
backing-dev.c | ||
balloon_compaction.c | ||
bootmem_info.c | ||
cma.c | ||
cma.h | ||
cma_debug.c | ||
cma_sysfs.c | ||
compaction.c | ||
debug.c | ||
debug_page_alloc.c | ||
debug_page_ref.c | ||
debug_vm_pgtable.c | ||
dmapool.c | ||
dmapool_test.c | ||
early_ioremap.c | ||
execmem.c | ||
fadvise.c | ||
fail_page_alloc.c | ||
failslab.c | ||
filemap.c | ||
folio-compat.c | ||
gup.c | ||
gup_test.c | ||
gup_test.h | ||
highmem.c | ||
hmm.c | ||
huge_memory.c | ||
hugetlb.c | ||
hugetlb_cgroup.c | ||
hugetlb_vmemmap.c | ||
hugetlb_vmemmap.h | ||
hwpoison-inject.c | ||
init-mm.c | ||
internal.h | ||
interval_tree.c | ||
io-mapping.c | ||
ioremap.c | ||
Kconfig | ||
Kconfig.debug | ||
khugepaged.c | ||
kmemleak.c | ||
ksm.c | ||
list_lru.c | ||
maccess.c | ||
madvise.c | ||
Makefile | ||
mapping_dirty_helpers.c | ||
memblock.c | ||
memcontrol-v1.c | ||
memcontrol-v1.h | ||
memcontrol.c | ||
memfd.c | ||
memory-failure.c | ||
memory-tiers.c | ||
memory.c | ||
memory_hotplug.c | ||
mempolicy.c | ||
mempool.c | ||
memremap.c | ||
memtest.c | ||
migrate.c | ||
migrate_device.c | ||
mincore.c | ||
mlock.c | ||
mm_init.c | ||
mm_slot.h | ||
mmap.c | ||
mmap_lock.c | ||
mmu_gather.c | ||
mmu_notifier.c | ||
mmzone.c | ||
mprotect.c | ||
mremap.c | ||
mseal.c | ||
msync.c | ||
nommu.c | ||
numa.c | ||
numa_emulation.c | ||
numa_memblks.c | ||
oom_kill.c | ||
page-writeback.c | ||
page_alloc.c | ||
page_counter.c | ||
page_ext.c | ||
page_idle.c | ||
page_io.c | ||
page_isolation.c | ||
page_owner.c | ||
page_poison.c | ||
page_reporting.c | ||
page_reporting.h | ||
page_table_check.c | ||
page_vma_mapped.c | ||
pagewalk.c | ||
percpu-internal.h | ||
percpu-km.c | ||
percpu-stats.c | ||
percpu-vm.c | ||
percpu.c | ||
pgalloc-track.h | ||
pgtable-generic.c | ||
process_vm_access.c | ||
ptdump.c | ||
readahead.c | ||
rmap.c | ||
rodata_test.c | ||
secretmem.c | ||
shmem.c | ||
shmem_quota.c | ||
show_mem.c | ||
shrinker.c | ||
shrinker_debug.c | ||
shuffle.c | ||
shuffle.h | ||
slab.h | ||
slab_common.c | ||
slub.c | ||
sparse-vmemmap.c | ||
sparse.c | ||
swap.c | ||
swap.h | ||
swap_cgroup.c | ||
swap_slots.c | ||
swap_state.c | ||
swapfile.c | ||
truncate.c | ||
usercopy.c | ||
userfaultfd.c | ||
util.c | ||
vma.c | ||
vma.h | ||
vma_internal.h | ||
vmalloc.c | ||
vmpressure.c | ||
vmscan.c | ||
vmstat.c | ||
workingset.c | ||
z3fold.c | ||
zbud.c | ||
zpool.c | ||
zsmalloc.c | ||
zswap.c |