I've faced a situation on one of my VPSes where kernel slab memory spontaneously started leaking:

In slabtop
, I found the culprit to be kmalloc-64
. This is fairly meaningless, but after some searching I found you can add slub_debug=U
to the kernel command line. Then, you can see the source of slab allocs of this type by viewing /sys/kernel/slab/kmalloc-64/alloc_calls
.
This pointed me to an issue with KVM paravirtualisation of page faults:
[20:48:53][root@kyubey][/sys/kernel/slab/kmalloc-64]# cat alloc_calls
27 x86_vector_alloc_irqs+0xf6/0x3b0 age=1061960/1062568/1063054 pid=0-92
15 mp_irqdomain_alloc+0x79/0x290 age=1063054/1063054/1063054 pid=0
138245 kvm_async_pf_task_wake+0x83/0x110 age=0/474464/1062430 pid=0-1517
31 reserve_memtype+0xb3/0x2c0 age=1060763/1062079/1063055 pid=0-273
24 __request_region+0x6e/0x190 age=1060825/1062237/1063050 pid=1-282
...
My host has confirmed it is unlikely to be on their end, so I'm stumped as to where this came from out of the blue.
Anyway, a valid workaround is to add no-kvmapf
to the kernel command line.
Comments
[…] around and I found a post where someone encountered exactly the same problem: https://darkimmortal.com/debian-10-kernel-slab-memory-leak/. This post documents a workaround of adding no-kvmapf to the Linux command line. Not sure if this […]
[…] around and I found a post where someone encountered exactly the same problem: https://darkimmortal.com/debian-10-kernel-slab-memory-leak/. This post documents a workaround of adding no-kvmapf to the Linux command line. Not sure if this […]
I ended up filing a kernel bug for this to see if the KVM team can look into it: https://bugzilla.kernel.org/show_bug.cgi?id=208081. If you have any other comments to add to the bug report, feel free to comment on it 🙂
Thank you for this post! I just encountered a very similar issue on one of my VPSes, and also found `kvm_async_pf_task_wake` to be the cause.
Is there any downside to adding `no-kvmapf` to the kernel command line?
Had it in place since this post with no obvious ill effect
I imagine the only impact is marginally higher CPU usage, which is the host’s problem 🙂
Thanks! I think I’ll try it out.
My situation is really strange… I have two VPSes configured identically (same Debian version, same kernel version, all the same software), and only one of them is exhibiting this behaviour. I contacted the host and they said there’s no difference between the nodes – they were physically built at the same time and use all the same software versions. I also posted on ServerFault earlier today (https://serverfault.com/questions/1020241/debugging-kmalloc-64-slab-allocations-memory-leak ) before finding your post.