1.问题描述
宿主机使用麒麟V10操作系统作为hostos时,对虚拟机进行软关机操作,会导致系统宕机,出问题时对应的调用栈如下:
[ 2188.225012] Process CPU 0/KVM (pid: 11384, stack limit = 0x000000005889c77d)
[ 2188.225671] CPU: 13 PID: 11384 Comm: CPU 0/KVM Kdump: loaded Tainted: G OE 4.19.90-52.44.v2207.fortest.ky10.aarch64 #1
[ 2188.226744] Source Version: 9de2a08e7b57c09e80a10421b6798d000d26a533
[ 2188.227412] Hardware name: GreatWall \xe6\x93\x8e\xe5\xa4\xa9DF723/N/A, BIOS KunLun BIOS V4.0 03/17/2021
[ 2188.228157] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[ 2188.228607] pc : queued_spin_lock_slowpath+0x190/0x308
[ 2188.229104] lr : update_lpi_config+0x160/0x168
[ 2188.229503] sp : ffffbeba9ff13790
[ 2188.229821] x29: ffffbeba9ff13790 x28: 0000000000000000
[ 2188.230267] x27: 0000000000000000 x26: ffffbeba9ff139c0
[ 2188.230736] x25: 0000000000000000 x24: 0000000000000000
[ 2188.231309] x23: 0000000000000001 x22: 0000000000000000
[ 2188.231791] x21: ffffc2b986a00000 x20: 00000000e87ab700
[ 2188.232309] x19: ffffc13ce87a9b80 x18: ffff8000c0024390
[ 2188.232834] x17: 0000fffe613ce498 x16: ffff00000811e570
[ 2188.233336] x15: 0000fffd00290007 x14: ffff8000cada6cf8
[ 2188.233862] x13: ffff8000cada6b38 x12: 0000000000000000
[ 2188.234363] x11: 0000000000000040 x10: ffff0000098ef8b0
[ 2188.234879] x9 : 000000000808008c x8 : 0000000000000000
[ 2188.235416] x7 : ffff4cf9f0373b48 x6 : 0000000000380000
[ 2188.235900] x5 : ffffc53d36082600 x4 : 000001a400000004
[ 2188.236359] x3 : ffff4cf9effd2620 x2 : ffffc53d36082600
[ 2188.236836] x1 : ffff4cf9effd2000 x0 : ffffc53d36082608
[ 2188.237311] Call trace:
[ 2188.237558] queued_spin_lock_slowpath+0x190/0x308
[ 2188.237980] update_lpi_config+0x160/0x168
[ 2188.238338] vgic_its_process_commands.part.11+0x898/0x9b0
[ 2188.238819] vgic_mmio_write_its_cwriter+0xa4/0xa8
[ 2188.239242] dispatch_mmio_write+0x94/0x110
[ 2188.239624] __kvm_io_bus_write.isra.27+0xa4/0x158
[ 2188.240062] kvm_io_bus_write+0x68/0x90
[ 2188.240430] io_mem_abort+0xd8/0x350
[ 2188.240758] kvm_handle_guest_abort+0x2a8/0x478
[ 2188.241269] handle_exit+0x184/0x368
[ 2188.241635] kvm_arch_vcpu_ioctl_run+0x250/0x850
[ 2188.242044] kvm_vcpu_ioctl+0x460/0x880
[ 2188.242401] do_vfs_ioctl+0xb0/0x8e8
[ 2188.242733] ksys_ioctl+0x8c/0xa0
[ 2188.243055] sys_ioctl+0x34/0xa0
[ 2188.243373] __sys_trace_return+0x0/0x4
2.受影响的软件包
银河麒麟高级服务器操作系统 V10 SP3 2303 aarch64
4.19.90-52.44.v2207
银河麒麟高级服务器操作系统 V10 SP3 2403 aarch64
4.19.90-89.18.v2401~4.19.90-89.19.v2401
3.问题复现方法
在部署麒麟操作系统的主机上启动虚拟机。在虚拟机中执行电源->关机操作,或者使用virsh shutdown命令关闭虚拟机,会导致主机宕机。
4.问题分析结果
该问题是因为上游社区解决CVE-2024-26598的补丁ad362fe07fec ("KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache")所引入,该补丁对中断变量irq进行了引用计数累加,但是在释放的时候对irq进行了过早的释放,从而导致系统访问irq的时候是个非法地址。针对该问题,修复在正确的位置释放变量irq,避免出现引用错误内存地址,导致系统宕机。
5.补丁及下载地址
通过新内核更新修复
6.修复和更新方法
yum update kernel(用root权限执行以下命令)