|
The kernel bug is described here: watchdog: BUG: soft lockup - CPU#6 stuck for 5718s! [wdavdaemon:1134] with 5.15.0-144-generic
Affected versions are 5.15.0-144-generic to 5.15.0-152-generic, and are fixed in 5.15.0-153-generic. Other Linux distros using kernel variants such as -aws will have their own affected ranges based on the relevant affected -generic versions: Comment 43 for bug 2118407
An example of the system log when triggered:
2025-10-03 18:53:40 *redacted* kernel: watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [datacollector:1862]
2025-10-03 18:53:40 *redacted* kernel: Modules linked in: ip6table_filter ip6_tables raw_diag unix_diag af_packet_diag netlink_diag binfmt_misc udp_diag tcp_diag inet_diag xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype iptable_filter bpfilter br_netfilter bridge stp llc aufs nf_tables libcrc32c nfnetlink overlay ppdev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd parport_pc cryptd parport input_leds ena psmouse serio_raw sch_fq_codel drm efi_pstore sunrpc ip_tables x_tables autofs4
2025-10-03 18:53:40 *redacted* kernel: CPU: 5 PID: 1862 Comm: datacollector Not tainted 5.15.0-1089-aws #96~20.04.1-Ubuntu
2025-10-03 18:53:40 *redacted* kernel: Hardware name: Amazon EC2 m5d.4xlarge/, BIOS 1.0 10/16/2017
2025-10-03 18:53:40 *redacted* kernel: RIP: 0010:do_task_stat+0xc1f/0xe40
2025-10-03 18:53:40 *redacted* kernel: Code: ff 31 d2 e9 9e fa ff ff 8b 4d a8 4c 8b a3 10 02 00 00 4c 8b bb 18 02 00 00 4c 8b ab c8 01 00 00 48 89 d0 4c 03 a0 00 0b 00 00 <4c> 03 b8 08 0b 00 00 4c 03 a8 c0 0a 00 00 48 8b 80 70 0a 00 00 48
2025-10-03 18:53:40 *redacted* kernel: RSP: 0018:ffff9fde80e13b48 EFLAGS: 00000216
2025-10-03 18:53:40 *redacted* kernel: RAX: ffff8d0b84dd8000 RBX: ffff8d0b855e1200 RCX: 0000000000000000
2025-10-03 18:53:40 *redacted* kernel: RDX: ffff8d0dbbc64100 RSI: 0000000000000001 RDI: ffff8d0b953feb40
2025-10-03 18:53:40 *redacted* kernel: RBP: ffff9fde80e13c58 R08: 0000000000000000 R09: ffff8d0b81238e10
2025-10-03 18:53:40 *redacted* kernel: R10: 0000000000020000 R11: 000000000000000b R12: 000018d7ec9f6b55
2025-10-03 18:53:40 *redacted* kernel: R13: 0000000000000000 R14: 0000000000000132 R15: 0000000000000000
2025-10-03 18:53:40 *redacted* kernel: FS: 00007f399dbde700(0000) GS:ffff8d1a4db40000(0000) knlGS:0000000000000000
2025-10-03 18:53:40 *redacted* kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2025-10-03 18:53:40 *redacted* kernel: CR2: 000000c02c31f000 CR3: 0000000125a96006 CR4: 00000000007706e0
2025-10-03 18:53:40 *redacted* kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2025-10-03 18:53:40 *redacted* kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
2025-10-03 18:53:40 *redacted* kernel: PKRU: 55555554
2025-10-03 18:53:40 *redacted* kernel: Call Trace:
2025-10-03 18:53:40 *redacted* kernel: <TASK>
2025-10-03 18:53:40 *redacted* kernel: ? mod_objcg_state+0x185/0x340
2025-10-03 18:53:40 *redacted* kernel: proc_tgid_stat+0x14/0x20
2025-10-03 18:53:40 *redacted* kernel: proc_single_show+0x52/0xc0
2025-10-03 18:53:40 *redacted* kernel: seq_read_iter+0x124/0x450
2025-10-03 18:53:40 *redacted* kernel: ? free_swap_slot+0x74/0x130
2025-10-03 18:53:40 *redacted* kernel: seq_read+0xfd/0x150
2025-10-03 18:53:40 *redacted* kernel: ? srso_alias_safe_ret+0xfbefc/0xfbefc
2025-10-03 18:53:40 *redacted* kernel: vfs_read+0xa0/0x1a0
2025-10-03 18:53:40 *redacted* kernel: ksys_read+0x67/0xf0
2025-10-03 18:53:40 *redacted* kernel: __x64_sys_read+0x1a/0x20
2025-10-03 18:53:40 *redacted* kernel: x64_sys_call+0x1dba/0x1fa0
2025-10-03 18:53:40 *redacted* kernel: do_syscall_64+0x54/0xb0
2025-10-03 18:53:40 *redacted* kernel: ? handle_mm_fault+0xd8/0x2c0
2025-10-03 18:53:40 *redacted* kernel: ? exit_to_user_mode_prepare+0x3d/0x1c0
2025-10-03 18:53:40 *redacted* kernel: ? do_user_addr_fault+0x1e0/0x610
2025-10-03 18:53:40 *redacted* kernel: ? irqentry_exit_to_user_mode+0xe/0x20
2025-10-03 18:53:40 *redacted* kernel: ? irqentry_exit+0x1d/0x30
2025-10-03 18:53:40 *redacted* kernel: ? exc_page_fault+0x89/0x170
2025-10-03 18:53:40 *redacted* kernel: entry_SYSCALL_64_after_hwframe+0x6c/0xd6
2025-10-03 18:53:40 *redacted* kernel: RIP: 0033:0x55b03130686e
2025-10-03 18:53:40 *redacted* kernel: Code: 24 28 44 8b 44 24 2c e9 70 ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
2025-10-03 18:53:40 *redacted* kernel: RSP: 002b:000000c0530049f0 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
2025-10-03 18:53:40 *redacted* kernel: RAX: ffffffffffffffda RBX: 000000000000007e RCX: 000055b03130686e
2025-10-03 18:53:40 *redacted* kernel: RDX: 0000000000000200 RSI: 000000c02c31f000 RDI: 000000000000007e
2025-10-03 18:53:40 *redacted* kernel: RBP: 000000c053004a30 R08: 0000000000000000 R09: 0000000000000000
2025-10-03 18:53:40 *redacted* kernel: R10: 0000000000000000 R11: 0000000000000212 R12: 000000c053004b60
2025-10-03 18:53:40 *redacted* kernel: R13: 0000000000000010 R14: 000000c0055b8700 R15: 000000c02c320410
2025-10-03 18:53:40 *redacted* kernel: </TASK>
It is believed that this can affect any version of the FortiCNAPP agent; however, at the time of writing has only been observed on v7.7.0 and v7.8.0.
Working around this issue requires an upgrade (or downgrade) to unaffected kernel versions.
FortiCNAPP customers planning to upgrade Linux versions are strongly advised to avoid the affected kernel versions.
It is worth noting that the bug was introduced in a patch to a CVE (link below), which describes very specific conditions where accessing /proc/pid/stat could cause a 'CPU hard lockup':
CVE-2024-26686 Detail
It is not known whether the 'CPU soft lockup' documented here has the same requirements for being triggered.
|