View Issue Details

IDProjectCategoryView StatusLast Update
0008451Rocky-Linux-9kernelpublic2024-12-03 15:39
ReporterSteve Rast Assigned To 
PrioritynormalSeveritycrashReproducibilityrandom
Status newResolutionopen 
Platformx86_64OSRocky LinuxOS Version9.5
Summary0008451: nfs-server hard reboot server
DescriptionSince upgrading from RockyLinux 9.4 to RockyLinux 9.5 several NFS servers are randomly hard rebooting. I can see this:

[251118.198708] perf: interrupt took too long (2524 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[256533.825978] perf: interrupt took too long (3166 > 3155), lowering kernel.perf_event_max_sample_rate to 63000
[279965.977293] perf: interrupt took too long (3965 > 3957), lowering kernel.perf_event_max_sample_rate to 50000
[326621.176722] ------------[ cut here ]------------
[326621.176728] WARNING: CPU: 18 PID: 3270 at mm/slab_common.c:957 free_large_kmalloc+0x5a/0x80
[326621.176739] Modules linked in: tls binfmt_misc dm_service_time iscsi_tcp libiscsi_tcp libiscsi rpcrdma rdma_cm iw_cm ib_cm ib_core scsi_transport_iscsi nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink vfat fat dm_multipath intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif dm_mod kvm dell_wmi_descriptor sparse_keymap rfkill video iTCO_wdt rapl intel_cstate mxm_wmi mei_me dcdbas mei intel_uncore iTCO_vendor_support ipmi_si joydev acpi_power_meter ipmi_devintf ipmi_msghandler pcspkr lpc_ich nfsd nfs_acl lockd auth_rpcgss grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg mgag200 uas usb_storage drm_kms_helper ahci libahci drm_shmem_helper crct10dif_pclmul crc32_pclmul drm ixgbe crc32c_intel libata
[326621.176795] igb ghash_clmulni_intel megaraid_sas i2c_algo_bit mdio dca wmi fuse
[326621.176801] CPU: 18 PID: 3270 Comm: nfsd Kdump: loaded Not tainted 5.14.0-503.14.1.el9_5.x86_64 #1
[326621.176804] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.19.0 12/12/2023
[326621.176806] RIP: 0010:free_large_kmalloc+0x5a/0x80
[326621.176811] Code: da 9c 5b fa be 06 00 00 00 48 89 ef e8 af 25 0a 00 80 e7 02 74 01 fb 48 83 c4 08 44 89 e6 48 89 ef 5b 5d 41 5c e9 d6 28 04 00 <0f> 0b 45 31 e4 80 3d 43 0e fc 01 00 ba 00 f0 ff ff 0f 84 fb 9a 90
[326621.176813] RSP: 0018:ffffb6f9092bb968 EFLAGS: 00010246
[326621.176815] RAX: 0017ffffc0001000 RBX: ffffffff8c31e2e0 RCX: ffff94a6c40f9220
[326621.176816] RDX: fffff42b8cb69608 RSI: ffffffff8b058378 RDI: fffff42b8cb69600
[326621.176818] RBP: fffff42b8cb69600 R08: ffffffff8ca06440 R09: ffff94a9afc744b0
[326621.176819] R10: 00000000000003c8 R11: 0000000000000000 R12: ffffffff8b058378
[326621.176820] R13: 0000000000000000 R14: ffff94a68942ae00 R15: ffff94a9e567c000
[326621.176822] FS: 0000000000000000(0000) GS:ffff94a9afc40000(0000) knlGS:0000000000000000
[326621.176824] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[326621.176825] CR2: 00005626630c5140 CR3: 000000032f410001 CR4: 00000000003706f0
[326621.176827] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[326621.176828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[326621.176829] Call Trace:
[326621.176831] <TASK>
[326621.176833] ? show_trace_log_lvl+0x1c4/0x2df
[326621.176839] ? show_trace_log_lvl+0x1c4/0x2df
[326621.176842] ? security_release_secctx+0x28/0x40
[326621.176846] ? free_large_kmalloc+0x5a/0x80
[326621.176849] ? __warn+0x7e/0xd0
[326621.176852] ? free_large_kmalloc+0x5a/0x80
[326621.176855] ? report_bug+0x100/0x140
[326621.176859] ? handle_bug+0x3c/0x70
[326621.176862] ? exc_invalid_op+0x14/0x70
[326621.176864] ? asm_exc_invalid_op+0x16/0x20
[326621.176868] ? lookup_dcache+0x18/0x60
[326621.176872] ? lookup_dcache+0x18/0x60
[326621.176875] ? free_large_kmalloc+0x5a/0x80
[326621.176878] ? lookup_dcache+0x18/0x60
[326621.176880] security_release_secctx+0x28/0x40
[326621.176883] nfsd4_encode_fattr4+0x2cc/0x4f0 [nfsd]
[326621.176955] ? avc_has_perm_noaudit+0x94/0x110
[326621.176959] ? selinux_inode_permission+0x10e/0x1d0
[326621.176964] ? __d_lookup+0x73/0xb0
[326621.176967] ? d_lookup+0x29/0x50
[326621.176969] ? lookup_dcache+0x18/0x60
[326621.176972] nfsd4_encode_entry4_fattr+0xcd/0x1e0 [nfsd]
[326621.177019] nfsd4_encode_entry4+0x100/0x290 [nfsd]
[326621.177072] nfsd_buffered_readdir+0x144/0x250 [nfsd]
[326621.177114] ? __pfx_nfsd4_encode_entry4+0x10/0x10 [nfsd]
[326621.177170] ? __pfx_nfsd_buffered_filldir+0x10/0x10 [nfsd]
[326621.177211] ? __pfx_nfsd4_encode_entry4+0x10/0x10 [nfsd]
[326621.177255] nfsd_readdir+0xa9/0xe0 [nfsd]
[326621.177296] nfsd4_encode_readdir+0xf8/0x1d0 [nfsd]
[326621.177341] nfsd4_encode_operation+0xa6/0x2b0 [nfsd]
[326621.177386] nfsd4_proc_compound+0x1d0/0x700 [nfsd]
[326621.177446] nfsd_dispatch+0xe9/0x220 [nfsd]
[326621.177487] svc_process_common+0x2e7/0x650 [sunrpc]
[326621.177583] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[326621.177623] svc_process+0x12d/0x170 [sunrpc]
[326621.177691] svc_handle_xprt+0x448/0x580 [sunrpc]
[326621.177750] svc_recv+0x17a/0x2c0 [sunrpc]
[326621.177819] ? __pfx_nfsd+0x10/0x10 [nfsd]
[326621.177858] nfsd+0x84/0xb0 [nfsd]
[326621.177896] kthread+0xe0/0x100
[326621.177900] ? __pfx_kthread+0x10/0x10
[326621.177904] ret_from_fork+0x2c/0x50
[326621.177919] </TASK>
[326621.177920] ---[ end trace 0000000000000000 ]---
[326621.177922] object pointer: 0x00000000e53caba2
[326621.179321] BUG: unable to handle page fault for address: ffff94a86da58000
[326621.179324] #PF: supervisor write access in kernel mode
[326621.179327] #PF: error_code(0x0003) - permissions violation
[326621.179330] PGD 330801067 P4D 330801067 PUD 100207063 PMD 800000032da000a1
[326621.179337] Oops: 0003 [#1] PREEMPT SMP PTI
[326621.179341] CPU: 18 PID: 3270 Comm: nfsd Kdump: loaded Tainted: G W ------- --- 5.14.0-503.14.1.el9_5.x86_64 #1
[326621.179345] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.19.0 12/12/2023
[326621.179347] RIP: 0010:svc_process_common+0xe7/0x650 [sunrpc]
[326621.179466] Code: 00 00 48 c7 87 80 02 00 00 00 00 00 00 48 29 d0 48 c1 f8 03 c1 e0 0c 89 87 cc 02 00 00 4c 89 e7 e8 ce a9 00 00 48 85 c0 74 02 <89> 18 be 04 00 00 00 4c 89 e7 e8 ba a9 00 00 48 85 c0 74 06 c7 00
[326621.179468] RSP: 0018:ffffb6f9092bbe28 EFLAGS: 00010286
[326621.179470] RAX: ffff94a86da58000 RBX: 000000000bc19c07 RCX: ffff94a86da58000
[326621.179471] RDX: ffff94a9e567c2e8 RSI: 0000000000000004 RDI: ffff94a9e567c238
[326621.179472] RBP: ffff94a9e567c000 R08: ffff94a9e567c1a0 R09: 0000000000000000
[326621.179473] R10: 0000000000000006 R11: 0000000000001000 R12: ffff94a9e567c238
[326621.179474] R13: ffff94a9c23f8f00 R14: ffff94a9c23f8784 R15: ffff94a9e567c000
[326621.179475] FS: 0000000000000000(0000) GS:ffff94a9afc40000(0000) knlGS:0000000000000000
[326621.179477] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[326621.179478] CR2: ffff94a86da58000 CR3: 000000032f410001 CR4: 00000000003706f0
[326621.179479] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[326621.179480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[326621.179481] Call Trace:
[326621.179483] <TASK>
[326621.179484] ? show_trace_log_lvl+0x1c4/0x2df
[326621.179488] ? show_trace_log_lvl+0x1c4/0x2df
[326621.179492] ? svc_process+0x12d/0x170 [sunrpc]
[326621.179547] ? __die_body.cold+0x8/0xd
[326621.179551] ? page_fault_oops+0x134/0x170
[326621.179554] ? kernelmode_fixup_or_oops+0x84/0x110
[326621.179557] ? exc_page_fault+0xa8/0x150
[326621.179561] ? asm_exc_page_fault+0x22/0x30
[326621.179565] ? svc_process_common+0xe7/0x650 [sunrpc]
[326621.179621] ? svc_process_common+0xe2/0x650 [sunrpc]
[326621.179678] svc_process+0x12d/0x170 [sunrpc]
[326621.179736] svc_handle_xprt+0x448/0x580 [sunrpc]
[326621.179796] svc_recv+0x17a/0x2c0 [sunrpc]
[326621.179856] ? __pfx_nfsd+0x10/0x10 [nfsd]
[326621.179896] nfsd+0x84/0xb0 [nfsd]
[326621.179936] kthread+0xe0/0x100
[326621.179940] ? __pfx_kthread+0x10/0x10
[326621.179943] ret_from_fork+0x2c/0x50
[326621.179947] </TASK>
[326621.179948] Modules linked in: tls binfmt_misc dm_service_time iscsi_tcp libiscsi_tcp libiscsi rpcrdma rdma_cm iw_cm ib_cm ib_core scsi_transport_iscsi nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink vfat fat dm_multipath intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif dm_mod kvm dell_wmi_descriptor sparse_keymap rfkill video iTCO_wdt rapl intel_cstate mxm_wmi mei_me dcdbas mei intel_uncore iTCO_vendor_support ipmi_si joydev acpi_power_meter ipmi_devintf ipmi_msghandler pcspkr lpc_ich nfsd nfs_acl lockd auth_rpcgss grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg mgag200 uas usb_storage drm_kms_helper ahci libahci drm_shmem_helper crct10dif_pclmul crc32_pclmul drm ixgbe crc32c_intel libata
[326621.179989] igb ghash_clmulni_intel megaraid_sas i2c_algo_bit mdio dca wmi fuse
[326621.179994] CR2: ffff94a86da58000

Also saw this sometimes:

kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 160s! [nfsd:5657]

It is also with the latest kernel: 5.14.0-503.15.1.el9_5.x86_64

I use NFSv4 with SSSD integrated to ActiveDirectory. The Clients are mounting via AutoFS. Everything was working fine until the Upgrade to RockyLinux 9.5. Since then some servers are rebooting several times per day. It doesn't matter if it is UEFI or BIOS Boot. Or if it its a VM or physical server.

I have the impression as more NFS traffic is generate as faster the servers are crashing.
TagsNo tags attached.

Activities

Simon Avery

Simon Avery

2024-12-03 14:51

reporter   ~0009011

We also have this issue, or one very much like it.

We have a busy fileserver that upgraded from Rocky 9.4 to 9.5 this morning at 0540.

It returned, but at 0610 stopped responding. The console was unresponsive so we hard rebooted via vm controls.

Within 30 minutes of that, had stopped again.

We hard booted again, and this time chose the previous kernel. Since then (5h+) the vm has been 100% stable as it has been before 9.5

It serves files through both NFSv4 and SMB, using SSSD and AD for authentication.

I also get the impression that it's related to NFS traffic.

Our logs follow



Dec 3 05:53:41 redacted_hostname: [ 769.252982] ------------[ cut here ]------------
Dec 3 05:53:41 redacted_hostname: [ 769.252987] WARNING: CPU: 0 PID: 1115 at mm/slab_common.c:957 free_large_kmalloc+0x5a/0x80
Dec 3 05:53:41 redacted_hostname: [ 769.253003] Modules linked in: binfmt_misc rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs rpcrdma rdma_cm iw_cm ib_cm ib_core rfkill nft_reject_ipv4 nf_reject_ipv4 nft_reject nft_counter nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 vsock_loopback vmw_vsock_virtio_transport_common nf_tables vmw_vsock_vmci_transport vsock nfnetlink vmwgfx intel_rapl_msr vmw_balloon intel_rapl_common drm_ttm_helper ttm vmw_vmci pcspkr drm_kms_helper i2c_piix4 joydev nfsd nfs_acl lockd auth_rpcgss grace sunrpc drm xfs libcrc32c crct10dif_pclmul sd_mod crc32_pclmul ata_generic t10_pi crc32c_intel sg ghash_clmulni_intel ata_piix libata vmxnet3 vmw_pvscsi serio_raw dm_mirror dm_region_hash dm_log dm_mod fuse
Dec 3 05:53:41 redacted_hostname: [ 769.253054] CPU: 0 PID: 1115 Comm: nfsd Not tainted 5.14.0-503.15.1.el9_5.x86_64 #1
Dec 3 05:53:41 redacted_hostname: [ 769.253056] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Dec 3 05:53:41 redacted_hostname: [ 769.253057] RIP: 0010:free_large_kmalloc+0x5a/0x80
Dec 3 05:53:41 redacted_hostname: [ 769.253060] Code: da 9c 5b fa be 06 00 00 00 48 89 ef e8 af 25 0a 00 80 e7 02 74 01 fb 48 83 c4 08 44 89 e6 48 89 ef 5b 5d 41 5c e9 d6 28 04 00 <0f> 0b 45 31 e4 80 3d d3 0d fc 01 00 ba 00 f0 ff ff 0f 84 8b 9a 90
Dec 3 05:53:41 redacted_hostname: [ 769.253062] RSP: 0018:ffffafef00d57b28 EFLAGS: 00010246
Dec 3 05:53:41 redacted_hostname: [ 769.253064] RAX: 0017ffffc0000000 RBX: ffffffff8cd1e2e0 RCX: ffff9fce52f7ac68
Dec 3 05:53:41 redacted_hostname: [ 769.253065] RDX: dead000000000100 RSI: ffffffffc096b47c RDI: ffffd9fd06c25ac0
Dec 3 05:53:41 redacted_hostname: [ 769.253065] RBP: ffffd9fd06c25ac0 R08: ffffffff8d4075e0 R09: ffff9fcf35e344b0
Dec 3 05:53:41 redacted_hostname: [ 769.253066] R10: 000000b30f218140 R11: 00000000004c82ea R12: ffffffffc096b47c
Dec 3 05:53:41 redacted_hostname: [ 769.253067] R13: 0000000000000000 R14: ffff9fce0a0c4a00 R15: ffff9fce01148000
Dec 3 05:53:41 redacted_hostname: [ 769.253068] FS: 0000000000000000(0000) GS:ffff9fcf35e00000(0000) knlGS:0000000000000000
Dec 3 05:53:41 redacted_hostname: [ 769.253070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 3 05:53:41 redacted_hostname: [ 769.253071] CR2: 00007efd2f51a654 CR3: 000000010902a002 CR4: 00000000007706f0
Dec 3 05:53:41 redacted_hostname: [ 769.253085] PKRU: 55555554
Dec 3 05:53:41 redacted_hostname: [ 769.253086] Call Trace:
Dec 3 05:53:41 redacted_hostname: [ 769.253087] <TASK>
Dec 3 05:53:41 redacted_hostname: [ 769.253088] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253094] ? show_trace_log_lvl+0x26e/0x2df
Dec 3 05:53:41 redacted_hostname: [ 769.253101] ? show_trace_log_lvl+0x26e/0x2df
Dec 3 05:53:41 redacted_hostname: [ 769.253106] ? security_release_secctx+0x28/0x40
Dec 3 05:53:41 redacted_hostname: [ 769.253110] ? free_large_kmalloc+0x5a/0x80
Dec 3 05:53:41 redacted_hostname: [ 769.253113] ? __warn+0x7e/0xd0
Dec 3 05:53:41 redacted_hostname: [ 769.253116] ? free_large_kmalloc+0x5a/0x80
Dec 3 05:53:41 redacted_hostname: [ 769.253119] ? report_bug+0x100/0x140
Dec 3 05:53:41 redacted_hostname: [ 769.253124] ? handle_bug+0x3c/0x70
Dec 3 05:53:41 redacted_hostname: [ 769.253127] ? exc_invalid_op+0x14/0x70
Dec 3 05:53:41 redacted_hostname: [ 769.253130] ? asm_exc_invalid_op+0x16/0x20
Dec 3 05:53:41 redacted_hostname: [ 769.253134] ? _fh_update.part.0.isra.0+0x4c/0x90 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253162] ? _fh_update.part.0.isra.0+0x4c/0x90 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253184] ? free_large_kmalloc+0x5a/0x80
Dec 3 05:53:41 redacted_hostname: [ 769.253188] ? _fh_update.part.0.isra.0+0x4c/0x90 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253206] security_release_secctx+0x28/0x40
Dec 3 05:53:41 redacted_hostname: [ 769.253209] nfsd4_encode_fattr4+0x2cc/0x4f0 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253237] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253239] ? __kmem_cache_alloc_node+0x18f/0x2e0
Dec 3 05:53:41 redacted_hostname: [ 769.253242] ? security_prepare_creds+0x71/0xa0
Dec 3 05:53:41 redacted_hostname: [ 769.253245] ? security_prepare_creds+0x71/0xa0
Dec 3 05:53:41 redacted_hostname: [ 769.253242] ? security_prepare_creds+0x71/0xa0
Dec 3 05:53:41 redacted_hostname: [ 769.253245] ? security_prepare_creds+0x71/0xa0
Dec 3 05:53:41 redacted_hostname: [ 769.253246] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253248] ? __kmalloc+0x4b/0x140
Dec 3 05:53:41 redacted_hostname: [ 769.253250] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253251] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253253] ? security_prepare_creds+0x47/0xa0
Dec 3 05:53:41 redacted_hostname: [ 769.253255] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253256] ? prepare_creds+0x180/0x270
Dec 3 05:53:41 redacted_hostname: [ 769.253259] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253261] ? nfsd_setuser+0x110/0x270 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253286] ? srso_alias_return_thunk+0x5/0xfbef5
Dec 3 05:53:41 redacted_hostname: [ 769.253288] ? nfsd_setuser_and_check_port+0x4a/0xc0 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253306] ? _fh_update.part.0.isra.0+0x4c/0x90 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253323] nfsd4_encode_getattr+0x2b/0x40 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253341] nfsd4_encode_operation+0xa6/0x2b0 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253361] nfsd4_proc_compound+0x1d0/0x700 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253384] nfsd_dispatch+0xe9/0x220 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253404] svc_process_common+0x2e7/0x650 [sunrpc]
Dec 3 05:53:41 redacted_hostname: [ 769.253435] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253463] svc_process+0x12d/0x170 [sunrpc]
Dec 3 05:53:41 redacted_hostname: [ 769.253491] svc_handle_xprt+0x448/0x580 [sunrpc]
Dec 3 05:53:41 redacted_hostname: [ 769.253523] svc_recv+0x17a/0x2c0 [sunrpc]
Dec 3 05:53:41 redacted_hostname: [ 769.253552] ? __pfx_nfsd+0x10/0x10 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253577] nfsd+0x84/0xb0 [nfsd]
Dec 3 05:53:41 redacted_hostname: [ 769.253600] kthread+0xe0/0x100
Dec 3 05:53:41 redacted_hostname: [ 769.253604] ? __pfx_kthread+0x10/0x10
Dec 3 05:53:41 redacted_hostname: [ 769.253608] ret_from_fork+0x2c/0x50
Dec 3 05:53:41 redacted_hostname: [ 769.253614] </TASK>
Dec 3 05:53:41 redacted_hostname: [ 769.253614] ---[ end trace 0000000000000000 ]---
Dec 3 05:53:41 redacted_hostname: [ 769.253616] object pointer: 0x000000008b742c83
Neil Hanlon

Neil Hanlon

2024-12-03 14:55

administrator   ~0009012

https://nvd.nist.gov/vuln/detail/CVE-2024-46697

feels related
Neil Hanlon

Neil Hanlon

2024-12-03 15:20

administrator   ~0009013

Would either of you have access or ability to generate a kdump? I'd like to poke around and see if my assumptions here are correct.. but I believe this crash is being triggered by the lack of the fix for CVE-2024-46697 (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f58bab6fd4063913bd8321e99874b8239e9ba726) in the 9.5 kernel.

i.e., the specified condition occurs and args.context is set with random junk which, when the kernel attempts to free, sometimes causes a crash -- the effect thereof would be pronounced further on busy servers, probably.
Steve Rast

Steve Rast

2024-12-03 15:28

reporter   ~0009014

In /var/crash i have a lot of kexec-dmesg.log, vmcore, vmcore-dmesg.txt

Is that what you would need?
Neil Hanlon

Neil Hanlon

2024-12-03 15:39

administrator   ~0009015

https://issues.redhat.com/browse/RHEL-69877

Opened this for now

Issue History

Date Modified Username Field Change
2024-12-03 11:18 Steve Rast New Issue
2024-12-03 14:51 Simon Avery Note Added: 0009011
2024-12-03 14:55 Neil Hanlon Note Added: 0009012
2024-12-03 15:20 Neil Hanlon Note Added: 0009013
2024-12-03 15:28 Steve Rast Note Added: 0009014
2024-12-03 15:39 Neil Hanlon Note Added: 0009015