[LU-17302] sanity-scrub test_19: crash Created: 20/Nov/23  Updated: 20/Nov/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for S Buisson <sbuisson@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/0718359d-bf15-4f24-943d-4094e70cb881

test_19 failed with the following error:

onyx-57vm6 crashed during sanity-scrub test_19

Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/100391 - 4.18.0-477.27.1.el8_8.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/100391 - 4.18.0-477.27.1.el8_lustre.x86_64

[ 4622.739102] Lustre: DEBUG MARKER: == sanity-scrub test 19: LFSCK can fix multiple linked files on OST ========================================================== 14:41:11 (1700491271)
[ 4624.394848] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
[ 4624.709167] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
[ 4659.053107] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [umount:333117]
[ 4659.056504] Modules linked in: dm_flakey osp(OE) mdd(OE) lod(OE) mgs(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr virtio_balloon i2c_piix4 sunrpc dm_mod ext4 mbcache jbd2 ata_generic ata_piix libata crc32c_intel serio_raw virtio_net virtio_blk net_failover failover [last unloaded: dm_flakey]
[ 4659.058105] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [mdt00_003:330597]
[ 4659.065759] CPU: 0 PID: 333117 Comm: umount Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-477.27.1.el8_lustre.x86_64 #1
[ 4659.067261] Modules linked in:
[ 4659.069459] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 4659.069459]  dm_flakey osp(OE)
[ 4659.070068] RIP: 0010:native_safe_halt+0xe/0x20
[ 4659.071244]  mdd(OE)
[ 4659.071858] Code: 00 f0 80 48 02 20 48 8b 00 a8 08 75 c0 e9 79 ff ff ff 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 46 90 60 00 fb f4 <e9> dd 01 40 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 e9 07 00 00
[ 4659.072812]  lod(OE)
[ 4659.073268] RSP: 0018:ffffb43ac795fc50 EFLAGS: 00000246
[ 4659.076935]  mgs(OE)
[ 4659.077399]  ORIG_RAX: ffffffffffffff13
[ 4659.078495]  mdt(OE)
[ 4659.078951] RAX: 0000000000000003 RBX: ffffffffc0a61f10 RCX: 0000000000000008
[ 4659.079770]  lfsck(OE)
[ 4659.080226] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffffc0a61f10
[ 4659.081677]  mgc(OE)
[ 4659.082161] RBP: ffff8a15bfc33d40 R08: 0000000000000008 R09: 00000000000000d4
[ 4659.083616]  osd_ldiskfs(OE)
[ 4659.084070] R10: ffffffffffffffff R11: 0000000000000004 R12: 0000000000000000
[ 4659.085521]  ldiskfs(OE)
[ 4659.086097] R13: 0000000000000001 R14: 0000000000000100 R15: 0000000000040000
[ 4659.087552]  lquota(OE)
[ 4659.088068] FS:  00007f4fb45ba080(0000) GS:ffff8a15bfc00000(0000) knlGS:0000000000000000
[ 4659.089517]  lustre(OE)
[ 4659.090020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4659.091645]  mdc(OE)
[ 4659.092163] CR2: 000056074e125038 CR3: 000000002ff92004 CR4: 00000000000606f0
[ 4659.093348]  lov(OE)
[ 4659.093831] Call Trace:
[ 4659.095265]  osc(OE)
[ 4659.095757]  kvm_wait+0x58/0x60
[ 4659.096296]  lmv(OE)
[ 4659.096768]  __pv_queued_spin_lock_slowpath+0x268/0x2a0
[ 4659.097431]  fid(OE)
[ 4659.097915]  _raw_spin_lock+0x1e/0x30
[ 4659.098989]  fld(OE)
[ 4659.099453]  xa_erase+0xd/0x30
[ 4659.100209]  ksocklnd(OE)
[ 4659.100687]  class_unregister_device+0x22/0x50 [obdclass]
[ 4659.101327]  ptlrpc(OE)
[ 4659.101883]  class_detach+0x1fe/0x2d0 [obdclass]
[ 4659.102990]  obdclass(OE)
[ 4659.103518]  class_process_config+0x1685/0x2160 [obdclass]
[ 4659.104470]  lnet(OE)
[ 4659.105005]  ? class_manual_cleanup+0x116/0x770 [obdclass]
[ 4659.106108]  libcfs(OE)
[ 4659.106588]  ? __kmalloc+0x113/0x250
[ 4659.107708]  rpcsec_gss_krb5
[ 4659.108209]  ? lprocfs_counter_add+0x12a/0x1a0 [obdclass]
[ 4659.108992]  auth_rpcgss
[ 4659.109583]  class_manual_cleanup+0x484/0x770 [obdclass]
[ 4659.110706]  nfsv4
[ 4659.111227]  server_put_super+0x7a5/0x1300 [ptlrpc]
[ 4659.112344]  dns_resolver
[ 4659.112832]  ? evict_inodes+0x160/0x1b0
[ 4659.113822]  nfs
[ 4659.114359]  generic_shutdown_super+0x6c/0x110
[ 4659.115168]  lockd
[ 4659.115588]  kill_anon_super+0x14/0x30
[ 4659.116514]  grace
[ 4659.116947]  deactivate_locked_super+0x34/0x70
[ 4659.117747]  fscache
[ 4659.118176]  cleanup_mnt+0x3b/0x70
[ 4659.119113]  intel_rapl_msr
[ 4659.119589]  task_work_run+0x8a/0xb0
[ 4659.120308]  intel_rapl_common
[ 4659.120918]  exit_to_usermode_loop+0xef/0x100
[ 4659.121685]  crct10dif_pclmul
[ 4659.122292]  do_syscall_64+0x19c/0x1b0
[ 4659.123200]  crc32_pclmul
[ 4659.123804]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[ 4659.124596]  ghash_clmulni_intel
[ 4659.125132] RIP: 0033:0x7f4fb3524e9b
[ 4659.126146]  joydev
[ 4659.126797] Code: ff d0 48 89 c7 b8 3c 00 00 00 0f 05 48 8b 0d e4 4f 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd 4f 38 00 f7 d8 64 89 01 48
[ 4659.127549]  pcspkr
[ 4659.127987] RSP: 002b:00007ffe32bf4ac8 EFLAGS: 00000202
[ 4659.131616]  virtio_balloon
[ 4659.132053]  ORIG_RAX: 00000000000000a6
[ 4659.133143]  i2c_piix4
[ 4659.133731] RAX: 0000000000000000 RBX: 0000559656027400 RCX: 00007f4fb3524e9b
[ 4659.134544]  sunrpc
[ 4659.135057] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 000055965602cd10
[ 4659.136504]  dm_mod
[ 4659.136943] RBP: 0000000000000001 R08: 00000000000000e0 R09: 00007f4fb38aabc0
[ 4659.138395]  ext4
[ 4659.138832] R10: 0000000000000007 R11: 0000000000000202 R12: 000055965602cd10
[ 4659.140255]  mbcache
[ 4659.140671] R13: 00007f4fb4396184 R14: 000055965602ce90 R15: 00000000ffffffff
[ 4659.142117]  jbd2
[ 4659.142598] Kernel panic - not syncing: softlockup: hung tasks
[ 4659.144006]  ata_generic

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-scrub test_19 - onyx-57vm6 crashed during sanity-scrub test_19


Generated at Sat Feb 10 03:34:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.