Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20095

BUG at /tmp/rpmbuild-lustre-jenkins-Ui9P732t/BUILD/lustre-2.17.51_23_gb36f204/ldiskfs/super.c:1186!

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Medium
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Dongyang Li <dongyangli@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/533d5466-45fa-4864-b7f8-91be02c74b8d

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/122939 - 4.18.0-553.89.1.el8_10.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/122939 - 4.18.0-553.89.1.el8_lustre.x86_64

      After fs remounts readonly due to IO errors etc, umount crashes the server failing the
      J_ASSERT(list_empty(&sbi->s_orphan))
       

      [ 2138.284023] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      [ 2138.390362] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180
      [ 2138.391989] Lustre: Skipped 13 previous similar messages
      [ 2138.396022] LDISKFS-fs error (device dm-3): mb_free_blocks:1962: group 10, block 225360:freeing already freed block (bit 16880); block bitmap corrupt.
      [ 2138.398643] Aborting journal on device dm-3-8.
      [ 2138.399620] LDISKFS-fs (dm-3): Remounting filesystem read-only
      [ 2138.400761] LDISKFS-fs warning (device dm-3): ldiskfs_mb_generate_buddy:1239: group 10: block bitmap and bg descriptor inconsistent: 3466 vs 3468 free clusters 3946 in gd, 1 pa's
      [ 2138.403685] LDISKFS-fs error (device dm-3) in osd_trans_stop:2340: IO failure
      [ 2138.405139] LustreError: 178024:0:(osd_handler.c:2343:osd_trans_stop()) lustre-MDT0000: failed to stop transaction: rc = -5
      [ 2138.543692] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
      [ 2138.673575] LustreError: 178198:0:(obd_sysfs.c:269:health_check_show()) lustre-MDT0000-osd: device reported unhealthy
      [ 2138.882232] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  mmp test_10: @@@@@@ FAIL: mds1 is in a unhealthy state, got: \'NOT HEALTHY\' 
      [ 2139.045746] Lustre: DEBUG MARKER: mmp test_10: @@@@@@ FAIL: mds1 is in a unhealthy state, got: 'NOT HEALTHY'
      [ 2139.254692] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-2/2026-03-27/lustre-reviews_review-dne-subtest-change_122939_104_1853660e-67b8-4caf-b360-19cdfc75c178//mmp.test_10.debug_log.$(hostname -s).1774591669.log;
      		dmesg > /autotest/autotest-2/2026-03-27/lustre-reviews_rev
      [ 2141.187399] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
      [ 2141.447256] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
      [ 2141.700069] LustreError: 178640:0:(osd_handler.c:2455:osd_drop_preallocated_space()) lustre-MDT0000: can't truncate: rc=-30
      [ 2141.709895] LDISKFS-fs (dm-3): Inode 576 (00000000b61feaea): orphan list check failed!
      [ 2142.807525] LDISKFS-fs (dm-3): sb orphan head is 583
      [ 2142.808414] sb_info orphan list:
      [ 2142.808984]   inode dm-3:583 at 0000000086f83954: mode 100644, nlink 1, next 582
      [ 2142.810203]   inode dm-3:582 at 00000000dc887a10: mode 100644, nlink 1, next 581
      [ 2142.811431]   inode dm-3:581 at 000000001f4f448f: mode 100644, nlink 1, next 580
      [ 2142.812653]   inode dm-3:580 at 00000000bc6f781b: mode 100644, nlink 1, next 579
      [ 2142.813868]   inode dm-3:579 at 00000000d7aea715: mode 100644, nlink 1, next 578
      [ 2142.815096]   inode dm-3:578 at 00000000df14a9ba: mode 100644, nlink 1, next 577
      [ 2142.816316]   inode dm-3:577 at 000000000c18ec88: mode 100644, nlink 1, next 576
      [ 2142.817562]   inode dm-3:576 at 000000009cc4214f: mode 100644, nlink 1, next 0
      [ 2142.818786] ------------[ cut here ]------------
      [ 2142.819597] kernel BUG at /tmp/rpmbuild-lustre-jenkins-Ui9P732t/BUILD/lustre-2.17.51_23_gb36f204/ldiskfs/super.c:1186!
      [ 2142.821340] invalid opcode: 0000 [#1] SMP PTI
      [ 2142.822086] CPU: 0 PID: 178640 Comm: umount Kdump: loaded Tainted: G           OE     -------- -  - 4.18.0-553.89.1.el8_lustre.x86_64 #1
      [ 2142.824039] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014
      [ 2142.825143] RIP: 0010:ldiskfs_put_super+0x3b3/0x3c0 [ldiskfs]
      [ 2142.826121] Code: 85 00 04 00 00 48 8b 40 68 83 60 60 fb 0f b7 85 a0 00 00 00 66 41 89 44 24 3a 41 f6 45 50 01 0f 85 3e fd ff ff e9 2c fd ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8d af
      [ 2142.829076] RSP: 0018:ffffbf09c56279f8 EFLAGS: 00010293
      [ 2142.829957] RAX: ffff9f23aaa01858 RBX: ffff9f23a9c89720 RCX: 0000000000000000
      [ 2142.831128] RDX: 0000000000000000 RSI: ffff9f23fbc1e698 RDI: ffff9f23fbc1e698
      [ 2142.832306] RBP: ffff9f23a9c89000 R08: 0000000000000000 R09: c0000000ffff7fff
      [ 2142.833467] R10: 0000000000000001 R11: ffffbf09c5627810 R12: ffff9f23a9c89720
      [ 2142.834626] R13: ffff9f2395f29800 R14: ffffffffc13fcc70 R15: ffff9f23c23b8520
      [ 2142.835785] FS:  00007f1685a95080(0000) GS:ffff9f23fbc00000(0000) knlGS:0000000000000000
      [ 2142.837094] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2142.838044] CR2: 000055a1bc589ee8 CR3: 0000000128e6a004 CR4: 0000000000170ef0
      [ 2142.839210] Call Trace:
      [ 2142.839664]  ? __die_body+0x1a/0x60
      [ 2142.840270]  ? die+0x2a/0x50
      [ 2142.840788]  ? do_trap+0xe7/0x110
      [ 2142.841373]  ? ldiskfs_put_super+0x3b3/0x3c0 [ldiskfs]
      [ 2142.842255]  ? do_invalid_op+0x36/0x40
      [ 2142.842905]  ? ldiskfs_put_super+0x3b3/0x3c0 [ldiskfs]
      [ 2142.843786]  ? invalid_op+0x14/0x20
      [ 2142.844396]  ? ldiskfs_put_super+0x3b3/0x3c0 [ldiskfs]
      [ 2142.845273]  generic_shutdown_super+0x6c/0x110
      [ 2142.846034]  kill_block_super+0x21/0x50
      [ 2142.846700]  deactivate_locked_super+0x34/0x70
      [ 2142.847456]  cleanup_mnt+0x3b/0x70
      [ 2142.848050]  osd_umount+0x7d/0x1b0 [osd_ldiskfs]
      [ 2142.848845]  osd_device_fini+0x1b8/0x210 [osd_ldiskfs]
      [ 2142.849723]  obd_precleanup.isra.33+0x8e/0x280 [obdclass]
      [ 2142.850670]  ? class_disconnect_exports+0x2d6/0x2f0 [obdclass]
      [ 2142.851680]  class_cleanup+0x326/0x7d0 [obdclass]
      [ 2142.852536]  class_process_config+0x3bb/0x20b0 [obdclass]
      [ 2142.853485]  ? class_manual_cleanup+0x1c8/0x780 [obdclass]
      [ 2142.854446]  ? __kmalloc+0x113/0x250
      [ 2142.855063]  class_manual_cleanup+0x2a2/0x780 [obdclass]
      [ 2142.855997]  ? __queue_work+0x145/0x3f0
      [ 2142.856661]  osd_obd_disconnect+0x12c/0x140 [osd_ldiskfs]
      [ 2142.857586]  lustre_put_lsi_free+0x10a/0x570 [obdclass]
      [ 2142.858513]  lustre_put_lsi+0x17c/0x1e0 [obdclass]
      [ 2142.859368]  server_put_super+0x263/0x14a0 [ptlrpc]
      [ 2142.860278]  ? fsnotify_sb_delete+0x138/0x1c0
      [ 2142.861020]  ? __dentry_kill+0x121/0x170
      [ 2142.861702]  generic_shutdown_super+0x6c/0x110
      [ 2142.862463]  kill_anon_super+0x14/0x30
      [ 2142.863108]  deactivate_locked_super+0x34/0x70
      [ 2142.863871]  cleanup_mnt+0x3b/0x70
      [ 2142.864466]  task_work_run+0x8a/0xb0
      [ 2142.865083]  exit_to_usermode_loop+0xf4/0x100
      [ 2142.865828]  do_syscall_64+0x1cb/0x1d0
      [ 2142.866481]  entry_SYSCALL_64_after_hwframe+0x66/0xcb
      [ 2142.867322] RIP: 0033:0x7f16849ee8fb
      [ 2142.867956] Code: ff d0 48 89 c7 b8 3c 00 00 00 0f 05 48 8b 0d 84 65 39 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5d 65 39 00 f7 d8 64 89 01 48
      [ 2142.870924] RSP: 002b:00007ffff1aa1508 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
      [ 2142.872154] RAX: 0000000000000000 RBX: 0000564cd4aea400 RCX: 00007f16849ee8fb
      [ 2142.873318] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000564cd4aeff50
      [ 2142.874495] RBP: 0000000000000001 R08: 0000564cd4af9c90 R09: 00007f1684d85bc0
      [ 2142.875670] R10: 0000000000000007 R11: 0000000000000202 R12: 0000564cd4aeff50
      [ 2142.876835] R13: 00007f1685871184 R14: 0000564cd4af0100 R15: 00000000ffffffff
      [ 2142.878012] Modules linked in: dm_flakey osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache intel_rapl_msr intel_rapl_common sb_edac kvm_intel kvm iTCO_wdt iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl joydev i2c_i801 virtio_balloon pcspkr lpc_ich sunrpc ext4 mbcache jbd2 ahci libahci libata crc32c_intel virtio_net serio_raw net_failover failover virtio_blk [last unloaded: dm_flakey]

      Attachments

        Activity

          People

            wc-triage WC Triage
            dongyang Dongyang Li
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: