Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15703

ext4/jbd2 fast commit breaks luster ldiskfs

Details

    • 3
    • 9223372036854775807

    Description

      Fast commits for ext4 - LWN.net
      https://lwn.net/Articles/842385/#:~:text=The%20Linux%205.10%20release%20included,%2C%20lighter%2Dweight%20journaling%20method.

      The Linux 5.10 release included a change that is expected to significantly increase the performance of the ext4 filesystem; it goes by the name "fast commits" and introduces a new, lighter-weight journaling method.

      Unfortunately this breaks ldiskfs. Disable ext4/jbd2 fast commit

      Error (on MDS):

      [   59.357528] Lustre: DEBUG MARKER: mds8.test: executing set_hostid
      [   64.281515] LDISKFS-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      [   69.918052] LDISKFS-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      [   69.960957] RPC: Registered named UNIX socket transport module.
      [   69.960974] RPC: Registered udp transport module.
      [   69.960984] RPC: Registered tcp transport module.
      [   69.960993] RPC: Registered tcp NFSv4.1 backchannel transport module.
      [   69.984978] Lustre: Lustre: Build Version: 2.14.54_522_g3f584cf_dirty
      [   70.021102] LNet: Added LNI 192.168.122.175@tcp [8/256/0/180]
      [   70.021438] LNet: Accept secure, port 988
      [   70.084523] LDISKFS-fs (vdb): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc
      [   70.179250] Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space: rc = -61
      [   71.214201] Lustre: lustre-MDT0000: new disk, initializing
      [   71.291209] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180
      [   71.297880] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt
      [   71.320210] ------------[ cut here ]------------
      [   71.320249] WARNING: CPU: 0 PID: 1978 at fs/jbd2/transaction.c:1512 jbd2_journal_dirty_metadata+0x201/0x220 [jbd2]
      [   71.320283] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sunrpc(E) ldiskfs(OE) mbcache(E) jbd2(E) libcfs(OE) fuse(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E) intel_pmc_core_pltdrv(E) intel_pmc_core(E) kvm_intel(E) kvm(E) qxl(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) drm_ttm_helper(E) ttm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) snd_hda_codec_generic(E) ledtrig_audio(E) drm_kms_helper(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) ghash_clmulni_intel(E) rapl(E) snd_pcm(E) drm(E) pcspkr(E) snd_timer(E) snd(E) i2c_i801(E) i2c_smbus(E) soundcore(E) lpc_ich(E) virtio_balloon(E) joydev(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) xfs(E) libcrc32c(E) sr_mod(E) cdrom(E) sg(E) ahci(E) libahci(E) libata(E) crc32c_intel(E) serio_raw(E) virtio_scsi(E)
      [   71.320342]  virtio_blk(E) virtio_net(E) virtio_console(E) net_failover(E) failover(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
      [   71.320667] CPU: 0 PID: 1978 Comm: lod0000_rec0000 Tainted: G           OE     5.10.9-1.ldiskfs.el8.x86_64 #1
      [   71.320700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      [   71.320736] RIP: 0010:jbd2_journal_dirty_metadata+0x201/0x220 [jbd2]
      [   71.320759] Code: 07 00 0f 1f 40 00 e9 50 ff ff ff 49 39 6c 24 30 0f 84 83 fe ff ff 0f 0b 4d 8b 4a 70 4c 39 cd 0f 84 36 ff ff ff e9 45 a0 00 00 <0f> 0b 41 bd e4 ff ff ff e9 24 ff ff ff 0f 0b 41 bd 8b ff ff ff e9
      [   71.320819] RSP: 0018:ffff9ec58152bc28 EFLAGS: 00010246
      [   71.320839] RAX: 0000000000000001 RBX: ffff89e9c47d4c78 RCX: 00000000000003fc
      [   71.320863] RDX: 0000000000000000 RSI: ffff89e9c307e888 RDI: ffff89e9c314a2d8
      [   71.320887] RBP: ffff89e9c47d3500 R08: ffff89e9c307e888 R09: ffff89e9ede0f000
      [   71.320921] R10: ffff89e9c7a3e800 R11: ffff89e9c314a2d0 R12: ffff89e9c314a2d0
      [   71.321977] R13: 0000000000000000 R14: ffff89e9c307e888 R15: ffff89e9c314a2d8
      [   71.323182] FS:  0000000000000000(0000) GS:ffff89ea3bc00000(0000) knlGS:0000000000000000
      [   71.324359] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   71.325367] CR2: 0000555bb1fd9e48 CR3: 0000000108048002 CR4: 0000000000370ef0
      [   71.325943] Call Trace:
      [   71.326543]  __ldiskfs_handle_dirty_metadata+0x4f/0x1a0 [ldiskfs]
      [   71.327120]  ldiskfs_getblk+0x13d/0x1d0 [ldiskfs]
      [   71.327672]  ldiskfs_bread+0xb/0x70 [ldiskfs]
      [   71.328261]  osd_write+0x6ca/0xc70 [osd_ldiskfs]
      [   71.328831]  dt_record_write+0x32/0x110 [obdclass]
      [   71.329402]  llog_osd_put_cat_list+0x799/0x930 [obdclass]
      [   71.329942]  lod_sub_prep_llog+0x6d9/0x81a [lod]
      [   71.330491]  lod_sub_recovery_thread+0xed/0xb60 [lod]
      [   71.331047]  ? __schedule+0x299/0x7a0
      [   71.331586]  ? lod_trans_stop+0x360/0x360 [lod]
      [   71.332128]  kthread+0x116/0x130
      [   71.332664]  ? kthread_park+0x80/0x80
      [   71.333220]  ret_from_fork+0x22/0x30
      [   71.333764] ---[ end trace c7b3deb5174e8df2 ]---
      [   71.334314] ------------[ cut here ]------------
      [   71.334869] WARNING: CPU: 0 PID: 1978 at /home/shaun/lustre-release/ldiskfs/ext4_jbd2.c:333 __ldiskfs_handle_dirty_metadata+0x110/0x1a0 [ldiskfs]
      [   71.335930] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sunrpc(E) ldiskfs(OE) mbcache(E) jbd2(E) libcfs(OE) fuse(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E) intel_pmc_core_pltdrv(E) intel_pmc_core(E) kvm_intel(E) kvm(E) qxl(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) drm_ttm_helper(E) ttm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) snd_hda_codec_generic(E) ledtrig_audio(E) drm_kms_helper(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) ghash_clmulni_intel(E) rapl(E) snd_pcm(E) drm(E) pcspkr(E) snd_timer(E) snd(E) i2c_i801(E) i2c_smbus(E) soundcore(E) lpc_ich(E) virtio_balloon(E) joydev(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) xfs(E) libcrc32c(E) sr_mod(E) cdrom(E) sg(E) ahci(E) libahci(E) libata(E) crc32c_intel(E) serio_raw(E) virtio_scsi(E)
      [   71.335966]  virtio_blk(E) virtio_net(E) virtio_console(E) net_failover(E) failover(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
      [   71.342010] CPU: 0 PID: 1978 Comm: lod0000_rec0000 Tainted: G        W  OE     5.10.9-1.ldiskfs.el8.x86_64 #1
      [   71.342642] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      [   71.343263] RIP: 0010:__ldiskfs_handle_dirty_metadata+0x110/0x1a0 [ldiskfs]
      [   71.343825] Code: 01 40 e9 4e ff ff ff f0 80 4b 01 20 e9 38 ff ff ff f0 80 0b 01 4d 85 ed 75 8f 48 89 df 45 31 e4 e8 55 94 de f6 e9 65 ff ff ff <0f> 0b 48 c7 c2 c0 56 9d c0 45 89 e0 48 89 e9 44 89 fe 4c 89 f7 e8
      [   71.344917] RSP: 0018:ffff9ec58152bc68 EFLAGS: 00010286
      [   71.345502] RAX: ffff89e9c7a3e800 RBX: ffff89e9c307e888 RCX: 00000000000003fc
      [   71.346048] RDX: 0000000000000000 RSI: ffff89e9c307e888 RDI: ffff89e9c314a2d8
      [   71.346603] RBP: ffff89e9c47d4c78 R08: ffff89e9c307e888 R09: ffff89e9ede0f000
      [   71.347144] R10: ffff89e9c7a3e800 R11: ffff89e9c314a2d0 R12: 00000000ffffffe4
      [   71.347685] R13: ffff89e9c313cf80 R14: ffffffffc09d7030 R15: 0000000000000376
      [   71.348215] FS:  0000000000000000(0000) GS:ffff89ea3bc00000(0000) knlGS:0000000000000000
      [   71.348925] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   71.349757] CR2: 0000555bb1fd9e48 CR3: 0000000108048002 CR4: 0000000000370ef0
      [   71.350602] Call Trace:
      [   71.351183]  ldiskfs_getblk+0x13d/0x1d0 [ldiskfs]
      [   71.351799]  ldiskfs_bread+0xb/0x70 [ldiskfs]
      [   71.352356]  osd_write+0x6ca/0xc70 [osd_ldiskfs]
      [   71.353261]  dt_record_write+0x32/0x110 [obdclass]
      [   71.354052]  llog_osd_put_cat_list+0x799/0x930 [obdclass]
      [   71.354867]  lod_sub_prep_llog+0x6d9/0x81a [lod]
      [   71.355578]  lod_sub_recovery_thread+0xed/0xb60 [lod]
      [   71.356183]  ? __schedule+0x299/0x7a0
      [   71.356790]  ? lod_trans_stop+0x360/0x360 [lod]
      [   71.357318]  kthread+0x116/0x130
      [   71.357872]  ? kthread_park+0x80/0x80
      [   71.358576]  ret_from_fork+0x22/0x30
      [   71.359418] ---[ end trace c7b3deb5174e8df3 ]---
      [   71.360306] LDISKFS-fs: ldiskfs_getblk:886: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata
      [   71.361223] LDISKFS-fs error (device vdb): ldiskfs_getblk:886: inode #107: block 136305: comm lod0000_rec0000: journal_dirty_metadata failed: handle type 0 started at line 1979, credits 7/0, errcode -28
      [   71.369909] Aborting journal on device vdb-8.
      [   71.374257] LDISKFS-fs (vdb): Remounting filesystem read-only
      [   71.375334] LustreError: 1978:0:(osd_io.c:2128:osd_ldiskfs_write_record()) lustre-MDT0000: error reading offset 0 (block 0, size 32, offs 0), credits 7/1: rc = -28
      [   71.379816] LustreError: 1927:0:(osd_handler.c:1787:osd_trans_commit_cb()) transaction @0x000000001e4cbe3c commit error: 2
      [   71.382743] LDISKFS-fs error (device vdb) in osd_trans_stop:2089: error 28
      [   71.385258] LustreError: 1978:0:(osd_handler.c:2092:osd_trans_stop()) lustre-MDT0000: failed to stop transaction: rc = -28
      [   71.386192] LustreError: 1978:0:(lod_dev.c:424:lod_sub_recovery_thread()) lustre-MDT0000-osd get update log failed: rc = -28
      [   71.742336] Lustre: DEBUG MARKER: llmount : @@@@@@ FAIL: mds1 is in a unhealthy state
      [   72.433338] Lustre: DEBUG MARKER: mds8.test: executing check_logdir /tmp/test_logs/1648563277
      

      Warning at fs/jbd2/transaction.c:1512 jbd2_journal_dirty_metadata+0x201/0x220 [jbd2]
      https://elixir.bootlin.com/linux/v5.10.9/source/fs/jbd2/transaction.c#L1512

      Warning in patched ext4, ldiskfs/ext2_jbd.c:333 :

      		err = jbd2_journal_dirty_metadata(handle, bh);
      		/* Errors can only happen due to aborted journal or a nasty bug */
      		if (!is_handle_aborted(handle) && WARN_ON_ONCE(err)) { <---- WARNING HERE
      			ldiskfs_journal_abort_handle(where, line, __func__, bh,
      						  handle, err);
      			if (inode == NULL) {
      				pr_err("LDISKFS: jbd2_journal_dirty_metadata "
      				       "failed: handle type %u started at "
      				       "line %u, credits %u/%u, errcode %d",
      				       handle->h_type,
      				       handle->h_line_no,
      				       handle->h_requested_credits,
      				       jbd2_handle_buffer_credits(handle), err);
      				return err;
      			}
      

      Attachments

        Issue Links

          Activity

            [LU-15703] ext4/jbd2 fast commit breaks luster ldiskfs

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47757/
            Subject: LU-15703 ldiskfs: Disable unused fast commit buffer
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 2fd033c6848f54a06ca9379789f37f73de80dce9

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47757/ Subject: LU-15703 ldiskfs: Disable unused fast commit buffer Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 2fd033c6848f54a06ca9379789f37f73de80dce9

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46943/
            Subject: LU-15703 ldiskfs: Disable unused fast commit buffer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ed80c3f64317a87ed10919cedbf2d65b36abcdcb

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46943/ Subject: LU-15703 ldiskfs: Disable unused fast commit buffer Project: fs/lustre-release Branch: master Current Patch Set: Commit: ed80c3f64317a87ed10919cedbf2d65b36abcdcb

            "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47757
            Subject: LU-15703 ldiskfs: Disable unused fast commit buffer
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: ce764be32c0d8cdda604eb6cff1b8fce643ddf57

            gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47757 Subject: LU-15703 ldiskfs: Disable unused fast commit buffer Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: ce764be32c0d8cdda604eb6cff1b8fce643ddf57

            It seems there is also changes in jbd2 journal credits that break lustre. Someone can probably come up with something more reasonable than just padding the credit calculation for new inodes / xattrs.
            As of now I can mount / test on a small cluster running 5.10.9, 5.14.21, and 5.16.9.

            stancheff Shaun Tancheff added a comment - It seems there is also changes in jbd2 journal credits that break lustre. Someone can probably come up with something more reasonable than just padding the credit calculation for new inodes / xattrs. As of now I can mount / test on a small cluster running 5.10.9, 5.14.21, and 5.16.9.

            Can you say more about why this breaks with ldiskfs?

            paf0186 Patrick Farrell added a comment - Can you say more about why this breaks with ldiskfs?

            "Shaun Tancheff <shaun.tancheff@hpe.com>" uploaded a new patch: https://review.whamcloud.com/46943
            Subject: LU-15703 ldiskfs: disable ext4/jbd2 fast commit
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7eac56305da26da4bcf7ac59d2090e703586acef

            gerrit Gerrit Updater added a comment - "Shaun Tancheff <shaun.tancheff@hpe.com>" uploaded a new patch: https://review.whamcloud.com/46943 Subject: LU-15703 ldiskfs: disable ext4/jbd2 fast commit Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7eac56305da26da4bcf7ac59d2090e703586acef

            People

              stancheff Shaun Tancheff
              stancheff Shaun Tancheff
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: