[LU-6683] OSS crash when starting lfsck layout check Created: 03/Jun/15  Updated: 09/Nov/15  Resolved: 21/Jul/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Frederik Ferner (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

files system with 1MDT, 6 OST, 2 OSS, installed as 1.6, upgrade to 1.8, 2.5, now 2.7


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When starting the lfsck layout check on our test file system, but OSS servers immediately crash with something like the following on the console (or in vmcore-dmesg.txt). I also discovered that I can't stop the lfsck (lctl lfsck_stop just hangs) in this stage (after recovering the OSTs) and when failing over the MDT in this state, it is re-started when mounting the MDT on the other MDS, crashing the OSS nodes again. The output below has been collected after the crash triggered by the MDT failover mounting.

------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:1030!
Lustre: play01-OST0001: deleting orphan objects from 0x0:51613818 to 0x0:5161388
Lustre: play01-OST0003: deleting orphan objects from 0x0:77539134 to 0x0:7753920
Lustre: play01-OST0005: deleting orphan objects from 0x0:44598982 to 0x0:4459905
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:0c:00.0/host7/target7
CPU 2 
Modules linked in: osp(U) ofd(U) lfsck(U) ipmi_si ost(U) mgc(U) osd_ldiskfs(U) a

Pid: 25013, comm: lfsck Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Dell Inc
RIP: 0010:[<ffffffffa039179d>]  [<ffffffffa039179d>] jbd2_journal_dirty_metadata
RSP: 0018:ffff8801fa26da00  EFLAGS: 00010246
RAX: ffff88043b4aa680 RBX: ffff880202e1f498 RCX: ffff880226a866e0
RDX: 0000000000000000 RSI: ffff880226a866e0 RDI: 0000000000000000
RBP: ffff8801fa26da20 R08: ffff880226a866e0 R09: 0000000000000018
R10: 0000000000480403 R11: 0000000000000001 R12: ffff880202e386d8
R13: ffff880226a866e0 R14: ffff880239208800 R15: 0000000000000000
FS:  00007fdff3fff700(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007feb2ce760a0 CR3: 000000043b4d1000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process lfsck (pid: 25013, threadinfo ffff8801fa26c000, task ffff8801f78b3540)
Stack:
 ffff880202e1f498 ffffffffa0fd7710 ffff880226a866e0 0000000000000000
<d> ffff8801fa26da60 ffffffffa0f9600b ffff8801fa26daa0 ffffffffa0fd2af3
<d> ffff8802159f3000 ffff8803f12396e0 ffff8803f1239610 ffff8801fa26db28
Call Trace:
 [<ffffffffa0f9600b>] __ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
 [<ffffffffa0fd2af3>] ? ldiskfs_xattr_set_entry+0x4e3/0x4f0 [ldiskfs]
 [<ffffffffa0fa1d9a>] ldiskfs_mark_iloc_dirty+0x52a/0x630 [ldiskfs]
 [<ffffffffa0fd4abc>] ldiskfs_xattr_set_handle+0x33c/0x560 [ldiskfs]
 [<ffffffffa0fd4ddc>] ldiskfs_xattr_set+0xfc/0x1a0 [ldiskfs]
 [<ffffffffa0fd500e>] ldiskfs_xattr_trusted_set+0x2e/0x30 [ldiskfs]
 [<ffffffff811b4722>] generic_setxattr+0xa2/0xb0
 [<ffffffffa0d4690d>] __osd_xattr_set+0x8d/0xe0 [osd_ldiskfs]
 [<ffffffffa0d4e005>] osd_xattr_set+0x3a5/0x4b0 [osd_ldiskfs]
 [<ffffffffa0a3f446>] lfsck_master_oit_engine+0x14c6/0x1ef0 [lfsck]
 [<ffffffffa0a4094e>] lfsck_master_engine+0xade/0x13e0 [lfsck]
 [<ffffffff81064b90>] ? default_wake_function+0x0/0x20
 [<ffffffffa0a3fe70>] ? lfsck_master_engine+0x0/0x13e0 [lfsck]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Code: c6 9c 03 00 00 4c 89 f7 e8 91 bf 19 e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e 
RIP  [<ffffffffa039179d>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
 RSP <ffff8801fa26da00>

We've got a vmcore file on one of the servers, which we can upload if this is required.

After failing over the MDT and recovering the OSTs, I can stop the lfsck layout check.



 Comments   
Comment by Peter Jones [ 03/Jun/15 ]

Fan Yong

Could you please advise on this one?

Thanks

Peter

Comment by nasf (Inactive) [ 04/Jun/15 ]

The reason is that the osd_declare_xattr_set() does not preserve enough journal credits for the subsequent osd_xattr_set() that is triggered by the LFSCK for upgrading object's FID-in-LMA.

static int osd_declare_xattr_set(const struct lu_env *env,
                                 struct dt_object *dt,
                                 const struct lu_buf *buf, const char *name,
                                 int fl, struct thandle *handle)
{
...
        /* optimistic optimization: LMA is set first and usually fit inode */
        if (strcmp(name, XATTR_NAME_LMA) == 0) {
                if (dt_object_exists(dt))
                        credits = 0;
                else
                        credits = 1;
        } else if (strcmp(name, XATTR_NAME_VERSION) == 0) {
                credits = 1;
...
}

Above optimisation does not consider the upgrading case, and should be improved.

Comment by Gerrit Updater [ 04/Jun/15 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/15132
Subject: LU-6683 osd: declare enough credits for generating LMA
Project: fs/lustre-release
Branch: b2_7
Current Patch Set: 1
Commit: b22c236872dbd5e585ff1b3ff7cf08e00967f6b6

Comment by nasf (Inactive) [ 04/Jun/15 ]

Frederik, would you please to try above patch? Thanks!

Comment by Frederik Ferner (Inactive) [ 04/Jun/15 ]

I noticed on review page that the builds are marked as failure, but this seems to be RHEL7 only. I'll certainly try the patch ASAP.

Comment by nasf (Inactive) [ 04/Jun/15 ]

The failure is related with the build system, not the patch. So please go ahead with the patch. Thanks!

Comment by Frederik Ferner (Inactive) [ 04/Jun/15 ]

Thanks for confirming regarding the build failure.

I have now updated our test file system to include the patch and can confirm that this fixed the crash for us.

Comment by nasf (Inactive) [ 04/Jun/15 ]

Thanks Frederik for the updating. The patch for b2_7 has been replaced by the patch for b2_7_fe: http://review.whamcloud.com/#/c/15133/

Comment by Andreas Dilger [ 19/Jun/15 ]

Is this patch needed for master?

Comment by nasf (Inactive) [ 19/Jun/15 ]

Yes, master needs the patch also.

Comment by Gerrit Updater [ 19/Jun/15 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/15361
Subject: LU-6683 osd: declare enough credits for generating LMA
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 71d360f5a2201aa666382a0b7da1b89860596777

Comment by Gerrit Updater [ 21/Jul/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15361/
Subject: LU-6683 osd: declare enough credits for generating LMA
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3675d14de7ffcd761eca1448aab950f80773412a

Comment by Peter Jones [ 21/Jul/15 ]

Landed for 2.8

Generated at Sat Feb 10 02:02:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.