Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.4.2, Lustre 2.5.3
-
2
-
14213
Description
After upgrading our servers from Lustre 2.4.0-28chaos to Lustre 2.4.2-11chaos (see github.com/chaos/lustre), we are seeing many client crashes with a NULL pointer dereference in posix_acl_valid() under mdc_get_lustre_md(). Note that both 2.4.0-19chaos client nodes and 2.4.2-11chaos client nodes are exhibiting this behavior.
The backtrace looks like:
PID: 3690 TASK: ffff880338d69540 CPU: 7 COMMAND: "ll_sa_3689" #0 [ffff8802ddf51800] machine_kexec+0x18b at ffffffff810391ab #1 [ffff8802ddf51860] crash_kexec+0x72 at ffffffff810c5d52 #2 [ffff8802ddf51930] oops_end+0xc0 at ffffffff8152e630 #3 [ffff8802ddf51960] no_context+0xfb at ffffffff8104a00b #4 [ffff8802ddf519b0] __bad_area_nosemaphore+0x125 at ffffffff8104a295 #5 [ffff8802ddf51a00] bad_area_nosemaphore+0x13 at ffffffff8104a363 #6 [ffff8802ddf51a10] __do_page_fault+0x32f at ffffffff8104aacf #7 [ffff8802ddf51b30] do_page_fault+0x3e at ffffffff8153057e #8 [ffff8802ddf51b60] page_fault+0x25 at ffffffff8152d935 [exception RIP: posix_acl_valid+9] RIP: ffffffff811ea9b9 RSP: ffff8802ddf51c10 RFLAGS: 00010207 RAX: 0000000000000000 RBX: ffff8805ec607000 RCX: ffff8805ebddda00 RDX: 0000000000000004 RSI: 0000000000000004 RDI: 0000000000000000 RBP: ffff8802ddf51c10 R8: 0000000000000000 R9: ffff8805ec46dc40 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88063987bc00 R13: ffff8802ddf51cf0 R14: 0000000000000000 R15: 0000000000000050 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff8802ddf51c18] mdc_get_lustre_md+0xc5a at ffffffffa0af4faa [mdc] #10 [ffff8802ddf51c78] lmv_get_lustre_md+0x153 at ffffffffa0d668d3 [lmv] #11 [ffff8802ddf51cc8] ll_prep_inode+0x3f7 at ffffffffa0c7e217 [lustre] #12 [ffff8802ddf51da8] ll_post_statahead+0x2f7 at ffffffffa0ca0577 [lustre] #13 [ffff8802ddf51e18] ll_statahead_thread+0xd38 at ffffffffa0ca4ff8 [lustre] #14 [ffff8802ddf51f48] child_rip+0xa at ffffffff8100c24a
The crash is on this line in posix_acl_valid():
crash> gdb list *(posix_acl_valid+9) 0xffffffff811ea9b9 is in posix_acl_valid (fs/posix_acl.c:88). 83 const struct posix_acl_entry *pa, *pe; 84 int state = ACL_USER_OBJ; 85 unsigned int id = 0; /* keep gcc happy */ 86 int needs_mask = 0; 87 88 FOREACH_ACL_ENTRY(pa, acl, pe) { 89 if (pa->e_perm & ~(ACL_READ|ACL_WRITE|ACL_EXECUTE)) 90 return -EINVAL; 91 switch (pa->e_tag) { 92 case ACL_USER_OBJ:
The problem is not particular to the statahead thread. That was just one example. Here is another where I ran getfattr on the problem file:
2014-06-05 14:51:29 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 2014-06-05 14:51:29 IP: [<ffffffff811ea9b9>] posix_acl_valid+0x9/0x120 2014-06-05 14:51:29 PGD 638f93067 PUD 5f1873067 PMD 0 2014-06-05 14:51:29 Oops: 0000 [#1] SMP 2014-06-05 14:51:29 last sysfs file: /sys/devices/system/edac/pci/pci_parity_count 2014-06-05 14:51:29 CPU 10 2014-06-05 14:51:29 Modules linked in: lmv(U) mgc(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate lustre(U) lov(U) osc(U) mdc(U 2014-06-05 14:51:29 2014-06-05 14:51:29 Pid: 6114, comm: getfattr Tainted: P --------------- 2.6.32-431.17.2.1chaos.ch5.2.x86_64 #1 Dell XS23-TY35 /0GW08P 2014-06-05 14:51:29 RIP: 0010:[<ffffffff811ea9b9>] [<ffffffff811ea9b9>] posix_acl_valid+0x9/0x120 2014-06-05 14:51:29 RSP: 0018:ffff8805f0e698d8 EFLAGS: 00010207 2014-06-05 14:51:29 RAX: 0000000000000000 RBX: ffff880639212000 RCX: ffff8805f1870a00 2014-06-05 14:51:29 RDX: 0000000000000004 RSI: 0000000000000004 RDI: 0000000000000000 2014-06-05 14:51:29 RBP: ffff8805f0e698d8 R08: 0000000000000000 R09: 0000000000000040 2014-06-05 14:51:29 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8805f3b22c00 2014-06-05 14:51:29 R13: ffff8805f0e699b8 R14: 0000000000000000 R15: 0000000000000050 2014-06-05 14:51:29 FS: 00002aaaab266fa0(0000) GS:ffff88034ac80000(0000) knlGS:0000000000000000 2014-06-05 14:51:29 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2014-06-05 14:51:29 CR2: 0000000000000004 CR3: 0000000638355000 CR4: 00000000000007e0 2014-06-05 14:51:29 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2014-06-05 14:51:29 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2014-06-05 14:51:29 Process getfattr (pid: 6114, threadinfo ffff8805f0e68000, task ffff88063a007500) 2014-06-05 14:51:29 Stack: 2014-06-05 14:51:29 ffff8805f0e69938 ffffffffa091dfaa ffff8805f1870a00 ffff8806389acc00 2014-06-05 14:51:29 <d> 0000000000000000 0000000000000000 6c0db3630c468718 ffff8806389acc00 2014-06-05 14:51:29 <d> ffff880639212000 ffff880639212400 ffff8805f0e699b8 ffff8805f3b22c00 2014-06-05 14:51:29 Call Trace: 2014-06-05 14:51:29 [<ffffffffa091dfaa>] mdc_get_lustre_md+0xc5a/0x1420 [mdc] 2014-06-05 14:51:29 [<ffffffffa0d6c8d3>] lmv_get_lustre_md+0x153/0x3d0 [lmv] 2014-06-05 14:51:29 [<ffffffffa0a99217>] ll_prep_inode+0x3f7/0xf60 [lustre] 2014-06-05 14:51:29 [<ffffffffa079faa8>] ? req_capsule_server_get+0x18/0x20 [ptlrpc] 2014-06-05 14:51:29 [<ffffffffa0d85e4a>] ? lmv_intent_lookup+0x25a/0x770 [lmv] 2014-06-05 14:51:29 [<ffffffffa0aa53b0>] ? ll_md_blocking_ast+0x0/0x740 [lustre] 2014-06-05 14:51:29 [<ffffffffa0aa89aa>] ll_lookup_it_finish+0x1da/0xe80 [lustre] 2014-06-05 14:51:29 [<ffffffffa0d86fca>] ? lmv_intent_lock+0x32a/0x380 [lmv] 2014-06-05 14:51:29 [<ffffffffa0aa53b0>] ? ll_md_blocking_ast+0x0/0x740 [lustre] 2014-06-05 14:51:29 [<ffffffffa0aa9a3d>] ll_lookup_it+0x3ed/0xbd0 [lustre] 2014-06-05 14:51:29 [<ffffffffa0aa53b0>] ? ll_md_blocking_ast+0x0/0x740 [lustre] 2014-06-05 14:51:29 [<ffffffffa0aaa2ac>] ll_lookup_nd+0x8c/0x430 [lustre] 2014-06-05 14:51:29 [<ffffffff811a457e>] ? d_alloc+0x13e/0x1b0 2014-06-05 14:51:29 [<ffffffff811998a5>] do_lookup+0x1a5/0x230 2014-06-05 14:51:29 [<ffffffff81199fb7>] __link_path_walk+0x587/0x850 2014-06-05 14:51:29 [<ffffffff811680ea>] ? alloc_pages_current+0xaa/0x110 2014-06-05 14:51:29 [<ffffffff8119a97a>] path_walk+0x6a/0xe0 2014-06-05 14:51:29 [<ffffffff8119ab8b>] filename_lookup+0x6b/0xc0 2014-06-05 14:51:29 [<ffffffff8119bcb7>] user_path_at+0x57/0xa0 2014-06-05 14:51:29 [<ffffffff8104a9a4>] ? __do_page_fault+0x204/0x490 2014-06-05 14:51:29 [<ffffffff8128ae05>] ? rb_insert_color+0x125/0x160 2014-06-05 14:51:29 [<ffffffff8114f020>] ? __vma_link_rb+0x30/0x40 2014-06-05 14:51:29 [<ffffffff8118f7a0>] vfs_fstatat+0x50/0xa0 2014-06-05 14:51:29 [<ffffffff8118f85e>] vfs_lstat+0x1e/0x20 2014-06-05 14:51:29 [<ffffffff8118f884>] sys_newlstat+0x24/0x50 2014-06-05 14:51:29 [<ffffffff8153057e>] ? do_page_fault+0x3e/0xa0 2014-06-05 14:51:29 [<ffffffff8152d935>] ? page_fault+0x25/0x30 2014-06-05 14:51:29 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Thanks! That patch should be a good work-around for us. I'll give it a try.
As I mentioned, the the xattr posix_acl_access has the contents
which is, in octal, just the little-endian form of:
The version is the first, and only, contents of the acl xattr's header. So we know that this is a case of a xattr being written out for an empty acl list.
Why did the Lustre server start doing that? If at all possible, we don't want to to happen because it is going to have negative performance implications.