Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
Lustre 2.7.0
-
None
-
RHEL6 server
-
3
-
9223372036854775807
Description
We have recently seen frequent occurrences of the LBUG below.
The affected machines are all exporting our Lustre file system via NFS to other Linux machines.
May 2 06:59:03 i05-storage1 kernel: LustreError: 3023:0:(dcache.c:236:ll_d_init()) ASSERTION( de->d_op == &ll_d_ops ) failed: May 2 06:59:03 i05-storage1 kernel: LustreError: 3023:0:(dcache.c:236:ll_d_init()) LBUG May 2 06:59:03 i05-storage1 kernel: Pid: 3023, comm: nfsd May 2 06:59:03 i05-storage1 kernel: May 2 06:59:03 i05-storage1 kernel: Call Trace: May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0383895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0383e97>] lbug_with_loc+0x47/0xb0 [libcfs] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa097e69f>] ll_d_init+0x2ff/0x540 [lustre] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa09c1b5b>] ll_iget_for_nfs+0x20b/0x300 [lustre] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa09c1d89>] ll_fh_to_dentry+0x99/0xa0 [lustre] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0b3871c>] exportfs_decode_fh+0x5c/0x2bc [exportfs] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bcc8e0>] ? nfsd_acceptable+0x0/0x120 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0b56da0>] ? cache_check+0x60/0x370 [sunrpc] May 2 06:59:03 i05-storage1 kernel: [<ffffffff8117f76b>] ? cache_alloc_refill+0x15b/0x240 May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bccdda>] fh_verify+0x32a/0x640 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bcfda1>] nfsd_open+0x31/0x240 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bd022b>] nfsd_commit+0x3b/0xa0 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffff810aff24>] ? groups_free+0x54/0x60 May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bd769d>] nfsd3_proc_commit+0x9d/0x100 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bc9405>] nfsd_dispatch+0xe5/0x230 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0b4ccf4>] svc_process_common+0x344/0x640 [sunrpc] May 2 06:59:03 i05-storage1 kernel: [<ffffffff8106c500>] ? default_wake_function+0x0/0x20 May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0b4d390>] svc_process+0x110/0x160 [sunrpc] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bc9c82>] nfsd+0xc2/0x160 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffffa0bc9bc0>] ? nfsd+0x0/0x160 [nfsd] May 2 06:59:03 i05-storage1 kernel: [<ffffffff810a640e>] kthread+0x9e/0xc0 May 2 06:59:03 i05-storage1 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 May 2 06:59:03 i05-storage1 kernel: [<ffffffff810a6370>] ? kthread+0x0/0xc0 May 2 06:59:03 i05-storage1 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 May 2 06:59:03 i05-storage1 kernel:
This looks similar to LU-9241 but the stack trace is not quite the same and also the patch is against master while we are running b2_7_fe, so would need a fix for that.
We are still investigating the events leading to the crash, hoping for a reproducer....
Attachments
Issue Links
- duplicates
-
LU-9421 minor improvement on the implementation of libcfs crypto framework
-
- Closed
-