Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3727

LBUG (llite_nfs.c:281:ll_get_parent()) ASSERTION(body->valid & OBD_MD_FLID) failed

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.1.5, Lustre 1.8.9, Lustre 2.4.1
    • 3
    • 9597

    Description

      At GE Global Research, we ran into an LBUG with a 1.8.9 client that is re-exporting 2.1.5 Lustre:

      Jul 31 10:26:46 scinfra3 kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
      Jul 31 10:26:46 scinfra3 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
      Jul 31 10:26:46 scinfra3 kernel: NFSD: starting 90-second grace period
      Jul 31 10:26:53 scinfra3 ntpd[8318]: synchronized to 3.40.208.30, stratum 2
      Jul 31 10:29:46 scinfra3 kernel: LustreError: 27396:0:(llite_nfs.c:281:ll_get_parent()) ASSERTION(body->valid & OBD_MD_FLID) failed
      Jul 31 10:29:46 scinfra3 kernel: LustreError: 27396:0:(llite_nfs.c:281:ll_get_parent()) LBUG
      Jul 31 10:29:46 scinfra3 kernel: Pid: 27396, comm: nfsd
      Jul 31 10:29:46 scinfra3 kernel:
      Jul 31 10g:29:46 scinfra3 kernel: Call Trace:
      Jul 31 10:29:46 scinfra3 kernel: [ ] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
      Jul 31 10:29:46 scinfra3 kernel: [ ] lbug_with_loc+0x7a/0xd0 [libcfs]
      Jul 31 10:29:46 scinfra3 kernel: [ ] tracefile_init+0x0/0x110 [libcfs]
      Jul 31 10:29:46 scinfra3 kernel: [ ] ll_get_parent+0x1e3/0x2b0 [lustre]
      Jul 31 10:29:46 scinfra3 kernel: [ ] ll_get_dentry+0x6b/0xe0 [lustre]
      Jul 31 10:29:46 scinfra3 kernel: [ ] mutex_lock+0xd/0x1d
      Jul 31 10:29:46 scinfra3 kernel: [ ] find_exported_dentry+0x241/0x486 [exportfs]
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd_acceptable+0x0/0xdc [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] autoremove_wake_function+0x0/0x2e
      Jul 31 10:29:46 scinfra3 kernel: [ ] sunrpc_cache_lookup+0x4b/0x128 [sunrpc]
      Jul 31 10:29:46 scinfra3 kernel: [ ] exp_get_by_name+0x5b/0x71 [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] exp_find_key+0x89/0x9c [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd_acceptable+0x0/0xdc [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] ll_decode_fh+0x197/0x240 [lustre]
      Jul 31 10:29:46 scinfra3 kernel: [ ] set_current_groups+0x116/0x164
      Jul 31 10:29:46 scinfra3 kernel: [ ] fh_verify+0x29c/0x4cf [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd3_proc_getattr+0x8a/0xbe [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd_dispatch+0xd8/0x1d6 [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] svc_process+0x3f8/0x6bf [sunrpc]
      Jul 31 10:29:46 scinfra3 kernel: [ ] __down_read+0x12/0x92
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd+0x0/0x2cb [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd+0x1a5/0x2cb [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] child_rip+0xa/0x11
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd+0x0/0x2cb [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] nfsd+0x0/0x2cb [nfsd]
      Jul 31 10:29:46 scinfra3 kernel: [ ] child_rip+0x0/0x11
      Jul 31 10:29:46 scinfra3 kernel:

      It appears to be easily reproducible, we are going to try to get a core dump, but I was wondering if there was anything obvious from this trace or any other jira tickets I might have missed. Also is there any other information that might be useful?

      Thanks.

      Attachments

        1. unlink08.c
          10 kB
        2. lustre.log
          3.60 MB
        3. log.unlink08.lctl.dk.out.gz
          3.52 MB
        4. log.txt
          44 kB

        Issue Links

          Activity

            People

              emoly.liu Emoly Liu
              orentas Oz Rentas (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: