Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3564

LBUG exporting lustre 2.1.5 via NFS on RHEL

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • None
    • RHEL 6.2 (2.6.32-279.19.1.el6.x86_64) client patchless client exporting Lustre 2.1.5 via NFS
    • 3
    • 8972

    Description

      We have a RHEL client that exports lustre via NFS that has been crashing with the following LBUG:

      Jul 7 14:27:55 lstgwbal823 kernel: LustreError: 2037:0:(mdc_locks.c:789:mdc_finish_intent_lock()) ASSERTION( it->d.lustre.it_status != 0 ) failed:
      Jul 7 14:27:55 lstgwbal823 kernel: LustreError: 2037:0:(mdc_locks.c:789:mdc_finish_intent_lock()) LBUG
      Jul 7 14:27:55 lstgwbal823 kernel: Pid: 2037, comm: nfsd
      Jul 7 14:27:55 lstgwbal823 kernel:
      Jul 7 14:27:55 lstgwbal823 kernel: Call Trace:
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0373785>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0373d97>] lbug_with_loc+0x47/0xb0 [libcfs]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa084f8fa>] mdc_finish_intent_lock+0x77a/0x830 [mdc]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0852bc8>] mdc_intent_lock+0x248/0x630 [mdc]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa08097c0>] ? fld_client_lookup+0x60/0x4a0 [fld]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff81464515>] ? ip_local_out+0x25/0x30
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a349f0>] ? ll_md_blocking_ast+0x0/0x710 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa06598c0>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0b32ae2>] ? lmv_fld_lookup+0x82/0x320 [lmv]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0b31930>] lmv_intent_open+0x2d0/0x10a0 [lmv]
      Jul 7 14:27:55 lstgwbal823 kernel: Kernel panic - not syncing: LBUG
      Jul 7 14:27:55 lstgwbal823 kernel: Pid: 2039, comm: nfsd Not tainted 2.6.32-279.19.1.el6.x86_64 #1
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a349f0>] ? ll_md_blocking_ast+0x0/0x710 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff8147967e>] ? tcp_transmit_skb+0x3fe/0x7b0
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0b32990>] lmv_intent_lock+0x290/0x360 [lmv]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a349f0>] ? ll_md_blocking_ast+0x0/0x710 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a3494d>] ? ll_i2gids+0x3d/0xe0 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a186cf>] ? ll_prep_md_op_data+0x14f/0x400 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a0a49c>] ll_intent_file_open+0x18c/0xb80 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a349f0>] ? ll_md_blocking_ast+0x0/0x710 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a0b99d>] ll_file_open+0x22d/0xcf0 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0a0b770>] ? ll_file_open+0x0/0xcf0 [lustre]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff81173c8a>] __dentry_open+0x10a/0x360
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff81173f32>] dentry_open+0x52/0xc0
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa032093e>] nfsd_open+0x11e/0x230 [nfsd]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff814ec646>] ? _spin_lock_bh+0x16/0x40
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0320e6f>] nfsd_read+0x3f/0x2e0 [nfsd]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa03286b5>] nfsd3_proc_read+0xd5/0x180 [nfsd]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa031943e>] nfsd_dispatch+0xfe/0x240 [nfsd]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa01f6794>] svc_process_common+0x344/0x640 [sunrpc]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa01f6dd0>] svc_process+0x110/0x160 [sunrpc]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0319b62>] nfsd+0xc2/0x160 [nfsd]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffffa0319aa0>] ? nfsd+0x0/0x160 [nfsd]
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff81090626>] kthread+0x96/0xa0
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff81090590>] ? kthread+0x0/0xa0
      Jul 7 14:27:55 lstgwbal823 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

      This is a RHEL 6.2 node running the following lustre version:

      lustre: 2.1.5
      kernel: patchless_client
      build: RC1--PRISTINE-2.6.32-279.19.1.el6.x86_64

      We have been hitting this bug frequently - once every 48 hours or so - but have not able to easily reproduce the scenario that causes this client to crash.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sampofo Sefa Ampofo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: