Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4958

do not crash accessing LOV object with FID {0, 0}

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0, Lustre 2.5.4
    • Lustre 2.1.6, Lustre 2.5.1, Lustre 2.4.3
    • 3
    • 13718

    Description

      If an orphan object with stripe_index != 0 is linked to a recreated MDS inode in http://review.whamcloud.com/7810, but not all of the objects are present (e.g. some of the stripes of that file were lost, but a non-zero stripe_index orphan remained) the client will crash if the file is accessed (e.g. "ls -l"):

      LustreError: 19393:0:(ldlm_resource.c:1077:ldlm_resource_get()) ASSERTION( name->name[0] != 0 ) failed: 
      LustreError: 19393:0:(ldlm_resource.c:1077:ldlm_resource_get()) LBUG
      Pid: 19393, comm: ls
      
      Call Trace:
       [<ffffffffa0ef9895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0ef9e97>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa07c4f20>] ldlm_resource_get+0x700/0x900 [ptlrpc]
       [<ffffffffa07bf1b9>] ldlm_lock_create+0x59/0xcc0 [ptlrpc]
       [<ffffffffa07d8314>] ldlm_cli_enqueue+0xa4/0x790 [ptlrpc]
       [<ffffffffa09ebd44>] osc_enqueue_base+0x1e4/0x5b0 [osc]
       [<ffffffffa0a082fd>] osc_lock_enqueue+0x1ed/0x8c0 [osc]
       [<ffffffffa105be7c>] cl_enqueue_try+0xfc/0x300 [obdclass]
       [<ffffffffa0a5d42a>] lov_lock_enqueue+0x22a/0x850 [lov]
       [<ffffffffa105be7c>] cl_enqueue_try+0xfc/0x300 [obdclass]
       [<ffffffffa105d0cf>] cl_enqueue_locked+0x6f/0x1f0 [obdclass]
       [<ffffffffa105dd1e>] cl_lock_request+0x7e/0x270 [obdclass]
       [<ffffffffa123dba0>] cl_glimpse_lock+0x180/0x490 [lustre]
       [<ffffffffa123e415>] cl_glimpse_size0+0x1a5/0x1d0 [lustre]
       [<ffffffffa11eb55d>] ll_inode_revalidate_it+0x1cd/0x660 [lustre]
       [<ffffffffa11eba3a>] ll_getattr_it+0x4a/0x1b0 [lustre]
       [<ffffffffa11ebbd7>] ll_getattr+0x37/0x40 [lustre]
       [<ffffffff81186db1>] vfs_getattr+0x51/0x80
       [<ffffffff81186e40>] vfs_fstatat+0x60/0x80
       [<ffffffff81186ece>] vfs_lstat+0x1e/0x20
       [<ffffffff81186ef4>] sys_newlstat+0x24/0x50
      

      I'm not sure what the right way to handle this is, since this would affect all old clients trying to access files in .lustre/lost+found so fixing just the 2.6 client is not enough. Either we need to backport the fix to 2.5.2 and 2.4.3 and 2.1.7 clients (not very good, since we aren't sure if the client has the fix), or use some other lmm_magic or lmm_pattern to ensure that unpatched clients will not understand it.

      In the second case (using a different lmm_magic or lmm_pattern, maybe LOV_PATTERN_F_SPARSE?) the lfsck_layout_extend_lovea() code would need to decide as stripes are added if the layout is sparse (set the flag, old clients cannot access) or if it is full (clear the flag, old clients can access).

      Attachments

        Issue Links

          Activity

            People

              ys Yang Sheng
              yong.fan nasf (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: