Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4363

(llite_lib.c:1683:ll_update_inode()) ASSERTION( lu_fid_eq(&lli->lli_fid, &body->fid1) ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.1.6
    • 3
    • 11946

    Description

      Hi,

      At IFERC customer site, 7 compute nodes crashed with the following message in the console:

      2013-11-21 00:57:45 LustreError: 92325:0:(llite_lib.c:1683:ll_update_inode()) ASSERTION( lu_fid_eq(&lli->lli_fid, &body->fid1) ) failed: Trying to change FID [0x217294ce4:0x107f0:0x0] to the [0x217294ce4:0x107f1:0x0], inode 150634522759727089/35072332(ffff8807dcbf85f8)
      2013-11-21 00:57:45 LustreError: 92325:0:(llite_lib.c:1683:ll_update_inode()) LBUG
      2013-11-21 00:57:45 Pid: 92325, comm: writer_v131
      2013-11-21 00:57:45
      2013-11-21 00:57:45 Call Trace:
      2013-11-21 00:57:45  [<ffffffffa046f7f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2013-11-21 00:57:45  [<ffffffffa046fe07>] lbug_with_loc+0x47/0xb0 [libcfs]
      2013-11-21 00:57:45  [<ffffffffa0a91ca0>] ll_update_inode+0x4a0/0xf60 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a928ea>] ll_prep_inode+0x18a/0xae0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a7c8c3>] ll_intent_file_open+0x563/0xb80 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0aa6a90>] ? ll_md_blocking_ast+0x0/0x700 [lustre]
      2013-11-21 00:57:45  [<ffffffff8108163e>] ? down+0x2e/0x50
      2013-11-21 00:57:45  [<ffffffffa0a7cf67>] ll_lov_setstripe_ea_info+0x87/0x2b0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a831a5>] ll_lov_setstripe+0x85/0x5a0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0aa3e8b>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a84ac6>] ll_file_ioctl+0x826/0xe00 [lustre]
      2013-11-21 00:57:45  [<ffffffff81179ff2>] vfs_ioctl+0x22/0xa0
      2013-11-21 00:57:45  [<ffffffff8117a4ba>] do_vfs_ioctl+0x3aa/0x580
      2013-11-21 00:57:45  [<ffffffff8117a711>] sys_ioctl+0x81/0xa0
      2013-11-21 00:57:45  [<ffffffff8149970e>] ? do_device_not_available+0xe/0x10
      2013-11-21 00:57:45  [<ffffffff810030f2>] system_call_fastpath+0x16/0x1b
      2013-11-21 00:57:45
      2013-11-21 00:57:45 Kernel panic - not syncing: LBUG
      2013-11-21 00:57:45 Pid: 92325, comm: writer_v131 Tainted: G        W  ---------------    2.6.32-279.5.2.bl6.Bull.36.x86_64 #1
      2013-11-21 00:57:45 Call Trace:
      2013-11-21 00:57:45  [<ffffffff81495fe3>] ? panic+0xa0/0x168
      2013-11-21 00:57:45  [<ffffffffa046fe5b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      2013-11-21 00:57:45  [<ffffffffa0a91ca0>] ? ll_update_inode+0x4a0/0xf60 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a928ea>] ? ll_prep_inode+0x18a/0xae0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a7c8c3>] ? ll_intent_file_open+0x563/0xb80 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0aa6a90>] ? ll_md_blocking_ast+0x0/0x700 [lustre]
      2013-11-21 00:57:45  [<ffffffff8108163e>] ? down+0x2e/0x50
      2013-11-21 00:57:45  [<ffffffffa0a7cf67>] ? ll_lov_setstripe_ea_info+0x87/0x2b0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a831a5>] ? ll_lov_setstripe+0x85/0x5a0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0aa3e8b>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre]
      2013-11-21 00:57:45  [<ffffffffa0a84ac6>] ? ll_file_ioctl+0x826/0xe00 [lustre]
      2013-11-21 00:57:45  [<ffffffff81179ff2>] ? vfs_ioctl+0x22/0xa0
      2013-11-21 00:57:45  [<ffffffff8117a4ba>] ? do_vfs_ioctl+0x3aa/0x580
      2013-11-21 00:57:45  [<ffffffff8117a711>] ? sys_ioctl+0x81/0xa0
      2013-11-21 00:57:45  [<ffffffff8149970e>] ? do_device_not_available+0xe/0x10
      2013-11-21 00:57:45  [<ffffffff810030f2>] ? system_call_fastpath+0x16/0x1b
      

      This issue looks like LU-2523 and LU-3311, but the patch for b2_1 has not made any progress since July.

      I havetested with the following reproducer, given in LU-2523:

      llmount.sh
      cd /mnt/lustre
      touch file1
      
      In a single process do:
        struct lov_user_md_v3 *lum;
        /* Initialize lum */
        fd2 = open("file2", O_RDWR|O_CREAT|O_LOV_DELAY_CREATE, 0666);
        rename("file1", "file2");
        ioctl(fd2, LL_IOC_LOV_SETSTRIPE, lum);
      

      With a stock 2.1.6 I can easily reproduce the issue. And unfortunately, with patch at http://review.whamcloud.com/6775 I am still able to hit the bug.

      Thanks,
      Sebastien.

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            sebastien.buisson Sebastien Buisson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: