Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2557

osc_page_delete()) Trying to teardown failed: -16 (EBUSY)

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.3.0, Lustre 2.4.0
    • Lustre client 2.3 rc6, SLES11 2.6.32.59 or SLES11 3.0.38
      Lustre server 2.3 rc6, SLES11 2.6.32.59(direct-attached) or centos 2.6.32(external server)
    • 2
    • 5983

    Description

      The "teardown failure" is handled with an assertion failure.

      The stack trace from the assertion failure call:
      [<ffffffffa06a1e61>] osc_page_delete+0x2d1/0x2e0 [osc]
      [<ffffffffa02d33fd>] cl_page_delete0+0xcd/0x4f0 [obdclass]
      [<ffffffffa02d3862>] cl_page_delete+0x42/0x120 [obdclass]
      [<ffffffffa0826e9d>] ll_invalidatepage+0x8d/0x170 [lustre]
      [<ffffffffa081e264>] ll_page_mkwrite+0x7c4/0x840 [lustre]
      [<ffffffff810ef21b>] __do_fault+0xbb/0x4c0
      [<ffffffff810f2c1b>] handle_mm_fault+0x1db/0xe50
      [<ffffffff810289f7>] do_page_fault+0x147/0x2c0
      [<ffffffff812d90df>] page_fault+0x1f/0x30

      And extract from pid 617 Lustre traces:

      00000008:00020000:21.0:1353473247.507160:0:617:0:(osc_page.c:411:osc_page_delete()) page@ffff88080aa2f180[1 ffff8801ff1e1508:51 ^ffff88080aa2f240_(null) 4 0 1 (null) (null) 0x0]
      00000008:00020000:21.0:1353473247.520125:0:617:0:(osc_page.c:411:osc_page_delete()) vvp-page@ffff880809658be0(0:0:0) vm@ffffea001bfd54d8 e000000000000e3 8:0 0 51 lru
      00000008:00020000:21.0:1353473247.532044:0:617:0:(osc_page.c:411:osc_page_delete()) lov-page@ffff88080b1c4b58
      00000008:00020000:21.0:1353473247.539101:0:617:0:(osc_page.c:411:osc_page_delete()) osc-page@ffff8808176944b0: 1< 0x845fed 258 0 + - > 2< 208896 0 4096 0x0 0x520 | (null) ffff88041c1a6778 ffff8802072be888 > 3< + ffff8801fd445100 0 0 0 > 4< 0 0 8 31752192 - | - - + - > 5< - - + - | 0 - | 17 - ->
      00000008:00020000:21.0:1353473247.562323:0:617:0:(osc_page.c:411:osc_page_delete()) end page@ffff88080aa2f240
      00000008:00020000:21.0:1353473247.569375:0:617:0:(osc_page.c:411:osc_page_delete()) Trying to teardown failed: -16
      00000008:00040000:21.0:1353473247.576859:0:617:0:(osc_page.c:412:osc_page_delete()) ASSERTION( 0 ) failed:
      00000008:00040000:21.0:1353473247.583736:0:617:0:(osc_page.c:412:osc_page_delete()) LBUG

      Customer commentary:
      > The symptom is similar to LU-1442/LU-1680 but we have the patch and apparently the problem still exists.

      A dump is in:
      ftp.cray.com:/outbound/791555-osc_cache_writeback_range-assert.tar.bz2
      More are available on request.

      Reference: LU-1030 – apparent source of problem

      Attachments

        Issue Links

          Activity

            [LU-2557] osc_page_delete()) Trying to teardown failed: -16 (EBUSY)
            pjones Peter Jones added a comment -

            duplicate of LU-2720

            pjones Peter Jones added a comment - duplicate of LU-2720

            CF: LU-2720

            Vitaly Fertman added a comment - 31/Jan/13 7:53 AM

            http://review.whamcloud.com/5222

            bkorb Bruce Korb (Inactive) added a comment - CF: LU-2720 Vitaly Fertman added a comment - 31/Jan/13 7:53 AM http://review.whamcloud.com/5222

            Hi, is the previouse patch(removing CILR_PEEK) still applied with removing the write_one_page?

            hongchao.zhang Hongchao Zhang added a comment - Hi, is the previouse patch(removing CILR_PEEK) still applied with removing the write_one_page?

            Sorry – should have read code before commenting. I knew Linus strongly favored locking and unlocking in the same function, ...

            Anyway, according to Cray, removing the "write_one_page" call still hangs:

            Wally Wang updated LELUS-103:
            -----------------------------
            
                Attachment: lelus103-sp1-nowrite-one-page.tar.bz2
            
            After removing the write_one_page(), it still hangs. See attachment
            bkorb Bruce Korb (Inactive) added a comment - Sorry – should have read code before commenting. I knew Linus strongly favored locking and unlocking in the same function, ... Anyway, according to Cray, removing the "write_one_page" call still hangs: Wally Wang updated LELUS-103: ----------------------------- Attachment: lelus103-sp1-nowrite-one-page.tar.bz2 After removing the write_one_page(), it still hangs. See attachment
            hongchao.zhang Hongchao Zhang added a comment - - edited

            the page is locked before calling "write_one_lock", and it will unlock the page after it exits, the modified codes like

            longer snippet from ll_page_mkwrite0
                    if (result == 0 || result == -ENODATA) {
                            lock_page(vmpage);
                            if (vmpage->mapping == NULL) {
                                    unlock_page(vmpage);
                                    if (result == 0)
                                            result = -ENODATA;
                            } else if (result == -ENODATA) {
                                    if (vmpage->mapping != NULL) {
                                            ll_invalidate_page(vmpage);
                                            LASSERT(vmpage->mapping == NULL);
                                    }
                                    unlock_page(vmpage);
            hongchao.zhang Hongchao Zhang added a comment - - edited the page is locked before calling "write_one_lock", and it will unlock the page after it exits, the modified codes like longer snippet from ll_page_mkwrite0 if (result == 0 || result == -ENODATA) { lock_page(vmpage); if (vmpage->mapping == NULL) { unlock_page(vmpage); if (result == 0) result = -ENODATA; } else if (result == -ENODATA) { if (vmpage->mapping != NULL) { ll_invalidate_page(vmpage); LASSERT(vmpage->mapping == NULL); } unlock_page(vmpage);

            do not lock a page that is already locked

            bkorb Bruce Korb (Inactive) added a comment - do not lock a page that is already locked
            longer snippet from ll_page_mkwrite0
                    if (result == 0 || result == -ENODATA) {
                            lock_page(vmpage);
                            if (vmpage->mapping == NULL) {
                                    unlock_page(vmpage);
                                    if (result == 0)
                                            result = -ENODATA;
                            } else if (result == -ENODATA) {
                                    write_one_page(vmpage, 1);
                                    lock_page(vmpage);
                                    if (vmpage->mapping != NULL) {
                                            ll_invalidate_page(vmpage);
                                            LASSERT(vmpage->mapping == NULL);
                                    }
                                    unlock_page(vmpage);

            OK, I'm not understanding how the write_one_page is not within a "lock_page".
            In fact, it looks to me like the lock_page() call that follows it is wrong.
            I'm guessing that the thread is self-deadlocked.

            bkorb Bruce Korb (Inactive) added a comment - longer snippet from ll_page_mkwrite0 if (result == 0 || result == -ENODATA) { lock_page(vmpage); if (vmpage->mapping == NULL) { unlock_page(vmpage); if (result == 0) result = -ENODATA; } else if (result == -ENODATA) { write_one_page(vmpage, 1); lock_page(vmpage); if (vmpage->mapping != NULL) { ll_invalidate_page(vmpage); LASSERT(vmpage->mapping == NULL); } unlock_page(vmpage); OK, I'm not understanding how the write_one_page is not within a "lock_page". In fact, it looks to me like the lock_page() call that follows it is wrong. I'm guessing that the thread is self-deadlocked.
            bkorb Bruce Korb (Inactive) added a comment - - edited

            Meaning that the call to "write_one_page(vmpage, 1)" did nothing useful, so just eliminate it entirely? OK...

            BTW, "noformat" inside of curly braces before and after a code snippet makes the code much more readable.

            code snippet from ll_page_mkwrite0
            if (result == 0 || result == -ENODATA) { 
               ... 
               if (vmpage-&gt;mapping == NULL) {
                  ...
               } else if (result == -ENODATA) { 
                  write_one_page(vmpage, 1); 
                  lock_page(vmpage); 
                  if (vmpage-&gt;mapping != NULL) {
                     ll_invalidate_page(vmpage);
                     LASSERT(vmpage-&gt;mapping == NULL);
                  } 
                  unlock_page(vmpage); 
               } 
               ... 
            }
            bkorb Bruce Korb (Inactive) added a comment - - edited Meaning that the call to "write_one_page(vmpage, 1)" did nothing useful, so just eliminate it entirely? OK... BTW, "noformat" inside of curly braces before and after a code snippet makes the code much more readable. code snippet from ll_page_mkwrite0 if (result == 0 || result == -ENODATA) { ... if (vmpage-&gt;mapping == NULL) { ... } else if (result == -ENODATA) { write_one_page(vmpage, 1); lock_page(vmpage); if (vmpage-&gt;mapping != NULL) { ll_invalidate_page(vmpage); LASSERT(vmpage-&gt;mapping == NULL); } unlock_page(vmpage); } ... }

            this ticket could be related to the "write_one_page" call in ll_page_mkwrite0 if there is no lock to protect the write.
            ...
            if (result == 0 || result == -ENODATA) {
            ...
            if (vmpage->mapping == NULL)

            { ... }

            else if (result == -ENODATA) {
            write_one_page(vmpage, 1);
            lock_page(vmpage);
            if (vmpage->mapping != NULL)

            { ll_invalidate_page(vmpage); LASSERT(vmpage->mapping == NULL); }

            unlock_page(vmpage);
            }
            ...
            }
            ...

            if there is an extent covering this page is waiting to be flushed back to OST, "write_one_page" will put the page into the pending list.

            Hi Bruce, could you please try to test it without the "write_one_page" call? Thanks.

            hongchao.zhang Hongchao Zhang added a comment - this ticket could be related to the "write_one_page" call in ll_page_mkwrite0 if there is no lock to protect the write. ... if (result == 0 || result == -ENODATA) { ... if (vmpage->mapping == NULL) { ... } else if (result == -ENODATA) { write_one_page(vmpage, 1); lock_page(vmpage); if (vmpage->mapping != NULL) { ll_invalidate_page(vmpage); LASSERT(vmpage->mapping == NULL); } unlock_page(vmpage); } ... } ... if there is an extent covering this page is waiting to be flushed back to OST, "write_one_page" will put the page into the pending list. Hi Bruce, could you please try to test it without the "write_one_page" call? Thanks.

            The attached "clio disable CILR_PEEK optimization" patch does not seem to work. Vitaly's analysis of the issues falling out from LU-1030 led him to believe that the peek optimization was the source of the trouble. The result is a different symptom.

            The new symptom is a hang. Two threads have the following stack trace and the data for one of the stack traces appears to have been corrupted. Upon request, I'll add those details, but meanwhile any help at all in understanding the root cause of the LU-1030 patch issues would be MOST welcome.

            Thank you.

            Stack Trace of Hung Process
            PID: 18325  TASK: ffff8803fe1381c0  CPU: 1   COMMAND: "fsx-linux-aio"
             #0 [ffff8803c51dd538] schedule at ffffffff8141e405
             #1 [ffff8803c51dd610] schedule_timeout at ffffffff8141ee14
             #2 [ffff8803c51dd690] cfs_waitq_timedwait at ffffffffa01a66c1 [libcfs]
             #3 [ffff8803c51dd6a0] osc_extent_wait at ffffffffa0725694 [osc]
             #4 [ffff8803c51dd810] osc_cache_wait_range at ffffffffa0726167 [osc]
             #5 [ffff8803c51dd9a0] osc_io_fsync_end at ffffffffa0717a79 [osc]
             #6 [ffff8803c51dd9e0] cl_io_end at ffffffffa03048b2 [obdclass]
             #7 [ffff8803c51dda10] lov_io_end_wrapper at ffffffffa07c26f9 [lov]
             #8 [ffff8803c51dda30] lov_io_fsync_end at ffffffffa07c407b [lov]
             #9 [ffff8803c51dda60] cl_io_end at ffffffffa03048b2 [obdclass]
            #10 [ffff8803c51dda90] cl_io_loop at ffffffffa0308c4a [obdclass]
            #11 [ffff8803c51ddac0] cl_sync_file_range at ffffffffa08aacbb [lustre]
            #12 [ffff8803c51ddb30] ll_writepages at ffffffffa08d0164 [lustre]
            #13 [ffff8803c51ddb70] do_writepages at ffffffff810cd631
            #14 [ffff8803c51ddb80] __filemap_fdatawrite_range at ffffffff810c5b65
            #15 [ffff8803c51ddbe0] do_sync_mapping_range at ffffffff8112ea01
            #16 [ffff8803c51ddc10] __generic_file_aio_write at ffffffff810c6d6c
            #17 [ffff8803c51ddcc0] generic_file_aio_write at ffffffff810c6e42
            #18 [ffff8803c51ddd00] vvp_io_write_start at ffffffffa08fd4f6 [lustre]
            #19 [ffff8803c51ddd40] cl_io_start at ffffffffa0304a1a [obdclass]
            #20 [ffff8803c51ddd70] cl_io_loop at ffffffffa0308c3c [obdclass]
            #21 [ffff8803c51ddda0] ll_file_io_generic at ffffffffa08a9cc5 [lustre]
            #22 [ffff8803c51dde10] ll_file_aio_write at ffffffffa08aa01b [lustre]
            #23 [ffff8803c51dde60] aio_rw_vect_retry at ffffffff811426d4
            #24 [ffff8803c51dde90] aio_run_iocb at ffffffff81143ffa
            #25 [ffff8803c51ddec0] sys_io_submit at ffffffff81144e38
            #26 [ffff8803c51ddf80] system_call_fastpath at ffffffff8100305b
                RIP: 00007fa64d7d06f7  RSP: 00007fff6d99f318  RFLAGS: 00010206
                RAX: 00000000000000d1  RBX: ffffffff8100305b  RCX: 000000000000002d
                RDX: 00007fff6d99f2c0  RSI: 0000000000000001  RDI: 00007fa64dbe6000
                RBP: 0000000000006000   R8: 0000000000039000   R9: 0000000000000003
                R10: 262b0c2b4d2b962b  R11: 0000000000000202  R12: 00007fff6d99f770
                R13: 0000000000404658  R14: 000000000003f000  R15: 000000000003f000
                ORIG_RAX: 00000000000000d1  CS: 0033  SS: 002b
            bkorb Bruce Korb (Inactive) added a comment - The attached "clio disable CILR_PEEK optimization" patch does not seem to work. Vitaly's analysis of the issues falling out from LU-1030 led him to believe that the peek optimization was the source of the trouble. The result is a different symptom. The new symptom is a hang. Two threads have the following stack trace and the data for one of the stack traces appears to have been corrupted. Upon request, I'll add those details, but meanwhile any help at all in understanding the root cause of the LU-1030 patch issues would be MOST welcome. Thank you. Stack Trace of Hung Process PID: 18325 TASK: ffff8803fe1381c0 CPU: 1 COMMAND: "fsx-linux-aio" #0 [ffff8803c51dd538] schedule at ffffffff8141e405 #1 [ffff8803c51dd610] schedule_timeout at ffffffff8141ee14 #2 [ffff8803c51dd690] cfs_waitq_timedwait at ffffffffa01a66c1 [libcfs] #3 [ffff8803c51dd6a0] osc_extent_wait at ffffffffa0725694 [osc] #4 [ffff8803c51dd810] osc_cache_wait_range at ffffffffa0726167 [osc] #5 [ffff8803c51dd9a0] osc_io_fsync_end at ffffffffa0717a79 [osc] #6 [ffff8803c51dd9e0] cl_io_end at ffffffffa03048b2 [obdclass] #7 [ffff8803c51dda10] lov_io_end_wrapper at ffffffffa07c26f9 [lov] #8 [ffff8803c51dda30] lov_io_fsync_end at ffffffffa07c407b [lov] #9 [ffff8803c51dda60] cl_io_end at ffffffffa03048b2 [obdclass] #10 [ffff8803c51dda90] cl_io_loop at ffffffffa0308c4a [obdclass] #11 [ffff8803c51ddac0] cl_sync_file_range at ffffffffa08aacbb [lustre] #12 [ffff8803c51ddb30] ll_writepages at ffffffffa08d0164 [lustre] #13 [ffff8803c51ddb70] do_writepages at ffffffff810cd631 #14 [ffff8803c51ddb80] __filemap_fdatawrite_range at ffffffff810c5b65 #15 [ffff8803c51ddbe0] do_sync_mapping_range at ffffffff8112ea01 #16 [ffff8803c51ddc10] __generic_file_aio_write at ffffffff810c6d6c #17 [ffff8803c51ddcc0] generic_file_aio_write at ffffffff810c6e42 #18 [ffff8803c51ddd00] vvp_io_write_start at ffffffffa08fd4f6 [lustre] #19 [ffff8803c51ddd40] cl_io_start at ffffffffa0304a1a [obdclass] #20 [ffff8803c51ddd70] cl_io_loop at ffffffffa0308c3c [obdclass] #21 [ffff8803c51ddda0] ll_file_io_generic at ffffffffa08a9cc5 [lustre] #22 [ffff8803c51dde10] ll_file_aio_write at ffffffffa08aa01b [lustre] #23 [ffff8803c51dde60] aio_rw_vect_retry at ffffffff811426d4 #24 [ffff8803c51dde90] aio_run_iocb at ffffffff81143ffa #25 [ffff8803c51ddec0] sys_io_submit at ffffffff81144e38 #26 [ffff8803c51ddf80] system_call_fastpath at ffffffff8100305b RIP: 00007fa64d7d06f7 RSP: 00007fff6d99f318 RFLAGS: 00010206 RAX: 00000000000000d1 RBX: ffffffff8100305b RCX: 000000000000002d RDX: 00007fff6d99f2c0 RSI: 0000000000000001 RDI: 00007fa64dbe6000 RBP: 0000000000006000 R8: 0000000000039000 R9: 0000000000000003 R10: 262b0c2b4d2b962b R11: 0000000000000202 R12: 00007fff6d99f770 R13: 0000000000404658 R14: 000000000003f000 R15: 000000000003f000 ORIG_RAX: 00000000000000d1 CS: 0033 SS: 002b

            People

              hongchao.zhang Hongchao Zhang
              bkorb Bruce Korb (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: