Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17993

"ASSERTION( i == pv->ldp_count ) failed" is ll_direct_rw_pages

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      There's something landed recently (directio related I am sure) that has landed recently and causes regular though not very frequent failures in maloo.

       [16534.089981] Lustre: DEBUG MARKER: == sanity test 119h: basic tests of memory unaligned dio ========================================================== 11:35:07 (1719315307)
      [16535.909936] LustreError: 443178:0:(rw26.c:426:ll_direct_rw_pages()) ASSERTION( i == pv->ldp_count ) failed: 
      [16535.911077] LustreError: 443178:0:(rw26.c:426:ll_direct_rw_pages()) LBUG
      [16535.911822] CPU: 0 PID: 443178 Comm: multiop Kdump: loaded Tainted: G           OE    --------  ---  5.14.0-284.30.1.el9_2.x86_64 #1
      [16535.912992] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [16535.913597] Call Trace:
      [16535.913905]  <TASK>
      [16535.914173]  dump_stack_lvl+0x34/0x48
      [16535.914628]  lbug_with_loc.cold+0x5/0x43 [libcfs]
      [16535.915184]  ll_direct_rw_pages+0x46e/0x470 [lustre]
      [16535.915886]  ll_direct_IO_impl+0x462/0xc90 [lustre]
      [16535.916432]  ? bpf_lsm_inode_removexattr+0x10/0x10
      [16535.916970]  generic_file_direct_write+0x9c/0x1e0
      [16535.917519]  __generic_file_write_iter+0x9e/0x1a0
      [16535.918029]  vvp_io_write_start+0xb25/0xcb0 [lustre]
      [16535.918595]  cl_io_start+0x61/0x130 [obdclass]
      [16535.919327]  cl_io_loop+0x99/0x220 [obdclass]
      [16535.919846]  ll_file_io_generic+0x53d/0xef0 [lustre]
      [16535.920415]  do_file_write_iter+0x4e3/0x770 [lustre]
      [16535.920991]  ? __alloc_pages+0xe6/0x230
      [16535.921430]  new_sync_write+0xff/0x190
      [16535.921875]  vfs_write+0x1ef/0x280
      [16535.922270]  ksys_write+0x5f/0xe0
      [16535.922657]  do_syscall_64+0x5c/0x90
      [16535.923119]  ? handle_mm_fault+0xc5/0x2a0
      [16535.923582]  ? do_user_addr_fault+0x1dd/0x6b0
      [16535.924073]  ? exc_page_fault+0x62/0x150
      [16535.924504]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [16535.925059] RIP: 0033:0x7f415893eb97
      [16535.925477] Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
      [16535.927263] RSP: 002b:00007ffc09d90ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [16535.928037] RAX: ffffffffffffffda RBX: 0000000000406a48 RCX: 00007f415893eb97
      [16535.928773] RDX: 0000000000000fff RSI: 00007f41585cb010 RDI: 0000000000000003
      [16535.929501] RBP: 0000000000000fff R08: 0000000000000003 R09: 0000000000000000
      [16535.930232] R10: 00007f41588070e0 R11: 0000000000000246 R12: 0000000000000fff
      [16535.930969] R13: 0000000000000003 R14: 00000000004044fc R15: 0000000000000122
      [16535.931695]  </TASK>
      [16535.932007] Kernel panic - not syncing: LBUG
      [16535.932480] CPU: 0 PID: 443178 Comm: multiop Kdump: loaded Tainted: G           OE    --------  ---  5.14.0-284.30.1.el9_2.x86_64 #1
      [16535.933665] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [16535.934264] Call Trace:
      [16535.934571]  <TASK>
      [16535.934840]  dump_stack_lvl+0x34/0x48
      [16535.935257]  panic+0xf4/0x2c6
      [16535.935637]  lbug_with_loc.cold+0x1a/0x43 [libcfs]
      [16535.936168]  ll_direct_rw_pages+0x46e/0x470 [lustre]
      [16535.936750]  ll_direct_IO_impl+0x462/0xc90 [lustre]
      [16535.937312]  ? bpf_lsm_inode_removexattr+0x10/0x10
      [16535.937842]  generic_file_direct_write+0x9c/0x1e0
      [16535.938395]  __generic_file_write_iter+0x9e/0x1a0
      [16535.938918]  vvp_io_write_start+0xb25/0xcb0 [lustre]
      [16535.939479]  cl_io_start+0x61/0x130 [obdclass]
      [16535.940016]  cl_io_loop+0x99/0x220 [obdclass]
      [16535.940542]  ll_file_io_generic+0x53d/0xef0 [lustre]
      [16535.941098]  do_file_write_iter+0x4e3/0x770 [lustre]
      [16535.941657]  ? __alloc_pages+0xe6/0x230
      [16535.942081]  new_sync_write+0xff/0x190
      [16535.942501]  vfs_write+0x1ef/0x280
      [16535.942892]  ksys_write+0x5f/0xe0
      [16535.943279]  do_syscall_64+0x5c/0x90
      [16535.943693]  ? handle_mm_fault+0xc5/0x2a0
      [16535.944132]  ? do_user_addr_fault+0x1dd/0x6b0
      [16535.944617]  ? exc_page_fault+0x62/0x150
      [16535.945050]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [16535.945589] RIP: 0033:0x7f415893eb97
      [16535.945992] Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
      [16535.947776] RSP: 002b:00007ffc09d90ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [16535.948557] RAX: ffffffffffffffda RBX: 0000000000406a48 RCX: 00007f415893eb97
      [16535.949289] RDX: 0000000000000fff RSI: 00007f41585cb010 RDI: 0000000000000003
      [16535.950032] RBP: 0000000000000fff R08: 0000000000000003 R09: 0000000000000000
      [16535.950765] R10: 00007f41588070e0 R11: 0000000000000246 R12: 0000000000000fff
      [16535.951491] R13: 0000000000000003 R14: 00000000004044fc R15: 0000000000000122
      [16535.952228]  </TASK>

      First time it hit in a seemingly unrelated review on June 11th: https://testing.whamcloud.com/test_sets/69350cd5-31d1-4654-bb37-1986fda71422

      and then shortly after hit in master proper on June 25th: https://testing.whamcloud.com/test_sets/00f3ebed-3676-4495-9fd6-308afa7df77b

      another failure from that session: https://testing.whamcloud.com/test_sets/79ab5267-752e-41f6-a27a-940ee3406679

       

      Always hits in sanity test 119h

       

      Attachments

        Issue Links

          Activity

            [LU-17993] "ASSERTION( i == pv->ldp_count ) failed" is ll_direct_rw_pages
            bobijam Zhenyu Xu added a comment -

            alright, ll_get_user_pages() called iov_iter_revert()

            bobijam Zhenyu Xu added a comment - alright, ll_get_user_pages() called iov_iter_revert()
            ys Yang Sheng added a comment -

            The iov_iter_get_pages_alloc doesn't advance. but alloc2 do it. So we have already revert it in own define alloc2.

            ys Yang Sheng added a comment - The iov_iter_get_pages_alloc doesn't advance. but alloc2 do it. So we have already revert it in own define alloc2.
            bobijam Zhenyu Xu added a comment -

            I found that master and es60 has different operations in ll_get_user_pages(), in master after iov_iter_get_pages_alloc2() it advance the iter, while in es60, it does not. And in master iov_iter_get_pages_alloc2(), it advance the iter again. while in es60 iov_iter_get_pages_alloc2() is redefined as iov_iter_get_pages_alloc() and in kernel iov_iter_get_pages_alloc() i don't see it advance iter as well. And in both branches, ll_direct_IO_impl() would advance the iter after ll_get_user_pages().

            Is it the master branch advance the iter too much?

            bobijam Zhenyu Xu added a comment - I found that master and es60 has different operations in ll_get_user_pages(), in master after iov_iter_get_pages_alloc2() it advance the iter, while in es60, it does not. And in master iov_iter_get_pages_alloc2(), it advance the iter again. while in es60 iov_iter_get_pages_alloc2() is redefined as iov_iter_get_pages_alloc() and in kernel iov_iter_get_pages_alloc() i don't see it advance iter as well. And in both branches, ll_direct_IO_impl() would advance the iter after ll_get_user_pages(). Is it the master branch advance the iter too much?

            Yes, for aligned IO, there are two offset (file_offset and the starting address in iovec) used for calculating the page count

            there could be some issue in the above aligned IO check of the user buffers?

                    /* Check that all user buffers are aligned as well */
                    if (ll_iov_iter_alignment(iter) & ~PAGE_MASK)
                            unaligned = true;
            
            hongchao.zhang Hongchao Zhang added a comment - Yes, for aligned IO, there are two offset (file_offset and the starting address in iovec) used for calculating the page count there could be some issue in the above aligned IO check of the user buffers? /* Check that all user buffers are aligned as well */ if (ll_iov_iter_alignment(iter) & ~PAGE_MASK) unaligned = true;
            ys Yang Sheng added a comment -

            I think the key part is the user buffer address. All of crash cases for test_119h has a extra 0x10 on the last byte of user buffer address.  It caused the crash. It is why this bug cannot be reproduced stable.

            ys Yang Sheng added a comment - I think the key part is the user buffer address. All of crash cases for test_119h has a extra 0x10 on the last byte of user buffer address.  It caused the crash. It is why this bug cannot be reproduced stable.
            ys Yang Sheng added a comment - - edited

            So the problem is that, Why we assert the (i==ldp_count)? The kernel really return a 2 pages then we only count 1 page base on size.

              csd_dio_pages = {
                ldp_pages = 0xffff9b72203149a0,
                ldp_count = 0x2,
                ldp_file_offset = 0x0
              },
            ...
              ci_queue = {
                c2_qin = {
                  pl_nr = 0x1,
                  pl_pages = {
                    next = 0xffff9b71c2adada8,
                    prev = 0xffff9b71c2adada8
                  }
                },
            

            The code is:

            static ssize_t ll_get_user_pages(int rw, struct iov_iter *iter,
                                            struct ll_dio_pages *pvec,
                                            size_t maxsize)
            {
            #if defined(HAVE_DIO_ITER)
                    size_t start;
                    size_t result;
            
                    result = iov_iter_get_pages_alloc2(iter, &pvec->ldp_pages, maxsize,
                                                      &start);
                    if (result > 0) {
                           pvec->ldp_count = DIV_ROUND_UP(result + start,PAGE_SIZE); 
                                                 ^^^^^-- (0xfff+16+4096-1)/4096=2
                                                                        
            ......
            static inline ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
                                                               struct page ***pages,
                                                               size_t maxsize,
                                                               size_t *start)
            {
                    ssize_t result = 0;
            
                    /* iov_iter_get_pages_alloc is non advancing version of alloc2 */
                    result = iov_iter_get_pages_alloc(i, pages, maxsize, start);  
                                         ^^^^- start: 0x7fef2f2b8010 % 4096=0x10       
                     if (result > 0 && user_backed_iter(i))
                            iov_iter_advance(i, result);
            
                    return result;     <-------- result=0xfff, start=16
            }
            
            
            

            Looks like it is still same case after move to iov_iter_get_pages_alloc2.

            ys Yang Sheng added a comment - - edited So the problem is that, Why we assert the (i==ldp_count)? The kernel really return a 2 pages then we only count 1 page base on size. csd_dio_pages = { ldp_pages = 0xffff9b72203149a0, ldp_count = 0x2, ldp_file_offset = 0x0 }, ... ci_queue = { c2_qin = { pl_nr = 0x1, pl_pages = { next = 0xffff9b71c2adada8, prev = 0xffff9b71c2adada8 } }, The code is: static ssize_t ll_get_user_pages( int rw, struct iov_iter *iter, struct ll_dio_pages *pvec, size_t maxsize) { # if defined(HAVE_DIO_ITER) size_t start; size_t result; result = iov_iter_get_pages_alloc2(iter, &pvec->ldp_pages, maxsize, &start); if (result > 0) { pvec->ldp_count = DIV_ROUND_UP(result + start,PAGE_SIZE); ^^^^^-- (0xfff+16+4096-1)/4096=2 ...... static inline ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start) { ssize_t result = 0; /* iov_iter_get_pages_alloc is non advancing version of alloc2 */ result = iov_iter_get_pages_alloc(i, pages, maxsize, start); ^^^^- start: 0x7fef2f2b8010 % 4096=0x10 if (result > 0 && user_backed_iter(i)) iov_iter_advance(i, result); return result; <-------- result=0xfff, start=16 } Looks like it is still same case after move to iov_iter_get_pages_alloc2.
            lixi_wc Li Xi added a comment -

            Hongchao is investigating the issue.

            lixi_wc Li Xi added a comment - Hongchao is investigating the issue.

            "Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55918
            Subject: LU-17993 tests: sanity/119h interop version checking
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 472e4444ac9747657db88c5c5dc0b9d56f3527cb

            gerrit Gerrit Updater added a comment - "Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55918 Subject: LU-17993 tests: sanity/119h interop version checking Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 472e4444ac9747657db88c5c5dc0b9d56f3527cb

            There are also a number of non-interop x86_64 failures of this subtest, so it looks like there is still a real issue here.

            adilger Andreas Dilger added a comment - There are also a number of non-interop x86_64 failures of this subtest, so it looks like there is still a real issue here.

            There are also quite a number of identical crashes from aarch64 clients when running patch https://review.whamcloud.com/55647 "LU-17525 test: aarch64/x86_74 interop". It might be that the support for the new "x86_74" processor isn't ready yet?

            adilger Andreas Dilger added a comment - There are also quite a number of identical crashes from aarch64 clients when running patch https://review.whamcloud.com/55647 " LU-17525 test: aarch64/x86_74 interop ". It might be that the support for the new " x86_74 " processor isn't ready yet?

            There was an assertion failure in sanity test_119h with an x86 client running 2.15.64.91, which includes the patch from LU-17525 (landed in 2.15.64), against a 2.14 server:
            https://testing.whamcloud.com/test_sets/3572f3b3-9bc7-444e-bcbc-3e6e9dfbdb13

            Lustre: DEBUG MARKER: == sanity test 119h: basic tests of memory unaligned dio === 09:30:20 (1722159020)
            LustreError: 453619:0:(rw26.c:426:ll_direct_rw_pages()) ASSERTION( i == pv->ldp_count ) failed: 
            LustreError: 453619:0:(rw26.c:426:ll_direct_rw_pages()) LBUG
            CPU: 0 PID: 453619 Comm: multiop 5.14.0-362.24.1.el9_3.x86_64 #1
            Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            Call Trace:
            lbug_with_loc.cold+0x5/0x58 [libcfs]
            ll_direct_rw_pages+0x46e/0x470 [lustre]
            ll_direct_IO_impl+0x462/0xc90 [lustre]
            generic_file_direct_write+0xc6/0x1f0
            __generic_file_write_iter+0x9e/0x1b0
            vvp_io_write_start+0xb24/0xcb0 [lustre]
            cl_io_start+0x61/0x130 [obdclass]
            cl_io_loop+0x99/0x220 [obdclass]
            ll_file_io_generic+0x53d/0xef0 [lustre]
            do_file_write_iter+0x4e3/0x770 [lustre]
            new_sync_write+0xff/0x190
            vfs_write+0x1ef/0x280
            
            adilger Andreas Dilger added a comment - There was an assertion failure in sanity test_119h with an x86 client running 2.15.64.91, which includes the patch from LU-17525 (landed in 2.15.64), against a 2.14 server: https://testing.whamcloud.com/test_sets/3572f3b3-9bc7-444e-bcbc-3e6e9dfbdb13 Lustre: DEBUG MARKER: == sanity test 119h: basic tests of memory unaligned dio === 09:30:20 (1722159020) LustreError: 453619:0:(rw26.c:426:ll_direct_rw_pages()) ASSERTION( i == pv->ldp_count ) failed: LustreError: 453619:0:(rw26.c:426:ll_direct_rw_pages()) LBUG CPU: 0 PID: 453619 Comm: multiop 5.14.0-362.24.1.el9_3.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: lbug_with_loc.cold+0x5/0x58 [libcfs] ll_direct_rw_pages+0x46e/0x470 [lustre] ll_direct_IO_impl+0x462/0xc90 [lustre] generic_file_direct_write+0xc6/0x1f0 __generic_file_write_iter+0x9e/0x1b0 vvp_io_write_start+0xb24/0xcb0 [lustre] cl_io_start+0x61/0x130 [obdclass] cl_io_loop+0x99/0x220 [obdclass] ll_file_io_generic+0x53d/0xef0 [lustre] do_file_write_iter+0x4e3/0x770 [lustre] new_sync_write+0xff/0x190 vfs_write+0x1ef/0x280

            People

              hongchao.zhang Hongchao Zhang
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: