Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12525

sanity-flr test 200 and others asertion in osc_page_delete

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.13.0
    • None
    • 3
    • 9223372036854775807

    Description

      There's a flurry of failures. In maloo it mostly fails in sanity-flr test 200, but I also saw it in racer at least once with a partially similar backtrace.

      Lustre: DEBUG MARKER: mkdir -p /mnt/lustre2
      Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-49vm6@tcp:/lustre /mnt/lustre2
      Lustre: Mounted lustre-client
      Lustre: DEBUG MARKER: mkdir -p /mnt/lustre3
      Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-49vm6@tcp:/lustre /mnt/lustre3
      LustreError: 16418:0:(osc_cache.c:2553:osc_teardown_async_page()) extent ffff89076293af20@{[768 -> 1023/1023], [2|0|-|cache|wi|ffff890756aca8c0], [1703936|123|+|-|ffff890762821d40|256| (null)]} trunc at 848.
      LustreError: 16418:0:(osc_cache.c:2553:osc_teardown_async_page()) ### extent: ffff89076293af20 ns: lustre-OST0001-osc-ffff890767ec5800 lock: ffff890762821d40/0x10751cafee0649b3 lrc: 2/0,0 mode: PW/PW res: [0x175d:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 3473408->3977215) flags: 0x20000020000 nid: local remote: 0x70d1a9e0ebb537b3 expref: -99 pid: 16413 timeout: 0 lvb_type: 1
      LustreError: 16418:0:(osc_page.c:192:osc_page_delete()) page@ffff89075dd84a00[2 ffff890736741ab0 4 1 (null)]
      LustreError: 16418:0:(osc_page.c:192:osc_page_delete()) vvp-page@ffff89075dd84a50(0:0) vm@ffffe55201ca5c80 1fffff0000083d 3:0 ffff89075dd84a00 1872 lru
      LustreError: 16418:0:(osc_page.c:192:osc_page_delete()) lov-page@ffff89075dd84a90, comp index: 70001, gen: 8
      LustreError: 16418:0:(osc_page.c:192:osc_page_delete()) osc-page@ffff89075dd84ac8 848: 1< 0x845fed 258 0 + - > 2< 3473408 0 4096 0x0 0x520 | (null) ffff89076b15c6c0 ffff890756aca8c0 > 3< 0 0 0 > 4< 0 0 8 120979456 - | - - + - > 5< - - + - | 0 - | 123 - ->
      LustreError: 16418:0:(osc_page.c:192:osc_page_delete()) end page@ffff89075dd84a00
      LustreError: 16418:0:(osc_page.c:192:osc_page_delete()) Trying to teardown failed: -16
      LustreError: 16418:0:(osc_page.c:193:osc_page_delete()) ASSERTION( 0 ) failed:
      LustreError: 16418:0:(osc_page.c:193:osc_page_delete()) LBUG
      Pid: 16418, comm: cat 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
      Call Trace:
      [<ffffffffc05b78bc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [<ffffffffc05b796c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [<ffffffffc0b8086f>] osc_page_delete+0x48f/0x500 [osc]
      [<ffffffffc073cdf0>] cl_page_delete0+0x80/0x220 [obdclass]
      [<ffffffffc073cfc3>] cl_page_delete+0x33/0x110 [obdclass]
      [<ffffffffc0c8c98f>] ll_invalidatepage+0x7f/0x170 [lustre]
      [<ffffffff8ffc6c8d>] do_invalidatepage_range+0x7d/0x90
      [<ffffffff8ffc6d37>] truncate_inode_page+0x77/0x80
      [<ffffffff8ffc6f6a>] truncate_inode_pages_range+0x1ea/0x700
      [<ffffffff8ffc7495>] truncate_inode_pages+0x15/0x20
      [<ffffffffc0ca06db>] vvp_prune+0x7b/0x2b0 [lustre]
      [<ffffffffc073a638>] cl_object_prune+0x58/0x110 [obdclass]
      [<ffffffffc0beecc9>] lov_conf_set+0x509/0xac0 [lov]
      [<ffffffffc073a753>] cl_conf_set+0x63/0x120 [obdclass]
      [<ffffffffc0c598f1>] ll_layout_conf+0x81/0x3f0 [lustre]
      [<ffffffffc0c5a1ad>] ll_layout_refresh+0x54d/0x900 [lustre]
      [<ffffffffc0c9fc97>] vvp_io_init+0x347/0x460 [lustre]
      [<ffffffffc07426ab>] cl_io_init0.isra.15+0x8b/0x160 [obdclass]
      [<ffffffffc0742843>] cl_io_init+0x43/0x80 [obdclass]
      [<ffffffffc07428f7>] cl_io_rw_init+0x77/0x210 [obdclass]
      [<ffffffffc0c502be>] ll_file_io_generic+0x17e/0xaf0 [lustre]
      [<ffffffffc0c51a52>] ll_file_aio_read+0x3a2/0x470 [lustre]
      [<ffffffffc0c51c20>] ll_file_read+0x100/0x1c0 [lustre]
      [<ffffffff9004167f>] vfs_read+0x9f/0x170
      [<ffffffff9004253f>] SyS_read+0x7f/0xf0
      [<ffffffff90575ddb>] system_call_fastpath+0x22/0x27
      [<ffffffffffffffff>] 0xffffffffffffffff
      

      example maloo report: https://testing.whamcloud.com/test_sets/4bc4f606-a1e3-11e9-8dbe-52540065bddc

      racer:

      LustreError: 12527:0:(osc_cache.c:2553:osc_teardown_async_page()) extent ffff8800a297e230@{[672 -> 1023/1023], [2|0|-|cache|wi|ffff8800a0bcee60], [1466368|352|+|-|ffff8800b81c0d80|1024| (null)]} trunc at 672.
      LustreError: 12527:0:(osc_cache.c:2553:osc_teardown_async_page()) ### extent: ffff8800a297e230 ns: lustre-OST0001-osc-ffff88010a173800 lock: ffff8800b81c0d80/0xfcc9cb8b08e86ca9 lrc: 2/0,0 mode: PW/PW res: [0x1e9:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 2752512->18446744073709551615) flags: 0x20000000000 nid: local remote: 0x4b2ddd8a9f2ac3e expref: -99 pid: 12139 timeout: 0 lvb_type: 1
      LustreError: 12527:0:(osc_page.c:192:osc_page_delete()) page@ffff8800a3ce2e00[2 ffff8800a231ac58 4 1 (null)]
      LustreError: 12527:0:(osc_page.c:192:osc_page_delete()) vvp-page@ffff8800a3ce2e50(0:0) vm@ffffea00028cfb48 1fffff0000083d 3:0 ffff8800a3ce2e00 1696 lru
      LustreError: 12527:0:(osc_page.c:192:osc_page_delete()) lov-page@ffff8800a3ce2e90, comp index: 10001, gen: 5
      LustreError: 12527:0:(osc_page.c:192:osc_page_delete()) osc-page@ffff8800a3ce2ec8 672: 1< 0x845fed 258 0 + - > 2< 2752512 0 4096 0x0 0x520 | (null) ffff8800c8118798 ffff8800a0bcee60 > 3< 0 0 0 > 4< 0 0 8 424652800 - | - - + - > 5< - - + - | 0 - | 352 - ->
      LustreError: 12527:0:(osc_page.c:192:osc_page_delete()) end page@ffff8800a3ce2e00
      LustreError: 12527:0:(osc_page.c:192:osc_page_delete()) Trying to teardown failed: -16
      LustreError: 12527:0:(osc_page.c:193:osc_page_delete()) ASSERTION( 0 ) failed:
      LustreError: 12527:0:(osc_page.c:193:osc_page_delete()) LBUG
      Pid: 12527, comm: touch 3.10.0-7.6-debug #5 SMP Tue Jun 11 21:26:20 EDT 2019
      Call Trace:
      [<ffffffffa017e8cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [<ffffffffa017e97c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [<ffffffffa08133d9>] osc_page_delete+0x489/0x4f0 [osc]
      [<ffffffffa032dcad>] cl_page_delete0+0x7d/0x210 [obdclass]
      [<ffffffffa032de73>] cl_page_delete+0x33/0x110 [obdclass]
      [<ffffffffa0d6551f>] ll_invalidatepage+0x7f/0x170 [lustre]
      [<ffffffff811bc961>] do_invalidatepage_range+0x71/0x80
      [<ffffffff811bca07>] truncate_inode_page+0x77/0x80
      [<ffffffff811bcc42>] truncate_inode_pages_range+0x1f2/0x770
      [<ffffffff811bd1d5>] truncate_inode_pages+0x15/0x20
      [<ffffffffa0d78abb>] vvp_prune+0x7b/0x2b0 [lustre]
      [<ffffffffa032b595>] cl_object_prune+0x55/0x110 [obdclass]
      [<ffffffffa08d5121>] lov_conf_set+0x501/0xa90 [lov]
      [<ffffffffa032b6b0>] cl_conf_set+0x60/0x120 [obdclass]
      [<ffffffffa0d6f0cb>] cl_file_inode_init+0x12b/0x390 [lustre]
      [<ffffffffa0d44f35>] ll_update_inode+0x2e5/0x610 [lustre]
      [<ffffffffa0d56363>] ll_iget+0x253/0x350 [lustre]
      [<ffffffffa0d4955d>] ll_prep_inode+0x20d/0x9a0 [lustre]
      [<ffffffffa0d5849b>] ll_lookup_it+0xc0b/0x1e90 [lustre]
      [<ffffffffa0d5ad25>] ll_lookup_nd+0x85/0x190 [lustre]
      [<ffffffff8124147d>] lookup_real+0x1d/0x50
      [<ffffffff81241f62>] __lookup_hash+0x42/0x60
      [<ffffffff817b31f4>] lookup_slow+0x42/0xa7
      [<ffffffff8124677f>] link_path_walk+0x81f/0x8c0
      [<ffffffff812478fe>] path_openat+0xae/0x650
      [<ffffffff812492cd>] do_filp_open+0x4d/0xb0
      [<ffffffff81235167>] do_sys_open+0x137/0x240
      [<ffffffff8123528e>] SyS_open+0x1e/0x20
      [<ffffffff817cae15>] system_call_fastpath+0x1c/0x21
      [<ffffffffffffffff>] 0xffffffffffffffff
      

      sample report: http://testing.linuxhacker.ru:3333/lustre-reports/1051/testresults/racer-ldiskfs-DNE-centos7_x86_64-centos7_x86_64 and in maloo: https://testing.whamcloud.com/test_sets/58251f94-a11a-11e9-8dbe-52540065bddc

      This seems to be a very recent crash scenario too appearing just two das ago.

      Coincidentally we just landed this patch n that timeframe: https://review.whamcloud.com/35111
      So I guess it's the prime suspect now.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: