Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-77

cl_page.c::cl_page_own0() assertion in echoclient

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.1.0
    • Lustre 2.1.0
    • None
    • 3
    • 24,361
    • 5052

    Description

      Oracle reports this assertion failure when running obdfilter survey:
      obdfilter-survey test 2a hung and hit the following LBUG on one of the client nodes:

      Lustre: DEBUG MARKER: == obdfilter-survey test 2a: Stripe F/S over the Network
      ============================================= 08:40:49 (1292686849)
      Lustre: 8086:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import lustre-OST0000_osc->host_0_UUID netid
      50000: select flavor null
      Lustre: 8086:0:(sec.c:1474:sptlrpc_import_sec_adapt()) Skipped 5 previous similar messages
      LustreError: 8309:0:(osc_request.c:773:osc_announce_cached()) dirty 11296 - 11297 > system
      dirty_max 589824
      LustreError: 8290:0:(osc_request.c:773:osc_announce_cached()) dirty 12006 - 12007 > system
      dirty_max 589824
      LustreError: 8302:0:(osc_request.c:773:osc_announce_cached()) dirty 12051 - 12052 > system
      dirty_max 589824
      LustreError: 8306:0:(osc_request.c:773:osc_announce_cached()) dirty 5853 - 5854 > system dirty_max
      589824
      LustreError: 8400:0:(osc_request.c:773:osc_announce_cached()) dirty 10889 - 10890 > system
      dirty_max 589824
      LustreError: 8388:0:(osc_request.c:773:osc_announce_cached()) dirty 9779 - 9780 > system dirty_max
      589824
      LustreError: 8387:0:(osc_request.c:773:osc_announce_cached()) dirty 4950 - 4951 > system dirty_max
      589824
      LustreError: 8387:0:(osc_request.c:773:osc_announce_cached()) Skipped 1 previous similar message
      LustreError: 8517:0:(osc_request.c:773:osc_announce_cached()) dirty 10796 - 10797 > system
      dirty_max 589824
      LustreError: 8517:0:(osc_request.c:773:osc_announce_cached()) Skipped 1 previous similar message
      LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) page@ffff81010a07ccc0[2 ffff810063a3ecd0:0
      ^0000000000000000_0000000000000000 1 0 2 ffff81010b699610 0000000000000000 0x0]
      LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) echo_client-page@ffff81010a24bf78
      vm@ffff810101e03cc8
      LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) osc-page@ffff810109a339b8: 1< 0x845fed 258 0 - -

      • > 2< 0 0 0x0 0x308 | 0000000000000000 ffff8100614308e8 ffff810066e8f600 ffffffff889451c0
        ffff810109a339b8 > 3< - ffff81011ff8e040 0 0 1 > 4< 0 7 8 39845888 - | + - + - > 5< + - + - | 0 - -
        512 + +>
        LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) end page@ffff81010a07ccc0
        LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) pg->cp_owner == NULL
        LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) ASSERTION(0) failed
        LustreError: 8756:0:(cl_page.c:986:cl_page_own0()) LBUG
        Pid: 8756, comm: lctl

      Call Trace:
      [<ffffffff885b85f1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
      [<ffffffff885b8b2a>] lbug_with_loc+0x7a/0xd0 [libcfs]
      [<ffffffff885c3960>] cfs_tracefile_init+0x0/0x10a [libcfs]
      [<ffffffff886ab720>] cl_page_own0+0x1a0/0x2f0 [obdclass]
      [<ffffffff88ac7801>] echo_client_brw_ioctl+0x1531/0x1cd0 [obdecho]
      [<ffffffff8000d47a>] dput+0x2c/0x114
      [<ffffffff88066381>] nfs_lookup_revalidate+0x2be/0x443 [nfs]
      [<ffffffff88acaf50>] echo_client_iocontrol+0x1360/0x1b00 [obdecho]
      [<ffffffff800cc354>] zone_statistics+0x3e/0x6d
      [<ffffffff800d1707>] __vmalloc_area_node+0x12e/0x156
      [<ffffffff88654e17>] obd_ioctl_getdata+0x5b7/0xeb0 [obdclass]
      [<ffffffff8002c9bc>] mntput_no_expire+0x19/0x89
      [<ffffffff8866965c>] class_handle_ioctl+0x1dcc/0x2160 [obdclass]
      [<ffffffff8000cd72>] do_path_lookup+0x275/0x2f1
      [<ffffffff8000d9e4>] permission+0x8d/0xc8
      [<ffffffff801aaaeb>] misc_open+0x16c/0x260
      [<ffffffff8865457a>] obd_class_ioctl+0x19a/0x230 [obdclass]
      [<ffffffff80064c7d>] lock_kernel+0x1b/0x32
      [<ffffffff8004217f>] do_ioctl+0x55/0x6b
      [<ffffffff800301de>] vfs_ioctl+0x457/0x4b9
      [<ffffffff800b76a3>] audit_syscall_entry+0x180/0x1b3
      [<ffffffff8004c607>] sys_ioctl+0x59/0x78
      [<ffffffff8005d28d>] tracesys+0xd5/0xe0

      Kernel panic - not syncing: LBUG

      Eric Mei comments that apparently obdecho threads incorrectly share pages they are not supposed to.

      Attachments

        Activity

          People

            jay Jinshan Xiong (Inactive)
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: