Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3067

ASSERTION(!(aa->aa_oa->o_valid & OBD_MD_FLHANDLE))

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Major
    • None
    • Lustre 1.8.9
    • 3
    • 7466

    Description

      One of our OSSs had problems writing to disk (due to a raid card problem).

      Several clients have an LBUG and haven't recovered after OSS reboot.
      The error is:

      Mar 29 06:20:10 cn492 kernel: LustreError: 3004:0:(osc_request.c:2357:brw_interpret()) ASSERTION(!(aa->aa_oa->o_valid & OBD_MD_FLHANDLE)) failed
      Mar 29 06:20:10 cn492 kernel: LustreError: 3004:0:(osc_request.c:2357:brw_interpret()) LBUG

      I attach the associated log file, and reproduce some lines of context in /var/log/messages

      Mar 29 05:57:03 cn492 kernel: Lustre: lustre_0-OST0027-osc-ffff81021c041800: Connection restored to service lustre_0-OST0027 using nid 10.1.4.12
      0@tcp.
      Mar 29 05:57:03 cn492 kernel: Lustre: Skipped 1 previous similar message
      Mar 29 06:09:39 cn492 kernel: Lustre: 3004:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1430341259304767 sent from lustre_0-OST002
      7-osc-ffff81021c041800 to NID 10.1.4.120@tcp 756s ago has timed out (756s prior to deadline).
      Mar 29 06:09:39 cn492 kernel: req@ffff8101145e6800 x1430341259304767/t0 o3->lustre_0-OST0027_UUID@10.1.4.120@tcp:6/4 lens 448/592 e 1 to 1 dl
      1364537379 ref 2 fl Rpc:/2/0 rc 0/0
      Mar 29 06:09:39 cn492 kernel: Lustre: 3004:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      Mar 29 06:09:39 cn492 kernel: Lustre: lustre_0-OST0027-osc-ffff81021c041800: Connection to service lustre_0-OST0027 via nid 10.1.4.120@tcp was l
      ost; in progress operations using this service will wait for recovery to complete.
      Mar 29 06:09:39 cn492 kernel: Lustre: Skipped 1 previous similar message
      Mar 29 06:09:39 cn492 kernel: Lustre: lustre_0-OST0027-osc-ffff81021c041800: Connection restored to service lustre_0-OST0027 using nid 10.1.4.12
      0@tcp.
      Mar 29 06:09:39 cn492 kernel: Lustre: Skipped 1 previous similar message
      Mar 29 06:20:10 cn492 kernel: LustreError: 3004:0:(osc_request.c:2357:brw_interpret()) ASSERTION(!(aa->aa_oa->o_valid & OBD_MD_FLHANDLE)) failed
      Mar 29 06:20:10 cn492 kernel: LustreError: 3004:0:(osc_request.c:2357:brw_interpret()) LBUG
      Mar 29 06:20:10 cn492 kernel: Pid: 3004, comm: ptlrpcd
      Mar 29 06:20:10 cn492 kernel:
      Mar 29 06:20:10 cn492 kernel: Call Trace:
      Mar 29 06:20:10 cn492 kernel: [<ffffffff885786a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff88578bda>] lbug_with_loc+0x7a/0xd0 [libcfs]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff88580fc0>] tracefile_init+0x0/0x110 [libcfs]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff8879c7e8>] brw_interpret+0x8e8/0xdb0 [osc]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff886d36ac>] after_reply+0xcac/0xe30 [ptlrpc]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff886d4b0b>] ptlrpc_check_set+0x12db/0x15a0 [ptlrpc]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff8004b396>] try_to_del_timer_sync+0x7f/0x88
      Mar 29 06:20:10 cn492 kernel: [<ffffffff887095ad>] ptlrpcd_check+0xdd/0x1f0 [ptlrpc]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff8009a98c>] process_timeout+0x0/0x5
      Mar 29 06:20:10 cn492 kernel: [<ffffffff88709ef1>] ptlrpcd+0x1b1/0x259 [ptlrpc]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff8008f3ad>] default_wake_function+0x0/0xe
      Mar 29 06:20:10 cn492 kernel: [<ffffffff8005dfc1>] child_rip+0xa/0x11
      Mar 29 06:20:10 cn492 kernel: [<ffffffff88709d40>] ptlrpcd+0x0/0x259 [ptlrpc]
      Mar 29 06:20:10 cn492 kernel: [<ffffffff8005dfb7>] child_rip+0x0/0x11
      Mar 29 06:20:10 cn492 kernel:
      Mar 29 06:20:10 cn492 kernel: LustreError: dumping log to /tmp/lustre-log.1364538010.3004

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              walker Christopher J. Walker (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: