Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3274

osc_cache.c:1774:osc_dec_unstable_pages()) ASSERTION( atomic_read(&cli->cl_cache->ccc_unstable_nr) >= 0 ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0
    • Lustre 2.4.0
    • None
    • 3
    • 8111

    Description

      Hit this running replay-single in a loop

      [123247.989106] Lustre: DEBUG MARKER: == replay-single test 87b: write replay with changed data (checksum resend) == 00:41:34 (1367728894)
      [123248.537768] Turning device loop1 (0x700001) read-only
      [123248.562834] Lustre: DEBUG MARKER: ost1 REPLAY BARRIER on lustre-OST0000
      [123248.580423] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-OST0000
      [123250.719913] Lustre: DEBUG MARKER: cancel_lru_locks osc start
      [123250.880677] Lustre: DEBUG MARKER: cancel_lru_locks osc stop
      [123251.389028] Removing read-only on unknown block (0x700001)
      [123263.314546] LDISKFS-fs (loop1): recovery complete
      [123263.357806] LDISKFS-fs (loop1): mounted filesystem with ordered data mode. quota=on. Opts: 
      [123268.399242] LustreError: 168-f: BAD WRITE CHECKSUM: lustre-OST0000 from 12345-0@lo inode [0x2000061c0:0x5:0x0] object 0x0:7458 extent [0-1048575]: client csum 7945acf6, server csum b9e6e441
      [123268.477228] LustreError: 17404:0:(osc_cache.c:1774:osc_dec_unstable_pages()) ASSERTION( atomic_read(&cli->cl_cache->ccc_unstable_nr) >= 0 ) failed: 
      [123268.477757] LustreError: 17404:0:(osc_cache.c:1774:osc_dec_unstable_pages()) LBUG
      [123268.478168] Pid: 17404, comm: ptlrpcd_rcv
      [123268.478381] 
      [123268.478381] Call Trace:
      [123268.478752]  [<ffffffffa0e018a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      [123268.479019]  [<ffffffffa0e01ea7>] lbug_with_loc+0x47/0xb0 [libcfs]
      [123268.479981]  [<ffffffffa0458bcc>] osc_dec_unstable_pages+0x12c/0x190 [osc]
      [123268.480270]  [<ffffffffa114d76b>] ptlrpc_free_committed+0x14b/0x620 [ptlrpc]
      [123268.480593]  [<ffffffffa114f4e3>] after_reply+0x7a3/0xd90 [ptlrpc]
      [123268.480871]  [<ffffffffa1154493>] ptlrpc_check_set+0x1093/0x1da0 [ptlrpc]
      [123268.481150]  [<ffffffffa1180e2b>] ptlrpcd_check+0x55b/0x590 [ptlrpc]
      [123268.481420]  [<ffffffffa1181373>] ptlrpcd+0x233/0x390 [ptlrpc]
      [123268.481665]  [<ffffffff8105ad10>] ? default_wake_function+0x0/0x20
      [123268.481963]  [<ffffffffa1181140>] ? ptlrpcd+0x0/0x390 [ptlrpc]
      [123268.482220]  [<ffffffff8100c10a>] child_rip+0xa/0x20
      [123268.482465]  [<ffffffffa1181140>] ? ptlrpcd+0x0/0x390 [ptlrpc]
      [123268.482733]  [<ffffffffa1181140>] ? ptlrpcd+0x0/0x390 [ptlrpc]
      [123268.482978]  [<ffffffff8100c100>] ? child_rip+0x0/0x20
      [123268.483209] 
      [123268.543751] Kernel panic - not syncing: LBUG
      

      This is somewhat current master plust 3 more lu-2139 patches applied on top.
      I have a crashdump in /exports/crashdumps/192.168.10.221-2013-05-05-00\:41\:57/
      code tag: master-20130505

      Attachments

        Issue Links

          Activity

            People

              jay Jinshan Xiong (Inactive)
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: