Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9579

LBUG: (osc_page.c:433:osc_page_init()) ASSERTION( result == 0 )

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Lustre 2.10.0
    • Labels:
      None
    • Epic/Theme:
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      Lustre client lbugs in osc_page_init when job processes are killed due to cgroup being out of memory. This LBUG occurred on 14 nodes during a recent relrun.

      > 2017-05-22T16:41:07.320315-05:00 c0-0c1s6n0 LustreError: 15485:0:(osc_page.c:433:osc_page_init()) ASSERTION( result == 0 ) failed:
      > 2017-05-22T16:41:07.320393-05:00 c0-0c1s6n0 Killed process 15246 (namu.exe.6GB_pe) apid 471027 total-vm:8968944kB, anon-rss:5203772kB, file-rss:12kB, shmem-rss:1828kB
      > 2017-05-22T16:41:07.320398-05:00 c0-0c1s6n0 Memory cgroup out of memory: Killed 15 processes sharing cpu group with pid 15246.
      > 2017-05-22T16:41:07.320404-05:00 c0-0c1s6n0 LustreError: 15485:0:(osc_page.c:433:osc_page_init()) LBUG
      > 2017-05-22T16:41:07.320409-05:00 c0-0c1s6n0 Pid: 15485, comm: namu.exe.6GB_pe
      
      > PID: 15485  TASK: ffff8816cf566980  CPU: 55  COMMAND: "namu.exe.6GB_pe"
      >  #0 [ffff8816cf56b908] panic at ffffffff8114670e
      >  #1 [ffff8816cf56b980] lbug_with_loc at ffffffffa026aead [libcfs]
      >  #2 [ffff8816cf56b9a0] osc_page_init at ffffffffa09f9e12 [osc]
      >  #3 [ffff8816cf56b9e0] lov_page_init_raid0 at ffffffffa084199b [lov]
      >  #4 [ffff8816cf56ba38] lov_page_init at ffffffffa083a34c [lov]
      >  #5 [ffff8816cf56ba48] cl_page_alloc at ffffffffa0559bf2 [obdclass]
      >  #6 [ffff8816cf56ba88] cl_page_find at ffffffffa0559e1f [obdclass]
      >  #7 [ffff8816cf56bad8] ll_readpage at ffffffffa09031c9 [lustre]
      >  #8 [ffff8816cf56bbe8] filemap_fault at ffffffff8114b5db
      >  #9 [ffff8816cf56bc58] vvp_io_fault_start at ffffffffa093258e [lustre]
      > #10 [ffff8816cf56bcc8] cl_io_start at ffffffffa055cfae [obdclass]
      > #11 [ffff8816cf56bcf0] cl_io_loop at ffffffffa056036e [obdclass]
      > #12 [ffff8816cf56bd20] ll_fault at ffffffffa09137e4 [lustre]
      > #13 [ffff8816cf56bd98] __do_fault at ffffffff81175abe
      > #14 [ffff8816cf56be00] handle_mm_fault at ffffffff81179528
      > #15 [ffff8816cf56bee0] __do_page_fault at ffffffff81048de9
      > #16 [ffff8816cf56bf40] do_page_fault at ffffffff8104904c
      > #17 [ffff8816cf56bf50] page_fault at ffffffff81506a62
      >     RIP: 0000000000415702  RSP: 00002aab20a00480  RFLAGS: 00010202
      >     RAX: 0000000000000280  RBX: 000000000000104b  RCX: 0000000000005008
      >     RDX: 0000000000000f02  RSI: 000000000517f258  RDI: 0000000000000280
      >     RBP: 00002aab20a00670   R8: 0000000106ec9118   R9: 0000000000000781
      >     R10: 0000000000000280  R11: 0000000101d49ec0  R12: 000000000cdabab8
      >     R13: 0000000000000280  R14: 00000000000013c2  R15: 0000000007c2c860
      >     ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b
      

        Attachments

          Activity

            People

            • Assignee:
              wc-triage WC Triage
              Reporter:
              aboyko Alexander Boyko
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: