[LU-7721] lfsck: slab 'size-1048576' exhaust memory (oom-killer) Created: 28/Jan/16  Updated: 29/Jan/16  Resolved: 29/Jan/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Blocker
Reporter: Frank Heckes (Inactive) Assignee: nasf (Inactive)
Resolution: Duplicate Votes: 0
Labels: soak
Environment:

lola
build: master branch, 2.7.65-38-g607f691 ; 607f6919ea67b101796630d4b55649a12ea0e859


Attachments: File console-lola-5.log.bz2     File lfsck-proc-list.bz2     File lola-5-oom-killer-2.tar.bz2     File lustre-log.1453949036.15397.bz2     File mds-lfsck-status-nslayout.log.bz2     File mds-lfsck-status-oi.log.bz2     File messages-lola-5.log.bz2     File oss-lfsck-status.log.bz2    
Issue Links:
Duplicate
duplicates LU-6923 writing process hung at txg_wait_open Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The error happened during soak testing of build '20160126' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160126). DNE is enabled.
MDTs had been formated with ldiskfs, OSTs with zfs.
No faults were injected during soak test. Only application load and execution of lfsck were imposed on the test cluster.

Sequence of events:

  • Jan 27 05:44:56 - Started lfsck - command on primary MDS (lola-8):
    lctl lfsck_start -M soaked-MDT0000 -s 1000 -t all -A 
    
  • Jan 27 05:49 - OSS node lola-5 hit LBUG (see LU-7720)
  • Jan 27 08:46 Rebooted lola-5, remounted OSTs, enabled debug for lfsck + increased debug buffer (512MB);
    increasing number of blocked ost_* - threads
    A huge number of debug logs were printed before oom-killer starts:
    Call Trace:
     [<ffffffff8106cc43>] ? dequeue_entity+0x113/0x2e0
     [<ffffffff8152bd26>] __mutex_lock_slowpath+0x96/0x210
     [<ffffffffa0fcbe7b>] ? ofd_seq_load+0xbb/0xa90 [ofd]
     [<ffffffff8152b84b>] mutex_lock+0x2b/0x50
     [<ffffffffa0fbff18>] ofd_create_hdl+0xc28/0x2640 [ofd]
     [<ffffffffa093a66b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc]
     [<ffffffffa093a7a6>] ? lustre_pack_reply_flags+0xa6/0x1e0 [ptlrpc]
     [<ffffffffa093a8f1>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
     [<ffffffffa09a4f9c>] tgt_request_handle+0x8ec/0x1470 [ptlrpc]
     [<ffffffffa094c201>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
     [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0
     [<ffffffffa094b3c0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
     [<ffffffff8109e78e>] kthread+0x9e/0xc0
     [<ffffffff8100c28a>] child_rip+0xa/0x20
     [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
     [<ffffffff8100c280>] ? child_rip+0x0/0x20
    
    LustreError: dumping log to /tmp/lustre-log.1453949036.15397
    Pid: 15443, comm: ll_ost00_065
    

    --> attached this debug log file (/tmp/lustre-log.1453949036.15397)

  • Jan 27 18:45 oom-killer started on OSS node lola-5 + crash 3 mins later
  • Memory exhausted by slab 'size-1048576' with ~ 27GB
    (see archive: lola-5-oom-killer-2.tar.bz2)
  • Jan 28 03:59 - lfsck - command still in not finished (see mds-lfsck-status-nslayout.log.bz2, mds-lfsck-status-oi.log.bz2, oss-lfsck-status.log.bz2)


 Comments   
Comment by nasf (Inactive) [ 28/Jan/16 ]

According to the log file messages-lola-5.log.bz2, there are 4 kinds of threads hung there:

1) out handlers for attr set. These RPC handlers are hit ASSERT() as described in LU-6656/LU-7720

Jan 27 08:46:18 lola-5 kernel: LustreError: 7807:0:(osd_object.c:925:osd_attr_set()) ASSERTION( dt_object_exists(dt) ) failed: 
Jan 27 08:46:18 lola-5 kernel: LustreError: 7807:0:(osd_object.c:925:osd_attr_set()) LBUG
Jan 27 08:46:18 lola-5 kernel: Pid: 7807, comm: ll_ost_out00_00
Jan 27 08:46:18 lola-5 kernel: 
Jan 27 08:46:18 lola-5 kernel: Call Trace:
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa05c5875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa05c5e77>] lbug_with_loc+0x47/0xb0 [libcfs]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa0b25af5>] osd_attr_set+0xdd5/0xe40 [osd_zfs]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa070e795>] ? keys_fill+0xd5/0x1b0 [obdclass]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa02da916>] ? spl_kmem_alloc+0x96/0x1a0 [spl]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa09b2033>] out_tx_attr_set_exec+0xa3/0x480 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa09a849a>] out_tx_end+0xda/0x5c0 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa09ae364>] out_handle+0x11c4/0x19a0 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffff8152b83e>] ? mutex_lock+0x1e/0x50
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa099d6fa>] ? req_can_reconstruct+0x6a/0x120 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa09a4f9c>] tgt_request_handle+0x8ec/0x1470 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa094c201>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0
Jan 27 08:46:18 lola-5 kernel: [<ffffffffa094b3c0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
Jan 27 08:46:18 lola-5 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0
Jan 27 08:46:18 lola-5 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Jan 27 08:46:18 lola-5 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
Jan 27 08:46:18 lola-5 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x2

2) IO server threads. They are blocked when start transaction on ZFS.

Jan 27 08:56:22 lola-5 kernel: LNet: Service thread pid 7776 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Jan 27 08:56:22 lola-5 kernel: Pid: 7776, comm: ll_ost_io01_003
Jan 27 08:56:22 lola-5 kernel: 
Jan 27 08:56:22 lola-5 kernel: Call Trace:
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8117591d>] ? kmem_cache_alloc_node_trace+0x1cd/0x200
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8109ee6e>] ? prepare_to_wait_exclusive+0x4e/0x80
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa02e178d>] cv_wait_common+0x11d/0x130 [spl]
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa02e17f5>] __cv_wait+0x15/0x20 [spl]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa0356d4d>] dmu_tx_wait+0x21d/0x400 [zfs]
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8152b844>] ? mutex_lock+0x24/0x50
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa0357121>] dmu_tx_assign+0xa1/0x570 [zfs]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa0b18f5d>] osd_trans_start+0xed/0x430 [osd_zfs]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa0fccc4c>] ofd_trans_start+0x7c/0x100 [ofd]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa0fd4993>] ofd_commitrw_write+0x583/0x10f0 [ofd]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa0fd5abf>] ofd_commitrw+0x5bf/0xbf0 [ofd]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa06ec921>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa099cfd4>] obd_commitrw+0x114/0x380 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa09a6790>] tgt_brw_write+0xc70/0x1540 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8105c256>] ? enqueue_task+0x66/0x80
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8105872d>] ? check_preempt_curr+0x6d/0x90
Jan 27 08:56:22 lola-5 kernel: [<ffffffff81064a6e>] ? try_to_wake_up+0x24e/0x3e0
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa093e070>] ? lustre_swab_niobuf_remote+0x0/0x30 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa08f35d0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa09a4f9c>] tgt_request_handle+0x8ec/0x1470 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa094c201>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0
Jan 27 08:56:22 lola-5 kernel: [<ffffffffa094b3c0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
Jan 27 08:56:22 lola-5 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20

3) OUT handler for orphan clean up. They are blocked when start transaction on ZFS.

Jan 27 17:51:32 lola-5 kernel: Pid: 8262, comm: ll_ost02_051
Jan 27 17:51:32 lola-5 kernel: 
Jan 27 17:51:32 lola-5 kernel: Call Trace:
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa02e178d>] cv_wait_common+0x11d/0x130 [spl]
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa02e17f5>] __cv_wait+0x15/0x20 [spl]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa039877b>] txg_wait_open+0x8b/0xd0 [zfs]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa0356f27>] dmu_tx_wait+0x3f7/0x400 [zfs]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa036b5da>] ? dsl_dir_tempreserve_space+0xca/0x190 [zfs]
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8152b83e>] ? mutex_lock+0x1e/0x50
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa0357121>] dmu_tx_assign+0xa1/0x570 [zfs]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa0b18f5d>] osd_trans_start+0xed/0x430 [osd_zfs]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa0fcef10>] ofd_object_destroy+0x280/0x8e0 [ofd]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa0fc8f0d>] ofd_destroy_by_fid+0x35d/0x620 [ofd]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa090abc0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa090c530>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa05d0d01>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa0fc0d13>] ofd_create_hdl+0x1a23/0x2640 [ofd]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa093a66b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa09a4f9c>] tgt_request_handle+0x8ec/0x1470 [ptlrpc]
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa094c201>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0
Jan 27 17:51:32 lola-5 kernel: [<ffffffffa094b3c0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
Jan 27 17:51:32 lola-5 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20

4) OUT handler for create. They are blocked at the mutex_lock().

Jan 27 17:59:44 lola-5 kernel: [<ffffffff8152bd26>] __mutex_lock_slowpath+0x96/0x210
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa0fcbe7b>] ? ofd_seq_load+0xbb/0xa90 [ofd]
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8152b84b>] mutex_lock+0x2b/0x50
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa0fbff18>] ofd_create_hdl+0xc28/0x2640 [ofd]
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa093a66b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc]
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa093a7a6>] ? lustre_pack_reply_flags+0xa6/0x1e0 [ptlrpc]
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa093a8f1>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa09a4f9c>] tgt_request_handle+0x8ec/0x1470 [ptlrpc]
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa094c201>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8106c560>] ? pick_next_task_fair+0xd0/0x130
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8152a126>] ? schedule+0x176/0x3a0
Jan 27 17:59:44 lola-5 kernel: [<ffffffffa094b3c0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
Jan 27 17:59:44 lola-5 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20

Because of more and more RPC service threads are blocked, then more and more RAM are occupied, as to OOM finally. Since the 1)/2)/3) cases may held some mutex when blocked, it is not strange that the 4) case happened. So the key issue is not why mutex is held, but why the ZFS transactions hang there.

According to the stack info, it looks similar as LU-6923.

Comment by Joseph Gmitter (Inactive) [ 28/Jan/16 ]

Hi Fan Yong,
Can you take care of this one?
Thanks.
Joe

Comment by nasf (Inactive) [ 29/Jan/16 ]

As explained above, the key issue is that the zfs transactions were blocked for some unknown reason. According to the kernel stack trace, it should be another failure instance of LU-6923 which has no solution yet.

Comment by nasf (Inactive) [ 29/Jan/16 ]

It is another failure instance of LU-6923.

Generated at Sat Feb 10 02:11:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.