[LU-3680] OOM crash: null_alloc_rs()) ASSERTION( rs->rs_size >= rs_size ) failed Created: 01/Aug/13 Updated: 16/Mar/16 Resolved: 19/Jan/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3, Lustre 2.8.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Oleg Drokin | Assignee: | Oleg Drokin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9502 | ||||||||||||||||||||
| Description |
|
Hit this running sanity in a loop: <4>[80900.195000] ldlm_cn01_000: page allocation failure. order:1, mode:0x40 <4>[80900.195002] Pid: 17587, comm: ldlm_cn01_000 Not tainted 2.6.32-rhe6.4-debug #2 <4>[80900.195003] Call Trace: <4>[80900.195008] [<ffffffff8112a666>] ? __alloc_pages_nodemask+0x7c6/0x980 <4>[80900.195011] [<ffffffff811658f2>] ? kmem_getpages+0x62/0x170 <4>[80900.195013] [<ffffffff8116834a>] ? fallback_alloc+0x1ba/0x270 <4>[80900.195015] [<ffffffff81167bf7>] ? cache_grow+0x4d7/0x520 <4>[80900.195017] [<ffffffff81168038>] ? ____cache_alloc_node+0xa8/0x200 <4>[80900.195018] [<ffffffff81168943>] ? kmem_cache_alloc_trace+0x1c3/0x250 <4>[80900.195029] [<ffffffffa099cbc5>] ? osd_key_init+0x25/0x4e0 [osd_ldiskfs] <4>[80900.195035] [<ffffffffa099cbc5>] ? osd_key_init+0x25/0x4e0 [osd_ldiskfs] <4>[80900.195060] [<ffffffffa0bdd27f>] ? keys_fill+0x6f/0x190 [obdclass] <4>[80900.195090] [<ffffffffa0be132e>] ? lu_context_init+0x4e/0x240 [obdclass] <4>[80900.195109] [<ffffffffa0be1383>] ? lu_context_init+0xa3/0x240 [obdclass] <4>[80900.195111] [<ffffffff811665be>] ? cache_free_debugcheck+0x2ae/0x360 <4>[80900.195130] [<ffffffffa0be153e>] ? lu_env_init+0x1e/0x30 [obdclass] <4>[80900.195140] [<ffffffffa0e3d69a>] ? ofd_lvbo_update+0x7a/0xea8 [ofd] <4>[80900.195164] [<ffffffffa04ac434>] ? ldlm_resource_putref+0x1d4/0x280 [ptlrpc] <4>[80900.195186] [<ffffffffa04c97b7>] ? ldlm_request_cancel+0x247/0x410 [ptlrpc] <4>[80900.195206] [<ffffffffa04c9abd>] ? ldlm_handle_cancel+0x13d/0x240 [ptlrpc] <4>[80900.195226] [<ffffffffa04cefb9>] ? ldlm_cancel_handler+0x1e9/0x500 [ptlrpc] <4>[80900.195250] [<ffffffffa04ffad1>] ? ptlrpc_server_handle_request+0x3b1/0xc70 [ptlrpc] <4>[80900.195260] [<ffffffffa0a2355e>] ? cfs_timer_arm+0xe/0x10 [libcfs] <4>[80900.195270] [<ffffffffa0a34b6f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] <4>[80900.195340] [<ffffffffa04f6bb1>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] <4>[80900.195345] [<ffffffff81054613>] ? __wake_up+0x53/0x70 <4>[80900.195367] [<ffffffffa0500db2>] ? ptlrpc_main+0xa22/0x1650 [ptlrpc] <4>[80900.195437] [<ffffffffa0500390>] ? ptlrpc_main+0x0/0x1650 [ptlrpc] <4>[80900.195441] [<ffffffff81094606>] ? kthread+0x96/0xa0 <4>[80900.195444] [<ffffffff8100c10a>] ? child_rip+0xa/0x20 <4>[80900.195447] [<ffffffff81094570>] ? kthread+0x0/0xa0 <4>[80900.195448] [<ffffffff8100c100>] ? child_rip+0x0/0x20 <6>[80900.195449] Mem-Info: <4>[80900.195450] Node 0 DMA per-cpu: <4>[80900.195451] CPU 0: hi: 0, btch: 1 usd: 0 <4>[80900.195452] CPU 1: hi: 0, btch: 1 usd: 0 <4>[80900.195453] CPU 2: hi: 0, btch: 1 usd: 0 <4>[80900.195454] CPU 3: hi: 0, btch: 1 usd: 0 <4>[80900.195455] CPU 4: hi: 0, btch: 1 usd: 0 <4>[80900.195456] CPU 5: hi: 0, btch: 1 usd: 0 <4>[80900.195458] CPU 6: hi: 0, btch: 1 usd: 0 <4>[80900.195459] CPU 7: hi: 0, btch: 1 usd: 0 <4>[80900.195459] Node 0 DMA32 per-cpu: <4>[80900.195460] CPU 0: hi: 186, btch: 31 usd: 51 <4>[80900.195461] CPU 1: hi: 186, btch: 31 usd: 26 <4>[80900.195462] CPU 2: hi: 186, btch: 31 usd: 0 <4>[80900.195463] CPU 3: hi: 186, btch: 31 usd: 0 <4>[80900.195464] CPU 4: hi: 186, btch: 31 usd: 57 <4>[80900.195465] CPU 5: hi: 186, btch: 31 usd: 174 <4>[80900.195466] CPU 6: hi: 186, btch: 31 usd: 162 <4>[80900.195467] CPU 7: hi: 186, btch: 31 usd: 32 <4>[80900.195470] active_anon:61548 inactive_anon:61459 isolated_anon:0 <4>[80900.195470] active_file:94797 inactive_file:74222 isolated_file:0 <4>[80900.195471] unevictable:0 dirty:20 writeback:0 unstable:0 <4>[80900.195471] free:43025 slab_reclaimable:75111 slab_unreclaimable:271092 <4>[80900.195472] mapped:577 shmem:119300 pagetables:383 bounce:0 <4>[80900.195473] Node 0 DMA free:9692kB min:136kB low:168kB high:204kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:9296kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes <4>[80900.195478] lowmem_reserve[]: 0 2967 2967 2967 <4>[80900.195479] Node 0 DMA32 free:162408kB min:44916kB low:56144kB high:67372kB active_anon:246192kB inactive_anon:245836kB active_file:379188kB inactive_file:296888kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3039076kB mlocked:0kB dirty:80kB writeback:0kB mapped:2308kB shmem:477200kB slab_reclaimable:300444kB slab_unreclaimable:1084368kB kernel_stack:3296kB pagetables:1532kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no <4>[80900.195485] lowmem_reserve[]: 0 0 0 0 <4>[80900.195486] Node 0 DMA: 3*4kB 0*8kB 3*16kB 1*32kB 2*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 2*4096kB = 9692kB <4>[80900.195490] Node 0 DMA32: 37378*4kB 1032*8kB 32*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 162408kB <4>[80900.195495] 129723 total pagecache pages <4>[80900.195496] 925 pages in swap cache <4>[80900.195497] Swap cache stats: add 376365, delete 375440, find 1323456/1328202 <4>[80900.195498] Free swap = 1869104kB <4>[80900.195499] Total swap = 2097144kB <6>[80900.198861] 774396 pages RAM <6>[80900.198861] 38583 pages reserved <6>[80900.198861] 11942 pages shared <6>[80900.198861] 675636 pages non-shared <4>[80900.226747] 129650 total pagecache pages <4>[80900.226747] 1136 pages in swap cache <4>[80900.226747] Swap cache stats: add 376650, delete 375514, find 1323456/1328202 <4>[80900.226747] Free swap = 1867964kB <4>[80900.226747] Total swap = 2097144kB <6>[80900.226747] 774396 pages RAM <6>[80900.226747] 38583 pages reserved <6>[80900.226747] 11963 pages shared <6>[80900.226747] 668761 pages non-shared <0>[80900.502883] LustreError: 17604:0:(sec_null.c:318:null_alloc_rs()) ASSERTION( rs->rs_size >= rs_size ) failed: <0>[80900.504111] LustreError: 17604:0:(sec_null.c:318:null_alloc_rs()) LBUG <4>[80900.504782] Pid: 17604, comm: mdt01_002 <4>[80900.505352] <4>[80900.505353] Call Trace: <4>[80900.506312] [<ffffffffa0a228a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4>[80900.507011] [<ffffffffa0a22ea7>] lbug_with_loc+0x47/0xb0 [libcfs] <4>[80900.507716] [<ffffffffa052a382>] null_alloc_rs+0x272/0x390 [ptlrpc] <4>[80900.508419] [<ffffffffa0518f19>] sptlrpc_svc_alloc_rs+0x1d9/0x2a0 [ptlrpc] <4>[80900.509166] [<ffffffffa04ef218>] lustre_pack_reply_v2+0x98/0x2a0 [ptlrpc] <4>[80900.509906] [<ffffffffa04ef4ce>] lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] <4>[80900.510935] [<ffffffffa04ef621>] lustre_pack_reply+0x11/0x20 [ptlrpc] <4>[80900.511642] [<ffffffffa0516603>] req_capsule_server_pack+0x53/0x100 [ptlrpc] <4>[80900.513248] [<ffffffffa0d472e5>] mdt_getxattr+0x585/0x13c0 [mdt] <4>[80900.514017] [<ffffffffa0d2570e>] mdt_intent_getxattr+0x9e/0x160 [mdt] <4>[80900.514572] [<ffffffffa0d2265e>] mdt_intent_policy+0x3ae/0x770 [mdt] <4>[80900.515391] [<ffffffffa04a735a>] ldlm_lock_enqueue+0x2ea/0x860 [ptlrpc] <4>[80900.516099] [<ffffffffa04cfc7f>] ldlm_handle_enqueue0+0x4ef/0x10c0 [ptlrpc] <4>[80900.518542] [<ffffffffa0d22b26>] mdt_enqueue+0x46/0xe0 [mdt] <4>[80900.519098] [<ffffffffa0d28ca7>] mdt_handle_common+0x647/0x16d0 [mdt] <4>[80900.519529] [<ffffffffa0d63335>] mds_regular_handle+0x15/0x20 [mdt] <4>[80900.519959] [<ffffffffa04ffad1>] ptlrpc_server_handle_request+0x3b1/0xc70 [ptlrpc] <4>[80900.520740] [<ffffffffa0a2355e>] ? cfs_timer_arm+0xe/0x10 [libcfs] <4>[80900.521190] [<ffffffffa0a34b6f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] <4>[80900.521644] [<ffffffffa04f6bb1>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] <4>[80900.522146] [<ffffffff81054613>] ? __wake_up+0x53/0x70 <4>[80900.522670] [<ffffffffa0500db2>] ptlrpc_main+0xa22/0x1650 [ptlrpc] <4>[80900.523097] [<ffffffffa0500390>] ? ptlrpc_main+0x0/0x1650 [ptlrpc] <4>[80900.523532] [<ffffffff81094606>] kthread+0x96/0xa0 <4>[80900.526690] [<ffffffff8100c10a>] child_rip+0xa/0x20 <4>[80900.527224] [<ffffffff81094570>] ? kthread+0x0/0xa0 <4>[80900.527603] [<ffffffff8100c100>] ? child_rip+0x0/0x20 <4>[80900.528010] <0>[80900.528950] Kernel panic - not syncing: LBUG |
| Comments |
| Comment by Oleg Drokin [ 08/Aug/13 ] |
|
At the very minimum we need to fix the error handling for this allocation. |
| Comment by Keith Mannthey (Inactive) [ 12/Aug/13 ] |
|
Part of the issue is we never check the return state of the call and seem to just pass it along. This svc->srv_ops.so_req_handler(request) should be failing with ENOMEM, but we don't check and just call ptlrpc_rqphase_move like everything is fine. rc = svc->srv_ops.so_req_handler(request);
ptlrpc_rqphase_move(request, RQ_PHASE_COMPLETE);
put_conn:
The only mention of rc to the end of the function is an log message. I will make a patch to spurn conversation it seems we only return 0 or 1 out of this function. |
| Comment by Keith Mannthey (Inactive) [ 12/Aug/13 ] |
|
Hmm I am not sure these are 2 backtraces are 100% related as they are from different threads but there is an odd things I see on the lbug error call chain that confuses me a bit. In: int lustre_pack_reply_v2(struct ptlrpc_request *req, int count,
__u32 *lens, char **bufs, int flags)
{
struct ptlrpc_reply_state *rs;
int msg_len, rc;
ENTRY;
LASSERT(req->rq_reply_state == NULL); <==== LASSERT rq_reply_state is NULL
if ((flags & LPRFL_EARLY_REPLY) == 0) {
spin_lock(&req->rq_lock);
req->rq_packed_final = 1;
spin_unlock(&req->rq_lock);
}
msg_len = lustre_msg_size_v2(count, lens);
rc = sptlrpc_svc_alloc_rs(req, msg_len); <==== pass that same req on.
sptlrpc_svc_alloc_rs does a little safety check and calls this: static
int null_alloc_rs(struct ptlrpc_request *req, int msgsize)
{
struct ptlrpc_reply_state *rs;
int rs_size = sizeof(*rs) + msgsize;
LASSERT(msgsize % 8 == 0);
rs = req->rq_reply_state; <=== LASSERTED as NULL just a little while ago.
if (rs) {
/* pre-allocated */
LASSERT(rs->rs_size >= rs_size); <==== we hit this and I am confused as to why.
Is some separate thread acting on this request at the same time? You don't happen to have the a Lustre Debug log hanging around do you Oleg? |
| Comment by Andreas Dilger [ 20/Sep/13 ] |
|
I'm more concerned about why running sanity.sh in a loop is running out of memory? |
| Comment by Li Xi (Inactive) [ 29/Sep/13 ] |
|
We hit the same LBUG on MDS too. Maybe service->srv_max_reply_size is not big enough to hold the reply message? When memory is under pressure and the buffer is returned by lustre_get_emerg_rs, this LBUG happens. |
| Comment by Oleg Drokin [ 05/Oct/13 ] |
|
here's a debug log from recent instance. The crash info is: <4>[19273.725502] mdt01_000: page allocation failure. order:2, mode:0x40 <4>[19273.726180] Pid: 14141, comm: mdt01_000 Not tainted 2.6.32-rhe6.4-debug #2 <4>[19273.726873] Call Trace: <4>[19273.727380] [<ffffffff8112a666>] ? __alloc_pages_nodemask+0x7c6/0x980 <4>[19273.728063] [<ffffffff8127f456>] ? vsnprintf+0x336/0x5e0 <4>[19273.728697] [<ffffffff811658f2>] ? kmem_getpages+0x62/0x170 <4>[19273.729334] [<ffffffff8116834a>] ? fallback_alloc+0x1ba/0x270 <4>[19273.729334] [<ffffffff81167bf7>] ? cache_grow+0x4d7/0x520 <4>[19273.729334] [<ffffffff81168038>] ? ____cache_alloc_node+0xa8/0x200 <4>[19273.729334] [<ffffffff811686e8>] ? __kmalloc+0x208/0x2a0 <4>[19273.729334] [<ffffffffa118d515>] ? null_alloc_rs+0xc5/0x390 [ptlrpc] <4>[19273.729334] [<ffffffffa118d515>] ? null_alloc_rs+0xc5/0x390 [ptlrpc] <4>[19273.729334] [<ffffffffa117c484>] ? sptlrpc_svc_alloc_rs+0x74/0x2a0 [ptlrpc] <4>[19273.729334] [<ffffffffa1153678>] ? lustre_pack_reply_v2+0x98/0x2a0 [ptlrpc] <4>[19273.729334] [<ffffffffa115392e>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] <4>[19273.738776] [<ffffffffa1153a81>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] <4>[19273.738776] [<ffffffffa1179c93>] ? req_capsule_server_pack+0x53/0x100 [ptlrpc] <4>[19273.738776] [<ffffffffa05a6765>] ? mdt_getxattr+0x545/0x1490 [mdt] <4>[19273.738776] [<ffffffffa05847be>] ? mdt_intent_getxattr+0x9e/0x160 [mdt] <4>[19273.738776] [<ffffffffa0581e2e>] ? mdt_intent_policy+0x3ae/0x770 [mdt] <4>[19273.738776] [<ffffffffa110b34a>] ? ldlm_lock_enqueue+0x2ea/0x860 [ptlrpc] <4>[19273.738776] [<ffffffffa1133c6f>] ? ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] <4>[19273.738776] [<ffffffffa05822f6>] ? mdt_enqueue+0x46/0xe0 [mdt] <4>[19273.738776] [<ffffffffa058897a>] ? mdt_handle_common+0x52a/0x1470 [mdt] <4>[19273.738776] [<ffffffffa05c2005>] ? mds_regular_handle+0x15/0x20 [mdt] <4>[19273.738776] [<ffffffffa11633f5>] ? ptlrpc_server_handle_request+0x395/0xc20 [ptlrpc] <4>[19273.738776] [<ffffffffa0eca35f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] <4>[19273.738776] [<ffffffffa115ad61>] ? ptlrpc_wait_event+0xc1/0x2e0 [ptlrpc] <4>[19273.738776] [<ffffffffa11646da>] ? ptlrpc_main+0xa5a/0x1690 [ptlrpc] <4>[19273.738776] [<ffffffffa1163c80>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] <4>[19273.738776] [<ffffffff81094606>] ? kthread+0x96/0xa0 <4>[19273.738776] [<ffffffff8100c10a>] ? child_rip+0xa/0x20 <4>[19273.738776] [<ffffffff81094570>] ? kthread+0x0/0xa0 <4>[19273.738776] [<ffffffff8100c100>] ? child_rip+0x0/0x20 <6>[19273.738776] Mem-Info: <4>[19273.738776] Node 0 DMA per-cpu: <4>[19273.738776] CPU 0: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 1: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 2: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 3: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 4: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 5: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 6: hi: 0, btch: 1 usd: 0 <4>[19273.738776] CPU 7: hi: 0, btch: 1 usd: 0 <4>[19273.738776] Node 0 DMA32 per-cpu: <4>[19273.738776] CPU 0: hi: 186, btch: 31 usd: 33 <4>[19273.738776] CPU 1: hi: 186, btch: 31 usd: 172 <4>[19273.738776] CPU 2: hi: 186, btch: 31 usd: 147 <4>[19273.738776] CPU 3: hi: 186, btch: 31 usd: 154 <4>[19273.738776] CPU 4: hi: 186, btch: 31 usd: 126 <4>[19273.738776] CPU 5: hi: 186, btch: 31 usd: 33 <4>[19273.738776] CPU 6: hi: 186, btch: 31 usd: 155 <4>[19273.738776] CPU 7: hi: 186, btch: 31 usd: 123 <4>[19273.738776] active_anon:96976 inactive_anon:80133 isolated_anon:0 <4>[19273.738776] active_file:49462 inactive_file:38723 isolated_file:0 <4>[19273.738776] unevictable:0 dirty:1065 writeback:0 unstable:0 <4>[19273.738776] free:25017 slab_reclaimable:77546 slab_unreclaimable:312838 <4>[19273.738776] mapped:625 shmem:174432 pagetables:389 bounce:0 <4>[19273.738776] Node 0 DMA free:9692kB min:136kB low:168kB high:204kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:9296kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes <4>[19273.738776] lowmem_reserve[]: 0 2967 2967 2967 <4>[19273.738776] Node 0 DMA32 free:91616kB min:44916kB low:56144kB high:67372kB active_anon:387904kB inactive_anon:320532kB active_file:196476kB inactive_file:154892kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3039076kB mlocked:0kB dirty:4260kB writeback:0kB mapped:2500kB shmem:697728kB slab_reclaimable:310184kB slab_unreclaimable:1251744kB kernel_stack:3304kB pagetables:1556kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no <4>[19273.738776] lowmem_reserve[]: 0 0 0 0 <4>[19273.738776] Node 0 DMA: 3*4kB 0*8kB 3*16kB 1*32kB 2*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 2*4096kB = 9692kB <4>[19273.738776] Node 0 DMA32: 10341*4kB 4432*8kB 23*16kB 19*32kB 10*64kB 2*128kB 6*256kB 2*512kB 2*1024kB 2*2048kB 1*4096kB = 91492kB <4>[19273.738776] 232691 total pagecache pages <4>[19273.738776] 53 pages in swap cache <4>[19273.738776] Swap cache stats: add 1502, delete 1449, find 18/20 <4>[19273.738776] Free swap = 2091200kB <4>[19273.738776] Total swap = 2097144kB <6>[19273.738776] 774396 pages RAM <6>[19273.738776] 38583 pages reserved <6>[19273.738776] 15526 pages shared <6>[19273.738776] 689850 pages non-shared <0>[19273.940246] LustreError: 14141:0:(sec_null.c:318:null_alloc_rs()) ASSERTION( rs->rs_size >= rs_size ) failed: <0>[19273.941315] LustreError: 14141:0:(sec_null.c:318:null_alloc_rs()) LBUG <4>[19273.941915] Pid: 14141, comm: mdt01_000 <4>[19273.942396] <4>[19273.942396] Call Trace: <4>[19273.945170] [<ffffffffa0eb88a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4>[19273.945793] [<ffffffffa0eb8ea7>] lbug_with_loc+0x47/0xb0 [libcfs] <4>[19273.950476] [<ffffffffa118d6c2>] null_alloc_rs+0x272/0x390 [ptlrpc] <4>[19273.951103] [<ffffffffa117c5e9>] sptlrpc_svc_alloc_rs+0x1d9/0x2a0 [ptlrpc] <4>[19273.955634] [<ffffffffa1153678>] lustre_pack_reply_v2+0x98/0x2a0 [ptlrpc] <4>[19273.956346] [<ffffffffa115392e>] lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] <4>[19273.957072] [<ffffffffa1153a81>] lustre_pack_reply+0x11/0x20 [ptlrpc] <4>[19273.957780] [<ffffffffa1179c93>] req_capsule_server_pack+0x53/0x100 [ptlrpc] <4>[19273.958495] [<ffffffffa05a6765>] mdt_getxattr+0x545/0x1490 [mdt] <4>[19273.970919] [<ffffffffa05847be>] mdt_intent_getxattr+0x9e/0x160 [mdt] <4>[19273.971619] [<ffffffffa0581e2e>] mdt_intent_policy+0x3ae/0x770 [mdt] <4>[19273.972322] [<ffffffffa110b34a>] ldlm_lock_enqueue+0x2ea/0x860 [ptlrpc] <4>[19273.973035] [<ffffffffa1133c6f>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] <4>[19273.973881] [<ffffffffa05822f6>] mdt_enqueue+0x46/0xe0 [mdt] <4>[19273.995450] [<ffffffffa058897a>] mdt_handle_common+0x52a/0x1470 [mdt] <4>[19273.996181] [<ffffffffa05c2005>] mds_regular_handle+0x15/0x20 [mdt] <4>[19274.008857] [<ffffffffa11633f5>] ptlrpc_server_handle_request+0x395/0xc20 [ptlrpc] <4>[19274.011971] [<ffffffffa0eca35f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] <4>[19274.012692] [<ffffffffa115ad61>] ? ptlrpc_wait_event+0xc1/0x2e0 [ptlrpc] <4>[19274.018425] [<ffffffffa11646da>] ptlrpc_main+0xa5a/0x1690 [ptlrpc] <4>[19274.019129] [<ffffffffa1163c80>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] <4>[19274.019808] [<ffffffff81094606>] kthread+0x96/0xa0 <4>[19274.020415] [<ffffffff8100c10a>] child_rip+0xa/0x20 <4>[19274.021025] [<ffffffff81094570>] ? kthread+0x0/0xa0 <4>[19274.021639] [<ffffffff8100c100>] ? child_rip+0x0/0x20 <4>[19274.022265] <0>[19274.095286] Kernel panic - not syncing: LBUG I also have a crashdump. |
| Comment by Patrick Farrell (Inactive) [ 05/Nov/13 ] |
|
Hit this twice in the last few days on an MDS running master from Nov. 1st during testing for another bug. I've got dumps of both events if they'd be of any use. (Logging was set to default levels) The stack traces look the same for both of them, here's one: |
| Comment by Patrick Farrell (Inactive) [ 06/Nov/13 ] |
|
Just occurred to me I should mention: I can reproduce this fairly consistently. If requested, I can try to do this with debugging turned up. |
| Comment by Keith Mannthey (Inactive) [ 06/Nov/13 ] |
|
Patrick. What is the memory usage like when you hit this error? |
| Comment by Patrick Farrell (Inactive) [ 06/Nov/13 ] |
|
Keith, Very high - Memory pressure, definitely. Not actually OOM, but as low as it can go (At the point where the kernel starts freeing memory and I see memory usage bounce around in the tens of MB free.). I've been working around it (still testing my other, unrelated bug) by adding more RAM to the virtual machine running the MDS and also dumping caches periodically to keep from going OOM. That's worked for me. |
| Comment by Keith Mannthey (Inactive) [ 06/Nov/13 ] |
|
Low memory is a tricky space in Lustre. Are you seeing other errors or just this one? |
| Comment by Patrick Farrell (Inactive) [ 07/Nov/13 ] |
|
This is the only one I've seen on the MDS. However, I've seen this one at least three times, and can reproduce it readily, so it may be obscuring others. On the client, we're chasing a recurrence of |
| Comment by Li Xi (Inactive) [ 07/Nov/13 ] |
|
Hi Patrick, Would you please apply following patch and try to reproduce the problem again? I am not able to reproduce this consistently. Thank you very much! |
| Comment by Patrick Farrell (Inactive) [ 07/Nov/13 ] |
|
Li, Sure. I'm doing that now - I've just tested to confirm I can still hit the problem, now I'm going to test the patch in the same situation. |
| Comment by Patrick Farrell (Inactive) [ 07/Nov/13 ] |
|
Li, Still hit the same crash with that patch in place. Stack trace is the same as before. I can make the dump available if you'd like. (Does Intel have an FTP site I could use? Otherwise I can find another way to make it available) [Edit] It's at: The file is at: You cannot ls in outbound or |
| Comment by Li Xi (Inactive) [ 08/Nov/13 ] |
|
Hi Patrick, Thank you very much for your test. It seems that the problem is complexer than I thought. I will investigate more. We can share the debug files by uploading it to ftp://ftp.whamcloud.com/uploads/LU-*. |
| Comment by Li Xi (Inactive) [ 08/Nov/13 ] |
|
Hi Patrick, It seems that there is a problem in lustre_get_emerg_rs(). The size of the buffer is set to zero, which will cause this LBUG when the memeory is under high pressure. I've updated the patch. Would you please test whether it works? Thank you very much! |
| Comment by Patrick Farrell (Inactive) [ 08/Nov/13 ] |
|
Li, Sure. I'll probably be testing this tomorrow. I expect you've noticed this, but you've got a small mistake in the current patch that's causing it not to build on RHEL6: /* Just return failure if the size is too big */ CERROR("size of message is too big (%lu), %d allowed", msglen + sizeof(struct ptlrpc_reply_state), You're printing %lu, but msglen + sizeof([...]) is an int. I'd just do a cast to a long unsigned int, I suppose. (unsigned long) (msglen + sizeof(struct ptlrpc_reply_state)),
Should do it.
|
| Comment by Patrick Farrell (Inactive) [ 08/Nov/13 ] |
|
Li, Just testing the FTP site... Is there something I've missed here?
|
| Comment by Li Xi (Inactive) [ 08/Nov/13 ] |
|
Hi Patrick, I've just test the FTP. It works well. Following are the commands. lixitekiMacBook-Pro:~ lixi$ ftp ftp.whamcloud.com |
| Comment by Patrick Farrell (Inactive) [ 08/Nov/13 ] |
|
Li, You're right - I was using a Windows FTP command line client, and now that I've switched to a Unix one, everything is fine. Thanks! |
| Comment by Patrick Farrell (Inactive) [ 08/Nov/13 ] |
|
Li, Different LBUG, same function: Dump is going up on the WC FTP site momentarily. Dump is in / |
| Comment by Li Xi (Inactive) [ 14/Nov/13 ] |
|
Patrick, Thanks for you test. I am sorry that this LBUG happended again. You said that this is a different LBUG, but the trace looks the same to me. Is there anything I missed? Thanks! |
| Comment by Patrick Farrell (Inactive) [ 14/Nov/13 ] |
|
Li, I'm sorry, you're right - I misread my own dump. I'm fairly sure I had the patch in place, but I'm happy to try again. It's always possible I made a mistake in setting things up. I should be able to test that tomorrow. By the way, if you're interested in trying to reproduce this, I'm just running this script from a client to generate activity: for idx in $(seq 0 10000); do time ls -laR > /dev/null touch somefile rm -f somefiles echo $idx: $(date +%T) $(grep MemFree /proc/meminfo) done Then running this code on the MDS to create memory pressure (you have to hold down enter, it consumes real memory much more slowly than the rate at which it's allocating memory). Once memory gets low, just keep running this code and you should see the bug within a minute or so: #include <stdio.h> #include <unistd.h> int main() { int i; char* junk; start: i = 0; while(i < 50) { printf("Malloc!\n"); junk = malloc(1024*1024*1024); junk[0] = i; i++; } printf("Mallocced 50 GB. Press enter to malloc another 50.\n"); printf("Note: This seems to use roughly 10 MB of real memory each time.\n"); getchar(); goto start; } |
| Comment by Patrick Farrell (Inactive) [ 14/Nov/13 ] |
|
Li, It looks like you're right - I must have made a mistake in my build/install process before. With your patch, I am no longer able to hit this bug.
|
| Comment by Li Xi (Inactive) [ 15/Nov/13 ] |
|
Patrick, Thank you very much for confirming this! I am glad that we've made some progress finally. Li Xi |
| Comment by Patrick Farrell (Inactive) [ 22/Nov/13 ] |
|
Re-pushed patch to get Maloo to test again. If tests pass, I'm planning to invite Oleg to review. |
| Comment by Oleg Drokin [ 19/Jan/14 ] |
|
Patch landed to master for 2.6.0, and to b2_4 for 2.4.3 (should it ever happen) and b2_5 for 2.5.1. |
| Comment by Andreas Dilger [ 13/Feb/14 ] |
|
Shows mode:0x40 == __GFP_IO, but missing __GFP_WAIT from |
| Comment by Alex Zhuravlev [ 19/Mar/14 ] |
|
http://review.whamcloud.com/#/c/9726/ - osd_thread_info fits 4K |
| Comment by Gerrit Updater [ 03/Mar/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/9726/ |