Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
On mount:
[248586.498989] Lustre: Lustre: Build Version: 2.9.56_82_g0eff453 [248586.559319] LNet: Added LNI 192.168.122.121@tcp [8/256/0/180] [248586.559412] LNet: Accept secure, port 988 [248586.741706] Lustre: Echo OBD driver; http://www.lustre.org/ [248587.766363] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro [248588.072575] LDISKFS-fs (loop0): file extents enabled, maximum tree depth=5 [248588.074580] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro [248588.446477] LDISKFS-fs (loop0): file extents enabled, maximum tree depth=5 [248588.448298] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro [248589.577659] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro [248589.597799] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [248589.620201] Lustre: MGS: Connection restored to 79c760f4-6c2d-891b-1d76-5801427fe973 (at 0@lo) [248589.641150] LustreError: 12764:0:(sec_bulk.c:188:enc_pools_release_free_pages()) ASSERTION( npages <= page_pools.epp_free_pages ) failed: [248589.654676] LustreError: 12764:0:(sec_bulk.c:188:enc_pools_release_free_pages()) LBUG [248589.663682] Pid: 12764, comm: mount.lustre [248589.663684] Call Trace: [248589.663709] [<ffffffffa05a47ee>] libcfs_call_trace+0x4e/0x60 [libcfs] [248589.663716] [<ffffffffa05a487c>] lbug_with_loc+0x4c/0xb0 [libcfs] [248589.663780] [<ffffffffa092a9b7>] enc_pools_shrink+0x5e7/0x680 [ptlrpc] [248589.663787] [<ffffffff811942b3>] shrink_slab+0x163/0x330 [248589.663791] [<ffffffff810c1b25>] ? check_preempt_curr+0x85/0xa0 [248589.663794] [<ffffffff810c1b59>] ? ttwu_do_wakeup+0x19/0xd0 [248589.663798] [<ffffffff811975b2>] do_try_to_free_pages+0x3c2/0x4e0 [248589.663801] [<ffffffff811977cc>] try_to_free_pages+0xfc/0x180 [248589.663807] [<ffffffff81682074>] __alloc_pages_slowpath+0x458/0x725 [248589.663810] [<ffffffff8118b155>] __alloc_pages_nodemask+0x405/0x420 [248589.663814] [<ffffffff81683262>] kmalloc_large_node+0x60/0x8d [248589.663819] [<ffffffff811dd5e7>] __kmalloc_node+0x247/0x2b0 [248589.663856] [<ffffffffa06b0599>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [248589.663897] [<ffffffffa0902c15>] ptlrpc_alloc_rqbd+0xd5/0x580 [ptlrpc] [248589.663936] [<ffffffffa090319d>] ptlrpc_grow_req_bufs+0xdd/0x280 [ptlrpc] [248589.663979] [<ffffffffa0903654>] ptlrpc_service_part_init+0x314/0x680 [ptlrpc] [248589.664037] [<ffffffffa0908907>] ptlrpc_register_service+0x337/0xe60 [ptlrpc] [248589.664063] [<ffffffffa0f221dc>] mds_start_ptlrpc_service+0x41c/0xbb0 [mdt] [248589.664077] [<ffffffffa0f22a84>] mds_device_alloc+0x114/0x290 [mdt] [248589.664106] [<ffffffffa06b9944>] obd_setup+0x114/0x2a0 [obdclass] [248589.664130] [<ffffffffa06bcdc4>] class_setup+0x2f4/0x8d0 [obdclass] [248589.664153] [<ffffffffa06c0f92>] class_process_config+0x1d12/0x2b80 [obdclass] [248589.664176] [<ffffffffa06b0599>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [248589.664199] [<ffffffffa06ca1d9>] do_lcfg+0x159/0x5d0 [obdclass] [248589.664221] [<ffffffffa06caf98>] lustre_start_simple+0x88/0x210 [obdclass] [248589.664250] [<ffffffffa06f3542>] server_start_targets+0xac2/0x2a70 [obdclass] [248589.664291] [<ffffffffa090def8>] ? ptlrpc_pinger_wake_up+0x28/0x30 [ptlrpc] [248589.664330] [<ffffffffa090e0a7>] ? ptlrpc_pinger_add_import+0x1a7/0x1e0 [ptlrpc] [248589.664353] [<ffffffffa06b06c1>] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] [248589.664375] [<ffffffffa06cb32d>] ? lustre_start_mgc+0x20d/0x2780 [obdclass] [248589.664399] [<ffffffffa06f657d>] server_fill_super+0x108d/0x184a [obdclass] [248589.664421] [<ffffffffa06ce348>] lustre_fill_super+0x328/0x950 [obdclass] [248589.664442] [<ffffffffa06ce020>] ? lustre_fill_super+0x0/0x950 [obdclass] [248589.664447] [<ffffffff81201b1d>] mount_nodev+0x4d/0xb0 [248589.664488] [<ffffffffa06c5af8>] lustre_mount+0x38/0x60 [obdclass] [248589.664492] [<ffffffff812024c9>] mount_fs+0x39/0x1b0 [248589.664497] [<ffffffff8121e25f>] vfs_kern_mount+0x5f/0xf0 [248589.664500] [<ffffffff812207be>] do_mount+0x24e/0xaa0 [248589.664504] [<ffffffff81185a7e>] ? __get_free_pages+0xe/0x50 [248589.664507] [<ffffffff812210a6>] SyS_mount+0x96/0xf0 [248589.664512] [<ffffffff81696d49>] system_call_fastpath+0x16/0x1b [248589.664514] [248589.664516] Kernel panic - not syncing: LBUG
I am suspicious of the following from LU-3308.
@@ -242,7 +242,7 @@ static unsigned long enc_pools_shrink_count(struct shrinker *s, } LASSERT(page_pools.epp_idle_idx <= IDLE_IDX_MAX); - return max((int)page_pools.epp_free_pages - PTLRPC_MAX_BRW_PAGES, 0) * + return max(page_pools.epp_free_pages - PTLRPC_MAX_BRW_PAGES, 0UL) * (IDLE_IDX_MAX - page_pools.epp_idle_idx) / IDLE_IDX_MAX; }
page_pools.epp_free_pages - PTLRPC_MAX_BRW_PAGES is unsigned long and if val is unsigned long then max(val, 0UL) is always equal to val.