Details
-
Bug
-
Resolution: Won't Fix
-
Minor
-
None
-
Lustre 2.12.5
-
None
-
3
-
9223372036854775807
Description
This filesystem has 2 servers and both hit LBUG
First one
[4308300.501701] LustreError: 17867:0:(tgt_grant.c:463:tgt_grant_space_left()) nbp12-OST0000: cli 55a893a2-37fc-a91e-0edd-c13e39c0bcd7/ffff93b5024b4000 left 51980938248192 < tot_grant 4611686279107159457 unstable 85434368 pending 90030080 dirty 390602752 [4308300.568648] LustreError: 17867:0:(tgt_grant.c:463:tgt_grant_space_left()) Skipped 69 previous similar messages [4308305.988484] LustreError: 80109:0:(tgt_grant.c:463:tgt_grant_space_left()) nbp12-OST0000: cli ba1ae12e-b75a-7684-bce5-34097cf4072c/ffff93b488caa400 left 51980913885184 < tot_grant 4611686277031540129 unstable 0 pending 0 dirty 344813568 [4308306.051765] LustreError: 80109:0:(tgt_grant.c:463:tgt_grant_space_left()) Skipped 44 previous similar messages [4308314.057174] LustreError: 86494:0:(tgt_grant.c:463:tgt_grant_space_left()) nbp12-OST0000: cli b5875278-4309-10d6-2c2d-557864167838/ffff93a576975c00 left 51980868866048 < tot_grant 4611686246882669986 unstable 27496448 pending 31715328 dirty 373313536 [4308314.124119] LustreError: 86494:0:(tgt_grant.c:463:tgt_grant_space_left()) Skipped 38 previous similar messages [4308330.021959] LustreError: 17949:0:(tgt_grant.c:463:tgt_grant_space_left()) nbp12-OST0000: cli fcefe8b5-bf4c-3d12-56f9-fb1b93134dfc/ffff939c3f41a800 left 51980592369664 < tot_grant 4611686224816031140 unstable 25616384 pending 25669632 dirty 388505600 [4308330.088908] LustreError: 17949:0:(tgt_grant.c:463:tgt_grant_space_left()) Skipped 183 previous similar messages [4308337.687240] LustreError: 19209:0:(tgt_grant.c:151:tgt_check_export_grants()) nbp12-OST0000: cli 5fc11b40-37df-3838-564b-bfebf0abf25a/ffff93ad4aeb9000 ted_grant(4611686037896666112) + ted_pending(0) > maxsize(77690215612416) [4308337.747390] LustreError: 19209:0:(tgt_grant.c:223:tgt_grant_sanity_check()) LBUG [4308337.749201] LustreError: 25433:0:(tgt_grant.c:223:tgt_grant_sanity_check()) LBUG [4308337.749203] Pid: 25433, comm: kworker/11:0 3.10.0-1127.19.1.el7_lustre2125.x86_64 #1 SMP Mon Nov 2 14:50:02 PST 2020 [4308337.749203] Call Trace: [4308337.749221] [<ffffffffc09f67cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [4308337.749225] [<ffffffffc09f687c>] lbug_with_loc+0x4c/0xa0 [libcfs] [4308337.749275] [<ffffffffc135aa20>] tgt_grant_sanity_check+0x520/0x560 [ptlrpc] [4308337.749281] [<ffffffffc18dccd8>] ofd_destroy_export+0x88/0x110 [ofd] [4308337.749304] [<ffffffffc0e2459e>] class_export_destroy+0xee/0x490 [obdclass] [4308337.749316] [<ffffffffc0e24955>] obd_zombie_exp_cull+0x15/0x20 [obdclass] [4308337.749319] [<ffffffffa28be6bf>] process_one_work+0x17f/0x440 [4308337.749321] [<ffffffffa28bf7d6>] worker_thread+0x126/0x3c0 [4308337.749323] [<ffffffffa28c6691>] kthread+0xd1/0xe0 [4308337.749325] [<ffffffffa2f92d1d>] ret_from_fork_nospec_begin+0x7/0x21 [4308337.749340] [<ffffffffffffffff>] 0xffffffffffffffff [4308337.749341] Kernel panic - not syncing: LBUG [4308337.749343] CPU: 11 PID: 25433 Comm: kworker/11:0 Kdump: loaded Tainted: G OE ------------ 3.10.0-1127.19.1.el7_lustre2125.x86_64 #1 [4308337.749343] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 06/15/2018 [4308337.749356] Workqueue: obd_zombid obd_zombie_exp_cull [obdclass] [4308337.749356] Call Trace: [4308337.749360] [<ffffffffa2f7ffa5>] dump_stack+0x19/0x1b [4308337.749362] [<ffffffffa2f79541>] panic+0xe8/0x21f [4308337.749367] [<ffffffffc09f68cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [4308337.749399] [<ffffffffc135aa20>] tgt_grant_sanity_check+0x520/0x560 [ptlrpc] [4308337.749403] [<ffffffffc18dccd8>] ofd_destroy_export+0x88/0x110 [ofd] [4308337.749416] [<ffffffffc0e2459e>] class_export_destroy+0xee/0x490 [obdclass] [4308337.749428] [<ffffffffc0e24955>] obd_zombie_exp_cull+0x15/0x20 [obdclass] [4308337.749430] [<ffffffffa28be6bf>] process_one_work+0x17f/0x440 [4308337.749431] [<ffffffffa28bf7d6>] worker_thread+0x126/0x3c0 [4308337.749433] [<ffffffffa28bf6b0>] ? manage_workers.isra.26+0x2a0/0x2a0 [4308337.749434] [<ffffffffa28c6691>] kthread+0xd1/0xe0 [4308337.749436] [<ffffffffa28c65c0>] ? insert_kthread_work+0x40/0x40 [4308337.749437] [<ffffffffa2f92d1d>] ret_from_fork_nospec_begin+0x7/0x21 [4308337.749438] [<ffffffffa28c65c0>] ? insert_kthread_work+0x40/0x40
Second one
[3879290.271375] LustreError: 16716:0:(tgt_grant.c:463:tgt_grant_space_left()) nbp12-OST000b: cli ad8e8405-8dcc-dcec-4983-af2b3509ed40/ffff9a21dd688400 left 51431169011712 < tot_grant 4611686156105684284 unstable 0 pending 0 dirty 55066624 [3879290.334406] LustreError: 16716:0:(tgt_grant.c:463:tgt_grant_space_left()) Skipped 6 previous similar messages [3879348.214404] LustreError: 87087:0:(tgt_grant.c:151:tgt_check_export_grants()) nbp12-OST000b: cli 5fc11b40-37df-3838-564b-bfebf0abf25a/ffff9a32405cac00 ted_grant(4611686037969102848) + ted_pending(0) > maxsize(77690215612416) [3879348.274549] LustreError: 87087:0:(tgt_grant.c:223:tgt_grant_sanity_check()) LBUG [3879348.297309] Pid: 87087, comm: kworker/17:3 3.10.0-1127.19.1.el7_lustre2125.x86_64 #1 SMP Mon Nov 2 14:50:02 PST 2020 [3879348.297314] Call Trace: [3879348.297326] [<ffffffffc0b447cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [3879348.302057] [3879348.302061] [<ffffffffc0b4487c>] lbug_with_loc+0x4c/0xa0 [libcfs] [3879348.302061] [3879348.302107] [<ffffffffc15d6a20>] tgt_grant_sanity_check+0x520/0x560 [ptlrpc] [3879348.302113] [<ffffffffc125ecd8>] ofd_destroy_export+0x88/0x110 [ofd] [3879348.302133] [<ffffffffc0f8f59e>] class_export_destroy+0xee/0x490 [obdclass] [3879348.302145] [<ffffffffc0f8f955>] obd_zombie_exp_cull+0x15/0x20 [obdclass] [3879348.302149] [<ffffffffaccbe6bf>] process_one_work+0x17f/0x440 [3879348.302150] [<ffffffffaccbf7d6>] worker_thread+0x126/0x3c0 [3879348.302152] [<ffffffffaccc6691>] kthread+0xd1/0xe0 [3879348.302154] [<ffffffffad392d1d>] ret_from_fork_nospec_begin+0x7/0x21 [3879348.302169] [<ffffffffffffffff>] 0xffffffffffffffff [3879348.302170] Kernel panic - not syncing: LBUG [3879348.302172] CPU: 17 PID: 87087 Comm: kworker/17:3 Kdump: loaded Tainted: G OE ------------ 3.10.0-1127.19.1.el7_lustre2125.x86_64 #1 [3879348.302172] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 06/15/2018 [3879348.302185] Workqueue: obd_zombid obd_zombie_exp_cull [obdclass] [3879348.302185] Call Trace: [3879348.302188] [<ffffffffad37ffa5>] dump_stack+0x19/0x1b [3879348.302190] [<ffffffffad379541>] panic+0xe8/0x21f [3879348.302196] [<ffffffffc0b448cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [3879348.302227] [<ffffffffc15d6a20>] tgt_grant_sanity_check+0x520/0x560 [ptlrpc] [3879348.302231] [<ffffffffc125ecd8>] ofd_destroy_export+0x88/0x110 [ofd] [3879348.302243] [<ffffffffc0f8f59e>] class_export_destroy+0xee/0x490 [obdclass] [3879348.302255] [<ffffffffc0f8f955>] obd_zombie_exp_cull+0x15/0x20 [obdclass] [3879348.302257] [<ffffffffaccbe6bf>] process_one_work+0x17f/0x440 [3879348.302258] [<ffffffffaccbf7d6>] worker_thread+0x126/0x3c0 [3879348.302260] [<ffffffffaccbf6b0>] ? manage_workers.isra.26+0x2a0/0x2a0 [3879348.302261] [<ffffffffaccc6691>] kthread+0xd1/0xe0 [3879348.302263] [<ffffffffaccc65c0>] ? insert_kthread_work+0x40/0x40 [3879348.302264] [<ffffffffad392d1d>] ret_from_fork_nospec_begin+0x7/0x21 [3879348.302265] [<ffffffffaccc65c0>] ? insert_kthread_work+0x40/0x40