Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Lustre 2.3.0
-
None
-
LLNL Hyperion, running the OST failover portion of recovery-scale - failure occurs close to the start of the test.
-
3
-
6015
Description
Client LBUGS - This appears to be the same issue as ORI-726, but reporting separately just in case, as ORI is PPC issue.
------------
Jul 25 17:27:40 ehyperion790 kernel: Lustre: DEBUG MARKER: == recovery-mds-scale test failover_ost: failover OST == 17:27:36 (1343262456)
Jul 25 17:27:43 ehyperion790 kernel: LustreError: 13860:0:(cl_lock.c:1949:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(env, page)))) ) failed:
Jul 25 17:27:43 ehyperion790 kernel: LustreError: 13860:0:(cl_lock.c:1949:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(env, page)))) ) failed:
Jul 25 17:27:48 ehyperion790 kernel: LustreError: 13860:0:(cl_lock.c:1949:discard_cb()) LBUG
Jul 25 17:27:48 ehyperion790 kernel: LustreError: 13860:0:(cl_lock.c:1949:discard_cb()) LBUG
Jul 25 17:27:48 ehyperion790 kernel: Call Trace:
Jul 25 17:27:51 ehyperion790 kernel: [<ffffffffa034c905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Jul 25 17:27:54 ehyperion790 kernel: [<ffffffffa034cf17>] lbug_with_loc+0x47/0xb0 [libcfs]
Jul 25 17:27:55 ehyperion790 kernel: [<ffffffffa0519148>] discard_cb+0x138/0x1f0 [obdclass]
Jul 25 17:27:59 ehyperion790 kernel: [<ffffffffa0516404>] cl_page_gang_lookup+0x1f4/0x400 [obdclass]
Jul 25 17:28:02 ehyperion790 kernel: [<ffffffffa0519010>] ? discard_cb+0x0/0x1f0 [obdclass]
Jul 25 17:28:02 ehyperion790 kernel: [<ffffffffa0519010>] ? discard_cb+0x0/0x1f0 [obdclass]
Jul 25 17:28:06 ehyperion790 kernel: [<ffffffffa0518ece>] cl_lock_discard_pages+0x11e/0x1f0 [obdclass]
Jul 25 17:28:06 ehyperion790 kernel: [<ffffffffa0832430>] osc_lock_flush+0x110/0x200 [osc]
Jul 25 17:28:10 ehyperion790 kernel: [<ffffffffa0832579>] osc_lock_cancel+0x59/0x1a0 [osc]
Jul 25 17:28:12 ehyperion790 kernel: [<ffffffffa0516c45>] cl_lock_cancel0+0x75/0x160 [obdclass]
Jul 25 17:28:14 ehyperion790 kernel: [<ffffffffa05178ab>] cl_lock_cancel+0x13b/0x140 [obdclass]
Jul 25 17:28:17 ehyperion790 kernel: [<ffffffffa083384a>] osc_ldlm_blocking_ast+0x13a/0x380 [osc]
Jul 25 17:28:19 ehyperion790 kernel: [<ffffffffa062f480>] ldlm_cancel_callback+0x60/0x100 [ptlrpc]
Jul 25 17:28:23 ehyperion790 kernel: [<ffffffffa063dc1b>] ldlm_cli_cancel_local+0x7b/0x380 [ptlrpc]
Jul 25 17:28:25 ehyperion790 kernel: [<ffffffffa06404ff>] ldlm_cli_cancel_list_local+0xef/0x1f0 [ptlrpc]
Jul 25 17:28:26 ehyperion790 kernel: [<ffffffffa064078a>] ldlm_cancel_resource_local+0x18a/0x2a0 [ptlrpc]
Jul 25 17:28:30 ehyperion790 kernel: [<ffffffffa081a5cb>] osc_destroy+0x12b/0x760 [osc]
Jul 25 17:28:30 ehyperion790 kernel: [<ffffffffa08bf63d>] ? lov_set_add_req+0x2d/0x50 [lov]
Jul 25 17:28:35 ehyperion790 kernel: [<ffffffffa08b0235>] lov_destroy+0x375/0xc40 [lov]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8115dd5c>] ? transfer_objects+0x5c/0x80
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0691396>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa09a8e8a>] ll_objects_destroy+0x53a/0x1810 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0978811>] ll_close_inode_openhandle+0x351/0x1050 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa09796ba>] ll_md_real_close+0x1aa/0x220 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa097998b>] ll_md_close+0x25b/0x760 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff814eec86>] ? down_read+0x16/0x30
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0979fab>] ll_file_release+0x11b/0x3e0 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff811781c5>] __fput+0xf5/0x210
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff81178305>] fput+0x25/0x30
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff81173d4d>] filp_close+0x5d/0x90
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106c7bf>] put_files_struct+0x7f/0xf0
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106c883>] exit_files+0x53/0x70
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106e8f5>] do_exit+0x185/0x870
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106f038>] do_group_exit+0x58/0xd0
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106f0c7>] sys_exit_group+0x17/0x20
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Jul 25 17:28:36 ehyperion790 kernel:
Jul 25 17:28:36 ehyperion790 kernel: Kernel panic - not syncing: LBUG
-----------
25 17:28:36 ehyperion790 kernel: Kernel panic - not syncing: LBUG
Jul 25 17:28:36 ehyperion790 kernel: Call Trace:
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff814ec98a>] ? panic+0x78/0x143
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa034cf6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0519148>] ? discard_cb+0x138/0x1f0 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0516404>] ? cl_page_gang_lookup+0x1f4/0x400 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0519010>] ? discard_cb+0x0/0x1f0 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0519010>] ? discard_cb+0x0/0x1f0 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0518ece>] ? cl_lock_discard_pages+0x11e/0x1f0 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0832430>] ? osc_lock_flush+0x110/0x200 [osc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0832579>] ? osc_lock_cancel+0x59/0x1a0 [osc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0516c45>] ? cl_lock_cancel0+0x75/0x160 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa05178ab>] ? cl_lock_cancel+0x13b/0x140 [obdclass]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa083384a>] ? osc_ldlm_blocking_ast+0x13a/0x380 [osc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa062f480>] ? ldlm_cancel_callback+0x60/0x100 [ptlrpc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa063dc1b>] ? ldlm_cli_cancel_local+0x7b/0x380 [ptlrpc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa06404ff>] ? ldlm_cli_cancel_list_local+0xef/0x1f0 [ptlrpc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa064078a>] ? ldlm_cancel_resource_local+0x18a/0x2a0 [ptlrpc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa081a5cb>] ? osc_destroy+0x12b/0x760 [osc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa08bf63d>] ? lov_set_add_req+0x2d/0x50 [lov]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa08b0235>] ? lov_destroy+0x375/0xc40 [lov]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8115dd5c>] ? transfer_objects+0x5c/0x80
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0691396>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa09a8e8a>] ? ll_objects_destroy+0x53a/0x1810 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0978811>] ? ll_close_inode_openhandle+0x351/0x1050 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa09796ba>] ? ll_md_real_close+0x1aa/0x220 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa097998b>] ? ll_md_close+0x25b/0x760 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff814eec86>] ? down_read+0x16/0x30
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffffa0979fab>] ? ll_file_release+0x11b/0x3e0 [lustre]
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff811781c5>] ? __fput+0xf5/0x210
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff81178305>] ? fput+0x25/0x30
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff81173d4d>] ? filp_close+0x5d/0x90
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106c7bf>] ? put_files_struct+0x7f/0xf0
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106c883>] ? exit_files+0x53/0x70
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106e8f5>] ? do_exit+0x185/0x870
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106f038>] ? do_group_exit+0x58/0xd0
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8106f0c7>] ? sys_exit_group+0x17/0x20
Jul 25 17:28:36 ehyperion790 kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Attachments
Issue Links
- duplicates
-
LU-1442 File corrupt with 1MiB-aligned 4k regions of zeros
- Closed