[LU-3061] (cl_lock.c:1964:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(env, page)))) ) failed Created: 29/Mar/13  Updated: 04/Apr/13  Resolved: 04/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Wally Wang (Inactive) Assignee: WC Triage
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Cray XT system, SLES11 SP1, SP2, Luster client 2.3, 2.4


Severity: 2
Rank (Obsolete): 7460

 Description   

This assertion happened when running Cray stress tests on a Lustre tag 2.3.63 compute client, running on SLES11 SP1 or SP2:

LustreError: 3661:0:(cl_lock.c:1964:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(env, page)))) ) failed:
LustreError: 3661:0:(cl_lock.c:1964:discard_cb()) LBUG
Pid: 3661, comm: ldlm_bl_00
Call Trace:
[<ffffffff81007e59>] try_stack_unwind+0x1a9/0x200
[<ffffffff81006625>] dump_trace+0x95/0x300
[<ffffffffa01698d7>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
[<ffffffffa0169e27>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa02d48f0>] discard_cb+0x160/0x1d0 [obdclass]
[<ffffffffa02d1c2f>] cl_page_gang_lookup+0x1cf/0x3c0 [obdclass]
[<ffffffffa02d4652>] cl_lock_discard_pages+0x112/0x1f0 [obdclass]
[<ffffffffa0695415>] osc_lock_flush+0xf5/0x260 [osc]
[<ffffffffa0695661>] osc_lock_cancel+0xe1/0x1c0 [osc]
[<ffffffffa02d246d>] cl_lock_cancel0+0x6d/0x160 [obdclass]
[<ffffffffa02d31ab>] cl_lock_cancel+0x13b/0x140 [obdclass]
[<ffffffffa06969cc>] osc_ldlm_blocking_ast+0x20c/0x330 [osc]
[<ffffffffa03dea1b>] ldlm_cancel_callback+0x6b/0x190 [ptlrpc]
[<ffffffffa03eca8a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
[<ffffffffa03efd0c>] ldlm_cli_cancel_list_local+0xec/0x280 [ptlrpc]
[<ffffffffa03f504c>] ldlm_bl_thread_main+0x10c/0x420 [ptlrpc]
[<ffffffff81003efa>] child_rip+0xa/0x20
Kernel panic - not syncing: LBUG
Pid: 3661, comm: ldlm_bl_00 Tainted: P 2.6.32.59-0.7.1_1.0000.6993-cray_gem_c #1

The symptom is similar to LU-1442/LU-1680 but seems to be still happening.



 Comments   
Comment by Cory Spitz [ 04/Apr/13 ]

Cray now believes that this bug is ours and that this ticket can be closed. Our high-speed-network driver was leaving read-only pages marked as dirty, thus tripping the assertion. Sorry for the report for a non-Lustre issue.

Comment by Peter Jones [ 04/Apr/13 ]

ok thanks Cory!

Generated at Sat Feb 10 01:30:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.