[LU-7637] recovery-small test_131 test failed: [ 592.536615] LustreError: 9581:0:(ofd_dev.c:2276:ofd_prolong_extent_locks()) ASSERTION( lock->l_export == exp ) failed: LBUG Created: 07/Jan/16  Updated: 25/May/16  Resolved: 25/May/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: parinay v kondekar (Inactive) Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Interop: Server 2.7.64 <-> Client 2.5.x
4 node setup - 1MDS / 1OSS/ 2Clients


Attachments: File 131.lctl.tgz     Text File vmcore-dmesg.txt    
Issue Links:
Duplicate
is duplicated by LU-7702 ASSERTION( lock->l_export == opd->opd... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Following crash is seen on OSS node.

580 [  588.472315] Lustre: DEBUG MARKER: == recovery-small test 131: IO vs evict results to IO under st    aled lock == 02:14:53 (1451960093)
581 [  589.841214] Lustre: 17281:0:(genops.c:1526:obd_export_evict_by_uuid()) lustre-OST0000: evicting     3294d2b3-5a9b-38d2-8dde-402c94cdc7aa at adminstrative request
582 [  589.852964] LustreError: 9568:0:(fail.c:133:__cfs_fail_timeout_set()) cfs_fail_timeout id 31e sl    eeping for 4000ms
583 [  589.876372] Lustre: lustre-OST0000: Connection restored to 3294d2b3-5a9b-38d2-8dde-402c94cdc7aa     (at 192.168.112.11@tcp)
584 [  592.536615] LustreError: 9581:0:(ofd_dev.c:2276:ofd_prolong_extent_locks()) ASSERTION( lock->l_e    xport == exp ) failed:$
585 [  592.546230] LustreError: 9581:0:(ofd_dev.c:2276:ofd_prolong_extent_locks()) LBUG
586 [  592.549732] Pid: 9581, comm: ll_ost_io00_002
587 [  592.552399]$
588 Call Trace:
589 [  592.556486]  [<ffffffffa04827d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
590 [  592.559786]  [<ffffffffa0482d75>] lbug_with_loc+0x45/0xc0 [libcfs]
591 [  592.562853]  [<ffffffffa0b61575>] ofd_prolong_extent_locks+0x395/0x3a0 [ofd]
592 [  592.566323]  [<ffffffffa0806420>] ? lustre_swab_niobuf_remote+0x0/0x30 [ptlrpc]
593 [  592.569584]  [<ffffffffa0b61944>] ofd_rw_hpreq_check+0xd4/0x340 [ofd]
594 [  592.572612]  [<ffffffffa08157bf>] ptlrpc_main+0x17bf/0x1e90 [ptlrpc]
595 [  592.575732]  [<ffffffffa0814000>] ? ptlrpc_main+0x0/0x1e90 [ptlrpc]
596 [  592.578826]  [<ffffffff8109727f>] kthread+0xcf/0xe0
597 [  592.581423]  [<ffffffff810971b0>] ? kthread+0x0/0xe0
598 [  592.584417]  [<ffffffff81614598>] ret_from_fork+0x58/0x90
599 [  592.586647]  [<ffffffff810971b0>] ? kthread+0x0/0xe0
600 [  592.589421]$
601 [  592.591377] Kernel panic - not syncing: LBUG
602 [  592.592298] CPU: 1 PID: 9581 Comm: ll_ost_io00_002 Tainted: GF          O--------------   3.10.0    -229.20.1.el7_lustremaster_master__81.x86_64 #1
603 [  592.592298] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
604 [  592.592298]  ffffffffa049feaf 00000000fb6158ba ffff8800bc92fc70 ffffffff816047e6
605 [  592.592298]  ffff8800bc92fcf0 ffffffff815fe08a ffffffff00000008 ffff8800bc92fd00
606 [  592.592298]  ffff8800bc92fca0 00000000fb6158ba ffffffffa0b8b340 0000000000000246
607 [  592.592298] Call Trace:
608 [  592.592298]  [<ffffffff816047e6>] dump_stack+0x19/0x1b
609 [  592.592298]  [<ffffffff815fe08a>] panic+0xd8/0x1e7
610 [  592.592298]  [<ffffffffa0482ddb>] lbug_with_loc+0xab/0xc0 [libcfs]
611 [  592.592298]  [<ffffffffa0b61575>] ofd_prolong_extent_locks+0x395/0x3a0 [ofd]
612 [  592.592298]  [<ffffffffa0806420>] ? lustre_swab_obd_ioobj+0x30/0x30 [ptlrpc]
613 [  592.592298]  [<ffffffffa0b61944>] ofd_rw_hpreq_check+0xd4/0x340 [ofd]
614 [  592.592298]  [<ffffffffa08157bf>] ptlrpc_main+0x17bf/0x1e90 [ptlrpc]
615 [  592.592298]  [<ffffffffa0814000>] ? ptlrpc_register_service+0xfc0/0xfc0 [ptlrpc]
616 [  592.592298]  [<ffffffff8109727f>] kthread+0xcf/0xe0
617 [  592.592298]  [<ffffffff810971b0>] ? kthread_create_on_node+0x140/0x140
618 [  592.592298]  [<ffffffff81614598>] ret_from_fork+0x58/0x90
619 [  592.592298]  [<ffffffff810971b0>] ? kthread_create_on_node+0x140/0x140

Master already contains LU-5522 committed on "CommitDate: Wed Sep 24 01:28:39 2014 +0000"
Similar issues LU-5522 LU-6236



 Comments   
Comment by Noopur Maheshwari (Inactive) [ 05/Mar/16 ]

The patch for LU-7702 solves the issue - http://review.whamcloud.com/#/c/18120/

Comment by Andreas Dilger [ 25/May/16 ]

There is no recovery-small test_131 in master, but it appears that this test and the fix for it are in patch http://review.whamcloud.com/18120 "LU-7702 ldlm: skip lock if export failed".

Generated at Sat Feb 10 02:10:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.