[LU-6183] replay-dual test_16: test_16 failed with 2 Created: 30/Jan/15 Updated: 11/Aug/15 Resolved: 11/Aug/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
client and server: lustre-master build # 2835 RHEL6 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 17296 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/36384a0a-a7aa-11e4-93dd-5254006e85c2. The sub-test test_16 failed with the following error: test_16 failed with 2 MDT 20:03:12:LDISKFS-fs (dm-0): recovery complete 20:03:12:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 20:03:12:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u 20:03:12:Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P1 2>/dev/null 20:03:12:Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 4 clients reconnect 20:03:12:Lustre: Skipped 9 previous similar messages 20:03:12:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts 20:03:12:Lustre: DEBUG MARKER: umount -d /mnt/mds1 20:03:12:LustreError: 26830:0:(ldlm_resource.c:777:ldlm_resource_complain()) lustre-MDT0000-lwp-MDT0000: namespace resource [0x200000006:0x1010000:0x0].0 (ffff88005c382340) refcount nonzero (1) after lock cleanup; forcing cleanup. 20:03:12:LustreError: 26830:0:(ldlm_resource.c:777:ldlm_resource_complain()) Skipped 1 previous similar message 20:03:12:LustreError: 26830:0:(ldlm_resource.c:1374:ldlm_resource_dump()) --- Resource: [0x200000006:0x1010000:0x0].0 (ffff88005c382340) refcount = 2 20:03:12:LustreError: 26830:0:(ldlm_resource.c:1377:ldlm_resource_dump()) Granted locks (in reverse order): 20:03:12:LustreError: 26830:0:(ldlm_resource.c:1380:ldlm_resource_dump()) ### ### ns: lustre-MDT0000-lwp-MDT0000 lock: ffff88007a7bd940/0x359e3bb9e884474 lrc: 2/1,0 mode: CR/CR res: [0x200000006:0x1010000:0x0].0 rrc: 2 type: PLN flags: 0x1106400000000 nid: local remote: 0x359e3bb9e8844ac expref: -99 pid: 26538 timeout: 0 lvb_type: 2 20:03:12:LustreError: 26830:0:(ldlm_resource.c:1380:ldlm_resource_dump()) Skipped 1 previous similar message 20:03:12:LustreError: 26830:0:(ldlm_resource.c:1374:ldlm_resource_dump()) --- Resource: [0x200000006:0x10000:0x0].0 (ffff88006b050300) refcount = 2 20:03:13:LustreError: 26830:0:(ldlm_resource.c:1377:ldlm_resource_dump()) Granted locks (in reverse order): 20:03:13:LustreError: 26830:0:(ldlm_lib.c:2106:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery 20:03:13:Lustre: 26540:0:(ldlm_lib.c:1773:target_recovery_overseer()) recovery is aborted, evict exports in recovery 20:03:13:Lustre: 26540:0:(ldlm_lib.c:1773:target_recovery_overseer()) Skipped 2 previous similar messages 20:03:13:Lustre: 26540:0:(ldlm_lib.c:1415:abort_req_replay_queue()) @@@ aborted: req@ffff88006883d680 x1491563361930380/t0(528280977442) o101->7aa76843-6c40-e2de-f30c-243abe33bee1@10.2.4.157@tcp:381/0 lens 592/0 e 0 to 0 dl 1422475496 ref 1 fl Complete:/4/ffffffff rc 0/-1 20:03:13:Lustre: lustre-MDT0000: Not available for connect from 10.2.4.157@tcp (stopping) 20:03:13:Lustre: Skipped 2 previous similar messages 20:03:13:LustreError: 26520:0:(osp_precreate.c:899:osp_precreate_cleanup_orphans()) lustre-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -5 OST 20:07:11:LustreError: 18618:0:(qsd_reint.c:54:qsd_reint_completion()) lustre-OST0001: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-5 client: 20:07:45:Lustre: DEBUG MARKER: lctl get_param -n at_max 20:07:45:LustreError: 11-0: lustre-MDT0000-mdc-ffff88007b5f8800: operation ldlm_enqueue to node 10.2.4.162@tcp failed: rc = -107 20:07:45:LustreError: Skipped 16 previous similar messages 20:07:45:LustreError: 167-0: lustre-MDT0000-mdc-ffff88007b5f8800: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. |
| Comments |
| Comment by Andreas Dilger [ 11/Aug/15 ] |
|
Haven hit this since February. |