[LU-2828] conf-sanity test_64 test_59: MDS dt_object.h dt_declare_record_write() ASSERTION( dt != NULL ) Created: 18/Feb/13 Updated: 03/Apr/16 Resolved: 03/Mar/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0, Lustre 2.8.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Keith Mannthey (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HB | ||
| Environment: |
A patch pushed via git. |
||
| Severity: | 3 |
| Rank (Obsolete): | 6849 |
| Description |
|
From this test run: The patch being tests is not involved in this area of the code. conf-sanity test_64 Error: 'test failed to respond and timed out' In the MDS the following is seen: 09:37:51:Lustre: DEBUG MARKER: == conf-sanity test 64: check lfs df --lazy == 09:37:45 (1361122665) 09:37:51:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1 09:37:51:Lustre: DEBUG MARKER: test -b /dev/lvm-MDS/P1 09:37:51:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 09:37:51:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 09:37:51:Lustre: lustre-MDT0000: used disk, loading 09:37:51:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust 09:37:51:Lustre: DEBUG MARKER: e2label /dev/lvm-MDS/P1 2>/dev/null 09:38:02:Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.10.17.34@tcp) was lost; in progress operations using this service will wait for recovery to complete 09:38:14:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts 09:38:14:Lustre: DEBUG MARKER: umount -d -f /mnt/mds1 09:38:14:LustreError: 7883:0:(client.c:1048:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88006c4a2c00 x1427239946161272/t0(0) o13->lustre-OST0000-osc-MDT0000@10.10.17.34@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 09:38:14:LustreError: 24649:0:(dt_object.h:979:dt_declare_record_write()) ASSERTION( dt != NULL ) failed: dt is NULL when we want to write record 09:38:14:LustreError: 24649:0:(dt_object.h:979:dt_declare_record_write()) LBUG 09:38:14:Pid: 24649, comm: osp-pre-1 09:38:14: 09:38:14:Call Trace: 09:38:14: [<ffffffffa0ee7895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 09:38:14: [<ffffffffa0ee7e97>] lbug_with_loc+0x47/0xb0 [libcfs] 09:38:14: [<ffffffffa0704ca5>] osp_write_last_oid_seq_files+0x595/0x6a0 [osp] 09:38:14: [<ffffffffa070918d>] osp_precreate_thread+0x80d/0x1460 [osp] 09:38:14: [<ffffffffa0708980>] ? osp_precreate_thread+0x0/0x1460 [osp] 09:38:14: [<ffffffff8100c0ca>] child_rip+0xa/0x20 09:38:14: [<ffffffffa0708980>] ? osp_precreate_thread+0x0/0x1460 [osp] 09:38:14: [<ffffffffa0708980>] ? osp_precreate_thread+0x0/0x1460 [osp] 09:38:14: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Looks like the MDS paniced on unmount. |
| Comments |
| Comment by Li Wei (Inactive) [ 19/Feb/13 ] |
|
https://maloo.whamcloud.com/test_sets/172a3bae-7adf-11e2-b916-52540035b04c |
| Comment by Zhenyu Xu [ 20/Feb/13 ] |
|
another hit at https://maloo.whamcloud.com/test_sets/b7840b4c-7b67-11e2-8242-52540035b04c |
| Comment by Nathaniel Clark [ 20/Feb/13 ] |
|
ZFS having same issue on test 59 of conf-sanity https://maloo.whamcloud.com/test_sets/a79faa76-7b51-11e2-8242-52540035b04c |
| Comment by nasf (Inactive) [ 20/Feb/13 ] |
|
another failure instance: https://maloo.whamcloud.com/test_sets/35e5a63c-7ada-11e2-b916-52540035b04c |
| Comment by Minh Diep [ 21/Feb/13 ] |
|
another hit: https://maloo.whamcloud.com/test_sets/a7a08a3a-79d1-11e2-ad0e-52540035b04c |
| Comment by Jodi Levi (Inactive) [ 21/Feb/13 ] |
|
Alex, |
| Comment by Nathaniel Clark [ 21/Feb/13 ] |
|
I've seen it several times in ZFS testing. maloo says:
for failures in test_59 |
| Comment by Sarah Liu [ 25/Feb/13 ] |
|
another instance seen in ldiskfs: |
| Comment by Zhenyu Xu [ 25/Feb/13 ] |
|
patch tracking at http://review.whamcloud.com/5528 commit message
LU-2828 osp: correct osp device finialize order
Should stop osp precreate thread before releasing its last used
oid/seq files.
|
| Comment by Peter Jones [ 26/Feb/13 ] |
|
Landed for 2.4 |
| Comment by Andreas Dilger [ 01/Oct/14 ] |
|
conf-sanity.sh test_59 and test_64 are still being skipped due to this bug. |
| Comment by Gerrit Updater [ 13/Feb/15 ] |
|
James Nunez (james.a.nunez@intel.com) uploaded a new patch: http://review.whamcloud.com/13757 |
| Comment by James Nunez (Inactive) [ 13/Feb/15 ] |
|
Patch to remove tests 59 and 64 from the ALWAYS_EXCEPT list at http://review.whamcloud.com/13757 |
| Comment by Gerrit Updater [ 03/Mar/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13757/ |
| Comment by James Nunez (Inactive) [ 03/Mar/15 ] |
|
Patch removing tests 59 and 64 from ALWAYS_EXCEPT list landed to master (pre-2.8). |