[LU-8872] sanity-lfsck: no tests run Created: 29/Nov/16 Updated: 16/May/17 Resolved: 26/Mar/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Full - EL7.3 Server/EL7.3 Client - ZFS |
||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com> Please provide additional information about the failure here. This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/414e1f26-b443-11e6-b287-5254006e85c2. Error message: /usr/bin/lfs setquota -g quota_2usr -b 7997952 -B 8397849 -i 109015 -I 114465 /mnt/lustre FAILED! suite_log: Total allocated inode limit: 0, total allocated block limit: 0 Setting up quota on onyx-32vm1.onyx.hpdd.intel.com:/mnt/lustre for quota_2usr... + /usr/bin/lfs setquota -u quota_2usr -b 7997952 -B 8397849 -i 109015 -I 114465 /mnt/lustre + /usr/bin/lfs setquota -g quota_2usr -b 7997952 -B 8397849 -i 109015 -I 114465 /mnt/lustre setquota failed: Transport endpoint is not connected sanity-lfsck : @@@@@@ FAIL: /usr/bin/lfs setquota -g quota_2usr -b 7997952 -B 8397849 -i 109015 -I 114465 /mnt/lustre FAILED! Might be related to |
| Comments |
| Comment by James Nunez (Inactive) [ 29/Nov/16 ] |
|
In the test_complete log for vm8, we see: 17:33:59:[ 606.188774] Lustre: lustre-OST0003: Connection restored to lustre-MDT0000-mdtlov_UUID (at 10.2.4.117@tcp) 17:33:59:[ 606.191655] Lustre: Skipped 6 previous similar messages 17:33:59:[ 610.365899] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-ost4/ost4 2>/dev/null 17:33:59:[ 614.462466] Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20 17:33:59:[ 614.746365] Lustre: DEBUG MARKER: Using TIMEOUT=20 17:33:59:[ 615.646830] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled 17:33:59:[ 621.474849] LustreError: 15111:0:(qsd_writeback.c:124:qsd_add_deferred()) ASSERTION( tmp->qur_lqe ) failed: 17:33:59:[ 621.477708] LustreError: 15111:0:(qsd_writeback.c:124:qsd_add_deferred()) LBUG 17:33:59:[ 621.480296] Pid: 15111, comm: ldlm_cb00_000 17:33:59:[ 621.482716] 17:33:59:[ 621.482716] Call Trace: 17:33:59:[ 621.486796] [<ffffffffa09c57d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] 17:33:59:[ 621.489228] [<ffffffffa09c5841>] lbug_with_loc+0x41/0xb0 [libcfs] 17:33:59:[ 621.491543] [<ffffffffa0f0b6af>] qsd_upd_schedule+0x6ef/0x760 [lquota] 17:33:59:[ 621.493834] [<ffffffffa0f04178>] qsd_glb_glimpse_ast+0x228/0x3a0 [lquota] 17:33:59:[ 621.496292] [<ffffffffa0d1fa3d>] ldlm_callback_handler.part.24+0x13bd/0x2110 [ptlrpc] 17:33:59:[ 621.498685] [<ffffffffa09d0537>] ? libcfs_debug_msg+0x57/0x80 [libcfs] 17:33:59:[ 621.500993] [<ffffffffa0d207c7>] ldlm_callback_handler+0x37/0xd0 [ptlrpc] 17:33:59:[ 621.503350] [<ffffffffa0d4d1fb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] 17:33:59:[ 621.505735] [<ffffffffa0d4adb8>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] 17:33:59:[ 621.508047] [<ffffffffa0d512b0>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc] 17:33:59:[ 621.510292] [<ffffffffa0d50810>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc] 17:33:59:[ 621.512503] [<ffffffff810b052f>] kthread+0xcf/0xe0 17:33:59:[ 621.514565] [<ffffffff810b0460>] ? kthread+0x0/0xe0 17:33:59:[ 621.516605] [<ffffffff81696658>] ret_from_fork+0x58/0x90 17:33:59:[ 621.518642] [<ffffffff810b0460>] ? kthread+0x0/0xe0 17:33:59:[ 621.520596] 17:33:59:[ 621.522237] Kernel panic - not syncing: LBUG 17:33:59:[ 621.523228] CPU: 1 PID: 15111 Comm: ldlm_cb00_000 Tainted: P OE ------------ 3.10.0-514.el7_lustre.x86_64 #1 17:33:59:[ 621.523228] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 |
| Comment by Joseph Gmitter (Inactive) [ 29/Nov/16 ] |
|
Hi Niu, There seems to be some relation to quota related LBUGs. Can you please have a look? Thanks. |
| Comment by Niu Yawei (Inactive) [ 30/Nov/16 ] |
|
That LASSERT is just inappropriate, the 'lqe' can be NULL for global list, I'm going to cook a patch to remove it. |
| Comment by Gerrit Updater [ 30/Nov/16 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/24024 |
| Comment by Gerrit Updater [ 26/Mar/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/24024/ |
| Comment by Peter Jones [ 26/Mar/17 ] |
|
Landed for 2.10 |