[LU-2547] test: recovery-small test_24a, test_24b: multiop didn't fail fsync: rc 0 Created: 28/Dec/12 Updated: 22/Dec/17 Resolved: 02/Sep/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jay Lan (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | revzfs | ||
| Environment: |
Server: 2.1.3-1nasS, centos 6.3, 2.6.32_279.2.1.el6 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 5971 |
| Description |
|
== recovery-small test 24a: fsync error (should return error) ======================================== 23:15:59 (1356678959) test_logs tarball is attached: recovery-small.24a.tgz |
| Comments |
| Comment by Peter Jones [ 29/Dec/12 ] |
|
Lai Could you please look into this one? Thanks Peter |
| Comment by Lai Siyao [ 31/Dec/12 ] |
|
Jay, I tested b2_3 branch, it always suceeds, could you point me to the exact version that fails? or a place where I can download? BTW, does this test always fail in your test environment? |
| Comment by Jay Lan (Inactive) [ 31/Dec/12 ] |
|
I tested it three times and all failed in my test environment. The source is at https://github.com/jlan/lustre-nas, |
| Comment by Lai Siyao [ 05/Jan/13 ] |
|
Jay, it looks like the code can't compile against 3.0 kernel yet, could you list the patches you've applied? |
| Comment by Jay Lan (Inactive) [ 06/Jan/13 ] |
|
Attached are nas-config and nas-make scripts. You need to modify the nas-config script to specify
|
| Comment by Nathaniel Clark [ 04/Feb/13 ] |
|
https://maloo.whamcloud.com/test_sets/f17e7f9c-6c8c-11e2-91d6-52540035b04c |
| Comment by Jay Lan (Inactive) [ 01/Apr/13 ] |
|
Nathaniel, is the above link for me to read? I can not access that link. |
| Comment by Nathaniel Clark [ 03/Apr/13 ] |
|
Jay, Sorry, that's a link to a failing autotest run. |
| Comment by Nathaniel Clark [ 22/Apr/13 ] |
|
This hasn't failed with ldiskfs in 4wks, but is failing over 50% of the time with zfs. |
| Comment by Nathaniel Clark [ 22/Apr/13 ] |
|
EXCEPT this test for zfs |
| Comment by Keith Mannthey (Inactive) [ 13/May/13 ] |
|
It looked like a patch was landed but I saw a zfs fail today that looked exactly like this. https://maloo.whamcloud.com/test_sets/f425e1a6-bc12-11e2-b013-52540035b04c |
| Comment by Nathaniel Clark [ 07/Jun/13 ] |
|
Patch to EXCEPT 24b also for ZFS |
| Comment by Bruno Faccini (Inactive) [ 04/Jul/13 ] |
|
Got an occurrence with recovery-small/test_24b during https://maloo.whamcloud.com/test_sets/f43c1ffc-e4ad-11e2-a950-52540035b04c. |
| Comment by Peter Jones [ 22/Aug/13 ] |
|
Landed for 2.5 |
| Comment by Nathaniel Clark [ 22/Aug/13 ] |
|
Patch for b2_4 http://review.whamcloud.com/7424 |
| Comment by Andreas Dilger [ 01/Oct/14 ] |
|
recovery-small test_24a and test_24b are being skipped, the problem was not actually fixed. |
| Comment by Jay Lan (Inactive) [ 19/Oct/15 ] |
|
Please close this ticket since the test was marked "always_except". |
| Comment by Peter Jones [ 19/Oct/15 ] |
|
ok Jay |
| Comment by Andreas Dilger [ 20/Oct/15 ] |
|
The whole point of the always_except label is that it means this test is being skipped, but the original bug has not actually been fixed. This ticket shouldn't be closed until the original problem is fixed (lack of error return to userspace on fsync) and the test is removed from the ALWAYS_EXCEPT list in recovery-small.sh. I verified that this test is still being skipped, but only for ZFS MDT. |
| Comment by Peter Jones [ 12/Aug/16 ] |
|
Niu Can you please check to see what needs to happen to get this test re-enabled? Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 19/Aug/16 ] |
|
Lustre fsync was semantically wrong before the fix of 00000080:00000001:4.0:1356678959.907382:0:21907:0:(obd_class.h:2061:md_sync()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:4.0:1356678959.907382:0:21907:0:(client.c:2323:__ptlrpc_req_finished()) Process entered 00000100:00000040:4.0:1356678959.907383:0:21907:0:(client.c:2335:__ptlrpc_req_finished()) @@@ refcount now 0 req@ffff8805bdf93800 x1422578972363063/t0(0) o44->lustre-MDT0000-mdc-ffff8806f504e800@10.151.25.187@o2ib:12/10 lens 448/408 e 0 to 0 dl 1356678997 ref 1 fl Complete:R/0/0 rc 0/0 00000100:00000001:4.0:1356678959.907386:0:21907:0:(client.c:2245:__ptlrpc_free_req()) Process entered 02000000:00000001:4.0:1356678959.907388:0:21907:0:(sec.c:1697:sptlrpc_cli_free_repbuf()) Process entered 02000000:00000010:4.0:1356678959.907388:0:21907:0:(sec_null.c:231:null_free_repbuf()) kfreed 'req->rq_repbuf': 1024 at ffff8805d7a85000. 02000000:00000001:4.0:1356678959.907389:0:21907:0:(sec.c:1711:sptlrpc_cli_free_repbuf()) Process leaving 00000020:00000001:4.0:1356678959.907390:0:21907:0:(genops.c:963:class_import_put()) Process entered 00000020:00000040:4.0:1356678959.907390:0:21907:0:(genops.c:970:class_import_put()) import ffff880379630800 refcount=10 obd=lustre-MDT0000-mdc-ffff8806f504e800 00000020:00000001:4.0:1356678959.907391:0:21907:0:(genops.c:979:class_import_put()) Process leaving 02000000:00000010:4.0:1356678959.907392:0:21907:0:(sec_null.c:201:null_free_reqbuf()) kfreed 'req->rq_reqbuf': 512 at ffff8805d8599600. 02000000:00000001:4.0:1356678959.907394:0:21907:0:(sec.c:437:sptlrpc_req_put_ctx()) Process entered 02000000:00000001:4.0:1356678959.907394:0:21907:0:(sec.c:453:sptlrpc_req_put_ctx()) Process leaving 00000100:00000010:4.0:1356678959.907395:0:21907:0:(client.c:2299:__ptlrpc_free_req()) kfreed 'request': 928 at ffff8805bdf93800. 00000100:00000001:4.0:1356678959.907396:0:21907:0:(client.c:2300:__ptlrpc_free_req()) Process leaving 00000100:00000001:4.0:1356678959.907396:0:21907:0:(client.c:2339:__ptlrpc_req_finished()) Process leaving (rc=1 : 1 : 1) 00020000:00000002:4.0:1356678959.907397:0:21907:0:(lov_object.c:787:lov_lsm_addref()) lsm ffff8805b90ed640 addref 2 by ffff880382188280. 00020000:00000002:4.0:1356678959.907398:0:21907:0:(lov_object.c:799:lov_lsm_decref()) lsm ffff8805b90ed640 decref 2 by ffff880382188280. 00000080:00000001:4.0:1356678959.907399:0:21907:0:(file.c:2121:ll_fsync()) Process leaving (rc=0 : 0 : 0) Above log shows fsync synced meta data only. Given fsync has been fixed since 2.6 (by |
| Comment by Gerrit Updater [ 19/Aug/16 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/22020 |
| Comment by Gerrit Updater [ 02/Sep/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22020/ |
| Comment by Peter Jones [ 02/Sep/16 ] |
|
Landed for 2.9 |