Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.5, Lustre 2.13.0, Lustre 2.14.0, Lustre 2.16.0
-
None
-
3
-
9223372036854775807
Description
replay-single test_87a fails a check of file checksums before and after an OST failover. Looking at the client test_log for https://testing.whamcloud.com/test_sets/193c29e2-d05c-11e8-82f2-52540065bddc, we can see that the OSTs are not showing up in the output of 'lfs df'
== replay-single test 87a: write replay ============================================================== 00:30:41 (1539588641) CMD: onyx-39vm10 lctl set_param -n obdfilter.lustre-OST0000.sync_journal 0 onyx-39vm10: error: set_param: param_path 'obdfilter/lustre-OST0000/sync_journal': No such file or directory CMD: onyx-39vm10 sync; sync; sync UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 5825660 47548 5255280 1% /mnt/lustre[MDT:0] filesystem_summary: 0 0 0 0% /mnt/lustre
When this test passes, we see the OSTs listed in 'lfs df'. More than likely a previous test did not clean up after itself. Note: The sync_journal parameter will be taken care of in LU-11561.
In the client log, it looks like we are able to write to the file successfully, but, after the OSS has failed over and when we go to read the file that we are calculating the checksum on, the dd doesn't read/write any data
8+0 records in 8+0 records out 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.531744 s, 15.8 MB/s ... 0+0 records in 0+0 records out 0 bytes copied, 0.00189394 s, 0.0 kB/s replay-single test_87a: @@@@@@ FAIL: New checksum d41d8cd98f00b204e9800998ecf8427e does not match original 51109c0cd52a9fa425c47b018ed0708e
Whenever replay-single test 87a fails with the checksum mismatch, the "new checksum" is always the same; d41d8cd98f00b204e9800998ecf8427e . The original checksum is not the same over these failures.
We see this error at least as far back as May 2018.
More replay-single test_87a failures are at
https://testing.whamcloud.com/test_sets/f3e40540-d13f-11e8-ad90-52540065bddc
https://testing.whamcloud.com/test_sets/7ad86b32-cae1-11e8-b589-52540065bddc
https://testing.whamcloud.com/test_sets/94e8844a-c8a1-11e8-82f2-52540065bddc
https://testing.whamcloud.com/test_sets/fc93fec0-c8a2-11e8-82f2-52540065bddc
https://testing.whamcloud.com/test_sets/1bedfd14-b068-11e8-bbd1-52540065bddc
Attachments
Issue Links
- is related to
-
LU-10702 replay-single test_87a: checksum doesn't match
- Resolved
-
LU-11561 Change syncjournal back to sync_journal
- Resolved
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...