Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.5.0
-
OpenSFS cluster with combined MGS/MDS, single OSS with two OSTs, four clients; one agent (c07), one with robinhood/db running (c08) and just running as Lustre clients (c09, c10)
-
3
-
10610
Description
Test results are at https://maloo.whamcloud.com/test_sets/28e49004-2171-11e3-b1f0-52540035b04c
This may just be an error in the test. From John Hammond:
I think 9a is a bug in the test. 0x00000001 should be 0x00000009.
From the test log, we see
== sanity-hsm test 9a: Multiple remote agents == 10:42:31 (1379526151) pdsh@c10: c07: ssh exited with exit code 1 pdsh@c10: c07: ssh exited with exit code 1 Purging archive on c07 Starting copytool agt1 on c07 2+0 records in 2+0 records out 2097152 bytes (2.1 MB) copied, 0.322563 s, 6.5 MB/s Changed after 0s: from '' to 'STARTED' Waiting 100 secs for update sanity-hsm test_9a: @@@@@@ FAIL: hsm flags on /lustre/scratch/d0.sanity-hsm/d9/f.sanity-hsm.9a.1 are 0x00000009 != 0x00000001
I think the following are harmless, but dmesg on the MDS shows the following:
Lustre: DEBUG MARKER: == sanity-hsm test 9: Use of explict archive number, with dedicated copytool == 10:42:28 (1379526148) LustreError: 30192:0:(mdt_coordinator.c:917:mdt_hsm_cdt_start()) scratch-MDT0000: Coordinator already started LustreError: 30192:0:(obd_config.c:1346:class_process_proc_param()) writing proc entry hsm_control err -114 Lustre: DEBUG MARKER: == sanity-hsm test 9a: Multiple remote agents == 10:42:31 (1379526151) Lustre: DEBUG MARKER: sanity-hsm test_9a: @@@@@@ FAIL: hsm flags on /lustre/scratch/d0.sanity-hsm/d9/f.sanity-hsm.9a.1 are 0x00000009 != 0x00000001