Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.13.0, Lustre 2.12.3, Lustre 2.12.4, Lustre 2.12.5
-
None
-
3
-
9223372036854775807
Description
sanity test_160f fails with ‘mds1: User cl6 not registered’. So far this year, there have been 52 sanity test 160f failures with this error; 36 of those failures are for ARM clients.
Looking at the suite_log for a recent failure, https://testing.whamcloud.com/test_sets/8bedad40-ebd5-11e9-b62b-52540065bddc, we see that user cl6 is registered and that we are able to manipulate the changelog register prior to the error
== sanity test 160f: changelog garbage collect (timestamped users) =================================== 20:27:47 (1570566467) CMD: trevis-49vm2 /usr/sbin/lctl get_param mdd.lustre-MDT0000.changelog_mask -n CMD: trevis-49vm2 /usr/sbin/lctl set_param mdd.lustre-MDT0000.changelog_mask=+hsm mdd.lustre-MDT0000.changelog_mask=+hsm CMD: trevis-49vm2 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n CMD: trevis-49vm3 /usr/sbin/lctl get_param mdd.lustre-MDT0001.changelog_mask -n CMD: trevis-49vm3 /usr/sbin/lctl set_param mdd.lustre-MDT0001.changelog_mask=+hsm mdd.lustre-MDT0001.changelog_mask=+hsm CMD: trevis-49vm3 /usr/sbin/lctl --device lustre-MDT0001 changelog_register -n CMD: trevis-49vm2 /usr/sbin/lctl get_param mdd.lustre-MDT0002.changelog_mask -n CMD: trevis-49vm2 /usr/sbin/lctl set_param mdd.lustre-MDT0002.changelog_mask=+hsm mdd.lustre-MDT0002.changelog_mask=+hsm CMD: trevis-49vm2 /usr/sbin/lctl --device lustre-MDT0002 changelog_register -n CMD: trevis-49vm3 /usr/sbin/lctl get_param mdd.lustre-MDT0003.changelog_mask -n CMD: trevis-49vm3 /usr/sbin/lctl set_param mdd.lustre-MDT0003.changelog_mask=+hsm mdd.lustre-MDT0003.changelog_mask=+hsm CMD: trevis-49vm3 /usr/sbin/lctl --device lustre-MDT0003 changelog_register -n Registered 4 changelog users: 'cl6 cl6 cl6 cl6' … mds1: verifying user cl6 clear: 19 + 2 == 21 CMD: trevis-49vm2 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.changelog_users CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0001.changelog_users CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0001.changelog_users lustre-MDT0001: clear the changelog for cl6 to record #10 CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0001.changelog_users mds2: verifying user cl6 clear: 8 + 2 == 10 CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0001.changelog_users CMD: trevis-49vm2 /usr/sbin/lctl get_param -n mdd.lustre-MDT0002.changelog_users CMD: trevis-49vm2 /usr/sbin/lctl get_param -n mdd.lustre-MDT0002.changelog_users lustre-MDT0002: clear the changelog for cl6 to record #2 CMD: trevis-49vm2 /usr/sbin/lctl get_param -n mdd.lustre-MDT0002.changelog_users mds3: verifying user cl6 clear: 0 + 2 == 2 CMD: trevis-49vm2 /usr/sbin/lctl get_param -n mdd.lustre-MDT0002.changelog_users CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0003.changelog_users CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0003.changelog_users lustre-MDT0003: clear the changelog for cl6 to record #2 CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0003.changelog_users mds4: verifying user cl6 clear: 0 + 2 == 2 CMD: trevis-49vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0003.changelog_users total: 8 create in 0.02 seconds: 453.39 ops/second CMD: trevis-49vm2 ps -e -o comm= | grep chlg_gc_thread pdsh@trevis-79vm17: trevis-49vm2: ssh exited with exit code 1 CMD: trevis-49vm2 ps -e -o comm= | grep chlg_gc_thread pdsh@trevis-79vm17: trevis-49vm2: ssh exited with exit code 1 CMD: trevis-49vm3 ps -e -o comm= | grep chlg_gc_thread pdsh@trevis-79vm17: trevis-49vm3: ssh exited with exit code 1 CMD: trevis-49vm3 ps -e -o comm= | grep chlg_gc_thread pdsh@trevis-79vm17: trevis-49vm3: ssh exited with exit code 1 CMD: trevis-49vm2 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.changelog_users sanity test_160f: @@@@@@ FAIL: mds1: User cl6 not registered
There is no indication of a problem in any of the console logs.
Logs for other failures are at
https://testing.whamcloud.com/test_sets/ecc72250-ccfd-11e9-a25b-52540065bddc
https://testing.whamcloud.com/test_sets/8da05bea-cff3-11e9-9fc9-52540065bddc