[LU-8728] Fix conf-sanity:88 for the multiple MDS case Created: 19/Oct/16 Updated: 08/Dec/17 Resolved: 08/Dec/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Arshad Hussain | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
start_mds call starts all MDSs. lctl clear_conf fails because it expects only one mds combined with mgs started with nosvc option. Only start_mdt call is to be used to start needed mds. |
| Comments |
| Comment by Gerrit Updater [ 19/Oct/16 ] |
|
Arshad Hussain (arshad.hussain@seagate.com) uploaded a new patch: http://review.whamcloud.com/23246 |
| Comment by Arshad Hussain [ 19/Oct/16 ] |
|
Test result on local: 88a == conf-sanity test 88a: test lctl clear_conf fsname == 22:03:06 (1475944386) Stopping clients: node1.domain /mnt/lustre (opts:) Stopping clients: node1.domain /mnt/lustre2 (opts:) Loading modules from /root/hpdd/lustre-wc/lustre/tests/.. detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs debug=-1 subsystem_debug=all ../lnet/lnet/lnet options: 'networks=tcp0(eth1) accept=all' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 start mds service on node1.domain Starting mds1: -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Commit the device label on /tmp/lustre-mdt1 Started lustre-MDT0000 start ost1 service on node1.domain Starting ost1: -o loop /tmp/lustre-ost1 /mnt/lustre-ost1 Commit the device label on /tmp/lustre-ost1 Started lustre-OST0000 mount lustre on /mnt/lustre..... Starting client: node1.domain: -o user_xattr,flock node1.domain@tcp:/lustre /mnt/lustre Setting lustre-MDT0000.mdd.atime_diff from 60 to 62 Waiting 90 secs for update Updated after 2s: wanted '62' got '62' Setting lustre-MDT0000.mdd.atime_diff from 62 to 63 Waiting 90 secs for update Updated after 5s: wanted '63' got '63' Setting lustre.llite.max_read_ahead_mb from 27.13 to 32 Waiting 90 secs for update Updated after 8s: wanted '32' got '32' Setting lustre.llite.max_read_ahead_mb from 32 to 64 Waiting 90 secs for update Updated after 6s: wanted '64' got '64' Pool lustre.pool1 created OST lustre-OST0000_UUID added to pool lustre.pool1 OST lustre-OST0000_UUID removed from pool lustre.pool1 OST lustre-OST0000_UUID added to pool lustre.pool1 umount lustre on /mnt/lustre..... Stopping client node1.domain /mnt/lustre (opts:) stop ost1 service on node1.domain Stopping /mnt/lustre-ost1 (opts:-f) on node1.domain stop mds service on node1.domain Stopping /mnt/lustre-mds1 (opts:-f) on node1.domain start mds service on node1.domain Starting mds1: -o nosvc,loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Start /tmp/lustre-mdt1 without service Started lustre-MDT0000 debugfs 1.42.13.wc3 (28-Aug-2015) /tmp/lustre-mdt1: catastrophic mode - not reading inode or group bitmaps stop mds service on node1.domain Stopping /mnt/lustre-mds1 (opts:-f) on node1.domain debugfs 1.42.13.wc3 (28-Aug-2015) /tmp/lustre-mdt1: catastrophic mode - not reading inode or group bitmaps start mds service on node1.domain Starting mds1: -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Started lustre-MDT0000 start ost1 service on node1.domain Starting ost1: -o loop /tmp/lustre-ost1 /mnt/lustre-ost1 Started lustre-OST0000 mount lustre on /mnt/lustre..... Starting client: node1.domain: -o user_xattr,flock node1.domain@tcp:/lustre /mnt/lustre umount lustre on /mnt/lustre..... Stopping client node1.domain /mnt/lustre (opts:) stop ost1 service on node1.domain Stopping /mnt/lustre-ost1 (opts:-f) on node1.domain stop mds service on node1.domain Stopping /mnt/lustre-mds1 (opts:-f) on node1.domain modules unloaded. Stopping clients: node1.domain /mnt/lustre (opts:) Stopping clients: node1.domain /mnt/lustre2 (opts:) Loading modules from /root/hpdd/lustre-wc/lustre/tests/.. detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs debug=-1 subsystem_debug=all ../lnet/lnet/lnet options: 'networks=tcp0(eth1) accept=all' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 Resetting fail_loc on all nodes...done. 22:05:36 (1475944536) waiting for node1.domain network 5 secs ... 22:05:36 (1475944536) network interface is UP PASS 88a (150s) Stopping clients: node1.domain /mnt/lustre (opts:) Stopping clients: node1.domain /mnt/lustre2 (opts:) Loading modules from /root/hpdd/lustre-wc/lustre/tests/.. detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs debug=-1 subsystem_debug=all gss/krb5 is not supported Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 == conf-sanity test complete, duration 198 sec == 22:05:39 (1475944539) Test result on local: 88b == conf-sanity test 88b: test lctl clear_conf one config == 22:07:00 (1475944620) Stopping clients: node1.domain /mnt/lustre (opts:) Stopping clients: node1.domain /mnt/lustre2 (opts:) Loading modules from /root/hpdd/lustre-wc/lustre/tests/.. detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs debug=-1 subsystem_debug=all ../lnet/lnet/lnet options: 'networks=tcp0(eth1) accept=all' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 start mds service on node1.domain Starting mds1: -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Commit the device label on /tmp/lustre-mdt1 Started lustre-MDT0000 start ost1 service on node1.domain Starting ost1: -o loop /tmp/lustre-ost1 /mnt/lustre-ost1 Commit the device label on /tmp/lustre-ost1 Started lustre-OST0000 mount lustre on /mnt/lustre..... Starting client: node1.domain: -o user_xattr,flock node1.domain@tcp:/lustre /mnt/lustre Setting lustre-MDT0000.mdd.atime_diff from 60 to 62 Waiting 90 secs for update Updated after 6s: wanted '62' got '62' Setting lustre-MDT0000.mdd.atime_diff from 62 to 63 Waiting 90 secs for update Updated after 7s: wanted '63' got '63' Setting lustre.llite.max_read_ahead_mb from 27.13 to 32 Waiting 90 secs for update Updated after 7s: wanted '32' got '32' Setting lustre.llite.max_read_ahead_mb from 32 to 64 Waiting 90 secs for update Updated after 6s: wanted '64' got '64' Pool lustre.pool1 created OST lustre-OST0000_UUID added to pool lustre.pool1 OST lustre-OST0000_UUID removed from pool lustre.pool1 OST lustre-OST0000_UUID added to pool lustre.pool1 umount lustre on /mnt/lustre..... Stopping client node1.domain /mnt/lustre (opts:) stop ost1 service on node1.domain Stopping /mnt/lustre-ost1 (opts:-f) on node1.domain stop mds service on node1.domain Stopping /mnt/lustre-mds1 (opts:-f) on node1.domain start mds service on node1.domain Starting mds1: -o nosvc,loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Start /tmp/lustre-mdt1 without service Started lustre-MDT0000 debugfs 1.42.13.wc3 (28-Aug-2015) /tmp/lustre-mdt1: catastrophic mode - not reading inode or group bitmaps stop mds service on node1.domain Stopping /mnt/lustre-mds1 (opts:-f) on node1.domain debugfs 1.42.13.wc3 (28-Aug-2015) /tmp/lustre-mdt1: catastrophic mode - not reading inode or group bitmaps start mds service on node1.domain Starting mds1: -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Started lustre-MDT0000 start ost1 service on node1.domain Starting ost1: -o loop /tmp/lustre-ost1 /mnt/lustre-ost1 Started lustre-OST0000 mount lustre on /mnt/lustre..... Starting client: node1.domain: -o user_xattr,flock node1.domain@tcp:/lustre /mnt/lustre umount lustre on /mnt/lustre..... Stopping client node1.domain /mnt/lustre (opts:) stop ost1 service on node1.domain Stopping /mnt/lustre-ost1 (opts:-f) on node1.domain stop mds service on node1.domain Stopping /mnt/lustre-mds1 (opts:-f) on node1.domain modules unloaded. Stopping clients: node1.domain /mnt/lustre (opts:) Stopping clients: node1.domain /mnt/lustre2 (opts:) Loading modules from /root/hpdd/lustre-wc/lustre/tests/.. detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs debug=-1 subsystem_debug=all ../lnet/lnet/lnet options: 'networks=tcp0(eth1) accept=all' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 Resetting fail_loc on all nodes...done. 22:09:30 (1475944770) waiting for node1.domain network 5 secs ... 22:09:30 (1475944770) network interface is UP PASS 88b (151s) Stopping clients: node1.domain /mnt/lustre (opts:) Stopping clients: node1.domain /mnt/lustre2 (opts:) Loading modules from /root/hpdd/lustre-wc/lustre/tests/.. detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs debug=-1 subsystem_debug=all gss/krb5 is not supported Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 == conf-sanity test complete, duration 190 sec == 22:09:33 (1475944773) |
| Comment by Andreas Dilger [ 08/Dec/17 ] |
|
Closing this as a duplicate of |