[LU-10760] MGS startup is happening asynchronously and taking a long time Created: 02/Mar/18  Updated: 17/Mar/18  Resolved: 17/Mar/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.4
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Brian Murrell (Inactive) Assignee: Nathaniel Clark
Resolution: Not a Bug Votes: 0
Labels: None

Attachments: File debug-lctl-0-1520607668.log.xz     File debug-lctl-0-1521033813.44.log.xz     File debug-lctl-1-1520607677.log     File debug-lctl-1519949571.log.xz     File lctl-0-1521033811.41.strace    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I have noticed that from 2.10.3 to b2_10 MGS startup is happening asynchronously and taking much longer.

In 2.10.3, immediately after (or within a very small number of seconds at most) the mount for the MGS returns it reports as being up:

# mount -t lustre zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk13/MGS /mnt/MGS
# lctl dl:
  0 UP osd-zfs MGS-osd MGS-osd_UUID 4
  1 UP mgs MGS MGS 4
  2 UP mgc MGC10.14.83.84@tcp f2845b74-6eb5-07a5-1aaa-d7c54f76df5f 4



However the same operation on b2_10 is taking on the order of 10-11 seconds after the mount returns before lctl dl shows the MGS as being up.

I have attached debug-lctl-1519949571.log.xz which is a lustre debug log of the mount an lctl conf_param where it takes 10-11 seconds before the lctl conf_param returns with success instead of a -19.



 Comments   
Comment by Peter Jones [ 06/Mar/18 ]

Nathaniel

Can you please investigate?

Thanks

Peter

Comment by Nathaniel Clark [ 06/Mar/18 ]

Looking at the debug-lctl-1519949571.log.xz:

Thu Mar  1 19:07:59 2018 [17187] (mgs_handler.c:1010:mgs_iocontrol()) Process entered
Thu Mar  1 19:07:59 2018 [17187] (mgs_handler.c:1011:mgs_iocontrol()) handling ioctl cmd 0x400866bb
...
Thu Mar  1 19:07:59 2018 [17187] (mgs_llog.c:4832:mgs_set_conf_param()) Process entered
...
Thu Mar  1 19:07:59 2018 [17187] (mgs_llog.c:3609:mgs_write_log_param()) next param 'llite.max_cached_mb=16'
...
Thu Mar  1 19:07:59 2018 [17187] (mgs_llog.c:4941:mgs_set_conf_param()) Process leaving (rc=0 : 0 : 0)
Thu Mar  1 19:07:59 2018 [17187] (mgs_handler.c:1132:mgs_iocontrol()) Process leaving (rc=0 : 0 : 0)
Thu Mar  1 19:07:59 2018 [17187] (obd_class.h:1168:obd_iocontrol()) Process leaving (rc=0 : 0 : 0)

There are a couple other call chains like that, none take longer than a second to complete. Likewise MGS startup seems to start and end within a second

Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1514:lustre_fill_super()) Process entered
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1516:lustre_fill_super()) VFS Op: sb ffff880078e06000
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:574:lustre_init_lsi()) Process entered
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:576:lustre_init_lsi()) kmalloced 'lsi': 984 at ffff880079cd7800.
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:579:lustre_init_lsi()) kmalloced 'lsi->lsi_lmd': 104 at ffff880077d2b700.
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:597:lustre_init_lsi()) Process leaving (rc=18446612134357727232 : -131939351824384 : ffff880079cd7800)
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1247:lmd_parse()) Process entered
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1265:lmd_parse()) kmalloced 'lmd->lmd_params': 4096 at ffff880049e40000.
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1107:lmd_parse_string()) kmalloced '*handle': 8 at ffff880078338d08.
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1107:lmd_parse_string()) kmalloced '*handle': 4 at ffff880078338cc8.
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1471:lmd_parse()) kmalloced 'lmd->lmd_dev': 43 at ffff880078dd4f40.
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:931:lmd_print())   mount data:
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:934:lmd_print()) device:  zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk1/MGS
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:935:lmd_print()) flags:   2200
...
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1578:lustre_fill_super()) Process leaving via out (rc=0 : 0 : 0x0)
Thu Mar  1 19:06:14 2018 [12364] (obd_mount.c:1585:lustre_fill_super()) Mount zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk1/MGS complete

What conf_param are you setting that should return error?

Comment by Brian Murrell (Inactive) [ 09/Mar/18 ]

Please find attached a lustre debug log where I got:

# lctl conf_param testfs.llite.max_cached_mb=16
No device found for name MGS: Invalid argument
This command must be run on the MGS.
error: conf_param: No such device
# echo $?
19

Right after starting the MGS and where only a moment later the same command succeeded in this lustre debug log.

Comment by Nathaniel Clark [ 12/Mar/18 ]

Looking at debug-lctl-0-1520607668.log.xz:
Modules loaded:

Fri Mar  9 09:44:49 2018 [ 1153] (api-ni.c:1462:lnet_startup_lndni()) Added LNI 10.14.82.96@tcp [8/256/0/180]

MGS Mounts:

Fri Mar  9 09:54:25 2018 [12151] (obd_mount.c:1514:lustre_fill_super()) Process entered
Fri Mar  9 09:54:25 2018 [12151] (obd_mount.c:1516:lustre_fill_super()) VFS Op: sb ffff88007a2ec000
...
Fri Mar  9 09:54:25 2018 [12151] (obd_mount.c:1578:lustre_fill_super()) Process leaving via out (rc=0 : 0 : 0x0)
Fri Mar  9 09:54:25 2018 [12151] (obd_mount.c:1585:lustre_fill_super()) Mount zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk5/MGS complete

MDS Mounts:

Fri Mar  9 09:54:29 2018 [12419] (obd_mount.c:1514:lustre_fill_super()) Process entered
Fri Mar  9 09:54:29 2018 [12419] (obd_mount.c:1516:lustre_fill_super()) VFS Op: sb ffff88007acf7800
...
Fri Mar  9 09:54:31 2018 [12419] (obd_mount.c:1578:lustre_fill_super()) Process leaving via out (rc=0 : 0 : 0x0)
Fri Mar  9 09:54:31 2018 [12419] (obd_mount.c:1585:lustre_fill_super()) Mount zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk3/testfs-MDT0000 complete

OST Mounts:

Fri Mar  9 09:55:15 2018 [14657] (obd_mount.c:1514:lustre_fill_super()) Process entered
Fri Mar  9 09:55:15 2018 [14657] (obd_mount.c:1516:lustre_fill_super()) VFS Op: sb ffff88007b29e000
...
Fri Mar  9 09:55:15 2018 [14657] (obd_mount_server.c:1757:osd_start()) Attempting to start testfs-OST0000, type=osd-ldiskfs, lsifl=200062, mountfl=0
...
Fri Mar  9 09:55:15 2018 [14657] (obd_mount.c:1578:lustre_fill_super()) Process leaving via out (rc=0 : 0 : 0x0)
Fri Mar  9 09:55:15 2018 [14657] (obd_mount.c:1585:lustre_fill_super()) Mount /dev/sde complete

# lctl conf_param testfs.llite.max_cached_mb=16 runs successfully

Fri Mar  9 09:55:58 2018 [16654] (mgs_handler.c:1010:mgs_iocontrol()) Process entered
Fri Mar  9 09:55:58 2018 [16654] (mgs_handler.c:1011:mgs_iocontrol()) handling ioctl cmd 0x400866bb
...
Fri Mar  9 09:55:58 2018 [16654] (mgs_llog.c:4911:mgs_set_conf_param()) set_conf_param fs='testfs' device='testfs' param='llite.max_cached_mb=16'
...
Fri Mar  9 09:55:58 2018 [16654] (linux-module.c:235:obd_class_release()) Process entered
Fri Mar  9 09:55:58 2018 [16654] (linux-module.c:238:obd_class_release()) Process leaving (rc=0 : 0 : 0)

The failing lctl doesn't seem to show up in the debug log. Can you run the initial (failing) lctl with strace -fytt?

Comment by Brian Murrell (Inactive) [ 14/Mar/18 ]

Can you run the initial (failing) lctl with strace -fytt?

Please find attached the strace output with the corresponding debug log.

FWIW, the output of lctl dl immediately after the failed:

# lctl conf_param testfs.llite.max_cached_mb=16

No device found for name MGS: Invalid argument
This command must be run on the MGS.
error: conf_param: No such device

looked like this:

# lctl dl
# echo $?
# 0
Comment by Nathaniel Clark [ 14/Mar/18 ]

Here is the ioctl call via strace:

24654 06:23:31.438314 ioctl(3</dev/obd>, _IOC(_IOC_READ|_IOC_WRITE, 0x66, 0x7f, 0x08), 0x7fffef9492a0) = -1 EINVAL (Invalid argument)

Here is the corresponding debug log:

Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:226:obd_class_open()) Process entered
Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:229:obd_class_open()) Process leaving (rc=0 : 0 : 0)
Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:246:obd_class_ioctl()) Process entered
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:222:class_handle_ioctl()) Process entered
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:232:class_handle_ioctl()) cmd = c008667f
Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:155:obd_ioctl_getdata()) Process entered
Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:181:obd_ioctl_getdata()) kmalloced '*buf': 584 at ffff880043251400.
Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:219:obd_ioctl_getdata()) Process leaving (rc=0 : 0 : 0)
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:191:class_resolve_dev_name()) Process entered
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:201:class_resolve_dev_name()) device name MGS
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:204:class_resolve_dev_name()) No device for name MGS!
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:205:class_resolve_dev_name()) Process leaving via out (rc=18446744073709551594 : -22 : 0xffffffffffffffea)
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:212:class_resolve_dev_name()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:299:class_handle_ioctl()) Process leaving via out (rc=18446744073709551594 : -22 : 0xffffffffffffffea)
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:426:class_handle_ioctl()) kfreed 'buf': 584 at ffff880043251400.
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:427:class_handle_ioctl()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
Wed Mar 14 09:23:31 2018 [24654] (linux-module.c:256:obd_class_ioctl()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
Comment by Nathaniel Clark [ 15/Mar/18 ]

Now that I know what to look for I've found it in previous debug-lctl-0-1520607668.log.xz :

Fri Mar  9 09:54:25 2018 [12151] (obd_config.c:372:class_attach()) Process entered
Fri Mar  9 09:54:25 2018 [12151] (genops.c:309:class_newdev()) Process entered
...
Fri Mar  9 09:54:25 2018 [12151] (genops.c:140:class_get_type()) Loaded module 'mgs'
Fri Mar  9 09:54:25 2018 [12151] (genops.c:373:class_newdev()) Allocate new device MGS (ffff880049e88f78)
...
Fri Mar  9 09:54:25 2018 [12151] (genops.c:505:class_register_device()) Process leaving (rc=0 : 0 : 0)
Fri Mar  9 09:54:25 2018 [12151] (obd_config.c:430:class_attach()) OBD: dev 1 attached type mgs with refcount 1
Fri Mar  9 09:54:25 2018 [12151] (obd_config.c:432:class_attach()) Process leaving (rc=0 : 0 : 0)

But then ioctl can't find it:

Fri Mar  9 10:01:08 2018 [25094] (class_obd.c:191:class_resolve_dev_name()) Process entered
Fri Mar  9 10:01:08 2018 [25094] (class_obd.c:201:class_resolve_dev_name()) device name MGS
Fri Mar  9 10:01:08 2018 [25094] (class_obd.c:204:class_resolve_dev_name()) No device for name MGS!
Fri Mar  9 10:01:08 2018 [25094] (class_obd.c:205:class_resolve_dev_name()) Process leaving via out (rc=18446744073709551594 : -22 : 0xffffffffffffffea)
Fri Mar  9 10:01:08 2018 [25094] (class_obd.c:212:class_resolve_dev_name()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
Comment by Nathaniel Clark [ 15/Mar/18 ]

Re-looking at debug-lctl-0-1520607668.log.xz :

Fri Mar  9 10:00:35 2018 [24022] (obd_mount_server.c:1618:server_put_super()) server umount MGS complete

The MGS is unmounting prior to you running conf param.

Same with debug-lctl-0-1521033813.44.log.xz:

Wed Mar 14 09:16:44 2018 [12129] (obd_mount_server.c:236:server_start_mgs()) Start MGS service MGS
...
Wed Mar 14 09:23:02 2018 [23609] (obd_mount_server.c:1618:server_put_super()) server umount MGS complete
...
Wed Mar 14 09:23:31 2018 [24654] (class_obd.c:204:class_resolve_dev_name()) No device for name MGS!
Comment by Brian Murrell (Inactive) [ 15/Mar/18 ]

utopiabound: Just want to say be careful.  You might have realised this already, but just in case you haven't, there could be multiple MGS start/stop operations before the one where the subsequent lctl conf_parm ioctl can't find it.  As in the case above, the almost 7 minutes between the MGS start and the No device for name MGS seems much too long.  The delay I am seeing seems to be on the order of about 10 seconds, not many minutes.

Comment by Brian Murrell (Inactive) [ 17/Mar/18 ]

This turned out to be a subtle difference in the way Pacemaker reports resource status in RHEL 7.5 which was causing IML to report a device as started before it actually was.

Generated at Sat Feb 10 02:37:57 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.