[LU-12387] l_tunedisk does not properly handle multipath devices Created: 04/Jun/19 Updated: 14/Jul/20 Resolved: 16/Jun/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Chris Horn | Assignee: | Chris Horn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
I've identified two bugs or deficiencies in the l_tunedisk utility in Lustre 2.12/2.13 The first is a regression introduced by " The l_tunedisk path does provide sufficient information to the tune_block_dev() function for it to ascertain whether the device being tuned is in-fact an OST. The result is that tuning is not performed for any device. I believe the solution is, after we've determined that the device has been formatted for Lustre, we should read the "lustre_disk_data" (ldd) from the device. This information is stored in the "struct mount_opts" that is passed through to the underlying tuning functions. diff --git a/lustre/utils/l_tunedisk.c b/lustre/utils/l_tunedisk.c
index 3321fd1f0f..feecdeea77 100644
--- a/lustre/utils/l_tunedisk.c
+++ b/lustre/utils/l_tunedisk.c
@@ -45,6 +45,8 @@ int main(int argc, char *const argv[])
struct mount_opts mop = {
.mo_max_sectors_kb = -1
};
+ struct lustre_disk_data *ldd = &mop.mo_ldd;
+
char real_path[PATH_MAX] = {'\0'};
unsigned int mount_type;
static struct option long_opts[] = {
@@ -95,6 +97,13 @@ int main(int argc, char *const argv[])
if (ret == 0)
goto out;
+ ret = osd_read_ldd(mop.mo_source, ldd);
+ if (ret != 0) {
+ fprintf(stderr, "Failed to read previous Lustre data from %s "
+ "(%d)\n", mop.mo_source, ret);
+ goto out;
+ }
+
ret = osd_tune_lustre(mop.mo_source, &mop);
out:
The second issue has to do with how tunings are emplaced for slave devices when the device that was initially passed to l_tunedisk is a multipath device. static int tune_block_dev(const char *src, struct mount_opts *mop)
{
<snip>
/* If device is multipath device then tune its slave
* devices. */
rc = tune_block_dev_slaves(real_sys_path, mop);
Here "real_sys_path" is something like /sys/devices/virtual/block/dm-2 tune_block_dev_slaves() then looks in the slaves subdirectory. static int tune_block_dev_slaves(const char *sys_path, struct mount_opts *mop)
{
<snip>
snprintf(slaves_path, sizeof(slaves_path), "%s/slaves", sys_path);
slaves_dir = opendir(slaves_path);
e.g. # ls /sys/devices/virtual/block/dm-2/slaves sdc sdh sdm sdr # And it recursively calls tune_block_dev() on each symlink in that directory: while ((d = readdir(slaves_dir)) != NULL) {
char path[PATH_MAX];
int rc2;
if (d->d_type != DT_LNK)
continue;
snprintf(path, sizeof(path), "%s/%s", slaves_path, d->d_name);
rc2 = tune_block_dev(path, mop);
The problem is that tune_block_dev() performs a stat on its "src" argument which is supposed to be the device path that we are tuning. And if the stat does not indicate that the file is a block device then we return without performing any tuning: static int tune_block_dev(const char *src, struct mount_opts *mop)
{
<snip>
rc = stat(src, &st);
if (rc < 0) {
if (verbose)
fprintf(stderr, "warning: cannot stat '%s': %s\n",
src, strerror(errno));
return errno;
}
if (!S_ISBLK(st.st_mode))
return 0;
stat will resolve the symbolic link, however these links point at directories. Directories are not block devices to tune_block_dev() will not tune them. I'm not sure what is going to come out of LU-12029. |
| Comments |
| Comment by Chris Horn [ 05/Jun/19 ] |
|
It's not totally clear to me the best way to handle the symlink issue. I wonder if a simple fix would be to change the path for the slave device to: while ((d = readdir(slaves_dir)) != NULL) {
char path[PATH_MAX];
int rc2;
if (d->d_type != DT_LNK)
continue;
snprintf(path, sizeof(path), "/dev/%s", d->d_name);
rc2 = tune_block_dev(path, mop);
|
| Comment by Chris Horn [ 05/Jun/19 ] |
|
The change proposed above seems to do the trick for my testbed, but it'd be nice to have someone with expertise in this area weigh in. oss1:~ # for i in /sys/block/sd{m,c,r,h} /sys/block/dm-2; do echo 1280 > $i/queue/max_sectors_kb; done
oss1:~ # for i in /sys/block/sd{m,c,r,h} /sys/block/dm-2; do echo "$i $(cat $i/queue/max_sectors_kb)"; done
/sys/block/sdm 1280
/sys/block/sdc 1280
/sys/block/sdr 1280
/sys/block/sdh 1280
/sys/block/dm-2 1280
oss1:~ # /usr/sbin/l_tunedisk /dev/mapper/mpathc
l_tunedisk: increased '/sys/devices/virtual/block/dm-2/queue/max_sectors_kb' from 1280 to 16383
Tuning slaves for /sys/devices/virtual/block/dm-2
Checking for slaves at /sys/devices/virtual/block/dm-2/slaves
tuning slave /dev/sdm
l_tunedisk: increased '/sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:2/end_device-0:0:2/target0:0:2/0:0:2:2/block/sdm/queue/max_sectors_kb' from 1280 to 16383
Tuning slaves for /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:2/end_device-0:0:2/target0:0:2/0:0:2:2/block/sdm
Checking for slaves at /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:2/end_device-0:0:2/target0:0:2/0:0:2:2/block/sdm/slaves
tuning slave /dev/sdc
l_tunedisk: increased '/sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:2/block/sdc/queue/max_sectors_kb' from 1280 to 16383
Tuning slaves for /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:2/block/sdc
Checking for slaves at /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:2/block/sdc/slaves
tuning slave /dev/sdr
l_tunedisk: increased '/sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:3/end_device-0:0:3/target0:0:3/0:0:3:2/block/sdr/queue/max_sectors_kb' from 1280 to 16383
Tuning slaves for /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:3/end_device-0:0:3/target0:0:3/0:0:3:2/block/sdr
Checking for slaves at /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:3/end_device-0:0:3/target0:0:3/0:0:3:2/block/sdr/slaves
tuning slave /dev/sdh
l_tunedisk: increased '/sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:1/0:0:1:2/block/sdh/queue/max_sectors_kb' from 1280 to 16383
Tuning slaves for /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:1/0:0:1:2/block/sdh
Checking for slaves at /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:1/0:0:1:2/block/sdh/slaves
oss1:~ # for i in /sys/block/sd{m,c,r,h} /sys/block/dm-2; do echo "$i $(cat $i/queue/max_sectors_kb)"; done
/sys/block/sdm 16383
/sys/block/sdc 16383
/sys/block/sdr 16383
/sys/block/sdh 16383
/sys/block/dm-2 16383
oss1:~ #
|
| Comment by Gerrit Updater [ 05/Jun/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/35065 |
| Comment by Gerrit Updater [ 05/Jun/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/35066 |
| Comment by Gerrit Updater [ 06/Jun/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/35081 |
| Comment by Gerrit Updater [ 16/Jun/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35066/ |
| Comment by Gerrit Updater [ 16/Jun/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35065/ |
| Comment by Gerrit Updater [ 16/Jun/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35081/ |
| Comment by Peter Jones [ 16/Jun/19 ] |
|
Landed for 2.13 |
| Comment by Gerrit Updater [ 28/Jun/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35370 |
| Comment by Gerrit Updater [ 28/Jun/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35371 |
| Comment by Gerrit Updater [ 28/Jun/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35372 |
| Comment by Gerrit Updater [ 12/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35370/ |
| Comment by Gerrit Updater [ 12/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35371/ |
| Comment by Gerrit Updater [ 05/Sep/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35372/ |