[LUDOC-109] Missing block scheduler tuning suggestion - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
- QContent

Business Value:
6
Severity:
3
Rank (Obsolete):
5832

Description

The Lustre Manual has a section that has suggested tuning for "testing", under "Tuning Linux Storage Devices". All of the settings have suggested values except for /sys/block/sdN/queue/scheduler. It would be nice to have a suggestion there.

I think we are probably still using the Linux default (probably CFQ) everywhere at LLNL, and that may be a problem. I remember a recent discussion at LUG that suggested that was bad. ZFS certainly attempts to change the scheduler off of CFQ to noop (if ZFS believes that it owns the entire disc).

For ldiskfs it might be the deadline scheduler that we should recommend?

Attachments

Activity

[LUDOC-109] Missing block scheduler tuning suggestion

Linda Bebernes (Inactive) added a comment - 05/Jun/13 5:38 PM

Change has been approved and merged. Resolving ticket.

Linda Bebernes (Inactive) added a comment - 05/Jun/13 5:38 PM Change has been approved and merged. Resolving ticket.

Linda Bebernes (Inactive) added a comment - 29/May/13 5:03 PM - edited

Added note about scheduler default (deadline) and recommendations (deadline, noop). Patch is ready for review at http://review.whamcloud.com/#change,6486.

Linda Bebernes (Inactive) added a comment - 29/May/13 5:03 PM - edited Added note about scheduler default (deadline) and recommendations (deadline, noop). Patch is ready for review at http://review.whamcloud.com/#change,6486 .

Andreas Dilger added a comment - 18/Mar/13 4:48 PM

Note that ~~LU-2498~~ has a patch (http://review.whamcloud.com/4853) to automatically change the default scheduler for Lustre block devices from CFQ to deadline, unless it is already set to noop. This behavior should also be documented.

Andreas Dilger added a comment - 18/Mar/13 4:48 PM Note that LU-2498 has a patch ( http://review.whamcloud.com/4853 ) to automatically change the default scheduler for Lustre block devices from CFQ to deadline, unless it is already set to noop. This behavior should also be documented.

Jodi Levi (Inactive) added a comment - 18/Mar/13 4:40 PM

Brett,
Would you mind taking a look at this and see if this is something you might be able to work with Linda on the Lustre Manual project with?

Jodi Levi (Inactive) added a comment - 18/Mar/13 4:40 PM Brett, Would you mind taking a look at this and see if this is something you might be able to work with Linda on the Lustre Manual project with?

Andreas Dilger added a comment - 14/Dec/12 5:48 PM

IIRC, while ZFS allocates the IO in order, there is some jitter in the processing times of the IO requests between threads, and this causes slightly out-of-order IO submission to the queue. At least I recall Brian (or someone) commenting about the slightly non-linear IO ordering from ZFS at the disk level. That's why I suggest deadline over noop, since it isn't guaranteed that only front/back merging is enough.

Andreas Dilger added a comment - 14/Dec/12 5:48 PM IIRC, while ZFS allocates the IO in order, there is some jitter in the processing times of the IO requests between threads, and this causes slightly out-of-order IO submission to the queue. At least I recall Brian (or someone) commenting about the slightly non-linear IO ordering from ZFS at the disk level. That's why I suggest deadline over noop, since it isn't guaranteed that only front/back merging is enough.

Christopher Morrone (Inactive) added a comment - 14/Dec/12 5:27 PM

I thought it was old and established knowledge that CFQ sucks for high-performance workloads

Well, that common knowledge appears to have been missed in both the documention, and at LLNL as a whole.

but I think it needs to be done internally by ZFS for its constituent block devices

That was the intent with ZFS, but apparently Brian was worried about setting the device's scheduler unilaterally, since the drive might be shared with other filesystems in other partitions. But they are talking that out right now in the hallway.

Brian tells me that even the noop scheduler does front/back merging. He might have said that the merging happens at a layer before the scheduling or something along those lines. That isn't to say that deadline might help too, but at least we should get merging even with noop. And in theory ZFS's scheduler will make things easy to merge. We need to verify that theory with block traces though.

Christopher Morrone (Inactive) added a comment - 14/Dec/12 5:27 PM I thought it was old and established knowledge that CFQ sucks for high-performance workloads Well, that common knowledge appears to have been missed in both the documention, and at LLNL as a whole. but I think it needs to be done internally by ZFS for its constituent block devices That was the intent with ZFS, but apparently Brian was worried about setting the device's scheduler unilaterally, since the drive might be shared with other filesystems in other partitions. But they are talking that out right now in the hallway. Brian tells me that even the noop scheduler does front/back merging. He might have said that the merging happens at a layer before the scheduling or something along those lines. That isn't to say that deadline might help too, but at least we should get merging even with noop. And in theory ZFS's scheduler will make things easy to merge. We need to verify that theory with block traces though.

People

Assignee:: Cliff White (Inactive)

Reporter:: Christopher Morrone (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 12/Dec/12 6:14 PM

Updated:: 05/Jun/13 5:38 PM

Resolved:: 05/Jun/13 5:38 PM