[LU-5888] mount.lustre: set max_sectors_kb to 2147483647 Created: 09/Nov/14 Updated: 27/Apr/17 Resolved: 03/Aug/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Dmitry Eremin (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | mq115 | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 16462 | ||||||||||||||||
| Description |
|
This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com> I noticed that the automatic tuning of max_sectors_kb by mount.lustre is going badly at mount time: mount.lustre: set /sys/block/dm-0/queue/max_sectors_kb to 2147483647 This looks like and overflow, but even if it is not then we don't need to set this higher than PTLRPC_MAX_BRW_SIZE, or maybe 16MB or 32MB if that #define is not available in userspace. This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/169fa0cc-6627-11e4-8c86-5254006e85c2. |
| Comments |
| Comment by Andreas Dilger [ 14/Nov/14 ] |
| Comment by Gerrit Updater [ 26/Nov/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12723/ |
| Comment by Gerrit Updater [ 04/Dec/14 ] |
|
James Simmons (uja.ornl@gmail.com) uploaded a new patch: http://review.whamcloud.com/12940 |
| Comment by Gerrit Updater [ 05/Jan/15 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/13240 |
| Comment by Andreas Dilger [ 05/Jan/15 ] |
|
Seems my first patch didn't completely fix the problem for newer kernels that allow a huge max_sectors_kb. I've pushed a second patch to resolve the remaining problem. |
| Comment by Andreas Dilger [ 03/Feb/15 ] |
|
Strangely, the new patch fails with a BUG in the DM layer now that it is actually limiting the max request size to 32MB. For example, in the autotest and MDS console log of https://testing.hpdd.intel.com/test_sets/05b9a226-a6c8-11e4-ad11-5254006e85c2: 07:19:32:CMD: onyx-33vm3 mkdir -p /mnt/mds1; mount -t lustre /dev/lvm-Role_MDS/P1 /mnt/mds1 07:19:33:onyx-33vm3: mount.lustre: increased /sys/block/dm-0/queue/max_sectors_kb from 1024 to 32768 07:19:33:onyx-33vm3: mount.lustre: increased /sys/block/vda/queue/max_sectors_kb from 1024 to 32768 07:19:14:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 07:19:14:BUG: unable to handle kernel NULL pointer dereference at (null) 07:19:14:IP: [<ffffffff811ca1b0>] mpage_end_io_read+0x30/0x90 07:19:14:Oops: 0002 [#1] SMP 07:19:14:Pid: 2832, comm: modprobe Not tainted 2.6.32-431.29.2.el6_lustre.gffd1fc2.x86_64 #1 Red Hat KVM 07:19:14:RIP: 0010:[<ffffffff811ca1b0>] [<ffffffff811ca1b0>] mpage_end_io_read+0x30/0x90 07:19:15:Process modprobe (pid: 2832, threadinfo ffff88005916a000, task ffff880079ae554 07:19:16:Call Trace: 07:19:16: <IRQ> 07:19:16: [<ffffffff811c314d>] bio_endio+0x1d/0x40 07:19:16: [<ffffffff812661fb>] req_bio_endio+0x9b/0xe0 07:19:16: [<ffffffff81267777>] blk_update_request+0x117/0x490 07:19:16: [<ffffffff81267b17>] blk_update_bidi_request+0x27/0xa0 07:19:16: [<ffffffff812698be>] __blk_end_request_all+0x2e/0x60 07:19:16: [<ffffffffa005a22a>] blk_done+0x4a/0x110 [virtio_blk] 07:19:16: [<ffffffffa004e2ac>] vring_interrupt+0x3c/0xd0 [virtio_ring] 07:19:16: [<ffffffff810e7090>] handle_IRQ_event+0x60/0x170 07:19:16: [<ffffffff810e99ee>] handle_edge_irq+0xde/0x180 07:19:16: [<ffffffff8100faf9>] handle_irq+0x49/0xa0 07:19:16: [<ffffffff81532dbc>] do_IRQ+0x6c/0xf0 07:19:16: [<ffffffff8100b9d3>] ret_from_intr+0x0/0x11 07:19:16: <EOI> 07:19:16: [<ffffffff811c4f3e>] bio_alloc_bioset+0x3e/0xf0 07:19:17: [<ffffffff811c5095>] bio_alloc+0x15/0x30 07:19:17: [<ffffffff811ca065>] mpage_alloc+0x35/0xa0 07:19:17: [<ffffffff811ca5de>] do_mpage_readpage+0x35e/0x5f0 07:19:17: [<ffffffff811ca9c9>] mpage_readpages+0xe9/0x130 07:19:18: [<ffffffffa0089ccd>] ext3_readpages+0x1d/0x20 [ext3] 07:19:18: [<ffffffff81135795>] __do_page_cache_readahead+0x185/0x210 07:19:18: [<ffffffff81135841>] ra_submit+0x21/0x30 07:19:18: [<ffffffff81135bb5>] ondemand_readahead+0x115/0x240 07:19:18: [<ffffffff81135dd3>] page_cache_sync_readahead+0x33/0x50 07:19:18: [<ffffffff81121868>] generic_file_aio_read+0x558/0x700 07:19:19: [<ffffffff811890ba>] do_sync_read+0xfa/0x140 07:19:19: [<ffffffff81189a75>] vfs_read+0xb5/0x1a0 07:19:19: [<ffffffff81189bb1>] sys_read+0x51/0x90 |
| Comment by Peter Jones [ 09/Jul/15 ] |
|
Dmitry Andreas is too busy to complete the remaining work on this. Could you please take this to completion? Thanks Peter |
| Comment by Gerrit Updater [ 03/Aug/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13240/ |
| Comment by Dmitry Eremin (Inactive) [ 03/Aug/15 ] |
|
landed to master |