Details
-
Question/Request
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.4.3
-
13557
Description
After observing substantial read I/Os on our systems during an OST umount I took a look at exactly what was causing them. They root cause turns out to be ldlm_cancel_locks_for_export() is calling ofd_lvbo_update() to update the LVB from disk for every lock as its canceled. When there are millions of locks on the server this translates in to a huge amount of IO.
After reading through the code it's not at all clear to me why this is done. How could it be out of date? Why is this required before the lock can be canceled?
[<ffffffffa031c9dc>] cv_wait_common+0x8c/0x100 [spl] [<ffffffffa031ca68>] __cv_wait_io+0x18/0x20 [spl] [<ffffffffa046353b>] zio_wait+0xfb/0x1b0 [zfs] [<ffffffffa03d16bd>] dbuf_read+0x3fd/0x740 [zfs] [<ffffffffa03d1b89>] __dbuf_hold_impl+0x189/0x480 [zfs] [<ffffffffa03d1f06>] dbuf_hold_impl+0x86/0xc0 [zfs] [<ffffffffa03d2f80>] dbuf_hold+0x20/0x30 [zfs] [<ffffffffa03d9767>] dmu_buf_hold+0x97/0x1d0 [zfs] [<ffffffffa042de8f>] zap_get_leaf_byblk+0x4f/0x2a0 [zfs] [<ffffffffa042e14a>] zap_deref_leaf+0x6a/0x80 [zfs] [<ffffffffa042e510>] fzap_lookup+0x60/0x120 [zfs] [<ffffffffa0433f11>] zap_lookup_norm+0xe1/0x190 [zfs] [<ffffffffa0434053>] zap_lookup+0x33/0x40 [zfs] [<ffffffffa0cf0710>] osd_fid_lookup+0xb0/0x2e0 [osd_zfs] [<ffffffffa0cea311>] osd_object_init+0x1a1/0x6d0 [osd_zfs] [<ffffffffa06efc9d>] lu_object_alloc+0xcd/0x300 [obdclass] [<ffffffffa06f0805>] lu_object_find_at+0x205/0x360 [obdclass] [<ffffffffa06f0976>] lu_object_find+0x16/0x20 [obdclass] [<ffffffffa0d80575>] ofd_object_find+0x35/0xf0 [ofd] [<ffffffffa0d90486>] ofd_lvbo_update+0x366/0xdac [ofd] [<ffffffffa0831828>] ldlm_cancel_locks_for_export_cb+0x88/0x200 [ptlrpc] [<ffffffffa059178f>] cfs_hash_for_each_relax+0x17f/0x360 [libcfs] [<ffffffffa0592fde>] cfs_hash_for_each_empty+0xfe/0x1e0 [libcfs] [<ffffffffa082c05f>] ldlm_cancel_locks_for_export+0x2f/0x40 [ptlrpc] [<ffffffffa083b804>] server_disconnect_export+0x64/0x1a0 [ptlrpc] [<ffffffffa0d717fa>] ofd_obd_disconnect+0x6a/0x1f0 [ofd] [<ffffffffa06b5d77>] class_disconnect_export_list+0x337/0x660 [obdclass] [<ffffffffa06b6496>] class_disconnect_exports+0x116/0x2f0 [obdclass] [<ffffffffa06de9cf>] class_cleanup+0x16f/0xda0 [obdclass] [<ffffffffa06e06bc>] class_process_config+0x10bc/0x1c80 [obdclass] [<ffffffffa06e13f9>] class_manual_cleanup+0x179/0x6f0 [obdclass] [<ffffffffa071615c>] server_put_super+0x5bc/0xf00 [obdclass] [<ffffffff8118461b>] generic_shutdown_super+0x5b/0xe0 [<ffffffff81184706>] kill_anon_super+0x16/0x60 [<ffffffffa06e3256>] lustre_kill_super+0x36/0x60 [obdclass] [<ffffffff81184ea7>] deactivate_super+0x57/0x80 [<ffffffff811a2d2f>] mntput_no_expire+0xbf/0x110 [<ffffffff811a379b>] sys_umount+0x7b/0x3a0