Details
-
Bug
-
Resolution: Duplicate
-
Medium
-
None
-
Lustre 2.17.0, Lustre 2.18.0
-
None
-
3
-
9223372036854775807
Description
I was testing out a workload that was reported in GitHub:
# python Python 3.9.17 (main, Aug 9 2023, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> data = np.random.random((6000, 500, 40)) >>> np.save("/mnt/test.npy", data)
on a recent build (2.17.50-85-gc606081, Ubuntu 6.8.0-88-generic x86_64 kernel).
This is trying to do a single 3GiB write() system call. In the GitHub issue, strace is showing a fault returned to the caller:
write(4, "\360\313"..., 3072000000) = -1 EFAULT (Bad address)
In my testing (possibly affected by kernel or Lustre version), rather than returning an error my python3 process hung in IO (unkillable idle sleep) after having written out about 1GiB of memory:
# pgrep python 627533 # cat /proc/627533/stack osc_extent_wait+0x44f/0x640 [osc] osc_cache_wait_range+0x12f/0xa70 [osc] osc_io_fsync_end+0x7c/0xd0 [osc] cl_io_end+0x7b/0x190 [obdclass] lov_io_end_wrapper+0xe5/0xf0 [lov] lov_io_fsync_end+0x81/0x1e0 [lov] cl_io_end+0x7b/0x190 [obdclass] cl_io_loop+0xf2/0x240 [obdclass] cl_sync_file_range+0x2b0/0x370 [lustre] ll_writepages+0x1e3/0x2a0 [lustre] do_writepages+0xd0/0x1f0 filemap_fdatawrite_wbc+0x75/0xb0 __filemap_fdatawrite_range+0x58/0x80 filemap_write_and_wait_range+0x59/0xc0 direct_write_fallback+0x4c/0xe0 __generic_file_write_iter+0xa4/0xd0 vvp_io_write_start+0x22d/0x10b0 [lustre] cl_io_start+0x87/0x190 [obdclass] cl_io_loop+0xe5/0x240 [obdclass] ll_file_io_generic+0x666/0x1300 [lustre] do_file_write_iter+0x7f1/0xe10 [lustre] ll_file_write_iter+0x4b/0x110 [lustre] vfs_write+0x2a8/0x480
GDB shows this to be at wait_event_idle() in osc_extent_wait():
# gdb lustre/osc/osc.ko (gdb) list *(osc_extent_wait+0x44f) 0x3019f is in osc_extent_wait (/usr/src/lustre-head/lustre/osc/osc_cache.c:994). 989 if (rc == 0) { 990 OSC_EXTENT_DUMP(D_ERROR, ext, 991 "%s: wait ext to %u timedout, recovery in progress?\n", 992 cli_name(osc_cli(obj)), state); 993 994 wait_event_idle(ext->oe_waitq, 995 smp_load_acquire(&ext->oe_state) == state); 996 } 997 if (ext->oe_rc < 0) 998 rc = ext->oe_rc;
and indeed this error message was printed on the client console:
LustreError: 627533:0:(osc_cache.c:990:osc_extent_wait()) extent 000000000fdab39e@{[228352 -> 229375/229375],
[3|1|-|active|wiuY|00000000d732b228], [4218880|1024|+|-|00000000ab2def22|1024|0000000000000000]}
myth-OST0002-osc-ffff89deea9bf000: wait ext to 0 timedout, recovery in progress?
LustreError: 627533:0:(osc_cache.c:990:osc_extent_wait()) ### extent: 000000000fdab39e ns: myth-OST0002-osc-ffff89deea9bf000
lock: 00000000ab2def22/0xd01e87e953e26bd2 lrc: 4/0,1 mode: PW/PW res: [0x2c0000400:0x3d757:0x0].0x0
rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x800020000020000 nid: local
remote: 0x198a66a0ec5e0d26 expref: -99 pid: 627533 timeout: 0 lvb_type: 1 lru_score: 4 lru_type: 1
It isn't clear why the process is waiting, since the client and server are both responsive to all other commands (ls -l on the file, etc).