brd: module loaded loop: module loaded virtio_blk virtio2: [vda] 4762904 512-byte logical blocks (2.44 GB/2.27 GiB) virtio_blk virtio3: [vdb] 57768 512-byte logical blocks (29.6 MB/28.2 MiB) i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mousedev: PS/2 mouse device common for all mice device-mapper: uevent: version 1.0.3 input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised: dm-devel@redhat.com NET: Registered protocol family 17 Key type dns_resolver registered sched_clock: Marking stable (791432008, 0)->(1335881712, -544449704) registered taskstats version 1 EXT4-fs (vda): mounted filesystem without journal. Opts: (null) VFS: Mounted root (ext4 filesystem) readonly on device 254:0. devtmpfs: mounted debug: unmapping init [mem 0xffffffff82559000-0xffffffff8284dfff] Write protecting the kernel read-only data: 16384k debug: unmapping init [mem 0xffff880001a0d000-0xffff880001bfffff] debug: unmapping init [mem 0xffff880001edd000-0xffff880001ffffff] rodata_test: all tests were successful random: fast init done systemd[1]: systemd v241-14.git18dd3fb.fc30 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid) systemd[1]: Detected virtualization kvm. systemd[1]: Detected architecture x86-64. Welcome to Fedora 30 (Thirty)! systemd[1]: Set hostname to . systemd[1]: File /usr/lib/systemd/system/systemd-journald.service:12 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling. systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.) systemd[1]: Condition check resulted in Journal Audit Socket being skipped. random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Created slice system-sshd\x2dkeygen.slice. [ OK ] Created slice system-sshd\x2dkeygen.slice. random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Created slice system-serial\x2dgetty.slice. [ OK ] Created slice system-serial\x2dgetty.slice. random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Listening on Journal Socket. [ OK ] Listening on Journal Socket. Starting Remount Root and Kernel File Systems... [ OK ] Set up automount Arbitrary…s File System Automount Point. [ OK ] Listening on udev Control Socket. Mounting Kernel Debug File System... Mounting Huge Pages File System... [ OK ] Created slice User and Session Slice. [ OK ] Listening on Journal Socket (/dev/log). Starting Journal Service... Starting Load Kernel Modules... [ OK ] Listening on udev Kernel Socket. Starting udev Coldplug all Devices... [ OK ] Started Dispatch Password …ts to Console Directory Watch. EXT4-fs (vda): re-mounted. Opts: (null) [ OK ] Listening on Process Core Dump Socket. [ OK ] Listening on initctl Compatibility Named Pipe. [ OK ] Started Forward Password R…uests to Wall Directory Watch. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Reached target Paths. [ OK ] Created slice system-getty.slice. [ OK ] Reached target Slices. [ OK ] Reached target Swap. [ OK ] Started Remount Root and Kernel File Systems. [ OK ] Mounted Kernel Debug File System. [ OK ] Mounted Huge Pages File System. [ OK ] Started Load Kernel Modules. Starting Apply Kernel Variables... Starting Configure read-only root support... Starting Create Static Device Nodes in /dev... [ OK ] Started Apply Kernel Variables. [ OK ] Started Create Static Device Nodes in /dev. [ OK ] Reached target Local File Systems (Pre). Mounting /tmp... Starting udev Kernel Device Manager... [ OK ] Mounted /tmp. [ OK ] Started Configure read-only root support. [ OK ] Reached target Local File Systems. Starting Restore /run/initramfs on shutdown... Starting Load/Save Random Seed... [ OK ] Started Restore /run/initramfs on shutdown. [ OK ] Started Load/Save Random Seed. [ OK ] Started udev Coldplug all Devices. [ OK ] Started Journal Service. Starting Flush Journal to Persistent Storage... systemd-journald[1213]: Received request to flush runtime journal from PID 1 [ OK ] Started Flush Journal to Persistent Storage. Starting Create Volatile Files and Directories... [ OK ] Started udev Kernel Device Manager. [ OK ] Started Create Volatile Files and Directories. Starting Update UTMP about System Boot/Shutdown... [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Reached target System Initialization. [ OK ] Listening on D-Bus System Message Bus Socket. [ OK ] Reached target Sockets. [ OK ] Started Daily Cleanup of Temporary Directories. [ OK ] Reached target Timers. [ OK ] Reached target Basic System. [ OK ] Started Entropy Daemon based on the HAVEGE algorithm. [ OK ] Reached target sshd-keygen.target. Starting Network Manager... Starting Login Service... virtio_net virtio0 enp0s2: renamed from eth0 random: crng init done random: 7 urandom warning(s) missed due to ratelimiting [ OK ] Found device /dev/hvc0. [* ] (1 of 2) A start job is running for Login Service (9s / 1min 31s) M [** ] (1 of 2) A start job is running for Login Service (9s / 1min 31s) M  Starting D-Bus System Message Bus... [*** ] (1 of 3) A start job is running for Login Service (10s / 1min 31s) M [ OK ] Started D-Bus System Message Bus. [ OK ] Started Login Service. [ OK ] Started Network Manager. [ OK ] Reached target Network. Starting Crash recovery kernel arming... Starting OpenSSH server daemon... Starting Permit User Sessions... [FAILED] Failed to start Crash recovery kernel arming. See 'systemctl status kdump.service' for details. [ OK ] Started Permit User Sessions. [ OK ] Started Getty on tty1. [ OK ] Started Serial Getty on hvc0. [ OK ] Reached target Login Prompts. Starting Hostname Service... [ OK ] Started OpenSSH server daemon. [ OK ] Reached target Multi-User System. Starting Update UTMP about System Runlevel Changes... [ OK ] Started Update UTMP about System Runlevel Changes. [ OK ] Started Hostname Service. Fedora 30 (Thirty) Kernel 4.18.0 on an x86_64 (hvc0) localhost login: libcfs: loading out-of-tree module taints kernel. LNet: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 Lustre: DEBUG MARKER: tmp.hFmNeMUZQn: executing check_logdir /tmp/ltest-logs Lustre: DEBUG MARKER: tmp.hFmNeMUZQn: executing yml_node Lustre: DEBUG MARKER: Client: 2.13.56.165 Lustre: DEBUG MARKER: MDS: 2.13.56.165 Lustre: DEBUG MARKER: OSS: 2.13.56.165 Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: tmp.hFmNeMUZQn: executing set_hostid Lustre: Lustre: Build Version: 2.13.56_165_g560c025 LNet: Added LNI 192.168.122.4@tcp [8/256/0/180] LNet: Accept secure, port 988 Lustre: Echo OBD driver; http://www.lustre.org/ LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: errors=remount-ro ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(current->lockdep_recursion) WARNING: CPU: 0 PID: 7741 at kernel/locking/lockdep.c:3229 __lockdep_init_map+0x16a/0x180 Modules linked in: lustre(O) ofd(O) osp(O) lod(O) ost(O) mdt(O) mdd(O) mgs(O) osd_ldiskfs(O) ldiskfs(O) lquota(O) lfsck(O) obdecho(O) mgc(O) mdc(O) lov(O) osc(O) lmv(O) fid(O) fld(O) ptlrpc(O) obdclass(O) ksocklnd(O) lnet(O) libcfs(O) CPU: 0 PID: 7741 Comm: mount.lustre Tainted: G O --------- --- 4.18.0 #34 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:__lockdep_init_map+0x16a/0x180 Code: e8 bb 76 2f 00 85 c0 74 80 8b 05 51 2d 58 02 85 c0 0f 85 72 ff ff ff 48 c7 c6 56 e5 e0 81 48 c7 c7 b1 d8 df 81 e8 e5 ab fb ff <0f> 0b e9 58 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 RSP: 0018:ffff88015fce3738 EFLAGS: 00010292 RAX: 000000000000002f RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff88016afd52d0 RBP: ffff88015982d898 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: ffffffff83483d53 R12: ffffffffa0ab3d80 R13: 0000000000000002 R14: 0000000000000001 R15: ffff88015982d898 FS: 00007f0fc434d880(0000) GS:ffff88016ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc89983000 CR3: 000000015d578000 CR4: 00000000000006b0 Call Trace: ? ldiskfs_enable_quotas+0x107/0x200 [ldiskfs] ? ldiskfs_fill_super+0x338d/0x3540 [ldiskfs] ? mount_bdev+0x176/0x1a0 ? ldiskfs_calculate_overhead+0x470/0x470 [ldiskfs] ? mount_fs+0x2d/0x15a ? vfs_kern_mount.part.0+0x47/0x140 ? osd_mount+0x4d2/0xbb0 [osd_ldiskfs] ? osd_device_alloc+0x361/0x8e0 [osd_ldiskfs] ? class_setup+0x645/0xa50 [obdclass] ? class_process_config+0x15e4/0x37d0 [obdclass] ? do_lcfg+0x15a/0x4c0 [obdclass] ? do_lcfg+0x15a/0x4c0 [obdclass] ? cache_alloc_debugcheck_after+0x138/0x150 ? __kmalloc+0x20c/0x2e0 ? do_lcfg+0x22d/0x4c0 [obdclass] ? lustre_start_simple+0x73/0x1d0 [obdclass] ? osd_start+0x59d/0x760 [obdclass] ? simple_strtoull+0x2b/0x50 ? target_name2index+0x88/0xb0 [obdclass] ? server_fill_super+0x22c/0xf80 [obdclass] ? lustre_fill_super+0xe7c/0x2a20 [obdclass] ? sget_userns+0x47a/0x4f0 ? lustre_start_mgc+0x2360/0x2360 [obdclass] ? mount_nodev+0x3c/0x90 ? mount_fs+0x2d/0x15a ? vfs_kern_mount.part.0+0x47/0x140 ? do_mount+0x1d6/0xd70 ? kmem_cache_alloc_trace+0x25f/0x2c0 ? ksys_mount+0x79/0xc0 ? __x64_sys_mount+0x1c/0x20 ? do_syscall_64+0x4b/0x1a0 ? entry_SYSCALL_64_after_hwframe+0x6a/0xdf irq event stamp: 12739 hardirqs last enabled at (12739): [] kmem_cache_alloc_trace+0xba/0x2c0 hardirqs last disabled at (12738): [] kmem_cache_alloc_trace+0x76/0x2c0 softirqs last enabled at (12170): [] sync_inodes_sb+0xc5/0x430 softirqs last disabled at (12166): [] wb_queue_work+0x44/0x1b0 ---[ end trace 835ee698df71c057 ]--- LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc Lustre: osd-ldiskfs create tunables for lustre-MDT0000 ------------[ cut here ]------------ do not call blocking ops when !TASK_RUNNING; state=402 set at [<00000000cea1daba>] prepare_to_wait_event+0x4f/0x110 WARNING: CPU: 0 PID: 4946 at kernel/sched/core.c:6127 __might_sleep+0x67/0x70 Modules linked in: lustre(O) ofd(O) osp(O) lod(O) ost(O) mdt(O) mdd(O) mgs(O) osd_ldiskfs(O) ldiskfs(O) lquota(O) lfsck(O) obdecho(O) mgc(O) mdc(O) lov(O) osc(O) lmv(O) fid(O) fld(O) ptlrpc(O) obdclass(O) ksocklnd(O) lnet(O) libcfs(O) CPU: 0 PID: 4946 Comm: ptlrpcd_rcv Tainted: G W O --------- --- 4.18.0 #34 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:__might_sleep+0x67/0x70 Code: 41 5c 41 5d e9 3a fe ff ff 48 8b 90 d0 1c 00 00 48 c7 c7 88 db e0 81 c6 05 12 7b 19 01 01 48 8b 70 10 48 89 d1 e8 f8 7e fd ff <0f> 0b eb c8 0f 1f 44 00 00 85 ff 75 0a 65 48 8b 04 25 40 4e 01 00 RSP: 0018:ffff88015ac87d98 EFLAGS: 00010292 RAX: 0000000000000073 RBX: ffffffffa022c060 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff88016afd52d0 RBP: ffffffff81e0e2f9 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000017 R13: 0000000000000000 R14: 0000000000000001 R15: ffff880163edff10 FS: 0000000000000000(0000) GS:ffff88016ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564b7b0dbfb8 CR3: 000000015d578000 CR4: 00000000000006b0 Call Trace: down_read+0x18/0xa0 keys_fill+0x15/0x100 [obdclass] lu_env_refill+0x32/0x70 [obdclass] ptlrpcd_check+0x43a/0x540 [ptlrpc] ptlrpcd+0x45c/0x4c0 [ptlrpc] ? wait_woken+0x90/0x90 kthread+0x100/0x140 ? ptlrpcd_check+0x540/0x540 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x24/0x30 irq event stamp: 50 hardirqs last enabled at (49): [] _raw_spin_unlock_irqrestore+0x46/0x60 hardirqs last disabled at (50): [] __schedule+0xb0/0xb00 softirqs last enabled at (0): [] copy_process.part.0+0x326/0x1c90 softirqs last disabled at (0): [<0000000000000000>] (null) ---[ end trace 835ee698df71c058 ]--- Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall in log lustre-MDT0000 Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space: rc = -61 Lustre: lustre-MDT0000: new disk, initializing Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc Lustre: osd-ldiskfs create tunables for lustre-MDT0001 Lustre: Setting parameter lustre-MDT0001.mdt.identity_upcall in log lustre-MDT0001 Lustre: srv-lustre-MDT0001: No data found on store. Initialize space: rc = -61 Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0001: new disk, initializing Lustre: lustre-MDT0001: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:1:mdt Lustre: cli-ctl-lustre-MDT0001: Allocated super-sequence [0x0000000240000400-0x0000000280000400]:1:mdt] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc Lustre: osd-ldiskfs create tunables for lustre-OST0000 Lustre: lustre-OST0000: new disk, initializing Lustre: srv-lustre-OST0000: No data found on store. Initialize space: rc = -61 Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:0:ost Lustre: cli-lustre-OST0000-super: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:0:ost] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc Lustre: osd-ldiskfs create tunables for lustre-OST0001 Lustre: lustre-OST0001: new disk, initializing Lustre: srv-lustre-OST0001: No data found on store. Initialize space: rc = -61 Lustre: lustre-OST0001: Imperative Recovery not enabled, recovery window 60-180 Lustre: Mounted lustre-client Lustre: Mounted lustre-client Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:1:ost Lustre: cli-lustre-OST0001-super: Allocated super-sequence [0x00000002c0000400-0x0000000300000400]:1:ost] Lustre: DEBUG MARKER: == racer test 1: racer on clients: tmp.hFmNeMUZQn DURATION=2700 ====================================== 04:28:18 (1607297298) Lustre: lfs: using old ioctl(LL_IOC_LOV_GETSTRIPE) on [0x200000403:0x33:0x0], use llapi_layout_get_by_path() LustreError: 7773:0:(out_handler.c:906:out_tx_end()) error during execution of #6 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 7773:0:(out_handler.c:915:out_tx_end()) lustre-MDT0001-osd: undo for /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:450: rc = -524 LustreError: 7798:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0001-osp-MDT0000: fail to cancel 1 llog-records: rc = -2 LustreError: 7798:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0001-osp-MDT0000: fail to cancel 1 of 1 llog-records: rc = -2 LustreError: 8222:0:(out_handler.c:906:out_tx_end()) error during execution of #6 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 8222:0:(out_handler.c:915:out_tx_end()) lustre-MDT0000-osd: undo for /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:450: rc = -524 LustreError: 8222:0:(out_handler.c:915:out_tx_end()) Skipped 1 previous similar message LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -116 Lustre: dir [0x240000402:0x449:0x0] stripe 0 readdir failed: -2, directory is partially accessed! Lustre: dir [0x240000403:0xa93:0x0] stripe 0 readdir failed: -2, directory is partially accessed! Lustre: Skipped 1 previous similar message LustreError: 31733:0:(out_handler.c:906:out_tx_end()) error during execution of #6 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 31733:0:(out_handler.c:915:out_tx_end()) lustre-MDT0000-osd: undo for /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:450: rc = -524 LustreError: 31733:0:(out_handler.c:915:out_tx_end()) Skipped 1 previous similar message LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) Skipped 2 previous similar messages LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) Skipped 2 previous similar messages LustreError: 8222:0:(out_handler.c:906:out_tx_end()) error during execution of #0 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) Skipped 1 previous similar message LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) Skipped 1 previous similar message LustreError: 4948:0:(integrity.c:63:obd_page_dif_generate_buffer()) lustre-OST0000-osc-ffff8801381a4000: unexpected used guard number of DIF 1/1, data length 4096, sector size 512: rc = -7 LustreError: 4948:0:(osc_request.c:2511:osc_build_rpc()) prep_req failed: -7 LustreError: 4948:0:(osc_cache.c:2182:osc_check_rpcs()) Write request failed with -7 LustreError: 8186:0:(out_handler.c:906:out_tx_end()) error during execution of #6 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 8186:0:(out_handler.c:915:out_tx_end()) lustre-MDT0001-osd: undo for /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:450: rc = -524 LustreError: 8186:0:(out_handler.c:915:out_tx_end()) Skipped 1 previous similar message LustreError: 7798:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0001-osp-MDT0000: fail to cancel 1 llog-records: rc = -116 LustreError: 7798:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0001-osp-MDT0000: fail to cancel 1 of 1 llog-records: rc = -116 LustreError: 8222:0:(out_handler.c:906:out_tx_end()) error during execution of #6 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 8222:0:(out_handler.c:915:out_tx_end()) lustre-MDT0000-osd: undo for /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:450: rc = -524 LustreError: 8222:0:(out_handler.c:915:out_tx_end()) Skipped 1 previous similar message LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -116 4[8040]: segfault at 8 ip 00007f6210645370 sp 00007ffeaac2f680 error 4 in ld-2.29.so[7f621063a000+20000] Code: c0 0f 85 13 15 00 00 49 8b 83 f0 00 00 00 48 89 85 18 ff ff ff 48 85 c0 0f 85 91 13 00 00 49 8b 43 68 49 83 bb f8 00 00 00 00 <48> 8b 40 08 48 89 85 40 ff ff ff 0f 84 3f 0c 00 00 45 85 ed 74 5a LustreError: 5525:0:(out_handler.c:906:out_tx_end()) error during execution of #6 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 5525:0:(out_handler.c:915:out_tx_end()) lustre-MDT0000-osd: undo for /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:450: rc = -524 LustreError: 5525:0:(out_handler.c:915:out_tx_end()) Skipped 1 previous similar message LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) Skipped 1 previous similar message LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -116 LustreError: 7995:0:(llog_cat.c:792:llog_cat_cancel_records()) Skipped 1 previous similar message Lustre: mdt00_029: service thread pid 19626 was inactive for 63.173 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: mdt00_026: service thread pid 16945 was inactive for 63.175 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Pid: 19626, comm: mdt00_029 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: ldlm_completion_ast+0x77b/0x8d0 [ptlrpc] ldlm_cli_enqueue_local+0x27b/0x7e0 [ptlrpc] mdt_object_local_lock+0x55b/0xb00 [mdt] mdt_object_lock_internal+0x55/0x3f0 [mdt] mdt_getattr_name_lock+0x3c9/0x1fe0 [mdt] mdt_intent_getattr+0x263/0x430 [mdt] mdt_intent_policy+0x3c4/0xef0 [mdt] ldlm_lock_enqueue+0x3e8/0x9c0 [ptlrpc] ldlm_handle_enqueue0+0x403/0x1670 [ptlrpc] tgt_enqueue+0x55/0x220 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Pid: 9538, comm: mdt00_007 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: ldlm_completion_ast+0x77b/0x8d0 [ptlrpc] ldlm_cli_enqueue_local+0x27b/0x7e0 [ptlrpc] mdt_object_local_lock+0x55b/0xb00 [mdt] mdt_object_lock_internal+0x55/0x3f0 [mdt] mdt_getattr_name_lock+0x3c9/0x1fe0 [mdt] mdt_intent_getattr+0x263/0x430 [mdt] mdt_intent_policy+0x3c4/0xef0 [mdt] ldlm_lock_enqueue+0x3e8/0x9c0 [ptlrpc] ldlm_handle_enqueue0+0x403/0x1670 [ptlrpc] tgt_enqueue+0x55/0x220 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Pid: 9749, comm: mdt00_020 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: ldlm_completion_ast+0x77b/0x8d0 [ptlrpc] ldlm_cli_enqueue_local+0x27b/0x7e0 [ptlrpc] mdt_object_local_lock+0x55b/0xb00 [mdt] mdt_object_lock_internal+0x55/0x3f0 [mdt] mdt_getattr_name_lock+0x3c9/0x1fe0 [mdt] mdt_intent_getattr+0x263/0x430 [mdt] mdt_intent_policy+0x3c4/0xef0 [mdt] ldlm_lock_enqueue+0x3e8/0x9c0 [ptlrpc] ldlm_handle_enqueue0+0x403/0x1670 [ptlrpc] tgt_enqueue+0x55/0x220 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff LustreError: 7755:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 0@lo ns: mdt-lustre-MDT0000_UUID lock: 000000007bd55381/0x9e82e7233ca8b25a lrc: 3/0,0 mode: PR/PR res: [0x200000402:0x1918:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 0@lo remote: 0x9e82e7233ca89b85 expref: 688 pid: 9632 timeout: 628 lvb_type: 0 LustreError: 10838:0:(ldlm_lockd.c:1422:ldlm_handle_enqueue0()) ### lock on destroyed export 00000000f71000d9 ns: mdt-lustre-MDT0000_UUID lock: 0000000016a20626/0x9e82e7233ca8b32c lrc: 3/0,0 mode: PR/PR res: [0x200000402:0x1:0x0].0x0 bits 0x13/0x0 rrc: 19 type: IBT gid 0 flags: 0x50200000000000 nid: 0@lo remote: 0x9e82e7233ca8b0cb expref: 449 pid: 10838 timeout: 0 lvb_type: 0 LustreError: 11-0: lustre-MDT0000-mdc-ffff8801381a4000: operation ldlm_enqueue to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-mdc-ffff8801381a4000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: Skipped 1 previous similar message LustreError: 167-0: lustre-MDT0000-mdc-ffff8801381a4000: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. LustreError: 18722:0:(file.c:4742:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x1:0x0] error: rc = -5 LustreError: 18722:0:(file.c:4742:ll_inode_revalidate_fini()) Skipped 1 previous similar message LustreError: 18541:0:(llite_lib.c:2787:ll_prep_inode()) new_inode -fatal: rc -108 LustreError: 18480:0:(file.c:233:ll_close_inode_openhandle()) lustre-clilmv-ffff8801381a4000: inode [0x200000403:0x1a11:0x0] mdc close failed: rc = -108 LustreError: 18740:0:(mdc_request.c:1427:mdc_read_page()) lustre-MDT0000-mdc-ffff8801381a4000: [0x200000402:0x1:0x0] lock enqueue fails: rc = -108 LustreError: 19526:0:(file.c:4742:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -108 LustreError: 19526:0:(file.c:4742:ll_inode_revalidate_fini()) Skipped 458 previous similar messages Lustre: lustre-MDT0000-mdc-ffff8801381a4000: Connection restored to 192.168.122.4@tcp (at 0@lo) LustreError: 5525:0:(out_handler.c:906:out_tx_end()) error during execution of #0 from /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/out_handler.c:565: rc = -17 LustreError: 7798:0:(llog_cat.c:755:llog_cat_cancel_arr_rec()) lustre-MDT0001-osp-MDT0000: fail to cancel 1 llog-records: rc = -2 LustreError: 7798:0:(llog_cat.c:792:llog_cat_cancel_records()) lustre-MDT0001-osp-MDT0000: fail to cancel 1 of 1 llog-records: rc = -2 Lustre: ll_ost_io00_008: service thread pid 727 was inactive for 40.343 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: ll_ost00_002: service thread pid 8178 was inactive for 40.420 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 13 previous similar messages Lustre: Skipped 13 previous similar messages Lustre: ll_ost_io00_004: service thread pid 9922 was inactive for 40.125 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 12 previous similar messages Lustre: ll_ost_io00_017: service thread pid 31533 was inactive for 65.035 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 14 previous similar messages Lustre: ll_ost00_021: service thread pid 31312 was inactive for 71.038 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 1 previous similar message Lustre: ll_ost_io00_013: service thread pid 31424 was inactive for 65.357 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 4 previous similar messages LustreError: 7755:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 104s: evicting client at 0@lo ns: filter-lustre-OST0000_UUID lock: 00000000ee533e5f/0x9e82e7233cd53a12 lrc: 3/0,0 mode: PW/PW res: [0x280000400:0x4e6:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000480000020 nid: 0@lo remote: 0x9e82e7233cd53a0b expref: 80 pid: 9896 timeout: 843 lvb_type: 0 LustreError: 18132:0:(client.c:1265:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@0000000095f47ba1 x1685373429472512/t0(0) o105->lustre-OST0000@0@lo:15/16 lens 392/224 e 0 to 0 dl 0 ref 1 fl Rpc:QU/0/ffffffff rc 0/-1 job:'' Lustre: lustre-OST0000-osc-ffff8801381a4000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607297970/real 1607297970] req@000000004d918fc3 x1685373429026112/t0(0) o2->lustre-OST0000-osc-ffff8801381a4000@0@lo:28/4 lens 440/432 e 4 to 1 dl 1607298081 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'chmod.0' Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607297971/real 1607297971] req@00000000dd558536 x1685373429198528/t0(0) o2->lustre-OST0000-osc-ffff8801381a4000@0@lo:28/4 lens 440/432 e 4 to 1 dl 1607298082 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'chown.0' Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607297972/real 1607297972] req@000000000f718fa6 x1685373429267328/t0(0) o2->lustre-OST0000-osc-ffff8801381a4000@0@lo:28/4 lens 440/432 e 4 to 1 dl 1607298083 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'touch.0' Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: lustre-OST0000: Export 000000006f3a2bf8 already connecting from 0@lo Lustre: lustre-OST0000: Export 000000006f3a2bf8 already connecting from 0@lo Lustre: lustre-OST0000: Export 000000006f3a2bf8 already connecting from 0@lo Lustre: lustre-OST0000: Export 000000006f3a2bf8 already connecting from 0@lo Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607297970/real 1607297970] req@0000000027ce45b2 x1685373429012544/t0(0) o4->lustre-OST0000-osc-ffff8801381a4000@0@lo:6/4 lens 488/448 e 5 to 1 dl 1607298104 ref 2 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'dd.0' Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: lustre-OST0000: Export 000000006f3a2bf8 already connecting from 0@lo Lustre: lustre-OST0000: Export 000000006f3a2bf8 already connecting from 0@lo Lustre: Skipped 1 previous similar message Lustre: lustre-OST0000: haven't heard from client 3f7f36a6-d2ac-4e37-bd3a-3248bf1c95a2 (at ) in 50 seconds. I think it's dead, and I am evicting it. exp 000000006f3a2bf8, cur 1607298130 expire 1607298100 last 1607298080 Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 2 previous similar messages Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 6 previous similar messages ptlrpc_watchdog_fire: 52 callbacks suppressed Lustre: ll_ost_io00_027: service thread pid 31589 was inactive for 168.959 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 2 previous similar messages Pid: 31589, comm: ll_ost_io00_027 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: jbd2_log_wait_commit+0x10f/0x160 ldiskfs_sync_fs+0x1eb/0x2a0 [ldiskfs] osd_sync+0xd6/0x170 [osd_ldiskfs] tgt_grant_prepare_write+0x20f/0xeb0 [ptlrpc] ofd_preprw+0x96b/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Pid: 31587, comm: ll_ost_io00_025 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: jbd2_log_wait_commit+0x10f/0x160 ldiskfs_sync_fs+0x1eb/0x2a0 [ldiskfs] osd_sync+0xd6/0x170 [osd_ldiskfs] tgt_grant_prepare_write+0x20f/0xeb0 [ptlrpc] ofd_preprw+0x96b/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff INFO: task ldlm_elt:7755 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_elt D 0 7755 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? tgt_server_data_update+0x1f6/0x580 [ptlrpc] ? tgt_client_del+0x1d4/0x6b0 [ptlrpc] ? ofd_obd_disconnect+0x1ba/0x1d0 [ofd] ? class_fail_export+0x1ce/0x4e0 [obdclass] ? expired_lock_main+0x1cb/0xaa0 [ptlrpc] ? wait_woken+0x90/0x90 ? kthread+0x100/0x140 ? ldlm_add_waiting_lock+0x2b0/0x2b0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task jbd2/dm-2-8:8173 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/dm-2-8 D 0 8173 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 jbd2_journal_commit_transaction+0x281/0x1f2c ? set_next_entity+0x74/0x380 ? update_curr+0x88/0x3a0 ? lock_timer_base+0x5c/0x80 ? wait_woken+0x90/0x90 ? try_to_del_timer_sync+0x3a/0x50 ? kjournald2+0x99/0x1f0 ? wait_woken+0x90/0x90 ? kthread+0x100/0x140 ? commit_timeout+0x10/0x10 ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost00_000:8176 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost00_000 D 0 8176 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? ofd_attr_set+0x375/0x1010 [ofd] ? ofd_setattr_hdl+0x3b0/0x910 [ofd] ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost00_001:8177 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost00_001 D 0 8177 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? ofd_attr_set+0x375/0x1010 [ofd] ? ofd_setattr_hdl+0x3b0/0x910 [ofd] ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost00_002:8178 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost00_002 D 0 8178 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? ofd_attr_set+0x375/0x1010 [ofd] ? ofd_setattr_hdl+0x3b0/0x910 [ofd] ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost_io00_000:8181 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost_io00_000 D 0 8181 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? ofd_commitrw+0x5eb/0x2b10 [ofd] ? tgt_checksum_niobuf_rw+0xda7/0x12f0 [ptlrpc] ? tgt_brw_write+0xf4c/0x2530 [ptlrpc] ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost_io00_001:8182 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost_io00_001 D 0 8182 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? ofd_commitrw+0x5eb/0x2b10 [ofd] ? tgt_checksum_niobuf_rw+0xda7/0x12f0 [ptlrpc] ? tgt_brw_write+0xf4c/0x2530 [ptlrpc] ? __mutex_unlock_slowpath+0x38/0x280 ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost_io00_002:8183 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost_io00_002 D 0 8183 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? jbd2__journal_restart+0xff/0x180 ? osd_write_commit+0x4f0/0x9c0 [osd_ldiskfs] ? ofd_commitrw+0xb4f/0x2b10 [ofd] ? tgt_brw_write+0xf4c/0x2530 [ptlrpc] ? __mutex_unlock_slowpath+0x38/0x280 ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task ll_ost00_003:8757 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ll_ost00_003 D 0 8757 2 0x80000000 Call Trace: ? __schedule+0x2ad/0xb00 schedule+0x34/0x80 wait_transaction_locked+0xb8/0xe0 ? wait_woken+0x90/0x90 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 ? kmem_cache_alloc+0x1e9/0x2c0 ? jbd2__journal_start+0xd7/0x2a0 ? osd_trans_start+0x132/0x530 [osd_ldiskfs] ? ofd_attr_set+0x375/0x1010 [ofd] ? ofd_setattr_hdl+0x3b0/0x910 [ofd] ? tgt_request_handle+0x40f/0x18f0 [ptlrpc] ? ptlrpc_main+0x1222/0x3530 [ptlrpc] ? kthread+0x100/0x140 ? ptlrpc_register_service+0x15a0/0x15a0 [ptlrpc] ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x24/0x30 INFO: task dir_create.sh:9031 blocked for more than 120 seconds. Tainted: G W O --------- --- 4.18.0 #34 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. dir_create.sh D 0 9031 9017 0x00000000 Call Trace: ? __schedule+0x2ad/0xb00 ? __wake_up_common_lock+0x4f/0x90 ? autoremove_wake_function+0x9/0x30 schedule+0x34/0x80 schedule_timeout+0x323/0x500 ? wait_for_common+0x3b/0x160 wait_for_common+0xc9/0x160 ? wake_up_q+0x60/0x60 osc_io_setattr_end+0xa4/0x290 [osc] ? osc_io_setattr_start+0x29b/0x520 [osc] cl_io_end+0x4e/0x130 [obdclass] lov_io_end_wrapper+0xc0/0xd0 [lov] ? lov_io_fini+0x3d0/0x3d0 [lov] lov_io_call.isra.0+0x76/0x130 [lov] lov_io_end+0x2d/0xd0 [lov] cl_io_end+0x4e/0x130 [obdclass] cl_io_loop+0x9d/0x1e0 [obdclass] cl_setattr_ost+0x224/0x2d0 [lustre] ll_setattr_raw+0x1022/0x11c0 [lustre] notify_change+0x293/0x420 do_truncate+0x61/0xa0 path_openat+0x3f8/0xa60 do_filp_open+0x79/0xd0 ? expand_files+0x27/0x290 ? _raw_spin_unlock+0x1f/0x30 do_sys_open+0x158/0x1e0 do_syscall_64+0x4b/0x1a0 entry_SYSCALL_64_after_hwframe+0x6a/0xdf RIP: 0033:0x7ff114b5fe9c Code: Bad RIP value. RSP: 002b:00007fffe7041440 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000564378ff8a30 RCX: 00007ff114b5fe9c RDX: 0000000000000241 RSI: 0000564378ffa5b0 RDI: 00000000ffffff9c RBP: 0000564378ffa5b0 R08: 0000000000000000 R09: 0000000000000020 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000241 R13: 0000000000000000 R14: 0000000000000001 R15: 0000564378ffa5b0 INFO: lockdep is turned off. Lustre: mdt00_018: service thread pid 9741 was inactive for 240.156 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: mdt00_006: service thread pid 9491 was inactive for 240.152 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 1 previous similar message Pid: 9741, comm: mdt00_018 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: ldlm_completion_ast+0x77b/0x8d0 [ptlrpc] ldlm_cli_enqueue_local+0x27b/0x7e0 [ptlrpc] mdt_object_local_lock+0x55b/0xb00 [mdt] mdt_object_lock_internal+0x55/0x3f0 [mdt] mdt_getattr_name_lock+0x3c9/0x1fe0 [mdt] mdt_intent_getattr+0x263/0x430 [mdt] mdt_intent_policy+0x3c4/0xef0 [mdt] ldlm_lock_enqueue+0x3e8/0x9c0 [ptlrpc] ldlm_handle_enqueue0+0x403/0x1670 [ptlrpc] tgt_enqueue+0x55/0x220 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 12 previous similar messages LustreError: 9741:0:(ldlm_request.c:121:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1607297973, 302s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: 0000000024b9c40f/0x9e82e7233cd60902 lrc: 3/1,0 mode: --/PR res: [0x200000402:0x1:0x0].0x0 bits 0x13/0x0 rrc: 13 type: IBT gid 0 flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 9741 timeout: 0 lvb_type: 0 LustreError: dumping log to /tmp/lustre-log.1607298275.9741 Lustre: ll_ost_io00_022: service thread pid 31584 was inactive for 266.224 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 17 previous similar messages Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 25 previous similar messages ptlrpc_watchdog_fire: 20 callbacks suppressed Lustre: ll_ost00_024: service thread pid 31582 was inactive for 377.759 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 31582, comm: ll_ost00_024 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: wait_transaction_locked+0xb8/0xe0 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 0xffffffffffffffff Lustre: 31581:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@00000000e7e63f29 x1685373429016640/t0(0) o2->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:210/0 lens 440/432 e 24 to 0 dl 1607298570 ref 2 fl Interpret:/0/0 rc 0/0 job:'chmod.0' Lustre: 31588:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@00000000b962dc6e x1685373429165376/t0(0) o10->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:211/0 lens 440/432 e 24 to 0 dl 1607298571 ref 2 fl Interpret:/0/0 rc 0/0 job:'truncate.0' Lustre: 31588:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages Lustre: 31588:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@000000003f6ce084 x1685373429419328/t0(0) o10->lustre-MDT0000-mdtlov_UUID@0@lo:212/0 lens 440/432 e 24 to 0 dl 1607298572 ref 2 fl Interpret:/0/0 rc 0/0 job:'mdt00_031.0' Lustre: 31588:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 17 previous similar messages Lustre: 31588:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@00000000f5a80a42 x1685373429435008/t0(0) o4->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:214/0 lens 504/448 e 24 to 0 dl 1607298574 ref 2 fl Interpret:/0/0 rc 0/0 job:'dd.0' Lustre: 31588:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 9 previous similar messages Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607297970/real 1607297970] req@000000005808236a x1685373429015552/t0(0) o2->lustre-OST0000-osc-ffff880139961000@0@lo:28/4 lens 440/432 e 24 to 1 dl 1607298571 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'cp.0' Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 14 previous similar messages Lustre: lustre-OST0000-osc-ffff880139961000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: lustre-OST0000: deleting orphan objects from 0x0:1281 to 0x0:1313 Lustre: 18647:0:(service.c:2319:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600/1s); client may timeout req@0000000025654d75 x1685373429035520/t0(0) o101->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:211/0 lens 992/600 e 18 to 0 dl 1607298571 ref 1 fl Complete:/0/0 rc 301/301 job:'lfs.0' Lustre: lustre-OST0000: deleting orphan objects from 0x280000400:1256 to 0x280000400:1281 LustreError: 20365:0:(osp_precreate.c:1681:osp_object_truncate()) can't punch object: -107 Lustre: lustre-MDT0000: Client 3f7f36a6-d2ac-4e37-bd3a-3248bf1c95a2 (at 0@lo) reconnecting Lustre: Skipped 4 previous similar messages Lustre: lustre-MDT0000-mdc-ffff8801381a4000: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: Skipped 4 previous similar messages Lustre: 20365:0:(service.c:2319:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600/1s); client may timeout req@00000000322b470c x1685373429418880/t0(0) o101->3f7f36a6-d2ac-4e37-bd3a-3248bf1c95a2@0@lo:212/0 lens 992/600 e 18 to 0 dl 1607298572 ref 1 fl Complete:/0/0 rc -107/-107 job:'file_concat.sh.0' Lustre: 20365:0:(service.c:2319:ptlrpc_server_handle_request()) Skipped 8 previous similar messages Lustre: 31602:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@00000000edcaedfa x1685373429444672/t0(0) o4->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:229/0 lens 496/448 e 24 to 0 dl 1607298589 ref 2 fl Interpret:/0/0 rc 0/0 job:'dir_create.sh.0' Lustre: 31602:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message LustreError: 9907:0:(osc_cache.c:919:osc_extent_wait()) extent 00000000b2026455@{[0 -> 0/1023], [3|0|+|rpc|wihY|00000000628c426f], [28672|1|+|-|000000003e714948|1024|00000000495558f3]} lustre-OST0000-osc-ffff8801381a4000: wait ext to 0 timedout, recovery in progress? LustreError: 9907:0:(osc_cache.c:919:osc_extent_wait()) ### extent: 00000000b2026455 ns: lustre-OST0000-osc-ffff8801381a4000 lock: 000000003e714948/0x9e82e7233cd4b217 lrc: 4/0,0 mode: PW/PW res: [0x4ef:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x800429400000000 nid: local remote: 0x9e82e7233cd4b248 expref: -99 pid: 9110 timeout: 0 lvb_type: 1 Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 52 previous similar messages LustreError: 31252:0:(llite_nfs.c:342:ll_dir_get_parent_fid()) lustre: failure inode [0x200000402:0x205a:0x0] get parent: rc = -116 Lustre: ost_io: This server is not able to keep up with request traffic (cpu-bound). Lustre: 2087:0:(service.c:1612:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=36, svcEst=600, delay=0ms Lustre: 2087:0:(service.c:1379:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-3s), not sending early reply. Consider increasing at_early_margin (5)? req@00000000a24faa56 x1685373429472448/t0(0) o10->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:721/0 lens 440/432 e 4 to 0 dl 1607299081 ref 2 fl Interpret:H/0/0 rc 0/0 job:'dd.0' Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 101 previous similar messages Lustre: 31602:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@00000000f82b72d9 x1685373429446464/t0(0) o4->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:56/0 lens 488/448 e 1 to 0 dl 1607299171 ref 2 fl Interpret:/2/0 rc 0/0 job:'dd.0' Lustre: 31602:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages LustreError: 31722:0:(osc_cache.c:919:osc_extent_wait()) extent 00000000f630df20@{[0 -> 0/1023], [3|0|+|rpc|wiY|00000000fb861c51], [28672|1|+|-|000000003e574fc9|1024|0000000023036cdd]} lustre-OST0000-osc-ffff8801381a4000: wait ext to 0 timedout, recovery in progress? LustreError: 31440:0:(osc_cache.c:919:osc_extent_wait()) ### extent: 00000000b3146bc4 ns: lustre-OST0000-osc-ffff880139961000 lock: 0000000065e3850d/0x9e82e7233cd33864 lrc: 3/0,0 mode: PW/PW res: [0x4cc:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x800020000020000 nid: local remote: 0x9e82e7233cd33959 expref: -99 pid: 28660 timeout: 0 lvb_type: 1 LustreError: 31722:0:(osc_cache.c:919:osc_extent_wait()) Skipped 5 previous similar messages LustreError: 31440:0:(osc_cache.c:919:osc_extent_wait()) Skipped 4 previous similar messages Lustre: 31602:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (3/3), not sending early reply req@0000000095d96e7b x1685373430111360/t0(0) o10->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:120/0 lens 440/432 e 1 to 0 dl 1607299235 ref 2 fl Interpret:/0/0 rc 0/0 job:'dir_create.sh.0' Lustre: 31602:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 9 previous similar messages Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607298635/real 1607298635] req@00000000e2a051e1 x1685373430111360/t0(0) o10->lustre-OST0000-osc-ffff880139961000@0@lo:6/4 lens 440/432 e 1 to 1 dl 1607299236 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'dir_create.sh.0' Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 12 previous similar messages Lustre: lustre-OST0000-osc-ffff880139961000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 5 previous similar messages Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 0@lo) reconnecting Lustre: Skipped 1 previous similar message Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.122.4@tcp (at 0@lo) LustreError: 31720:0:(osc_cache.c:919:osc_extent_wait()) extent 00000000c6c24bfc@{[3264 -> 4031/4095], [3|0|+|rpc|wiuY|00000000fa44c7a5], [3170304|768|+|-|0000000092f74a3e|1024|00000000150dbb8e]} lustre-OST0000-osc-ffff880139961000: wait ext to 0 timedout, recovery in progress? LustreError: 31720:0:(osc_cache.c:919:osc_extent_wait()) ### extent: 00000000c6c24bfc ns: lustre-OST0000-osc-ffff880139961000 lock: 0000000092f74a3e/0x9e82e7233cd28cc1 lrc: 2/0,0 mode: PW/PW res: [0x4ce:0x0:0x0].0x0 rrc: 2 type: EXT [13369344->18446744073709551615] (req 13369344->13434879) gid 0 flags: 0x800020000020000 nid: local remote: 0x9e82e7233cd28cd6 expref: -99 pid: 27138 timeout: 0 lvb_type: 1 Lustre: 32429:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (3/-152), not sending early reply req@0000000011854977 x1685373429644928/t0(0) o5->lustre-MDT0001-mdtlov_UUID@0@lo:212/0 lens 432/432 e 0 to 0 dl 1607299327 ref 2 fl Interpret:/0/0 rc 0/0 job:'osp-pre-0-1.0' Lustre: 8207:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607298572/real 1607298572] req@0000000019030736 x1685373429644928/t0(0) o5->lustre-OST0000-osc-MDT0001@0@lo:28/4 lens 432/432 e 0 to 1 dl 1607299328 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'osp-pre-0-1.0' Lustre: 8207:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 9 previous similar messages Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: 8230:0:(osp_precreate.c:958:osp_precreate_cleanup_orphans()) lustre-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -107 Lustre: lustre-OST0000: Client lustre-MDT0001-mdtlov_UUID (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-MDT0001: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: Skipped 1 previous similar message Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 120 previous similar messages Lustre: ll_ost00_020: service thread pid 31256 was inactive for 1208.314 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 31256, comm: ll_ost00_020 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: wait_transaction_locked+0xb8/0xe0 add_transaction_credits+0x150/0x350 start_this_handle+0xec/0x420 0xffffffffffffffff Pid: 31601, comm: ll_ost_io00_035 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: io_schedule+0xd/0x40 __lock_page+0xe6/0x120 pagecache_get_page+0x1b1/0x240 osd_bufs_get+0x563/0xb30 [osd_ldiskfs] ofd_preprw+0xb60/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Pid: 31600, comm: ll_ost_io00_034 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: io_schedule+0xd/0x40 __lock_page+0xe6/0x120 pagecache_get_page+0x1b1/0x240 osd_bufs_get+0x563/0xb30 [osd_ldiskfs] ofd_preprw+0xb60/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Lustre: ll_ost_io00_033: service thread pid 31599 was inactive for 1208.402 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 1 previous similar message Lustre: dir [0x200000404:0x890:0x0] stripe 0 readdir failed: -2, directory is partially accessed! Lustre: Skipped 1 previous similar message Lustre: 9376:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@000000006a0fe7fa x1685373429043840/t0(0) o4->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:727/0 lens 488/448 e 0 to 0 dl 1607299842 ref 2 fl Interpret:H/2/0 rc 0/0 job:'cat.0' Lustre: 9376:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message Lustre: ll_ost_io00_038: service thread pid 31716 was inactive for 1226.836 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 7 previous similar messages Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607299242/real 1607299242] req@0000000082c2a9a2 x1685373429472448/t0(0) o10->lustre-OST0000-osc-ffff880139961000@0@lo:6/4 lens 440/432 e 4 to 1 dl 1607299753 ref 1 fl Rpc:XQr/2/ffffffff rc -11/-1 job:'dd.0' Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 1 previous similar message Lustre: lustre-OST0000-osc-ffff880139961000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-OST0000: Client 1879007f-1371-4e81-bd8f-434f89eab36f (at 0@lo) reconnecting Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: ll_ost00_023: service thread pid 31581 was inactive for 1205.974 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. LustreError: 30762:0:(vvp_io.c:1756:vvp_io_init()) lustre: refresh file layout [0x200000402:0x2145:0x0] error -4. LustreError: 13300:0:(file.c:4742:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x1:0x0] error: rc = -4 LustreError: 13300:0:(file.c:4742:ll_inode_revalidate_fini()) Skipped 73 previous similar messages LustreError: 13311:0:(file.c:4742:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x1:0x0] error: rc = -4 LustreError: 13311:0:(file.c:4742:ll_inode_revalidate_fini()) Skipped 46 previous similar messages Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607299329/real 1607299329] req@000000002b2bfddb x1685373429435968/t0(0) o2->lustre-OST0000-osc-MDT0001@0@lo:28/4 lens 440/432 e 24 to 1 dl 1607299930 ref 1 fl Rpc:XQr/2/ffffffff rc -11/-1 job:'osp-syn-0-1.0' Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 10 previous similar messages Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: lustre-OST0000: Client lustre-MDT0001-mdtlov_UUID (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-MDT0001: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: Skipped 1 previous similar message Lustre: 2076:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/-150), not sending early reply req@00000000e982fee8 x1685373441214272/t0(0) o5->lustre-MDT0000-mdtlov_UUID@0@lo:215/0 lens 432/432 e 0 to 0 dl 1607300085 ref 2 fl Interpret:/0/0 rc 0/0 job:'osp-pre-0-0.0' Lustre: 2076:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 8 previous similar messages Lustre: 8230:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607299330/real 1607299330] req@000000005a7d4e7c x1685373441214272/t0(0) o5->lustre-OST0000-osc-MDT0000@0@lo:28/4 lens 432/432 e 0 to 1 dl 1607300086 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'osp-pre-0-0.0' LustreError: 8230:0:(osp_precreate.c:958:osp_precreate_cleanup_orphans()) lustre-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -107 LustreError: 8230:0:(osp_precreate.c:958:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message Lustre: lustre-OST0000: Export 000000007f1abfad already connecting from 0@lo Lustre: Skipped 120 previous similar messages Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607299900/real 1607299900] req@0000000082c2a9a2 x1685373429472448/t0(0) o10->lustre-OST0000-osc-ffff880139961000@0@lo:6/4 lens 440/432 e 4 to 1 dl 1607300411 ref 1 fl Rpc:XQr/2/ffffffff rc -11/-1 job:'dd.0' Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) Skipped 1 previous similar message Lustre: lustre-OST0000-osc-ffff880139961000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-OST0000: Client 1879007f-1371-4e81-bd8f-434f89eab36f (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-ffff880139961000: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: ost_io: This server is not able to keep up with request traffic (cpu-bound). Lustre: 14208:0:(service.c:1612:ptlrpc_at_check_timed()) earlyQ=0 reqQ=0 recA=63, svcEst=600, delay=0ms ptlrpc_watchdog_fire: 10 callbacks suppressed Lustre: ll_ost_io00_044: service thread pid 8405 was inactive for 1233.892 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 8406, comm: ll_ost_io00_045 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Lustre: Skipped 3 previous similar messages Call Trace: io_schedule+0xd/0x40 __lock_page+0xe6/0x120 pagecache_get_page+0x1b1/0x240 osd_bufs_get+0x563/0xb30 [osd_ldiskfs] ofd_preprw+0xb60/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Pid: 8403, comm: ll_ost_io00_043 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: io_schedule+0xd/0x40 __lock_page+0xe6/0x120 pagecache_get_page+0x1b1/0x240 osd_bufs_get+0x563/0xb30 [osd_ldiskfs] ofd_preprw+0xb60/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Lustre: ll_ost_io00_041: service thread pid 8401 was inactive for 1233.903 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Pid: 8405, comm: ll_ost_io00_044 4.18.0 #34 SMP Tue May 5 10:50:27 MSK 2020 Call Trace: io_schedule+0xd/0x40 __lock_page+0xe6/0x120 pagecache_get_page+0x1b1/0x240 osd_bufs_get+0x563/0xb30 [osd_ldiskfs] ofd_preprw+0xb60/0x28e0 [ofd] tgt_brw_write+0xc8e/0x2530 [ptlrpc] tgt_request_handle+0x40f/0x18f0 [ptlrpc] ptlrpc_main+0x1222/0x3530 [ptlrpc] kthread+0x100/0x140 ret_from_fork+0x24/0x30 0xffffffffffffffff Lustre: 14208:0:(service.c:1437:ptlrpc_at_send_early_reply()) @@@ Could not add any time (3/3), not sending early reply req@000000000f4c758a x1685373429043840/t0(0) o4->1879007f-1371-4e81-bd8f-434f89eab36f@0@lo:630/0 lens 488/448 e 0 to 0 dl 1607300500 ref 2 fl Interpret:H/2/0 rc 0/0 job:'cat.0' Lustre: 14208:0:(service.c:1437:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message Lustre: 4947:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607299900/real 1607299900] req@0000000070f25dbd x1685373429002432/t0(0) o2->lustre-OST0000-osc-MDT0000@0@lo:28/4 lens 440/432 e 24 to 1 dl 1607300501 ref 1 fl Rpc:XQr/2/ffffffff rc -11/-1 job:'osp-syn-0-0.0' Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.122.4@tcp (at 0@lo) Lustre: ll_ost00_025: service thread pid 31606 was inactive for 1227.759 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 5 previous similar messages Lustre: 4948:0:(client.c:2276:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1607300002/real 1607300002] req@000000002b2bfddb x1685373429435968/t0(0) o2->lustre-OST0000-osc-MDT0001@0@lo:28/4 lens 440/432 e 24 to 1 dl 1607300603 ref 1 fl Rpc:XQr/2/ffffffff rc -11/-1 job:'osp-syn-0-1.0' Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-OST0000: Client lustre-MDT0001-mdtlov_UUID (at 0@lo) reconnecting Lustre: lustre-OST0000-osc-MDT0001: Connection restored to 192.168.122.4@tcp (at 0@lo) LustreError: 9110:0:(osc_cache.c:919:osc_extent_wait()) extent 00000000b2026455@{[0 -> 0/1023], [4|0|+|rpc|wihY|00000000628c426f], [28672|1|+|+|000000003e714948|1024|00000000495558f3]} lustre-OST0000-osc-ffff8801381a4000: wait ext to 0 timedout, recovery in progress? LustreError: 9110:0:(osc_cache.c:919:osc_extent_wait()) ### extent: 00000000b2026455 ns: lustre-OST0000-osc-ffff8801381a4000 lock: 000000003e714948/0x9e82e7233cd4b217 lrc: 4/0,0 mode: PW/PW res: [0x4ef:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x800429400000000 nid: local remote: 0x9e82e7233cd4b248 expref: -99 pid: 9110 timeout: 0 lvb_type: 1