Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.4.3
-
3
-
14392
Description
mounting mdt we start to see these errors then the system becomes unreponsive
nbp6-mds login: LDISKFS-fs (dm-1): recovery complete LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-2): recovery complete LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: nbp6-MDT0000: Not available for connect from 10.153.1.57@o2ib233 (not set up) Lustre: nbp6-MDT0000: used disk, loading Lustre: nbp6-MDT0000: Not available for connect from 10.153.0.76@o2ib233 (not set up) Lustre: 2967:0:(mdt_handler.c:4969:mdt_process_config()) For interoperability, skip this mdt.group_upcall. It is obsolete. Lustre: 2967:0:(mdt_handler.c:4969:mdt_process_config()) For interoperability, skip this mdt.quota_type. It is obsolete. LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 11-0: nbp6-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. Lustre: nbp6-MDT0000: Will be in recovery for at least 5:00, or until 1083 clients reconnect LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3081:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST0032-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3081:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 212864 previous similar messages LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 229931 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 450953 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 445853 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 930108 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 925668 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 1897185 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 1898934 previous similar messages LustreError: 3081:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST0032-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 3829999 previous similar messages LustreError: 3081:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 3851279 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 7683067 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 7758692 previous similar messages LustreError: 3081:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST0032-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 15494392 previous similar messages LustreError: 3081:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 15461799 previous similar messages LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 31047376 previous similar messages LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 30898154 previous similar messages INFO: task crond:3224 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. crond D 0000000000000003 0 3224 2649 0x00000080 ffff8805c5579d38 0000000000000086 ffff8805c5579d00 ffff8805c5579cfc 0000000000000000 ffff88063fc24700 ffff880028296780 0000000000000500 ffff880625ed85f8 ffff8805c5579fd8 000000000000fc40 ffff880625ed85f8 Call Trace: [<ffffffff81540545>] schedule_timeout+0x215/0x2e0 [<ffffffff81346555>] ? extract_entropy+0xe5/0x140 [<ffffffff815401c3>] wait_for_common+0x123/0x180 [<ffffffff81063be0>] ? default_wake_function+0x0/0x20 [<ffffffff815402dd>] wait_for_completion+0x1d/0x20 [<ffffffff810921c8>] synchronize_sched+0x58/0x60 [<ffffffff81092150>] ? wakeme_after_rcu+0x0/0x20 [<ffffffff8122205c>] install_session_keyring_to_cred+0x6c/0xd0 [<ffffffff812221f3>] join_session_keyring+0x133/0x160 [<ffffffff810dbff7>] ? audit_syscall_entry+0x1d7/0x200 [<ffffffff81220df8>] keyctl_join_session_keyring+0x38/0x70 [<ffffffff81221a20>] sys_keyctl+0x170/0x190 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) can't send: -22 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) nbp6-OST002f-osc-MDT0000: invalid setattr record, lsr_valid:0 LustreError: 3069:0:(osp_sync.c:487:osp_sync_new_setattr_job()) Skipped 61763336 previous similar messages LustreError: 3081:0:(osp_sync.c:797:osp_sync_process_queues()) Skipped 62092386 previous similar messages