Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7888

kernel: INFO: task mount.lustre:22219 blocked for more than 120 seconds

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.9.0
    • Lustre 2.7.0, Lustre 2.9.0
    • None
    • 3
    • 9223372036854775807

    Description

      During a test he following caused the test to fail:

      Mar  2 15:12:36 lotus-42vm5 kernel: LDISKFS-fs (sdc): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:12:36 lotus-42vm5 kernel: 
      Mar  2 15:12:38 lotus-42vm5 kernel: LDISKFS-fs (sda): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:12:38 lotus-42vm5 kernel: 
      Mar  2 15:12:38 lotus-42vm5 kernel: LDISKFS-fs (sde): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:12:38 lotus-42vm5 kernel: 
      Mar  2 15:12:38 lotus-42vm5 kernel: LDISKFS-fs (sdd): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:12:38 lotus-42vm5 kernel: 
      Mar  2 15:13:03 lotus-42vm5 kernel: LDISKFS-fs (sdc): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:03 lotus-42vm5 kernel: 
      Mar  2 15:13:03 lotus-42vm5 kernel: LDISKFS-fs (sdc): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:03 lotus-42vm5 kernel: 
      Mar  2 15:13:09 lotus-42vm5 kernel: LDISKFS-fs (sdd): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:09 lotus-42vm5 kernel: 
      Mar  2 15:13:09 lotus-42vm5 kernel: LDISKFS-fs (sdd): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:09 lotus-42vm5 kernel: 
      Mar  2 15:13:09 lotus-42vm5 kernel: Lustre: ctl-testfs-MDT0000: No data found on store. Initialize space
      Mar  2 15:13:09 lotus-42vm5 kernel: Lustre: testfs-MDT0000: new disk, initializing
      Mar  2 15:13:09 lotus-42vm5 kernel: Lustre: Failing over testfs-MDT0000
      Mar  2 15:13:10 lotus-42vm5 kernel: Lustre: server umount testfs-MDT0000 complete
      Mar  2 15:13:10 lotus-42vm5 kernel: LDISKFS-fs (sda): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:10 lotus-42vm5 kernel: 
      Mar  2 15:13:10 lotus-42vm5 kernel: LDISKFS-fs (sda): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:10 lotus-42vm5 kernel: 
      Mar  2 15:13:10 lotus-42vm5 kernel: Lustre: srv-testfs-MDT0001: No data found on store. Initialize space
      Mar  2 15:13:10 lotus-42vm5 kernel: Lustre: Skipped 1 previous similar message
      Mar  2 15:13:10 lotus-42vm5 kernel: Lustre: testfs-MDT0001: new disk, initializing
      Mar  2 15:13:10 lotus-42vm5 kernel: LustreError: 137-5: testfs-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      Mar  2 15:13:41 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1456960415/real 1456960415]  req@ffff880061d19c80 x1527730997821688/t0(0) o38->testfs-MDT0000-osp-MDT0001@10.14.82.129@tcp:24/4 lens 520/544 e 0 to 1 dl 1456960421 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Mar  2 15:13:42 lotus-42vm5 kernel: LDISKFS-fs (sdd): mounted filesystem with ordered data mode. quota=on. Opts: 
      Mar  2 15:13:42 lotus-42vm5 kernel: 
      Mar  2 15:14:00 lotus-42vm5 kernel: LustreError: 137-5: testfs-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      Mar  2 15:14:00 lotus-42vm5 kernel: LustreError: Skipped 1 previous similar message
      Mar  2 15:14:36 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1456960465/real 1456960465]  req@ffff88005e254980 x1527730997821712/t0(0) o38->testfs-MDT0000-osp-MDT0001@10.14.82.129@tcp:24/4 lens 520/544 e 0 to 1 dl 1456960476 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Mar  2 15:14:36 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      Mar  2 15:14:50 lotus-42vm5 kernel: LustreError: 137-5: testfs-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      Mar  2 15:14:50 lotus-42vm5 kernel: LustreError: Skipped 1 previous similar message
      Mar  2 15:15:31 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1456960515/real 1456960515]  req@ffff88005e254c80 x1527730997821736/t0(0) o38->testfs-MDT0000-osp-MDT0001@10.14.82.129@tcp:24/4 lens 520/544 e 0 to 1 dl 1456960531 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Mar  2 15:15:31 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      Mar  2 15:15:40 lotus-42vm5 kernel: LustreError: 137-5: testfs-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      Mar  2 15:15:40 lotus-42vm5 kernel: LustreError: Skipped 1 previous similar message
      Mar  2 15:15:44 lotus-42vm5 kernel: INFO: task mount.lustre:22219 blocked for more than 120 seconds.
      Mar  2 15:15:44 lotus-42vm5 kernel:      Not tainted 2.6.32-573.18.1.el6_lustre.x86_64 #1
      Mar  2 15:15:44 lotus-42vm5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Mar  2 15:15:44 lotus-42vm5 kernel: mount.lustre  D 0000000000000000     0 22219  22217 0x00000080
      Mar  2 15:15:44 lotus-42vm5 kernel: ffff88007840f808 0000000000000082 ffff880002215a00 ffff88007bb26f4b
      Mar  2 15:15:44 lotus-42vm5 kernel: ffffffffa12a9e1b ffffffffa12a9e19 ffff88007840f828 ffff88007bb26f78
      Mar  2 15:15:44 lotus-42vm5 kernel: ffff88007840f818 ffffffff8129caf8 ffff88007c5c45f8 ffff88007840ffd8
      Mar  2 15:15:44 lotus-42vm5 kernel: Call Trace:
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8129caf8>] ? vsnprintf+0x218/0x5e0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8153bc06>] __mutex_lock_slowpath+0x96/0x210
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8129cf64>] ? snprintf+0x34/0x40
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125404f>] ? server_name2fsname+0x6f/0x90 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa040a1f0>] ? qsd_conn_callback+0x0/0x180 [lquota]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8153b72b>] mutex_lock+0x2b/0x50
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa128ac3f>] lustre_register_lwp_item+0xdf/0xab0 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa040bd09>] qsd_prepare+0x949/0x11b0 [lquota]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa0464c82>] osd_prepare+0x132/0x720 [osd_ldiskfs]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa08e5640>] lod_prepare+0xf0/0x1e0 [lod]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa095d0af>] mdd_prepare+0xef/0x1280 [mdd]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa149f81b>] ? tgt_ses_key_init+0x6b/0x190 [ptlrpc]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125ff2d>] ? keys_fill+0xdd/0x1c0 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa07faf76>] mdt_prepare+0x56/0x3b0 [mdt]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1264403>] ? lu_context_init+0xa3/0x240 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa128a51f>] server_start_targets+0x176f/0x1db0 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1252550>] ? class_config_llog_handler+0x0/0x17b0 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1291595>] server_fill_super+0xbe5/0x1a7c [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125cc82>] lustre_fill_super+0xa82/0x2150 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125c200>] ? lustre_fill_super+0x0/0x2150 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8119567f>] get_sb_nodev+0x5f/0xa0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1253de5>] lustre_get_sb+0x25/0x30 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff81194cbb>] vfs_kern_mount+0x7b/0x1b0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff81194e62>] do_kern_mount+0x52/0x130
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff811b6e1b>] do_mount+0x2fb/0x930
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff811b74e0>] sys_mount+0x90/0xe0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
      Mar  2 15:15:44 lotus-42vm5 kernel: INFO: task mount.lustre:22496 blocked for more than 120 seconds.
      Mar  2 15:15:44 lotus-42vm5 kernel:      Not tainted 2.6.32-573.18.1.el6_lustre.x86_64 #1
      Mar  2 15:15:44 lotus-42vm5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Mar  2 15:15:44 lotus-42vm5 kernel: mount.lustre  D 0000000000000000     0 22496  22495 0x00000080
      Mar  2 15:15:44 lotus-42vm5 kernel: ffff88006083f908 0000000000000082 ffffffffffffffff 0000000000000000
      Mar  2 15:15:44 lotus-42vm5 kernel: 0000000000000001 77fa41c5810ac58a 0000000000000e3b ffff880077fa41c5
      Mar  2 15:15:44 lotus-42vm5 kernel: 0000000000000000 0000000affffffff ffff880077e805f8 ffff88006083ffd8
      Mar  2 15:15:44 lotus-42vm5 kernel: Call Trace:
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa036db53>] ? libcfs_debug_vmsg2+0x5e3/0xbe0 [libcfs]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8153bc06>] __mutex_lock_slowpath+0x96/0x210
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8153b72b>] mutex_lock+0x2b/0x50
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa0341d10>] mgc_set_info_async+0x450/0x1a50 [mgc]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa036e191>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1284745>] server_mgc_set_fs+0x115/0x4e0 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1288e5f>] server_start_targets+0xaf/0x1db0 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125ab48>] ? lustre_start_mgc+0xac8/0x2180 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1291595>] server_fill_super+0xbe5/0x1a7c [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125cc82>] lustre_fill_super+0xa82/0x2150 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa125c200>] ? lustre_fill_super+0x0/0x2150 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8119567f>] get_sb_nodev+0x5f/0xa0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffffa1253de5>] lustre_get_sb+0x25/0x30 [obdclass]
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff81194cbb>] vfs_kern_mount+0x7b/0x1b0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff81194e62>] do_kern_mount+0x52/0x130
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff811b6e1b>] do_mount+0x2fb/0x930
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff811b74e0>] sys_mount+0x90/0xe0
      Mar  2 15:15:44 lotus-42vm5 kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
      Mar  2 15:16:26 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1456960565/real 1456960565]  req@ffff88005e254980 x1527730997821760/t0(0) o38->testfs-MDT0000-osp-MDT0001@10.14.82.129@tcp:24/4 lens 520/544 e 0 to 1 dl 1456960586 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Mar  2 15:16:26 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      Mar  2 15:16:30 lotus-42vm5 kernel: LustreError: 137-5: testfs-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      Mar  2 15:16:30 lotus-42vm5 kernel: LustreError: Skipped 1 previous similar message
      Mar  2 15:17:21 lotus-42vm5 kernel: Lustre: 7592:0:(client.c:2048:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1456960615/real 1456960615]  req@ffff88005e254c80 x1527730997821784/t0(0) o38->testfs-MDT0000-osp-MDT0001@10.14.82.129@tcp:24/4 lens 520/544 e 0 to 1 dl 1456960641 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      class_config_llog_handler+0x0/0x17b0 [obdclass]
      Mar  2 15:19:00 lotus-42vm5 kernel: LustreError: 137-5: testfs-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      

      Attachments

        Activity

          People

            yong.fan nasf (Inactive)
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: