Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1739

parallel-scale-nfsv4: compilebench: IOError: [Errno 5] Input/output error

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.1.3, Lustre 2.8.0
    • None
    • Lustre Tag: v2_1_3_RC1
      Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/113/
      Distro/Arch: RHEL6.3/x86_64 + RHEL6.3/i686 (Server + Client)
      Network: TCP (1GigE)
    • 3
    • 5784

    Description

      parallel-scale-nfsv4 compilebench test failed as follows:

      == parallel-scale-nfsv4 test compilebench: compilebench == 12:20:15 (1344799215)
      OPTIONS:
      cbench_DIR=/usr/bin
      cbench_IDIRS=4
      cbench_RUNS=4
      client-26vm1
      client-26vm2.lab.whamcloud.com
      ./compilebench -D /mnt/lustre/d0.compilebench -i 4         -r 4 --makej
      using working directory /mnt/lustre/d0.compilebench, 4 intial dirs 4 runs
      native unpatched native-0 222MB in 337.20 seconds (0.66 MB/s)
      native patched native-0 109MB in 64.12 seconds (1.71 MB/s)
      native patched compiled native-0 691MB in 77.99 seconds (8.87 MB/s)
      create dir kernel-0 222MB in 220.91 seconds (1.01 MB/s)
      create dir kernel-1 222MB in 221.99 seconds (1.00 MB/s)
      create dir kernel-2 222MB in 223.53 seconds (0.99 MB/s)
      create dir kernel-3 222MB in 260.04 seconds (0.86 MB/s)
      Traceback (most recent call last):
        File "./compilebench", line 594, in <module>
          if not compile_one_dir(dset, rnd):
        File "./compilebench", line 368, in compile_one_dir
          mbs = run_directory(ch[0], dir, "compile dir")
        File "./compilebench", line 245, in run_directory
          fp.close()
      IOError: [Errno 5] Input/output error
       parallel-scale-nfsv4 test_compilebench: @@@@@@ FAIL: compilebench failed: 1 
      

      https://maloo.whamcloud.com/test_sets/8e490bd6-e4bf-11e1-af05-52540035b04c

      Console log on MDS/Lustre Client/NFS Server showed that:

      12:20:18:Lustre: DEBUG MARKER: == parallel-scale-nfsv4 test compilebench: compilebench == 12:20:15 (1344799215)
      12:20:18:LustreError: 138-a: lustre-MDT0000: A client on nid 0@lo was evicted due to a lock blocking callback time out: rc -107
      12:20:18:Lustre: DEBUG MARKER: /usr/sbin/lctl mark .\/compilebench -D \/mnt\/lustre\/d0.compilebench -i 4         -r 4 --makej
      12:20:18:Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 4 -r 4 --makej
      12:47:05:------------[ cut here ]------------
      12:47:05:WARNING: at arch/x86/kernel/smp.c:117 native_smp_send_reschedule+0x5c/0x60() (Not tainted)
      12:47:05:Hardware name: KVM
      12:47:06:Modules linked in: lmv(U) nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) jbd2 nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lnet_selftest]
      12:47:06:Pid: 327, comm: nfsd Not tainted 2.6.32-279.2.1.el6_lustre.x86_64 #1
      12:47:06:Call Trace:
      12:47:06: <IRQ>  [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
      12:47:06: [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20
      12:47:06: [<ffffffff8102a8cc>] ? native_smp_send_reschedule+0x5c/0x60
      12:47:06: [<ffffffff8104f218>] ? resched_task+0x68/0x80
      12:47:06: [<ffffffff8105484c>] ? task_tick_fair+0x14c/0x160
      12:47:06: [<ffffffff81057fa1>] ? scheduler_tick+0xc1/0x260
      12:47:06: [<ffffffff810a21d0>] ? tick_sched_timer+0x0/0xc0
      12:47:07: [<ffffffff8107e25e>] ? update_process_times+0x6e/0x90
      12:47:07: [<ffffffff810a2236>] ? tick_sched_timer+0x66/0xc0
      12:47:07: [<ffffffff8109684e>] ? __run_hrtimer+0x8e/0x1a0
      12:47:07: [<ffffffff81038779>] ? kvm_clock_get_cycles+0x9/0x10
      12:47:07: [<ffffffff81096bf6>] ? hrtimer_interrupt+0xe6/0x250
      12:47:07: [<ffffffff8150603b>] ? smp_apic_timer_interrupt+0x6b/0x9b
      12:47:07: [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
      12:47:07: <EOI>  [<ffffffff8112df11>] ? shrink_inactive_list+0x2f1/0x7d0
      12:47:07: [<ffffffffa04540b7>] ? cfs_hash_bd_lookup_intent+0x37/0x120 [libcfs]
      12:47:07: [<ffffffffa04534f4>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
      12:47:07: [<ffffffffa0456d0d>] ? cfs_hash_find_or_add+0x9d/0x190 [libcfs]
      12:47:07: [<ffffffff8112ecbf>] ? shrink_zone+0x38f/0x520
      12:47:07: [<ffffffff8109cd49>] ? ktime_get_ts+0xa9/0xe0
      12:47:07: [<ffffffff8112ef4e>] ? do_try_to_free_pages+0xfe/0x520
      12:47:07: [<ffffffff8111835f>] ? zone_watermark_ok+0x1f/0x30
      12:47:07: [<ffffffff8112f55d>] ? try_to_free_pages+0x9d/0x130
      12:47:07: [<ffffffff81135f26>] ? next_online_pgdat+0x26/0x50
      12:47:07: [<ffffffff811306b0>] ? isolate_pages_global+0x0/0x350
      12:47:08: [<ffffffff8112735d>] ? __alloc_pages_nodemask+0x40d/0x940
      12:47:08: [<ffffffff81161e92>] ? kmem_getpages+0x62/0x170
      12:47:08: [<ffffffff81162aaa>] ? fallback_alloc+0x1ba/0x270
      12:47:08: [<ffffffff811624ff>] ? cache_grow+0x2cf/0x320
      12:47:08: [<ffffffff81162829>] ? ____cache_alloc_node+0x99/0x160
      12:47:08: [<ffffffffa0445a13>] ? cfs_alloc+0x63/0x90 [libcfs]
      12:47:08: [<ffffffff81163459>] ? __kmalloc+0x189/0x220
      12:47:08: [<ffffffffa0445a13>] ? cfs_alloc+0x63/0x90 [libcfs]
      12:47:08: [<ffffffffa063ed7a>] ? ptlrpc_prep_bulk_imp+0x7a/0x350 [ptlrpc]
      12:47:08: [<ffffffffa064b36c>] ? lustre_msg_set_timeout+0x9c/0x110 [ptlrpc]
      12:47:08: [<ffffffffa07173df>] ? osc_brw_prep_request+0x7cf/0x1030 [osc]
      12:47:08: [<ffffffffa072c88b>] ? osc_req_attr_set+0xfb/0x2a0 [osc]
      12:47:08: [<ffffffffa088b8c8>] ? ccc_req_attr_set+0x78/0x150 [lustre]
      12:47:08: [<ffffffffa056b6cc>] ? cl_req_prep+0x8c/0x190 [obdclass]
      12:47:08: [<ffffffffa0718dd5>] ? osc_send_oap_rpc+0x1195/0x1c20 [osc]
      12:47:08: [<ffffffff8127a45c>] ? put_dec+0x10c/0x110
      12:47:08: [<ffffffffa070aff1>] ? osc_consume_write_grant+0x81/0x160 [osc]
      12:47:08: [<ffffffffa07295d0>] ? osc_page_init+0x190/0x330 [osc]
      12:47:09: [<ffffffffa0719b3e>] ? osc_check_rpcs+0x2de/0x470 [osc]
      12:47:09: [<ffffffffa0710143>] ? on_list+0x43/0x50 [osc]
      12:47:09: [<ffffffffa071a6e3>] ? osc_queue_async_io+0x3c3/0x8f0 [osc]
      12:47:09: [<ffffffff81039632>] ? pvclock_clocksource_read+0x12/0xd0
      12:47:09: [<ffffffff8127ce96>] ? vsnprintf+0x2b6/0x5f0
      12:47:09: [<ffffffff8109caaa>] ? do_gettimeofday+0x1a/0x50
      12:47:09: [<ffffffffa072863f>] ? osc_page_cache_add+0xcf/0x200 [osc]
      12:47:09: [<ffffffffa044f6a1>] ? libcfs_debug_vmsg2+0x4d1/0xb50 [libcfs]
      12:47:09: [<ffffffffa055f908>] ? cl_page_invoke+0xb8/0x160 [obdclass]
      12:47:09: [<ffffffffa0802afe>] ? lov_page_stripe+0x3e/0x150 [lov]
      12:47:09: [<ffffffffa0560918>] ? cl_page_cache_add+0x58/0x240 [obdclass]
      12:47:09: [<ffffffffa045415d>] ? cfs_hash_bd_lookup_intent+0xdd/0x120 [libcfs]
      12:47:09: [<ffffffffa0892cc5>] ? vvp_io_commit_write+0x325/0x580 [lustre]
      12:47:09: [<ffffffffa0455402>] ? cfs_hash_lookup+0x82/0xa0 [libcfs]
      12:47:09: [<ffffffffa056e83f>] ? cl_io_commit_write+0xaf/0x1e0 [obdclass]
      12:47:09: [<ffffffffa055ebb9>] ? cl_env_get+0x29/0x350 [obdclass]
      12:47:09: [<ffffffffa086a0ad>] ? ll_commit_write+0xed/0x300 [lustre]
      12:47:10: [<ffffffffa0881990>] ? ll_write_end+0x30/0x60 [lustre]
      12:47:10: [<ffffffff81114c4a>] ? generic_file_buffered_write+0x18a/0x2e0
      12:47:10: [<ffffffff810724c7>] ? current_fs_time+0x27/0x30
      12:47:10: [<ffffffff81116580>] ? __generic_file_aio_write+0x250/0x480
      12:47:10: [<ffffffff8111681f>] ? generic_file_aio_write+0x6f/0xe0
      12:47:10: [<ffffffffa08935d1>] ? vvp_io_write_start+0xa1/0x270 [lustre]
      12:47:10: [<ffffffffa056b018>] ? cl_io_start+0x68/0x170 [obdclass]
      12:47:10: [<ffffffffa056fb90>] ? cl_io_loop+0x110/0x1c0 [obdclass]
      12:47:10: [<ffffffffa083a98b>] ? ll_file_io_generic+0x44b/0x580 [lustre]
      12:47:10: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13
      12:47:10: [<ffffffffa083aac0>] ? ll_file_aio_write+0x0/0x310 [lustre]
      12:47:10: [<ffffffffa083abff>] ? ll_file_aio_write+0x13f/0x310 [lustre]
      12:47:10: [<ffffffffa083aac0>] ? ll_file_aio_write+0x0/0x310 [lustre]
      12:47:10: [<ffffffff8117ad5b>] ? do_sync_readv_writev+0xfb/0x140
      12:47:10: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
      12:47:10: [<ffffffff81213286>] ? security_file_permission+0x16/0x20
      12:47:10: [<ffffffff8117be06>] ? do_readv_writev+0xd6/0x1f0
      12:47:11: [<ffffffffa02bcdc0>] ? cache_check+0x60/0x360 [sunrpc]
      12:47:11: [<ffffffffa030b759>] ? exportfs_decode_fh+0x99/0x2bc [exportfs]
      12:47:11: [<ffffffff812135e6>] ? security_task_setgroups+0x16/0x20
      12:47:11: [<ffffffff8109ae85>] ? set_groups+0x25/0x190
      12:47:11: [<ffffffff8117bf66>] ? vfs_writev+0x46/0x60
      12:47:11: [<ffffffffa034d3f5>] ? nfsd_vfs_write+0x105/0x430 [nfsd]
      12:47:11: [<ffffffffa034b872>] ? nfsd_setuser_and_check_port+0x62/0xb0 [nfsd]
      12:47:11: [<ffffffffa034fed9>] ? nfsd_write+0x99/0x100 [nfsd]
      12:47:11: [<ffffffffa035a5f0>] ? nfsd4_write+0x100/0x130 [nfsd]
      12:47:11: [<ffffffffa035af68>] ? nfsd4_proc_compound+0x3d8/0x490 [nfsd]
      12:47:11: [<ffffffffa034843e>] ? nfsd_dispatch+0xfe/0x240 [nfsd]
      12:47:11: [<ffffffffa02b25d4>] ? svc_process_common+0x344/0x640 [sunrpc]
      12:47:11: [<ffffffff81060250>] ? default_wake_function+0x0/0x20
      12:47:11: [<ffffffffa02b2c10>] ? svc_process+0x110/0x160 [sunrpc]
      12:47:11: [<ffffffffa0348b62>] ? nfsd+0xc2/0x160 [nfsd]
      12:47:11: [<ffffffffa0348aa0>] ? nfsd+0x0/0x160 [nfsd]
      12:47:11: [<ffffffff81091d66>] ? kthread+0x96/0xa0
      12:47:11: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
      12:47:11: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
      12:47:12: [<ffffffff8100c140>] ? child_rip+0x0/0x20
      12:47:12:---[ end trace a40d7ac865e9e05f ]---
      

      Attachments

        Issue Links

          Activity

            [LU-1739] parallel-scale-nfsv4: compilebench: IOError: [Errno 5] Input/output error
            sarah Sarah Liu added a comment -

            dup of LU-2661

            sarah Sarah Liu added a comment - dup of LU-2661

            Another instance found for interop - EL7 Server/2.7.1 Client, tag 2.7.90.
            https://testing.hpdd.intel.com/test_sessions/495aabae-d306-11e5-be5c-5254006e85c2

            standan Saurabh Tandan (Inactive) added a comment - Another instance found for interop - EL7 Server/2.7.1 Client, tag 2.7.90. https://testing.hpdd.intel.com/test_sessions/495aabae-d306-11e5-be5c-5254006e85c2

            Another instance found for Full tag 2.7.66 -EL7.1 Server/SLES11 SP3 Client, build# 3314
            https://testing.hpdd.intel.com/test_sets/b154fcf2-ca7b-11e5-9609-5254006e85c2

            Another instance found for Full tag 2.7.66 -EL6.7 Server/SLES11 SP3 Client, build# 3316
            https://testing.hpdd.intel.com/test_sets/fdcf7f92-cce9-11e5-8b0e-5254006e85c2

            standan Saurabh Tandan (Inactive) added a comment - - edited Another instance found for Full tag 2.7.66 -EL7.1 Server/SLES11 SP3 Client, build# 3314 https://testing.hpdd.intel.com/test_sets/b154fcf2-ca7b-11e5-9609-5254006e85c2 Another instance found for Full tag 2.7.66 -EL6.7 Server/SLES11 SP3 Client, build# 3316 https://testing.hpdd.intel.com/test_sets/fdcf7f92-cce9-11e5-8b0e-5254006e85c2

            Another failure for master : Tag 2.7.66 FULL - EL7.1 Server/SLES11 SP3 Client, build# 3314
            https://testing.hpdd.intel.com/test_sets/b154fcf2-ca7b-11e5-9609-5254006e85c2

            standan Saurabh Tandan (Inactive) added a comment - Another failure for master : Tag 2.7.66 FULL - EL7.1 Server/SLES11 SP3 Client, build# 3314 https://testing.hpdd.intel.com/test_sets/b154fcf2-ca7b-11e5-9609-5254006e85c2
            standan Saurabh Tandan (Inactive) added a comment - - edited master, build# 3264, 2.7.64 tag Regression:EL7.1 Server/EL7.1 Client https://testing.hpdd.intel.com/test_sets/79b1a120-9f37-11e5-ba94-5254006e85c2 https://testing.hpdd.intel.com/test_sets/7a96d3d0-9f37-11e5-ba94-5254006e85c2

            Encountered same issue for tag 2.7.61
            server: 2.5.5, b2_5_fe/62
            client: build# 3203, RHEL 7

            https://testing.hpdd.intel.com/test_sets/01c3aa70-6b05-11e5-9272-5254006e85c2

            IOError: [Errno 5] Input/output error
             parallel-scale-nfsv4 test_compilebench: @@@@@@ FAIL: compilebench failed: 1 
            
            standan Saurabh Tandan (Inactive) added a comment - Encountered same issue for tag 2.7.61 server: 2.5.5, b2_5_fe/62 client: build# 3203, RHEL 7 https://testing.hpdd.intel.com/test_sets/01c3aa70-6b05-11e5-9272-5254006e85c2 IOError: [Errno 5] Input/output error parallel-scale-nfsv4 test_compilebench: @@@@@@ FAIL: compilebench failed: 1
            pjones Peter Jones added a comment -

            Oleg believes this to be a duplicate of LU-969. Please reopen if this problem reoccurs with that fix in place.

            pjones Peter Jones added a comment - Oleg believes this to be a duplicate of LU-969 . Please reopen if this problem reoccurs with that fix in place.

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: