Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12832

soft lockup in ldlm_bl_xx threads at read for a single shared strided file

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 2.13.0
    • None
    • master (commit fcf219d) on both server and client
    • 3
    • 9223372036854775807

    Description

      many clients got soft lockup in ldlm_bl_xx at read for a single shared strided file that is same workload of io500 hard. at the end, ior failed becouse of clients were evicted.

      # mkdir /scratch0/io500.out/ior_hard -p
      # lfs setstripe -C 240 -S 16M /scratch0/io500.out/ior_hard
      
      # salloc --nodes=10 --ntasks-per-node=24 mpirun --allow-run-as-root /scratch0/ior -w -s 132000 -a POSIX -i 1 -C -Q 1 -g -G 27 -k -e -t 47008 -b 47008 -o /scratch0/io500.out/ior_hard/IOR_file -O stoneWallingStatusFile=/scratch0/io500.out/ior_hard/stonewall -O stoneWallingWearOut=1 -D 300 -W
      

      one of server log

      Oct  6 21:27:46 es400nv-vm1 kernel: LustreError: 26369:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 99s: evicting client at 10.0.13.151@o2ib10  ns: filter-ai400-OST0000_UUID lock: ffff995029f90480/0xe6d707f7408ef34e lrc: 3/0,0 mode: PW/PW res: [0x19:0x0:0x0].0x0 rrc: 30741 type: EXT [5842382848->5842853887] (req 5842382848->5842431999) flags: 0x60000400000020 nid: 10.0.13.151@o2ib10 remote: 0xeff507f1ffb50fc0 expref: 158923 pid: 27145 timeout: 11179 lvb_type: 0
      Oct  6 21:27:46 es400nv-vm1 kernel: Lustre: ai400-OST0000: Connection restored to 85106299-51f6-4 (at 10.0.13.154@o2ib10)
      Oct  6 21:27:46 es400nv-vm1 kernel: Lustre: Skipped 127 previous similar messages
      Oct  6 21:27:49 es400nv-vm1 kernel: LustreError: 26369:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 99s: evicting client at 10.0.13.152@o2ib10  ns: filter-ai400-OST0001_UUID lock: ffff9950aee0d8c0/0xe6d707f740dbdd52 lrc: 3/0,0 mode: PW/PW res: [0x8:0x0:0x0].0x0 rrc: 26996 type: EXT [5140467712->5140746239] (req 5140467712->5140516863) flags: 0x60000400020020 nid: 10.0.13.152@o2ib10 remote: 0xc5564b8fed3e7bd7 expref: 166664 pid: 26904 timeout: 11182 lvb_type: 0
      Oct  6 21:27:49 es400nv-vm1 kernel: LustreError: 26369:0:(ldlm_lockd.c:261:expired_lock_main()) Skipped 2 previous similar messages
      Oct  6 21:27:53 es400nv-vm1 kernel: LustreError: 26986:0:(ldlm_lockd.c:2437:ldlm_cancel_handler()) ldlm_cancel from 10.0.13.152@o2ib10 arrived at 1570364871 with bad export cookie 16633772507216549573
      Oct  6 21:27:59 es400nv-vm1 kernel: LustreError: 26369:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 99s: evicting client at 10.0.13.159@o2ib10  ns: filter-ai400-OST0000_UUID lock: ffff9950587b6400/0xe6d707f7402a10dc lrc: 3/0,0 mode: PW/PW res: [0x15:0x0:0x0].0x0 rrc: 18417 type: EXT [6164672512->6164860927] (req 6164672512->6164721663) flags: 0x60000400020020 nid: 10.0.13.159@o2ib10 remote: 0xd7d9cb14933b97f6 expref: 153927 pid: 26985 timeout: 11192 lvb_type: 0

      one of client log

      Oct  6 21:27:54 c05 kernel: NMI watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [ldlm_bl_01:151183]
      Oct  6 21:27:54 c05 kernel: Modules linked in: mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) ip6table_nat ip6_tables xt_statistic xt_physdev xt_NFLOG nfnetlink_log veth dummy vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 ip_set_hash_ip xt_set ip_set xt_nat ipt_REJECT nf_reject_ipv4 xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack libcrc32c br_netfilter bridge stp llc nfsv3 nfs_acl nfs lockd grace fscache overlay(T) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) ib_uverbs(OE)
      Oct  6 21:27:54 c05 kernel: mlx5_core(OE) mlxfw(OE) mlx4_en(OE) sunrpc iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr joydev sg i2c_i801 mei_me lpc_ich ioatdma mei ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_pad acpi_power_meter binfmt_misc knem(OE) ip_tables ext4 mbcache jbd2 mlx4_ib(OE) ib_core(OE) sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci igb crct10dif_pclmul crct10dif_common libata crc32c_intel mlx4_core(OE) ptp pps_core devlink dca i2c_algo_bit drm_panel_orientation_quirks mlx_compat(OE) [last unloaded: libcfs]
      Oct  6 21:27:54 c05 kernel: CPU: 9 PID: 151183 Comm: ldlm_bl_01 Kdump: loaded Tainted: G           OEL ------------ T 3.10.0-957.27.2.el7.x86_64 #1
      Oct  6 21:27:54 c05 kernel: Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0027.071020182329 07/10/2018
      Oct  6 21:27:54 c05 kernel: task: ffff8eb4c1eb4100 ti: ffff8eb64c178000 task.ti: ffff8eb64c178000
      Oct  6 21:27:54 c05 kernel: RIP: 0010:[<ffffffff89d13536>]  [<ffffffff89d13536>] native_queued_spin_lock_slowpath+0x126/0x200
      Oct  6 21:27:54 c05 kernel: RSP: 0018:ffff8eb64c17bc70  EFLAGS: 00000246
      Oct  6 21:27:54 c05 kernel: RAX: 0000000000000000 RBX: ffff8ec614df8280 RCX: 0000000000490000
      Oct  6 21:27:54 c05 kernel: RDX: ffff8eb67e79b780 RSI: 0000000000310001 RDI: ffff8ec614de0058
      Oct  6 21:27:54 c05 kernel: RBP: ffff8eb64c17bc70 R08: ffff8eb67e85b780 R09: 0000000000000000
      Oct  6 21:27:54 c05 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8eb64c17bc38
      Oct  6 21:27:54 c05 kernel: R13: ffff8eb1aa758000 R14: ffff8eb2af3e46c8 R15: ffff8eb1aa758098
      Oct  6 21:27:54 c05 kernel: FS:  0000000000000000(0000) GS:ffff8eb67e840000(0000) knlGS:0000000000000000
      Oct  6 21:27:54 c05 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Oct  6 21:27:54 c05 kernel: CR2: 00000000029cb000 CR3: 0000000f51610000 CR4: 00000000003607e0
      Oct  6 21:27:54 c05 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      Oct  6 21:27:54 c05 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Oct  6 21:27:54 c05 kernel: Call Trace:
      Oct  6 21:27:54 c05 kernel: [<ffffffff8a35e2cb>] queued_spin_lock_slowpath+0xb/0xf
      Oct  6 21:27:54 c05 kernel: [<ffffffff8a36c7a0>] _raw_spin_lock+0x20/0x30
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0c3b32a>] cl_object_attr_lock+0x1a/0x20 [obdclass]
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0fea746>] osc_ldlm_blocking_ast+0x2f6/0x3a0 [osc]
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0e3e54a>] ldlm_cancel_callback+0x8a/0x330 [ptlrpc]
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0c14979>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0e49460>] ldlm_cli_cancel_local+0xa0/0x3f0 [ptlrpc]
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0e4d97a>] ldlm_cli_cancel_list_local+0xea/0x260 [ptlrpc]
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0e54370>] ldlm_bl_thread_main+0x5a0/0xa40 [ptlrpc]
      Oct  6 21:27:54 c05 kernel: [<ffffffff89cd7c40>] ? wake_up_state+0x20/0x20
      Oct  6 21:27:54 c05 kernel: [<ffffffffc0e53dd0>] ? ldlm_handle_bl_callback+0x4e0/0x4e0 [ptlrpc]
      Oct  6 21:27:54 c05 kernel: [<ffffffff89cc2e81>] kthread+0xd1/0xe0
      Oct  6 21:27:54 c05 kernel: [<ffffffff89cc2db0>] ? insert_kthread_work+0x40/0x40
      Oct  6 21:27:54 c05 kernel: [<ffffffff8a376c1d>] ret_from_fork_nospec_begin+0x7/0x21
      Oct  6 21:27:54 c05 kernel: [<ffffffff89cc2db0>] ? insert_kthread_work+0x40/0x40
      Oct  6 21:27:54 c05 kernel: Code: 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 a0 bf 94 8a 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 41 8b 40 08 <85> c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b 17 0f b7 c2 
      Oct  6 21:27:56 c05 kernel: LustreError: 152233:0:(ldlm_resource.c:1147:ldlm_resource_complain()) ai400-OST0006-osc-ffff8eafdda3d800: namespace resource [0x10:0x0:0x0].0x0 (ffff8ebf4ddcc840) refcount nonzero (1404) after lock cleanup; forcing cleanup.
      Oct  6 21:27:56 c05 kernel: Lustre: ai400-OST0006-osc-ffff8eafdda3d800: Connection restored to 10.0.10.178@o2ib10 (at 10.0.10.178@o2ib10)
      Oct  6 21:27:58 c05 kernel: LustreError: 152229:0:(ldlm_resource.c:1147:ldlm_resource_complain()) ai400-OST0002-osc-ffff8eafdda3d800: namespace resource [0x10:0x0:0x0].0x0 (ffff8ec4263355c0) refcount nonzero (1729) after lock cleanup; forcing cleanup.
      Oct  6 21:27:58 c05 kernel: LustreError: 152229:0:(ldlm_resource.c:1147:ldlm_resource_complain()) Skipped 59 previous similar messages
      Oct  6 21:27:58 c05 kernel: Lustre: ai400-OST0002-osc-ffff8eafdda3d800: Connection restored to 10.0.10.176@o2ib10 (at 10.0.10.176@o2ib10)

      Attachments

        1. vmcore-dmesg.txt
          118 kB
          Shuichi Ihara
        2. vmcore-dmesg.txt
          1021 kB
          Shuichi Ihara

        Issue Links

          Activity

            [LU-12832] soft lockup in ldlm_bl_xx threads at read for a single shared strided file
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-13131 [ LU-13131 ]
            pjones Peter Jones made changes -
            Fix Version/s Original: Lustre 2.13.0 [ 14290 ]
            Resolution New: Not a Bug [ 6 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones added a comment -

            seems to be hw issue

            pjones Peter Jones added a comment - seems to be hw issue
            pjones Peter Jones made changes -
            Link New: This issue is related to LU-12786 [ LU-12786 ]
            simmonsja James A Simmons made changes -
            Comment [ Does patch [https://review.whamcloud.com/#/c/36229/] help? ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-12377 [ LU-12377 ]

            background aggressive lock canceling in advance helps to avoid this problem.
            e.g. setting 30sec to lru_max_age 'lctl set_param ldlm.namespaces.*.lru_max_age=30000' 
            After this change, I didn't see see problem.
            However, now I'm reproducing LU-12786 again.. I will re-open LU-12786.

            sihara Shuichi Ihara added a comment - background aggressive lock canceling in advance helps to avoid this problem. e.g. setting 30sec to lru_max_age 'lctl set_param ldlm.namespaces.*.lru_max_age=30000'  After this change, I didn't see see problem. However, now I'm reproducing LU-12786 again.. I will re-open LU-12786 .

            I wonder if overstriping could be related. I have been enabling overstriping for this testing, but will test without overstriping in master. If the problem is still exist, we can test old commit. It takes too long time to finish this ior job without overstriping though...

            sihara Shuichi Ihara added a comment - I wonder if overstriping could be related. I have been enabling overstriping for this testing, but will test without overstriping in master. If the problem is still exist, we can test old commit. It takes too long time to finish this ior job without overstriping though...
            sihara Shuichi Ihara made changes -
            Attachment New: vmcore-dmesg.txt [ 33641 ]

            I've tested above change with same workload, but it caused another regression at write (not read). Please see console messages whne client crashed.

            [295995.970954] LustreError: 92739:0:(osc_cache.c:1153:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed: 
            [295995.971011] LustreError: 92739:0:(osc_cache.c:1153:osc_extent_make_ready()) LBUG
            [295995.971047] Pid: 92739, comm: ldlm_bl_02 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019
            [295995.971048] Call Trace:
            [295995.971058]  [<ffffffffc0cb58ac>] libcfs_call_trace+0x8c/0xc0 [libcfs]
            [295995.971071]  [<ffffffffc0cb595c>] lbug_with_loc+0x4c/0xa0 [libcfs]
            [295995.971077]  [<ffffffffc123204b>] osc_extent_make_ready+0x91b/0xe50 [osc]
            [295995.971097]  [<ffffffffc123499f>] osc_io_unplug0+0xd9f/0x1910 [osc]
            [295995.971105]  [<ffffffffc12391e0>] osc_cache_writeback_range+0x9c0/0xff0 [osc]
            [295995.971113]  [<ffffffffc1224275>] osc_lock_flush+0x195/0x290 [osc]
            [295995.971120]  [<ffffffffc1224733>] osc_ldlm_blocking_ast+0x2e3/0x3a0 [osc]
            [295995.971126]  [<ffffffffc107854a>] ldlm_cancel_callback+0x8a/0x330 [ptlrpc]
            [295995.971161]  [<ffffffffc1083460>] ldlm_cli_cancel_local+0xa0/0x3f0 [ptlrpc]
            [295995.971181]  [<ffffffffc10891c1>] ldlm_cli_cancel+0x161/0x650 [ptlrpc]
            [295995.971205]  [<ffffffffc12245ca>] osc_ldlm_blocking_ast+0x17a/0x3a0 [osc]
            [295995.971213]  [<ffffffffc108d9dd>] ldlm_handle_bl_callback+0xed/0x4e0 [ptlrpc]
            [295995.971233]  [<ffffffffc108e5d0>] ldlm_bl_thread_main+0x800/0xa40 [ptlrpc]
            [295995.971252]  [<ffffffff932c2e81>] kthread+0xd1/0xe0
            [295995.971258]  [<ffffffff93976c1d>] ret_from_fork_nospec_begin+0x7/0x21
            [295995.971264]  [<ffffffffffffffff>] 0xffffffffffffffff
            [295995.971294] Kernel panic - not syncing: LBUG
            
            sihara Shuichi Ihara added a comment - I've tested above change with same workload, but it caused another regression at write (not read). Please see console messages whne client crashed. [295995.970954] LustreError: 92739:0:(osc_cache.c:1153:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed: [295995.971011] LustreError: 92739:0:(osc_cache.c:1153:osc_extent_make_ready()) LBUG [295995.971047] Pid: 92739, comm: ldlm_bl_02 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 [295995.971048] Call Trace: [295995.971058] [<ffffffffc0cb58ac>] libcfs_call_trace+0x8c/0xc0 [libcfs] [295995.971071] [<ffffffffc0cb595c>] lbug_with_loc+0x4c/0xa0 [libcfs] [295995.971077] [<ffffffffc123204b>] osc_extent_make_ready+0x91b/0xe50 [osc] [295995.971097] [<ffffffffc123499f>] osc_io_unplug0+0xd9f/0x1910 [osc] [295995.971105] [<ffffffffc12391e0>] osc_cache_writeback_range+0x9c0/0xff0 [osc] [295995.971113] [<ffffffffc1224275>] osc_lock_flush+0x195/0x290 [osc] [295995.971120] [<ffffffffc1224733>] osc_ldlm_blocking_ast+0x2e3/0x3a0 [osc] [295995.971126] [<ffffffffc107854a>] ldlm_cancel_callback+0x8a/0x330 [ptlrpc] [295995.971161] [<ffffffffc1083460>] ldlm_cli_cancel_local+0xa0/0x3f0 [ptlrpc] [295995.971181] [<ffffffffc10891c1>] ldlm_cli_cancel+0x161/0x650 [ptlrpc] [295995.971205] [<ffffffffc12245ca>] osc_ldlm_blocking_ast+0x17a/0x3a0 [osc] [295995.971213] [<ffffffffc108d9dd>] ldlm_handle_bl_callback+0xed/0x4e0 [ptlrpc] [295995.971233] [<ffffffffc108e5d0>] ldlm_bl_thread_main+0x800/0xa40 [ptlrpc] [295995.971252] [<ffffffff932c2e81>] kthread+0xd1/0xe0 [295995.971258] [<ffffffff93976c1d>] ret_from_fork_nospec_begin+0x7/0x21 [295995.971264] [<ffffffffffffffff>] 0xffffffffffffffff [295995.971294] Kernel panic - not syncing: LBUG

            People

              wshilong Wang Shilong (Inactive)
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: