Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18516

do not call blocking ops when !TASK_RUNNING occurs in osd-ldisk / quota path

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.17.0, Lustre 2.15.7
    • Lustre 2.17.0
    • Lustre 2.17 runnning ldiskfs with a debug kernel.
    • 3
    • 9223372036854775807

    Description

      With a debug kernel for sanity-quota test  1i reports the following:

      [ 1276.935753] ------------[ cut here ]------------ 
      [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0 
      [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0 
      [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 4.18.0rh8.5-debug #2 
      [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 
      [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0 
      
      Call Trace:
      [ 1276.976327] down_read_nested+0x2e/0x430
      [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]
      [ 1276.977957] lquota_disk_read+0x8e/0x540 [lquota] 
      [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota] 
      [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota] 
      [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota] 
      [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota] 
      [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs] 
      [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs] 
      [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs] 
      [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd] 
      [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd] 
      [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc] 
      [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc] 
      [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]
      [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]
      [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
      

      Attachments

        Issue Links

          Activity

            [LU-18516] do not call blocking ops when !TASK_RUNNING occurs in osd-ldisk / quota path
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.15.7 [ 16821 ]
            hongchao.zhang Hongchao Zhang made changes -
            Link New: This issue is related to LU-18980 [ LU-18980 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.17.0 [ 16192 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            adilger Andreas Dilger made changes -
            Labels New: janitor9x
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: James A Simmons [ simmonsja ]
            adilger Andreas Dilger made changes -
            Description Original: With a debug kernel for sanity-quota test  1i reports the following:
            {noformat}
            [ 1276.935753] ------------[ cut here ]------------
            [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at
            [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0
            [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0
            [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 4.18.0rh8.5-debug #2
            [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0

            Call Trace:
            [ 1276.976327] down_read_nested+0x2e/0x430
            [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]
            [ 1276.977957] lquota_disk_read+0x8e/0x540 [lquota]
            [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota]
            [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota]
            [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota]
            [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota]
            [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs]
            [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs]
            [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs]
            [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd]
            [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd]
            [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc]
            [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc]
            [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]
            [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]
            [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
            {noformat}
            New: With a debug kernel for sanity-quota test  1i reports the following:
            {noformat}
            [ 1276.935753] ------------[ cut here ]------------
            [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0
            [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0
            [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 4.18.0rh8.5-debug #2
            [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0

            Call Trace:
            [ 1276.976327] down_read_nested+0x2e/0x430
            [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]
            [ 1276.977957] lquota_disk_read+0x8e/0x540 [lquota]
            [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota]
            [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota]
            [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota]
            [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota]
            [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs]
            [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs]
            [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs]
            [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd]
            [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd]
            [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc]
            [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc]
            [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]
            [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]
            [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
            {noformat}
            adilger Andreas Dilger made changes -
            Description Original: With a debug kernel for sanity-quota test  1i reports the following:
            {noformat}
            [ 1276.935753] ------------[ cut here ]------------ [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at
            [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0
            [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0
            [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 4.18.0rh8.5-debug #2
            [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0

            Call Trace:
            [ 1276.976327] down_read_nested+0x2e/0x430
            [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]
            [ 1276.977957] lquota_disk_read+0x8e/0x540 [lquota]
            [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota]
            [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota]
            [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota]
            [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota]
            [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs]
            [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs]
            [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs]
            [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd]
            [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd]
            [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc]
            [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc]
            [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]
            [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]
            [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
            {noformat}
            New: With a debug kernel for sanity-quota test  1i reports the following:
            {noformat}
            [ 1276.935753] ------------[ cut here ]------------
            [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at
            [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0
            [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0
            [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 4.18.0rh8.5-debug #2
            [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0

            Call Trace:
            [ 1276.976327] down_read_nested+0x2e/0x430
            [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]
            [ 1276.977957] lquota_disk_read+0x8e/0x540 [lquota]
            [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota]
            [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota]
            [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota]
            [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota]
            [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs]
            [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs]
            [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs]
            [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd]
            [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd]
            [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc]
            [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc]
            [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]
            [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]
            [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
            {noformat}
            adilger Andreas Dilger made changes -
            Description Original: With a debug kernel for sanity-quota test  1i reports the following:

            [ 1276.935753] ------------[ cut here ]------------ [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0 [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0 [ 1276.941158] Modules linked in: zfs(O) zunicode(O) zzstd(O) zlua(O) zcommon(O) znvpair(O) zavl(O) icp(O) spl(O) lustre(O) osp(O) ofd(O) lod(O) ost(O) mdt(O) mdd(O) mgs(O) osd_ldiskfs(O) ldiskfs(O) lquota(O) lfsck(O) obdecho(O) mgc(O) mdc(O) lov(O) osc(O) lmv(O) fid(O) fld(O) ptlrpc_gss(O) ptlrpc(O) obdclass(O) ksocklnd(O) lnet(O) libcfs(O) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver intel_rapl_msr intel_rapl_common sb_edac rapl i2c_piix4 pcspkr squashfs crct10dif_pclmul crc32_pclmul crc32c_intel ata_generic ata_piix ghash_clmulni_intel serio_raw libata dm_mirror dm_region_hash dm_log dm_mod sha512_ssse3 sha512_generic [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 Kdump: loaded Tainted: G W O --------- - - 4.18.0rh8.5-debug #2 [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0 [ 1276.958577] Code: 48 8b 70 10 48 c7 c7 d8 09 b4 82 c6 05 79 ab 42 02 01 48 83 05 dd 3a fa 02 01 48 89 d1 e8 fb a1 fa ff 48 83 05 d5 3a fa 02 01 <0f> 0b 48 83 05 d3 3a fa 02 01 48 83 05 d3 3a fa 02 01 eb 97 66 0f [ 1276.962324] RSP: 0018:ffffc900026eb760 EFLAGS: 00010202 [ 1276.963405] RAX: 0000000000000000 RBX: ffffffff82b4350b RCX: 0000000000000000 [ 1276.964927] RDX: ffff888141bf7660 RSI: ffff888141be69e8 RDI: ffff888141be69e8 [ 1276.966405] RBP: 00000000000005cd R08: 0000000000000000 R09: 0000000000000000 [ 1276.967825] R10: 0000000000000082 R11: ffffc900026eb5c0 R12: 0000000000000000 [ 1276.969364] R13: 0000000000000000 R14: ffff888109f02300 R15: 0000000000002710 [ 1276.970602] FS: 0000000000000000(0000) GS:ffff888141a00000(0000) knlGS:0000000000000000 [ 1276.971975] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1276.973214] CR2: 00007fe6fa710000 CR3: 0000000002e12001 CR4: 0000000000170ee0 [ 1276.975751]

            Call Trace:

            [ 1276.976327] down_read_nested+0x2e/0x430

            [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]

            [1276.977957] lquota_disk_read+0x8e/0x540 [lquota]

            [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota]

            [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota]

            [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota]

            [ 1276.981764] ? woken_wake_function+0x30/0x30

            [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota]

            [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs]

            [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs]

            [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs]

            [ 1276.987728] ? osd_trans_create+0x3c1/0x600 [osd_ldiskfs]

            [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd]

            [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd]

            [ 1276.990630] ? tgt_brw_write+0x19ab/0x3780 [ptlrpc]

            [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc]

            [ 1276.992476] ? tgt_request_preprocess.isra.14+0xad/0xba0 [ptlrpc]

            [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc]

            [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]

            [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]

            [ 1276.996756] ? lprocfs_counter_add+0x14d/0x220 [obdclass]

            [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
            New: With a debug kernel for sanity-quota test  1i reports the following:
            {noformat}
            [ 1276.935753] ------------[ cut here ]------------ [ 1276.937120] do not call blocking ops when !TASK_RUNNING; state=402 set at
            [<00000000ccd913e0>] prepare_to_wait_event+0xc9/0x2a0
            [ 1276.939652] WARNING: CPU: 2 PID: 15740 at kernel/sched/core.c:6733 __might_sleep+0xa3/0xc0
            [ 1276.952941] CPU: 2 PID: 15740 Comm: ll_ost_io00_003 4.18.0rh8.5-debug #2
            [ 1276.955682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 1276.957525] RIP: 0010:__might_sleep+0xa3/0xc0

            Call Trace:
            [ 1276.976327] down_read_nested+0x2e/0x430
            [ 1276.977082] osd_read_lock+0xc8/0x180 [osd_ldiskfs]
            [ 1276.977957] lquota_disk_read+0x8e/0x540 [lquota]
            [ 1276.978994] qsd_refresh_usage+0x105/0x3d0 [lquota]
            [ 1276.979956] qsd_acquire+0xbe/0x770 [lquota]
            [ 1276.980753] qsd_op_begin0+0x5f8/0xd60 [lquota]
            [ 1276.983353] qsd_op_begin+0x3fa/0x6d0 [lquota]
            [ 1276.984720] osd_declare_qid+0x4da/0x770 [osd_ldiskfs]
            [ 1276.985625] osd_declare_inode_qid+0x14f/0x630 [osd_ldiskfs]
            [ 1276.986723] osd_declare_write_commit+0x810/0xaa0 [osd_ldiskfs]
            [ 1276.988811] ofd_commitrw_write+0x60e/0x2010 [ofd]
            [ 1276.989579] ofd_commitrw+0x838/0x15d0 [ofd]
            [ 1276.991503] tgt_brw_write+0x19ab/0x3780 [ptlrpc]
            [ 1276.993689] tgt_handle_request0+0x13c/0xb00 [ptlrpc]
            [ 1276.994614] tgt_request_handle+0x351/0x1c10 [ptlrpc]
            [ 1276.995676] ptlrpc_server_handle_request+0x379/0x1320 [ptlrpc]
            [ 1276.997827] ptlrpc_main+0xd58/0x1500 [ptlrpc]
            {noformat}
            simmonsja James A Simmons made changes -
            Link New: This issue is related to LU-16807 [ LU-16807 ]
            simmonsja James A Simmons created issue -

            People

              simmonsja James A Simmons
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: