Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3239

ofd_internal.h:518:ofd_info_init()) ASSERTION( info )

Details

    • 3
    • 7954

    Description

      Lustre: DEBUG MARKER: == parallel-scale test simul: simul == 22:08:33 (1367125713)
      LustreError: 19190:0:(ofd_internal.h:518:ofd_info_init()) ASSERTION( info ) failed:
      LustreError: 19182:0:(ofd_internal.h:518:ofd_info_init()) ASSERTION( info ) failed:
      LustreError: 19182:0:(ofd_internal.h:518:ofd_info_init()) LBUG
      Pid: 19182, comm: ll_ost_io00_001

      Call Trace:
      [<ffffffffa044e895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      [<ffffffffa044ee97>] lbug_with_loc+0x47/0xb0 [libcfs]
      [<ffffffffa0e03e62>] ofd_info_init+0x92/0x130 [ofd]
      [<ffffffffa0e05835>] ofd_get_info+0x2e5/0xa90 [ofd]
      [<ffffffff812805cd>] ? pointer+0x8d/0x830
      [<ffffffffa029f7e5>] ? lprocfs_counter_add+0x125/0x182 [lvfs]
      [<ffffffffa078528a>] nrs_orr_range_fill_physical+0x18a/0x540 [ptlrpc]
      [<ffffffffa0762dd6>] ? __req_capsule_get+0x166/0x700 [ptlrpc]
      [<ffffffffa073e630>] ? lustre_swab_ost_body+0x0/0x10 [ptlrpc]
      [<ffffffffa07871d7>] nrs_orr_res_get+0x817/0xb80 [ptlrpc]
      [<ffffffffa077d306>] nrs_resource_get+0x56/0x110 [ptlrpc]
      [<ffffffffa077dccb>] nrs_resource_get_safe+0x8b/0x100 [ptlrpc]
      [<ffffffffa0780248>] ptlrpc_nrs_req_initialize+0x38/0x90 [ptlrpc]
      [<ffffffffa074cff0>] ptlrpc_main+0x1170/0x16f0 [ptlrpc]
      [<ffffffffa074be80>] ? ptlrpc_main+0x0/0x16f0 [ptlrpc]
      [<ffffffff8100c0ca>] child_rip+0xa/0x20
      [<ffffffffa074be80>] ? ptlrpc_main+0x0/0x16f0 [ptlrpc]
      [<ffffffffa074be80>] ? ptlrpc_main+0x0/0x16f0 [ptlrpc]
      [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

      LustreError: dumping log to /tmp/lustre-log.1367125716.19182
      LustreError: 19190:0:(ofd_internal.h:518:ofd_info_init()) LBUG
      Pid: 19190, comm: ll_ost_io03_000

      Call Trace:
      [<ffffffffa044e895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      [<ffffffffa044ee97>] lbug_with_loc+0x47/0xb0 [libcfs]
      [<ffffffffa0e03e62>] ofd_info_init+0x92/0x130 [ofd]
      [<ffffffffa0e05835>] ofd_get_info+0x2e5/0xa90 [ofd]
      [<ffffffff812805cd>] ? pointer+0x8d/0x830
      [<ffffffffa029f7e5>] ? lprocfs_counter_add+0x125/0x182 [lvfs]
      [<ffffffffa078528a>] nrs_orr_range_fill_physical+0x18a/0x540 [ptlrpc]
      [<ffffffffa0762dd6>] ? __req_capsule_get+0x166/0x700 [ptlrpc]
      [<ffffffffa073e630>] ? lustre_swab_ost_body+0x0/0x10 [ptlrpc]
      [<ffffffffa07871d7>] nrs_orr_res_get+0x817/0xb80 [ptlrpc]
      [<ffffffffa077d306>] nrs_resource_get+0x56/0x110 [ptlrpc]
      [<ffffffffa077dccb>] nrs_resource_get_safe+0x8b/0x100 [ptlrpc]
      [<ffffffffa0780248>] ptlrpc_nrs_req_initialize+0x38/0x90 [ptlrpc]
      [<ffffffffa074cff0>] ptlrpc_main+0x1170/0x16f0 [ptlrpc]
      [<ffffffffa074be80>] ? ptlrpc_main+0x0/0x16f0 [ptlrpc]
      [<ffffffff8100c0ca>] child_rip+0xa/0x20
      [<ffffffffa074be80>] ? ptlrpc_main+0x0/0x16f0 [ptlrpc]
      [<ffffffffa074be80>] ? ptlrpc_main+0x0/0x16f0 [ptlrpc]
      [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

      LustreError: dumping log to /tmp/lustre-log.1367125719.19190
      LNet: Service thread pid 19182 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      Pid: 19182, comm: ll_ost_io00_001

      I set nrs on mds to crrn, oss02 to orr, oss03 to trr and run parallel-scale

      Attachments

        Issue Links

          Activity

            [LU-3239] ofd_internal.h:518:ofd_info_init()) ASSERTION( info )
            pjones Peter Jones added a comment -

            Landed for 2.4

            pjones Peter Jones added a comment - Landed for 2.4

            Please do not close this bug until a test that runs at least basic NRS policy functionality.

            adilger Andreas Dilger added a comment - Please do not close this bug until a test that runs at least basic NRS policy functionality.
            di.wang Di Wang added a comment -

            hmm, yes, probably env_refill should be better. I change it back to env_refill here in the update patch.

            di.wang Di Wang added a comment - hmm, yes, probably env_refill should be better. I change it back to env_refill here in the update patch.

            get_info(KEY_FIEMAP) is used to order requests with regard physical offset, AFAIK.

            bzzz Alex Zhuravlev added a comment - get_info(KEY_FIEMAP) is used to order requests with regard physical offset, AFAIK.
            di.wang Di Wang added a comment -

            hmm, only for ofd_get_info, because get_info is the only use case from handle_request_in, where env(le_ses) is not initialized correctly. so I figured initializing env(le_ses) for all requests in req_in might not worth. Only do this env_init in ofd_get_info should be enough. Besides we can not use env_refill here, because le_ses might be in some uninitialized state, you can see from Minh's test. Hmm, I did not see ofd_get_info is called in the process of read/write. Do I miss sth?

            di.wang Di Wang added a comment - hmm, only for ofd_get_info, because get_info is the only use case from handle_request_in, where env(le_ses) is not initialized correctly. so I figured initializing env(le_ses) for all requests in req_in might not worth. Only do this env_init in ofd_get_info should be enough. Besides we can not use env_refill here, because le_ses might be in some uninitialized state, you can see from Minh's test. Hmm, I did not see ofd_get_info is called in the process of read/write. Do I miss sth?

            env_refill() is fine, but recreation env from scratch on potentially every read/write is not, i think.

            bzzz Alex Zhuravlev added a comment - env_refill() is fine, but recreation env from scratch on potentially every read/write is not, i think.
            di.wang Di Wang added a comment -

            well, those envs might be initialized too early before ofd stack setup. that is why we do env_refill in ost_handle?

            di.wang Di Wang added a comment - well, those envs might be initialized too early before ofd stack setup. that is why we do env_refill in ost_handle?

            we do already have env in ptlrpc_main()..

            bzzz Alex Zhuravlev added a comment - we do already have env in ptlrpc_main()..

            I tend to agree, if this is the only single case.

            tappro Mikhail Pershin added a comment - I tend to agree, if this is the only single case.
            di.wang Di Wang added a comment - - edited

            I just updated the patch. To avoid the hassle in handle_req_in, ofd_get_info(KEY_FIEMAP) will initialize the env itself, since ofd_get_info(KEY_FIEMAP) is the only use case from handle_req_in, adding env initialization in handle_req_in might not worth for now.

            di.wang Di Wang added a comment - - edited I just updated the patch. To avoid the hassle in handle_req_in, ofd_get_info(KEY_FIEMAP) will initialize the env itself, since ofd_get_info(KEY_FIEMAP) is the only use case from handle_req_in, adding env initialization in handle_req_in might not worth for now.

            People

              di.wang Di Wang
              mdiep Minh Diep
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: