Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16264

assertion in osp_precreate_send req->rq_transno == 0

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.15.2, Lustre 2.15.3
    • None
    • 3
    • 9223372036854775807

    Description

      There's a regular crash in conf-sanity test 30a (janitor) or 30b (maloo) that unfolds like this https://testing.whamcloud.com/test_sets/683c15ea-ec59-46d0-89c1-7f10d715f355

      [ 5673.591855] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      [ 5673.592761] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      [ 5673.882066] Lustre: DEBUG MARKER: onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      [ 5673.948710] Lustre: DEBUG MARKER: onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      [ 5679.785437] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre-OST0000.failover.node='10.240.25.54@tcp'
      [ 5684.257444] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param -d lustre-OST0000.failover.node
      [ 5686.387703] LustreError: 167-0: lustre-OST0000-osc-MDT0002: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
      [ 5686.390343] LustreError: Skipped 1 previous similar message
      [ 5686.391768] Lustre: lustre-OST0000-osc-MDT0002: Connection restored to 10.240.25.54@tcp (at 10.240.25.54@tcp)
      [ 5686.393594] Lustre: Skipped 1 previous similar message
      [ 5686.576332] LustreError: 15573:0:(osp_precreate.c:684:osp_precreate_send()) ASSERTION( req->rq_transno == 0 ) failed: 
      [ 5686.578284] LustreError: 15573:0:(osp_precreate.c:684:osp_precreate_send()) LBUG
      [ 5686.579611] Pid: 15573, comm: osp-pre-0-2 3.10.0-1160.76.1.el7_lustre.ddn17.x86_64 #1 SMP Fri Oct 14 20:32:07 UTC 2022
      [ 5686.581497] Call Trace:
      [ 5686.582054] [<0>] libcfs_call_trace+0x90/0xf0 [libcfs]
      [ 5686.582998] [<0>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 5686.583926] [<0>] osp_precreate_send+0x1106/0x11b0 [osp]
      [ 5686.584908] [<0>] osp_precreate_thread+0x6ba/0x13b0 [osp]
      [ 5686.585956] [<0>] kthread+0xd1/0xe0
      [ 5686.586645] [<0>] ret_from_fork_nospec_begin+0x21/0x21
      [ 5686.587621] [<0>] 0xfffffffffffffffe
      [ 5686.588300] Kernel panic - not syncing: LBUG
      [ 5686.589082] CPU: 0 PID: 15573 Comm: osp-pre-0-2 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1160.76.1.el7_lustre.ddn17.x86_64 #1
      [ 5686.591324] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 5686.592332] Call Trace:
      [ 5686.592803]  [<ffffffff875865c9>] dump_stack+0x19/0x1b
      [ 5686.593723]  [<ffffffff875802d1>] panic+0xe8/0x21f
      [ 5686.594578]  [<ffffffffc075ba1b>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [ 5686.595672]  [<ffffffffc12e1f56>] osp_precreate_send+0x1106/0x11b0 [osp]
      [ 5686.596843]  [<ffffffffc12e464a>] osp_precreate_thread+0x6ba/0x13b0 [osp]
      [ 5686.598027]  [<ffffffff86ec7080>] ? wake_up_atomic_t+0x30/0x30
      [ 5686.599052]  [<ffffffffc12e3f90>] ? osp_init_pre_fid+0x630/0x630 [osp]
      [ 5686.600244]  [<ffffffff86ec5f91>] kthread+0xd1/0xe0
      [ 5686.601114]  [<ffffffff86ec5ec0>] ? insert_kthread_work+0x40/0x40
      [ 5686.602190]  [<ffffffff87599df7>] ret_from_fork_nospec_begin+0x21/0x21
      [ 5686.603335]  [<ffffffff86ec5ec0>] ? insert_kthread_work+0x40/0x40 

      or http://testing.linuxhacker.ru/lustre-reports/25993/testresults/conf-sanity1-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/

      [ 3227.316101] Lustre: Skipped 1 previous similar message
      [ 3227.321723] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180
      [ 3228.406622] Lustre: DEBUG MARKER: oleg208-server.virtnet: executing set_default_debug -1 all 8
      [ 3230.985273] Lustre: DEBUG MARKER: oleg208-client.virtnet: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      [ 3231.624671] Lustre: lustre-OST0000: Connection restored to 50155b09-b14b-6aa1-56c4-bebc671f968e (at 192.168.201.8@tcp)
      [ 3231.627520] Lustre: Skipped 3 previous similar messages
      [ 3242.148194] Lustre: MGS: Connection restored to 50155b09-b14b-6aa1-56c4-bebc671f968e (at 192.168.201.8@tcp)
      [ 3242.149935] Lustre: Skipped 2 previous similar messages
      [ 3243.345638] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107
      [ 3243.351508] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
      [ 3243.361084] Lustre: lustre-OST0000: Not available for connect from 0@lo (stopping)
      [ 3243.370317] LustreError: 167-0: lustre-OST0000-osc-MDT0000: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
      [ 3243.404559] LustreError: 10962:0:(osp_precreate.c:683:osp_precreate_send()) ASSERTION( req->rq_transno == 0 ) failed: 
      [ 3243.411037] LustreError: 10962:0:(osp_precreate.c:683:osp_precreate_send()) LBUG
      [ 3243.414422] Pid: 10962, comm: osp-pre-0-1 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022
      [ 3243.418602] Call Trace:
      [ 3243.419815]  [<ffffffffa018d7dc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 3243.422930]  [<ffffffffa018d88c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 3243.425789]  [<ffffffffa0e7a8b6>] osp_precreate_send+0x11a6/0x11c0 [osp]
      [ 3243.428811]  [<ffffffffa0e7c218>] osp_precreate_thread+0x818/0x1360 [osp]
      [ 3243.431581]  [<ffffffff810ba114>] kthread+0xe4/0xf0
      [ 3243.433857]  [<ffffffff817f3e5d>] ret_from_fork_nospec_begin+0x7/0x21
      [ 3243.437050]  [<ffffffffffffffff>] 0xffffffffffffffff
      [ 3243.439596] Kernel panic - not syncing: LBUG 

      Happens on a variety of branches including b2_12, b2_15 and even master at times, and all the derivatives.

      Attachments

        Issue Links

          Activity

            People

              scherementsev Sergey Cheremencev
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: