Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      [ 5302.557213] Lustre: DEBUG MARKER: == sanity test 205g: stress test for job_stats procfile == 00:21:32 (1735777292)
      [ 5393.581798] LustreError: 303135:0:(lprocfs_jobstats.c:133:job_putref()) ASSERTION( kref_read(&job->js_refcount) > 0 ) failed: 
      [ 5393.581997] LustreError: 303135:0:(lprocfs_jobstats.c:133:job_putref()) LBUG
      [ 5393.582044] CPU: 1 PID: 303135 Comm: lctl Tainted: G        W  O     --------- -  - 4.18.0 #11
      [ 5393.582084] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
      [ 5393.582124] Call Trace:
      [ 5393.582161]  dump_stack+0x6e/0xa0
      [ 5393.582189]  lbug_with_loc.cold.4+0x5/0x63 [libcfs]
      [ 5393.582221]  job_putref+0xa6/0xe0 [obdclass]
      [ 5393.582297]  lprocfs_jobstats_seq_show+0x2d1/0x520 [obdclass]
      [ 5393.582374]  seq_read+0x2c8/0x3e0
      [ 5393.582398]  proc_reg_read+0x31/0x50
      [ 5393.582421]  vfs_read+0xa1/0x150
      [ 5393.582441]  ksys_read+0x3d/0xa0
      [ 5393.582462]  do_syscall_64+0x4b/0x1b0
      [ 5393.582483]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 5393.582511] RIP: 0033:0x7fe718b459b2
      

      In lprocfs_job_cleanup() expired jobs can be put, however dropping from the lru happens at a separate point.

      jobs should only be expired 'once' however it is desirable to avoid spinlocks in this tight loop.

      Instead add a status flag to expire the job and avoid a double put.

      Attachments

        Issue Links

          Activity

            People

              stancheff Shaun Tancheff
              stancheff Shaun Tancheff
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: