Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2199

LBUG triggered in brw_interpret: "obdo already freed"

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Blocker
    • None
    • Lustre 2.4.0
    • orion-2_3_49_54_2-75chaos
    • 3
    • 5243

    Description

      Hit this LBUG on one of our Sequoia IO nodes running the old Orion code base orion-2_3_49_54_2-75chaos:

      LustreError: 3216:0:(osc_request.c:1859:brw_interpret()) @@@ obdo already freed  req@c0000003c24cc800 x1415959349900530/t12885333672(12885333672) o4->ls1-OST00e3-osc-c0000003c6904c00@172.20.2.27@o2ib500:6/4 lens 456/416 e 0 to 0 dl 1350423562 ref 1 fl Interpret:R/4/0 rc 0/0
      LustreError: 3216:0:(osc_request.c:1860:brw_interpret()) LBUG
      Call Trace:
      [c0000003e9343870] [c000000000008190] .show_stack+0x7c/0x184 (unreliable)
      [c0000003e9343920] [80000000009f0c1c] .libcfs_debug_dumpstack+0x9c/0xe0 [libcfs]
      [c0000003e93439c0] [80000000009f1260] .lbug_with_loc+0x50/0xc0 [libcfs]
      [c0000003e9343a50] [8000000004445918] .brw_interpret+0xd28/0xec0 [osc]
      [c0000003e9343b70] [800000000383b8a4] .ptlrpc_check_set+0x384/0x3c40 [ptlrpc]
      [c0000003e9343d10] [800000000387d86c] .ptlrpcd_check+0x5bc/0x760 [ptlrpc]
      [c0000003e9343e30] [800000000387dd28] .ptlrpcd+0x318/0x4d0 [ptlrpc]
      [c0000003e9343f90] [c00000000001a9e0] .kernel_thread+0x54/0x70
      ^GMessage from syslogd@(none) at Oct 16 14:37:37 ...
       kernel:LustreError: 3216:0:(osc_request.c:1860:brw_interpret()) LBUG
      Kernel panic - not syncing: LBUG
      Call Trace:
      [c0000003e9343880] [c000000000008190] .show_stack+0x7c/0x184 (unreliable)
      [c0000003e9343930] [c000000000432c0c] .panic+0x80/0x1a8
      [c0000003e93439c0] [80000000009f12c0] .lbug_with_loc+0xb0/0xc0 [libcfs]
      [c0000003e9343a50] [8000000004445918] .brw_interpret+0xd28/0xec0 [osc]
      [c0000003e9343b70] [800000000383b8a4] .ptlrpc_check_set+0x384/0x3c40 [ptlrpc]
      [c0000003e9343d10] [800000000387d86c] .ptlrpcd_check+0x5bc/0x760 [ptlrpc]
      [c0000003e9343e30] [800000000387dd28] .ptlrpcd+0x318/0x4d0 [ptlrpc]
      [c0000003e9343f90] [c00000000001a9e0] .kernel_thread+0x54/0x70
      LustreError: dumping log to /tmp/lustre-log.1350423457.3216
      

      Attachments

        Activity

          [LU-2199] LBUG triggered in brw_interpret: "obdo already freed"

          Please reopen it if this problem can be seen again

          jay Jinshan Xiong (Inactive) added a comment - Please reopen it if this problem can be seen again

          now that it can't be seen anymore, let's lower the priority and leave this ticket open.

          jay Jinshan Xiong (Inactive) added a comment - now that it can't be seen anymore, let's lower the priority and leave this ticket open.

          I updated the tags on our github site.

          The tag for 2.3.49.54-75chaos is orion-2_3_49_54_2-75chaos.
          The tag for 2.3.54-2chaos is 2.3.54-2chaos.

          morrone Christopher Morrone (Inactive) added a comment - I updated the tags on our github site. The tag for 2.3.49.54-75chaos is orion-2_3_49_54_2-75chaos . The tag for 2.3.54-2chaos is 2.3.54-2chaos .

          Hi Chris, where can I refer to the code base you're using?

          jay Jinshan Xiong (Inactive) added a comment - Hi Chris, where can I refer to the code base you're using?

          I hit this pretty reliably today while we were running 2.3.49.54-75chaos, but after installing 2.3.54-2chaos I haven't hit it yet. It may be fixed...or perhaps just harder to hit now.

          morrone Christopher Morrone (Inactive) added a comment - I hit this pretty reliably today while we were running 2.3.49.54-75chaos, but after installing 2.3.54-2chaos I haven't hit it yet. It may be fixed...or perhaps just harder to hit now.

          Hit again.

          2012-11-01 14:23:35.708036 {DefaultControlEventListener} [mmcs]{103}.7.1: LustreError: 3336:0:(osc_request.c:1859:brw_interpret()) @@@ obdo already freed  req@c0000002e0f58c00 x1417459081295033/t21475646365(21475646365) o4->ls1-OST00d5-osc-c0000003ec619800@172.20.2.13@o2ib500:6/4 lens 456/416 e 0 to 0 dl 1351805120 ref 1 fl Interpret:R/4/0 rc 0/0
          2012-11-01 14:23:35.747969 {DefaultControlEventListener} [mmcs]{103}.7.1: LustreError: 3336:0:(osc_request.c:1860:brw_interpret()) LBUG
          2012-11-01 14:23:35.787805 {DefaultControlEventListener} [mmcs]{103}.7.1: Call Trace:
          2012-11-01 14:23:35.827832 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3870] [c000000000008160] .show_stack+0x7c/0x184 (unreliable)
          2012-11-01 14:23:35.868181 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3920] [80000000009f0c1c] .libcfs_debug_dumpstack+0x9c/0xe0 [libcfs]
          2012-11-01 14:23:35.908596 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b39c0] [80000000009f1260] .lbug_with_loc+0x50/0xc0 [libcfs]
          2012-11-01 14:23:35.947814 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3a50] [8000000004445918] .brw_interpret+0xd28/0xec0 [osc]
          2012-11-01 14:23:35.987861 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3b70] [800000000383b8a4] .ptlrpc_check_set+0x384/0x3c40 [ptlrpc]
          2012-11-01 14:23:36.027871 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3d10] [800000000387d86c] .ptlrpcd_check+0x5bc/0x760 [ptlrpc]
          2012-11-01 14:23:36.068110 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3e30] [800000000387dd28] .ptlrpcd+0x318/0x4d0 [ptlrpc]
          2012-11-01 14:23:36.107803 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3f90] [c00000000001a9e0] .kernel_thread+0x54/0x70
          2012-11-01 14:23:36.198902 {DefaultControlEventListener} [mmcs]{103}.13.1: ^GMessage from syslogd@(none) at Nov  1 14:23:35 ...
          2012-11-01 14:23:36.237809 {DefaultControlEventListener} [mmcs]{103}.13.1:  kernel:LustreError: 3336:0:(osc_request.c:1860:brw_interpret()) LBUG
          2012-11-01 14:23:36.288623 {DefaultControlEventListener} [mmcs]{103}.7.2: Kernel panic - not syncing: LBUG
          2012-11-01 14:23:36.328575 {DefaultControlEventListener} [mmcs]{103}.7.2: Call Trace:
          2012-11-01 14:23:36.368579 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3880] [c000000000008160] .show_stack+0x7c/0x184 (unreliable)
          2012-11-01 14:23:36.407824 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3930] [c000000000432c0c] .panic+0x80/0x1a8
          2012-11-01 14:23:36.448613 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b39c0] [80000000009f12c0] .lbug_with_loc+0xb0/0xc0 [libcfs]
          2012-11-01 14:23:36.488034 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3a50] [8000000004445918] .brw_interpret+0xd28/0xec0 [osc]
          2012-11-01 14:23:36.528883 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3b70] [800000000383b8a4] .ptlrpc_check_set+0x384/0x3c40 [ptlrpc]
          2012-11-01 14:23:36.568987 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3d10] [800000000387d86c] .ptlrpcd_check+0x5bc/0x760 [ptlrpc]
          2012-11-01 14:23:36.608557 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3e30] [800000000387dd28] .ptlrpcd+0x318/0x4d0 [ptlrpc]
          2012-11-01 14:23:36.649014 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3f90] [c00000000001a9e0] .kernel_thread+0x54/0x70
          
          morrone Christopher Morrone (Inactive) added a comment - Hit again. 2012-11-01 14:23:35.708036 {DefaultControlEventListener} [mmcs]{103}.7.1: LustreError: 3336:0:(osc_request.c:1859:brw_interpret()) @@@ obdo already freed req@c0000002e0f58c00 x1417459081295033/t21475646365(21475646365) o4->ls1-OST00d5-osc-c0000003ec619800@172.20.2.13@o2ib500:6/4 lens 456/416 e 0 to 0 dl 1351805120 ref 1 fl Interpret:R/4/0 rc 0/0 2012-11-01 14:23:35.747969 {DefaultControlEventListener} [mmcs]{103}.7.1: LustreError: 3336:0:(osc_request.c:1860:brw_interpret()) LBUG 2012-11-01 14:23:35.787805 {DefaultControlEventListener} [mmcs]{103}.7.1: Call Trace: 2012-11-01 14:23:35.827832 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3870] [c000000000008160] .show_stack+0x7c/0x184 (unreliable) 2012-11-01 14:23:35.868181 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3920] [80000000009f0c1c] .libcfs_debug_dumpstack+0x9c/0xe0 [libcfs] 2012-11-01 14:23:35.908596 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b39c0] [80000000009f1260] .lbug_with_loc+0x50/0xc0 [libcfs] 2012-11-01 14:23:35.947814 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3a50] [8000000004445918] .brw_interpret+0xd28/0xec0 [osc] 2012-11-01 14:23:35.987861 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3b70] [800000000383b8a4] .ptlrpc_check_set+0x384/0x3c40 [ptlrpc] 2012-11-01 14:23:36.027871 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3d10] [800000000387d86c] .ptlrpcd_check+0x5bc/0x760 [ptlrpc] 2012-11-01 14:23:36.068110 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3e30] [800000000387dd28] .ptlrpcd+0x318/0x4d0 [ptlrpc] 2012-11-01 14:23:36.107803 {DefaultControlEventListener} [mmcs]{103}.7.1: [c0000003e90b3f90] [c00000000001a9e0] .kernel_thread+0x54/0x70 2012-11-01 14:23:36.198902 {DefaultControlEventListener} [mmcs]{103}.13.1: ^GMessage from syslogd@(none) at Nov 1 14:23:35 ... 2012-11-01 14:23:36.237809 {DefaultControlEventListener} [mmcs]{103}.13.1: kernel:LustreError: 3336:0:(osc_request.c:1860:brw_interpret()) LBUG 2012-11-01 14:23:36.288623 {DefaultControlEventListener} [mmcs]{103}.7.2: Kernel panic - not syncing: LBUG 2012-11-01 14:23:36.328575 {DefaultControlEventListener} [mmcs]{103}.7.2: Call Trace: 2012-11-01 14:23:36.368579 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3880] [c000000000008160] .show_stack+0x7c/0x184 (unreliable) 2012-11-01 14:23:36.407824 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3930] [c000000000432c0c] .panic+0x80/0x1a8 2012-11-01 14:23:36.448613 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b39c0] [80000000009f12c0] .lbug_with_loc+0xb0/0xc0 [libcfs] 2012-11-01 14:23:36.488034 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3a50] [8000000004445918] .brw_interpret+0xd28/0xec0 [osc] 2012-11-01 14:23:36.528883 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3b70] [800000000383b8a4] .ptlrpc_check_set+0x384/0x3c40 [ptlrpc] 2012-11-01 14:23:36.568987 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3d10] [800000000387d86c] .ptlrpcd_check+0x5bc/0x760 [ptlrpc] 2012-11-01 14:23:36.608557 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3e30] [800000000387dd28] .ptlrpcd+0x318/0x4d0 [ptlrpc] 2012-11-01 14:23:36.649014 {DefaultControlEventListener} [mmcs]{103}.7.2: [c0000003e90b3f90] [c00000000001a9e0] .kernel_thread+0x54/0x70
          pjones Peter Jones added a comment -

          Alex

          Could someone please look into this one?

          Thanks

          Peter

          pjones Peter Jones added a comment - Alex Could someone please look into this one? Thanks Peter

          People

            jay Jinshan Xiong (Inactive)
            prakash Prakash Surya (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: