Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3346

The assignment of obt_sb was removed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.5.0
    • Lustre 2.4.0
    • 3
    • 8274

    Description

      Hi,

      Coverity has found a strange piece of code in check_and_start_recovery_timer() function:

      if (!(lsi->lsi_flags | LDD_F_IR_CAPABLE))
      service_time += 2 * (CONNECTION_SWITCH_MAX +
      CONNECTION_SWITCH_INC);

      According to the tool, 'lsi->lsi_flags | LDD_F_IR_CAPABLE' is useless because it is always true. So it seems the developer meant to use '&' instead of '|'.

      But if we change the code accordingly, then we hit a number of issues because of unprotected access to recovery timer, as seen in https://maloo.whamcloud.com/test_sessions/96125f26-bc36-11e2-b013-52540035b04c

      Attachments

        Issue Links

          Activity

            [LU-3346] The assignment of obt_sb was removed

            It was covered by LU-1330.

            dmiter Dmitry Eremin (Inactive) added a comment - It was covered by LU-1330 .
            dmiter Dmitry Eremin (Inactive) added a comment - Patch set: http://review.whamcloud.com/6357
            dmiter Dmitry Eremin (Inactive) added a comment - - edited

            The assignment of obt->obt_sb was removed in a recent project. The peace of code with obt->obt_sb usage from this function has never executed because of smart compiler remove it. This BUG was not discovered because we have IR so that even the timeout is much short it can still survive.

            dmiter Dmitry Eremin (Inactive) added a comment - - edited The assignment of obt->obt_sb was removed in a recent project. The peace of code with obt->obt_sb usage from this function has never executed because of smart compiler remove it. This BUG was not discovered because we have IR so that even the timeout is much short it can still survive.
            dmiter Dmitry Eremin (Inactive) added a comment - - edited

            The stack for crash is following:

            05:23:08:Lustre: lustre-MDT0000: used disk, loading
            05:23:08:LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.
            05:23:08:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
            05:23:08:LNet: 11898:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
            05:23:08:LNet: 11898:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 1 previous similar message
            05:23:08:Lustre: DEBUG MARKER: e2label /dev/lvm-MDS/P1 2>/dev/null
            05:23:08:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x194
            05:23:08:Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect
            05:23:08:BUG: unable to handle kernel NULL pointer dereference at 0000000000000290
            05:23:08:IP: [<ffffffffa074059f>] target_handle_connect+0x249f/0x2d00 [ptlrpc]
            05:23:08:PGD 0 
            05:23:08:Oops: 0000 [#1] SMP 
            05:23:08:last sysfs file: /sys/devices/system/cpu/possible
            05:23:08:CPU 0 
            05:23:08:Modules linked in: osp(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
            05:23:08:
            05:23:08:Pid: 11705, comm: mdt00_002 Not tainted 2.6.32-358.6.1.el6_lustre.g309e05a.x86_64 #1 Red Hat KVM
            05:23:08:RIP: 0010:[<ffffffffa074059f>]  [<ffffffffa074059f>] target_handle_connect+0x249f/0x2d00 [ptlrpc]
            05:23:08:RSP: 0018:ffff880056b9fb70  EFLAGS: 00010246
            05:23:08:RAX: 0000000000000000 RBX: ffff88006481e278 RCX: 0000000000000014
            05:23:08:RDX: 0000000000000005 RSI: 0000000000000010 RDI: 0000000000000010
            05:23:08:RBP: ffff880056b9fd10 R08: ffffffff81c07720 R09: 0000000000000000
            05:23:08:R10: ffff8800667f42d1 R11: 0000000000010000 R12: ffff88007bcf4c00
            05:23:08:R13: 0000000000000001 R14: ffff88006481e278 R15: ffff88007bcf4fa0
            05:23:08:FS:  00007f06d06ba700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
            05:23:08:CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
            05:23:08:CR2: 0000000000000290 CR3: 0000000037446000 CR4: 00000000000006f0
            05:23:08:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            05:23:08:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            05:23:08:Process mdt00_002 (pid: 11705, threadinfo ffff880056b9e000, task ffff880065172ae0)
            05:23:08:Stack:
            05:23:08: ffffffffa07e1bac ffffffffa07e1b9f 000000005190db27 000000005190db26
            05:23:08:<d> 0000000200000000 0000000affffffff ffffffffffffffff ffff88006578a800
            05:23:08:<d> ffff88006578a800 000000005190db27 ffffffffa07e1ba0 0000000200000001
            05:23:08:Call Trace:
            05:23:08: [<ffffffffa0dd745a>] mdt_connect+0x3a/0x4e0 [mdt]
            05:23:08: [<ffffffffa0dc8df8>] mdt_handle_common+0x648/0x1660 [mdt]
            05:23:08: [<ffffffffa0e058c5>] mds_regular_handle+0x15/0x20 [mdt]
            05:23:08: [<ffffffffa078b3a8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
            05:23:08: [<ffffffffa04795de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
            05:23:08: [<ffffffffa048ad8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
            05:23:08: [<ffffffffa0782709>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
            05:23:08: [<ffffffff81055ab3>] ? __wake_up+0x53/0x70
            05:23:08: [<ffffffffa078c73e>] ptlrpc_main+0xace/0x1700 [ptlrpc]
            05:23:08: [<ffffffffa078bc70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
            05:23:08: [<ffffffff8100c0ca>] child_rip+0xa/0x20
            05:23:08: [<ffffffffa078bc70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
            05:23:08: [<ffffffffa078bc70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
            05:23:08: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
            05:23:08:Code: 00 00 00 31 d2 89 c8 f7 f6 ba 05 00 00 00 44 89 ef 83 f8 05 0f 42 c2 c1 ef 02 41 8d 44 45 05 8d 3c 38 48 8b 83 10 05 00 00 89 fe <48> 8b 80 90 02 00 00 8b 00 f6 c4 20 75 15 83 f9 05 b8 32 00 00 
            05:23:08:RIP  [<ffffffffa074059f>] target_handle_connect+0x249f/0x2d00 [ptlrpc]
            05:23:08: RSP <ffff880056b9fb70>
            05:23:08:CR2: 0000000000000290
            
            static void
            check_and_start_recovery_timer(struct obd_device *obd,
                                           struct ptlrpc_request *req,
                                           int new_client)
            {
                    int service_time = lustre_msg_get_service_time(req->rq_reqmsg);
                    struct obd_device_target *obt = &obd->u.obt;
                    struct lustre_sb_info *lsi;
            
                    if (!new_client && service_time)
                            /* Teach server about old server's estimates, as first guess
                             * at how long new requests will take. */
            		at_measured(&req->rq_rqbd->rqbd_svcpt->scp_at_estimate,
                                        service_time);
            
                    target_start_recovery_timer(obd);
            
            	/* Convert the service time to RPC timeout,
            	 * and reuse service_time to limit stack usage. */
            	service_time = at_est2timeout(service_time);
            
            	/* We expect other clients to timeout within service_time, then try
            	 * to reconnect, then try the failover server.  The max delay between
            	 * connect attempts is SWITCH_MAX + SWITCH_INC + INITIAL. */
                    service_time += 2 * INITIAL_CONNECT_TIMEOUT;
            
                    LASSERT(obt->obt_magic == OBT_MAGIC);
                    lsi = s2lsi(obt->obt_sb);                <== obt->obt_sb is NULL here!!!
            	if (!(lsi->lsi_flags & LDD_F_IR_CAPABLE))
                            service_time += 2 * (CONNECTION_SWITCH_MAX +
                                                 CONNECTION_SWITCH_INC);
                    if (service_time > obd->obd_recovery_timeout && !new_client)
                            extend_recovery_timer(obd, service_time, false);
            }
            
            dmiter Dmitry Eremin (Inactive) added a comment - - edited The stack for crash is following: 05:23:08:Lustre: lustre-MDT0000: used disk, loading 05:23:08:LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. 05:23:08:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u 05:23:08:LNet: 11898:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. 05:23:08:LNet: 11898:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 1 previous similar message 05:23:08:Lustre: DEBUG MARKER: e2label /dev/lvm-MDS/P1 2>/dev/null 05:23:08:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x194 05:23:08:Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect 05:23:08:BUG: unable to handle kernel NULL pointer dereference at 0000000000000290 05:23:08:IP: [<ffffffffa074059f>] target_handle_connect+0x249f/0x2d00 [ptlrpc] 05:23:08:PGD 0 05:23:08:Oops: 0000 [#1] SMP 05:23:08:last sysfs file: /sys/devices/system/cpu/possible 05:23:08:CPU 0 05:23:08:Modules linked in: osp(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 05:23:08: 05:23:08:Pid: 11705, comm: mdt00_002 Not tainted 2.6.32-358.6.1.el6_lustre.g309e05a.x86_64 #1 Red Hat KVM 05:23:08:RIP: 0010:[<ffffffffa074059f>] [<ffffffffa074059f>] target_handle_connect+0x249f/0x2d00 [ptlrpc] 05:23:08:RSP: 0018:ffff880056b9fb70 EFLAGS: 00010246 05:23:08:RAX: 0000000000000000 RBX: ffff88006481e278 RCX: 0000000000000014 05:23:08:RDX: 0000000000000005 RSI: 0000000000000010 RDI: 0000000000000010 05:23:08:RBP: ffff880056b9fd10 R08: ffffffff81c07720 R09: 0000000000000000 05:23:08:R10: ffff8800667f42d1 R11: 0000000000010000 R12: ffff88007bcf4c00 05:23:08:R13: 0000000000000001 R14: ffff88006481e278 R15: ffff88007bcf4fa0 05:23:08:FS: 00007f06d06ba700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 05:23:08:CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 05:23:08:CR2: 0000000000000290 CR3: 0000000037446000 CR4: 00000000000006f0 05:23:08:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 05:23:08:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 05:23:08:Process mdt00_002 (pid: 11705, threadinfo ffff880056b9e000, task ffff880065172ae0) 05:23:08:Stack: 05:23:08: ffffffffa07e1bac ffffffffa07e1b9f 000000005190db27 000000005190db26 05:23:08:<d> 0000000200000000 0000000affffffff ffffffffffffffff ffff88006578a800 05:23:08:<d> ffff88006578a800 000000005190db27 ffffffffa07e1ba0 0000000200000001 05:23:08:Call Trace: 05:23:08: [<ffffffffa0dd745a>] mdt_connect+0x3a/0x4e0 [mdt] 05:23:08: [<ffffffffa0dc8df8>] mdt_handle_common+0x648/0x1660 [mdt] 05:23:08: [<ffffffffa0e058c5>] mds_regular_handle+0x15/0x20 [mdt] 05:23:08: [<ffffffffa078b3a8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] 05:23:08: [<ffffffffa04795de>] ? cfs_timer_arm+0xe/0x10 [libcfs] 05:23:08: [<ffffffffa048ad8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 05:23:08: [<ffffffffa0782709>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] 05:23:08: [<ffffffff81055ab3>] ? __wake_up+0x53/0x70 05:23:08: [<ffffffffa078c73e>] ptlrpc_main+0xace/0x1700 [ptlrpc] 05:23:08: [<ffffffffa078bc70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 05:23:08: [<ffffffff8100c0ca>] child_rip+0xa/0x20 05:23:08: [<ffffffffa078bc70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 05:23:08: [<ffffffffa078bc70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 05:23:08: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 05:23:08:Code: 00 00 00 31 d2 89 c8 f7 f6 ba 05 00 00 00 44 89 ef 83 f8 05 0f 42 c2 c1 ef 02 41 8d 44 45 05 8d 3c 38 48 8b 83 10 05 00 00 89 fe <48> 8b 80 90 02 00 00 8b 00 f6 c4 20 75 15 83 f9 05 b8 32 00 00 05:23:08:RIP [<ffffffffa074059f>] target_handle_connect+0x249f/0x2d00 [ptlrpc] 05:23:08: RSP <ffff880056b9fb70> 05:23:08:CR2: 0000000000000290 static void check_and_start_recovery_timer(struct obd_device *obd, struct ptlrpc_request *req, int new_client) { int service_time = lustre_msg_get_service_time(req->rq_reqmsg); struct obd_device_target *obt = &obd->u.obt; struct lustre_sb_info *lsi; if (!new_client && service_time) /* Teach server about old server's estimates, as first guess * at how long new requests will take. */ at_measured(&req->rq_rqbd->rqbd_svcpt->scp_at_estimate, service_time); target_start_recovery_timer(obd); /* Convert the service time to RPC timeout, * and reuse service_time to limit stack usage. */ service_time = at_est2timeout(service_time); /* We expect other clients to timeout within service_time, then try * to reconnect, then try the failover server. The max delay between * connect attempts is SWITCH_MAX + SWITCH_INC + INITIAL. */ service_time += 2 * INITIAL_CONNECT_TIMEOUT; LASSERT(obt->obt_magic == OBT_MAGIC); lsi = s2lsi(obt->obt_sb); <== obt->obt_sb is NULL here!!! if (!(lsi->lsi_flags & LDD_F_IR_CAPABLE)) service_time += 2 * (CONNECTION_SWITCH_MAX + CONNECTION_SWITCH_INC); if (service_time > obd->obd_recovery_timeout && !new_client) extend_recovery_timer(obd, service_time, false); }

            People

              dmiter Dmitry Eremin (Inactive)
              sebastien.buisson Sebastien Buisson (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: