Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5077

insanity test_1: out of memory on MDT in crypto_create_tfm()

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.6.0, Lustre 2.5.2, Lustre 2.5.3
    • None
    • 3
    • 14011

    Description

      This issue was created by maloo for wangdi <di.wang@intel.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/4f1c047c-dd60-11e3-9f1e-52540035b04c.

      The sub-test test_1 failed with the following error:

      16:39:44:Lustre: DEBUG MARKER: == insanity test 1: MDS/MDS failure == 15:39:30 (1400279970)
      16:39:44:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
      16:39:44:Lustre: DEBUG MARKER: umount -d /mnt/mds1
      16:39:44:Lustre: Failing over lustre-MDT0000
      16:42:00:LustreError: 137-5: lustre-MDT0001_UUID: not available for connect from 10.1.4.136@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      16:42:00:LustreError: Skipped 2839 previous similar messages
      16:42:00:Lustre: 11949:0:(client.c:1914:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1400279972/real 1400279972] req@ffff88007a72f000 x1468278858237216/t0(0) o251->MGC10.1.4.144@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1400279978 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
      16:42:00:Lustre: 11949:0:(client.c:1914:ptlrpc_expire_one_request()) Skipped 12 previous similar messages
      16:42:00:Lustre: server umount lustre-MDT0000 complete
      16:42:00:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      16:42:00:Lustre: DEBUG MARKER: hostname
      16:42:00:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1
      16:42:00:Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
      16:42:00:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-Role_MDS/P1 /mnt/mds1
      16:42:00:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts:
      16:42:00:LustreError: 12262:0:(mdt_coordinator.c:976:mdt_hsm_cdt_start()) lustre-MDT0000: cannot take the layout locks needed for registered restore: -2
      16:42:00:Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 5 clients reconnect
      16:42:00:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
      16:42:00:LNet: 12440:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
      16:42:00:LNet: 12440:0:(debug.c:218:libcfs_debug_str2mask()) Skipped 1 previous similar message
      16:42:00:Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P1 2>/dev/null
      16:42:00:Lustre: mdt: This server is not able to keep up with request traffic (cpu-bound).
      16:42:00:Lustre: 12222:0:(service.c:1511:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=21, delay=0(jiff)
      16:42:00:Lustre: 12222:0:(service.c:1308:ptlrpc_at_send_early_reply()) @@@ Already past deadline (1s), not sending early reply. Consider increasing at_early_margin (5)? req@ffff88002ef5e400 x1468299694051176/t0(0) o400>ac3e68be-778c-5e81-3dfa-5ce31a3eb36d@10.1.4.142@tcp:0/0 lens 224/0 e 225451 to 0 dl 1400280113 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      16:42:00:Lustre: 12229:0:(service.c:1308:ptlrpc_at_send_early_reply()) @@@ Already past deadline (1s), not sending early reply. Consider increasing at_early_margin (5)? req@ffff880022bb0000 x1468278870043488/t0(0) o400>lustre-MDT0001-mdtlov_UUID@10.1.4.140@tcp:0/0 lens 224/0 e 1032545 to 0 dl 1400280114 ref 2 fl Complete:/c0/ffffffff rc 0/-1
      16:42:00:mdt_out00_000 invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
      16:42:00:mdt_out00_000 cpuset=/ mems_allowed=0
      16:42:00:Pid: 12229, comm: mdt_out00_000 Not tainted 2.6.32-431.17.1.el6_lustre.g684b38d.x86_64 #1
      16:42:00:Call Trace:
      16:42:00: [<ffffffff810d0211>] ? cpuset_print_task_mems_allowed+0x91/0xb0
      16:42:00: [<ffffffff811225c0>] ? dump_header+0x90/0x1b0
      16:42:00: [<ffffffff8129058d>] ? __bitmap_intersects+0x1d/0xa0
      16:42:00: [<ffffffff8122781c>] ? security_real_capable_noaudit+0x3c/0x70
      16:42:00: [<ffffffff81122a42>] ? oom_kill_process+0x82/0x2a0
      16:42:00: [<ffffffff8112293e>] ? select_bad_process+0x9e/0x120
      16:42:00: [<ffffffff81122e80>] ? out_of_memory+0x220/0x3c0
      16:42:00: [<ffffffff8112f79f>] ? __alloc_pages_nodemask+0x89f/0x8d0
      16:42:01: [<ffffffff8116e082>] ? kmem_getpages+0x62/0x170
      16:42:01: [<ffffffff8116ec9a>] ? fallback_alloc+0x1ba/0x270
      16:42:01: [<ffffffff8116e6ef>] ? cache_grow+0x2cf/0x320
      16:42:01: [<ffffffff8116ea19>] ? ____cache_alloc_node+0x99/0x160
      16:42:01: [<ffffffff8124bc3c>] ? crypto_create_tfm+0x3c/0xe0
      16:42:01: [<ffffffff8116f7e9>] ? __kmalloc+0x189/0x220
      16:42:01: [<ffffffff8124bc3c>] ? crypto_create_tfm+0x3c/0xe0
      16:42:01: [<ffffffff812525d8>] ? crypto_init_shash_ops+0x68/0x100
      16:42:01: [<ffffffff8124bd4a>] ? __crypto_alloc_tfm+0x6a/0x130
      16:42:01: [<ffffffff8124c5ba>] ? crypto_alloc_base+0x5a/0xb0
      16:42:01: [<ffffffffa04a2684>] ? cfs_percpt_unlock+0x24/0xb0 [libcfs]
      16:42:01: [<ffffffffa048d2ca>] ? cfs_crypto_hash_alloc+0x7a/0x290 [libcfs]
      16:42:01: [<ffffffffa048d5da>] ? cfs_crypto_hash_digest+0x6a/0xf0 [libcfs]
      16:42:01: [<ffffffff8116f86c>] ? __kmalloc+0x20c/0x220
      16:42:01: [<ffffffffa082bd73>] ? lustre_msg_calc_cksum+0xd3/0x130 [ptlrpc]
      16:42:01: [<ffffffffa0865881>] ? null_authorize+0xa1/0x100 [ptlrpc]
      16:42:01: [<ffffffffa0854a56>] ? sptlrpc_svc_wrap_reply+0x56/0x1c0 [ptlrpc]
      16:42:01: [<ffffffffa08241ec>] ? ptlrpc_send_reply+0x1fc/0x7f0 [ptlrpc]
      16:42:01: [<ffffffffa083b675>] ? ptlrpc_at_check_timed+0xc05/0x1360 [ptlrpc]
      16:42:01: [<ffffffffa0832c09>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
      16:42:01: [<ffffffffa083cf68>] ? ptlrpc_main+0x1198/0x1980 [ptlrpc]
      16:42:01: [<ffffffffa083bdd0>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
      16:42:01: [<ffffffff8109ab56>] ? kthread+0x96/0xa0
      16:42:01: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      16:42:01: [<ffffffff8109aac0>] ? kthread+0x0/0xa0
      16:42:01: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      16:42:01:Mem-Info:

      test failed to respond and timed out

      Info required for matching: insanity 1

      Attachments

        Issue Links

          Activity

            [LU-5077] insanity test_1: out of memory on MDT in crypto_create_tfm()
            sarah Sarah Liu added a comment -

            seen in interop test with 2.6.0 server and master client
            https://testing.hpdd.intel.com/test_sets/8d7a5190-68fe-11e4-9444-5254006e85c2

            sarah Sarah Liu added a comment - seen in interop test with 2.6.0 server and master client https://testing.hpdd.intel.com/test_sets/8d7a5190-68fe-11e4-9444-5254006e85c2

            This was addressed by landing http://review.whamcloud.com/11213 instead of reverting the 9100 patch.

            adilger Andreas Dilger added a comment - This was addressed by landing http://review.whamcloud.com/11213 instead of reverting the 9100 patch.

            I hate to see it reverted since the original patch helped reduce a lot of the client evictions I saw earlier. So reverting would cause just as much damage in my testing.

            simmonsja James A Simmons added a comment - I hate to see it reverted since the original patch helped reduce a lot of the client evictions I saw earlier. So reverting would cause just as much damage in my testing.
            hornc Chris Horn added a comment -

            I agree with James that the patch for LU-5079 should resolve this issue.

            hornc Chris Horn added a comment - I agree with James that the patch for LU-5079 should resolve this issue.

            Instead of reverting it can we try the patch for LU-5079 - http://review.whamcloud.com/#/c/11213. It should address the problem with the LU-4578 patch

            simmonsja James A Simmons added a comment - Instead of reverting it can we try the patch for LU-5079 - http://review.whamcloud.com/#/c/11213 . It should address the problem with the LU-4578 patch
            jhammond John Hammond added a comment -

            My local testing shows that running with http://review.whamcloud.com/#/c/12330/ eliminates the looping in early reply.

            jhammond John Hammond added a comment - My local testing shows that running with http://review.whamcloud.com/#/c/12330/ eliminates the looping in early reply.

            I've created http://review.whamcloud.com/12330 to revert this patch.

            adilger Andreas Dilger added a comment - I've created http://review.whamcloud.com/12330 to revert this patch.

            That the node us looping on early reply makes sense, since that is the main codepath that is invoking the crypto_alloc_tfm().

            adilger Andreas Dilger added a comment - That the node us looping on early reply makes sense, since that is the main codepath that is invoking the crypto_alloc_tfm().
            jhammond John Hammond added a comment -

            I think these OOMs are due to lnet+ptlrpc and not to LDLM. In the crash dumps I find there are a trivial number of locks on the MDT. But lnet_memused may be as high as 700 MB. Running replay-single 0c (one palce where we see OOMs) under trace I see that ptlrpc_send_early_reply() is being called 1000s of times a second to handle the same three OBD_PINGS:

            awk -F: '$4 ~ /1413555299/' 8.dk
            00000100:00000010:1.0:1413555299.934838:0:13846:0:(service.c:1335:ptlrpc_at_send_early_reply()) kmalloced 'reqmsg': 224 at ffff88006dc987c0.
            02000000:00000010:1.0:1413555299.934839:0:13846:0:(sec_null.c:327:null_alloc_rs()) kmalloced 'rs': 488 at ffff880068e26200.
            00000100:00000010:1.0:1413555299.934841:0:13846:0:(service.c:1394:ptlrpc_at_send_early_reply()) kfreed 'reqmsg': 224 at ffff88006dc987c0.
            00000100:00000010:1.0:1413555299.934842:0:13846:0:(client.c:414:ptlrpc_request_cache_free()) slab-freed '(req)': 984 at ffff880032b64400.
            00000100:00001000:1.0:1413555299.934843:0:13846:0:(service.c:1276:ptlrpc_at_send_early_reply()) @@@ sending early reply (deadline +0s, margin -21s) for 21+30  req@ffff880066e91400 x1482175522279684/t0(0) o400->lustre-MDT0002-mdtlov_UUID@10.1.5.202@tcp:0/0 lens 224/0 e 27720 to 0 dl 1413555299 ref 2 fl Complete:/c0/ffffffff rc 0/-1
            00000100:00001000:1.0:1413555299.934845:0:13846:0:(import.c:1570:at_measured()) add 21 to ffff88007cab0150 time=21 v=21 (21 0 0 0)
            00000100:00000010:1.0:1413555299.934846:0:13846:0:(client.c:408:ptlrpc_request_cache_alloc()) slab-alloced 'req': 984 at ffff880032b64400.
            00000100:00000010:1.0:1413555299.934847:0:13846:0:(service.c:1335:ptlrpc_at_send_early_reply()) kmalloced 'reqmsg': 224 at ffff88006dc987c0.
            02000000:00000010:1.0:1413555299.934848:0:13846:0:(sec_null.c:327:null_alloc_rs()) kmalloced 'rs': 488 at ffff880068e25e00.
            00000100:00000010:1.0:1413555299.934850:0:13846:0:(service.c:1394:ptlrpc_at_send_early_reply()) kfreed 'reqmsg': 224 at ffff88006dc987c0.
            00000100:00000010:1.0:1413555299.934851:0:13846:0:(client.c:414:ptlrpc_request_cache_free()) slab-freed '(req)': 984 at ffff880032b64400.
            00000100:00001000:1.0:1413555299.934852:0:13846:0:(service.c:1476:ptlrpc_at_check_timed()) timeout in +0s, asking for 30 secs on 2 early replies
            00000100:00001000:1.0:1413555299.934853:0:13846:0:(service.c:1276:ptlrpc_at_send_early_reply()) @@@ sending early reply (deadline +0s, margin -21s) for 21+30  req@ffff880066e91400 x1482175522279684/t0(0) o400->lustre-MDT0002-mdtlov_UUID@10.1.5.202@tcp:0/0 lens 224/0 e 27721 to 0 dl 1413555299 ref 2 fl Complete:/c0/ffffffff rc 0/-1
            00000100:00001000:1.0:1413555299.934855:0:13846:0:(import.c:1570:at_measured()) add 21 to ffff88007cab0150 time=21 v=21 (21 0 0 0)
            00000100:00000010:1.0:1413555299.934856:0:13846:0:(client.c:408:ptlrpc_request_cache_alloc()) slab-alloced 'req': 984 at ffff880032b64400.
            00000100:00000010:1.0:1413555299.934856:0:13846:0:(service.c:1335:ptlrpc_at_send_early_reply()) kmalloced 'reqmsg': 224 at ffff88006dc987c0.
            02000000:00000010:1.0:1413555299.934857:0:13846:0:(sec_null.c:327:null_alloc_rs()) kmalloced 'rs': 488 at ffff880068e25a00.
            00000100:00000010:1.0:1413555299.934860:0:13846:0:(service.c:1394:ptlrpc_at_send_early_reply()) kfreed 'reqmsg': 224 at ffff88006dc987c0.
            00000100:00000010:1.0:1413555299.934860:0:13846:0:(client.c:414:ptlrpc_request_cache_free()) slab-freed '(req)': 984 at ffff880032b64400.
            ...
            
            [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555299/' 8.dk | grep -c ptlrpc_at_send_early_reply
            13749
            

            In all the cases the new request deadline is the same as the old deadline.

            [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555299/' 8.dk | grep -c null_alloc_rs
            4583
            [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555299/' 8.dk | grep -c null_free_rs
            2580
            [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555300/' 8.dk | grep -c null_alloc_rs
            95
            [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555300/' 8.dk | grep -c null_free_rs
            49217
            [root@shadow-37vm1 ~]# 
            

            Perhaps we should look at reverting or adjusting:

            commit 8b2f9c0e40822dd847494f27d9ef4c6858fee78c
            Author: Chris Horn <hornc@cray.com>
            Date:   Mon Feb 3 15:24:01 2014 -0600
            
                LU-4578 ptlrpc: Early replies need to honor at_max
                
                When determining whether an early reply can be sent the server will
                calculate the new deadline based on an offset from the request
                arrival time. However, when actually setting the new deadline
                the server offsets the current time. This can result in deadlines
                being extended more than at_max seconds past the request arrival
                time. Instead, the server should offset the arrival time when updating
                its request timeout.
                
                When a client receives an early reply it doesn't know the server side
                arrival time so we use the original sent time as an approximation.
                
                Signed-off-by: Chris Horn <hornc@cray.com>
                Change-Id: I1b7b04bd20ea5d471ab3777f987e89add744b813
                Reviewed-on: http://review.whamcloud.com/9100
                Tested-by: Jenkins
                Reviewed-by: James Simmons <uja.ornl@gmail.com>
                Tested-by: Maloo <hpdd-maloo@intel.com>
                Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
                Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
            
            ...
            
            @@ -1344,7 +1352,6 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req)
                                    RETURN(-ETIMEDOUT);
                            }
                    }
            -       newdl = cfs_time_current_sec() + at_get(&svcpt->scp_at_estimate);
             
                    reqcopy = ptlrpc_request_cache_alloc(GFP_NOFS);
                    if (reqcopy == NULL)
            @@ -1390,13 +1397,14 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req)
             
                     rc = ptlrpc_send_reply(reqcopy, PTLRPC_REPLY_EARLY);
             
            -        if (!rc) {
            -                /* Adjust our own deadline to what we told the client */
            -                req->rq_deadline = newdl;
            -                req->rq_early_count++; /* number sent, server side */
            -        } else {
            -                DEBUG_REQ(D_ERROR, req, "Early reply send failed %d", rc);
            -        }
            +       if (!rc) {
            +               /* Adjust our own deadline to what we told the client */
            +               req->rq_deadline = req->rq_arrival_time.tv_sec +
            +                                  at_get(&svcpt->scp_at_estimate);
            +               req->rq_early_count++; /* number sent, server side */
            +       } else {
            +               DEBUG_REQ(D_ERROR, req, "Early reply send failed %d", rc);
            +       }
            
            jhammond John Hammond added a comment - I think these OOMs are due to lnet+ptlrpc and not to LDLM. In the crash dumps I find there are a trivial number of locks on the MDT. But lnet_memused may be as high as 700 MB. Running replay-single 0c (one palce where we see OOMs) under trace I see that ptlrpc_send_early_reply() is being called 1000s of times a second to handle the same three OBD_PINGS: awk -F: '$4 ~ /1413555299/' 8.dk 00000100:00000010:1.0:1413555299.934838:0:13846:0:(service.c:1335:ptlrpc_at_send_early_reply()) kmalloced 'reqmsg': 224 at ffff88006dc987c0. 02000000:00000010:1.0:1413555299.934839:0:13846:0:(sec_null.c:327:null_alloc_rs()) kmalloced 'rs': 488 at ffff880068e26200. 00000100:00000010:1.0:1413555299.934841:0:13846:0:(service.c:1394:ptlrpc_at_send_early_reply()) kfreed 'reqmsg': 224 at ffff88006dc987c0. 00000100:00000010:1.0:1413555299.934842:0:13846:0:(client.c:414:ptlrpc_request_cache_free()) slab-freed '(req)': 984 at ffff880032b64400. 00000100:00001000:1.0:1413555299.934843:0:13846:0:(service.c:1276:ptlrpc_at_send_early_reply()) @@@ sending early reply (deadline +0s, margin -21s) for 21+30 req@ffff880066e91400 x1482175522279684/t0(0) o400->lustre-MDT0002-mdtlov_UUID@10.1.5.202@tcp:0/0 lens 224/0 e 27720 to 0 dl 1413555299 ref 2 fl Complete:/c0/ffffffff rc 0/-1 00000100:00001000:1.0:1413555299.934845:0:13846:0:(import.c:1570:at_measured()) add 21 to ffff88007cab0150 time=21 v=21 (21 0 0 0) 00000100:00000010:1.0:1413555299.934846:0:13846:0:(client.c:408:ptlrpc_request_cache_alloc()) slab-alloced 'req': 984 at ffff880032b64400. 00000100:00000010:1.0:1413555299.934847:0:13846:0:(service.c:1335:ptlrpc_at_send_early_reply()) kmalloced 'reqmsg': 224 at ffff88006dc987c0. 02000000:00000010:1.0:1413555299.934848:0:13846:0:(sec_null.c:327:null_alloc_rs()) kmalloced 'rs': 488 at ffff880068e25e00. 00000100:00000010:1.0:1413555299.934850:0:13846:0:(service.c:1394:ptlrpc_at_send_early_reply()) kfreed 'reqmsg': 224 at ffff88006dc987c0. 00000100:00000010:1.0:1413555299.934851:0:13846:0:(client.c:414:ptlrpc_request_cache_free()) slab-freed '(req)': 984 at ffff880032b64400. 00000100:00001000:1.0:1413555299.934852:0:13846:0:(service.c:1476:ptlrpc_at_check_timed()) timeout in +0s, asking for 30 secs on 2 early replies 00000100:00001000:1.0:1413555299.934853:0:13846:0:(service.c:1276:ptlrpc_at_send_early_reply()) @@@ sending early reply (deadline +0s, margin -21s) for 21+30 req@ffff880066e91400 x1482175522279684/t0(0) o400->lustre-MDT0002-mdtlov_UUID@10.1.5.202@tcp:0/0 lens 224/0 e 27721 to 0 dl 1413555299 ref 2 fl Complete:/c0/ffffffff rc 0/-1 00000100:00001000:1.0:1413555299.934855:0:13846:0:(import.c:1570:at_measured()) add 21 to ffff88007cab0150 time=21 v=21 (21 0 0 0) 00000100:00000010:1.0:1413555299.934856:0:13846:0:(client.c:408:ptlrpc_request_cache_alloc()) slab-alloced 'req': 984 at ffff880032b64400. 00000100:00000010:1.0:1413555299.934856:0:13846:0:(service.c:1335:ptlrpc_at_send_early_reply()) kmalloced 'reqmsg': 224 at ffff88006dc987c0. 02000000:00000010:1.0:1413555299.934857:0:13846:0:(sec_null.c:327:null_alloc_rs()) kmalloced 'rs': 488 at ffff880068e25a00. 00000100:00000010:1.0:1413555299.934860:0:13846:0:(service.c:1394:ptlrpc_at_send_early_reply()) kfreed 'reqmsg': 224 at ffff88006dc987c0. 00000100:00000010:1.0:1413555299.934860:0:13846:0:(client.c:414:ptlrpc_request_cache_free()) slab-freed '(req)': 984 at ffff880032b64400. ... [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555299/' 8.dk | grep -c ptlrpc_at_send_early_reply 13749 In all the cases the new request deadline is the same as the old deadline. [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555299/' 8.dk | grep -c null_alloc_rs 4583 [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555299/' 8.dk | grep -c null_free_rs 2580 [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555300/' 8.dk | grep -c null_alloc_rs 95 [root@shadow-37vm1 ~]# awk -F: '$4 ~ /1413555300/' 8.dk | grep -c null_free_rs 49217 [root@shadow-37vm1 ~]# Perhaps we should look at reverting or adjusting: commit 8b2f9c0e40822dd847494f27d9ef4c6858fee78c Author: Chris Horn <hornc@cray.com> Date: Mon Feb 3 15:24:01 2014 -0600 LU-4578 ptlrpc: Early replies need to honor at_max When determining whether an early reply can be sent the server will calculate the new deadline based on an offset from the request arrival time. However, when actually setting the new deadline the server offsets the current time. This can result in deadlines being extended more than at_max seconds past the request arrival time. Instead, the server should offset the arrival time when updating its request timeout. When a client receives an early reply it doesn't know the server side arrival time so we use the original sent time as an approximation. Signed-off-by: Chris Horn <hornc@cray.com> Change-Id: I1b7b04bd20ea5d471ab3777f987e89add744b813 Reviewed-on: http://review.whamcloud.com/9100 Tested-by: Jenkins Reviewed-by: James Simmons <uja.ornl@gmail.com> Tested-by: Maloo <hpdd-maloo@intel.com> Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> ... @@ -1344,7 +1352,6 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req) RETURN(-ETIMEDOUT); } } - newdl = cfs_time_current_sec() + at_get(&svcpt->scp_at_estimate); reqcopy = ptlrpc_request_cache_alloc(GFP_NOFS); if (reqcopy == NULL) @@ -1390,13 +1397,14 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req) rc = ptlrpc_send_reply(reqcopy, PTLRPC_REPLY_EARLY); - if (!rc) { - /* Adjust our own deadline to what we told the client */ - req->rq_deadline = newdl; - req->rq_early_count++; /* number sent, server side */ - } else { - DEBUG_REQ(D_ERROR, req, "Early reply send failed %d", rc); - } + if (!rc) { + /* Adjust our own deadline to what we told the client */ + req->rq_deadline = req->rq_arrival_time.tv_sec + + at_get(&svcpt->scp_at_estimate); + req->rq_early_count++; /* number sent, server side */ + } else { + DEBUG_REQ(D_ERROR, req, "Early reply send failed %d", rc); + }
            sarah Sarah Liu added a comment - another instance seen in replay-vbr: https://testing.hpdd.intel.com/test_sets/473c9eda-5428-11e4-abcf-5254006e85c2

            People

              niu Niu Yawei (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: