[LU-1419] Tracking ticket for gnilnd push Created: 17/May/12  Updated: 19/Dec/12  Resolved: 19/Dec/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.4.0

Type: New Feature Priority: Minor
Reporter: Chris Horn Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: lnet

Attachments: File git-cleanup.sh    
Rank (Obsolete): 8135

 Description   

Cray is preparing to submit gnilnd for upstream adoption. This ticket is for tracking that work.



 Comments   
Comment by James A Simmons [ 17/May/12 ]

Will this work only be aimed at the master branch? Also during ORNL testing of IR recovery we discovered some old left over code from the catamount days that impacted the recovery time. Should we merge that fix under this ticket? Its just the INITIAL_CONNECT_TIMEOUT in obd_support.h.

Comment by Cory Spitz [ 18/May/12 ]

Yes, we'll aim at master. Then, if there is reason to push back, we can do so later.

Other changes that aren't related to the gnilnd should be tracked in a different ticket. James, can you open one for the INITIAL_CONNECT_TIMEOUT problem?

Comment by James A Simmons [ 18/May/12 ]

The ticket is in http://jira.whamcloud.com/browse/LU-1422 for this. Want to let you know for peer review. Thanks

Comment by Cory Spitz [ 11/Jul/12 ]

Our push of gnilnd is eminent. However, it was written before we adopted the retab policy. Can we please submit with the old whitespace style? It would help us from touching every single line or keeping our version and the contributed version different. Thoughts or suggestions?

Comment by James A Simmons [ 11/Jul/12 ]

Here is a handy script I used for re-tabbing.After you commit your code just run it and then push it upstream.
.
#!/bin/bash -e
#

  1. Rewrite the last commit to remove any trailing whitespace
  2. in the new version of changed lines.
  3. Then replace space-based indentation with TAB based indentation
  4. based on TABS at every eight position
    #
    [[ -z $TRACE ]] || set -x
    trap "rm -f $tmpf" 0
    tmpf1=$TMP/$$.1.diff
    tmpf2=$TMP/$$.2.diff
    git show --binary >$tmpf1
    perl -p -e 's/(+.?)[ \t]+$/$1/; while(m/(+\t)( {1,7}

    \t|

    {8}

    )(.*)/)

    { $_=$1."\t".$3."\n"; }

    ' <$tmpf1 >$tmpf2
    if ! cmp -s $tmpf1 $tmpf2
    then
    git apply --binary --index -R --whitespace=nowarn $tmpf1
    git apply --binary --index $tmpf2
    GIT_EDITOR=true git commit --amend
    else
    echo "No changes"
    fi

Comment by James A Simmons [ 11/Jul/12 ]

Attached to ticket since JIRA managed my script.

Comment by Chris Horn [ 11/Jul/12 ]

http://review.whamcloud.com/#change,3381

Comment by Doug Oucharek (Inactive) [ 01/Aug/12 ]

Question: What will be Cray's policy for keeping the gnilnd code synchronized between the Lustre main repository and the Cray repository? Will you be doing code drops to match shipping versions? How will changes made by non-Cray community members be handled?

Comment by Cory Spitz [ 02/Aug/12 ]

Doug, good question. I don't know exactly, it likely won't be an instantaneous sync. However, Cray won't make code drops to match shipping versions. We intend to keep to the 'master' model. (The version submitted here is already "ahead" of our released versions). Cray will push up changes on as a regular basis as we can manage. Surely, it would be nice to stay current with 'master'. For other contributions, I suggest that we handle those as any other: through Gerrit with community review.

Comment by James A Simmons [ 02/Aug/12 ]

Hi Cory.

I have been regularly testing master on our Gemini test bed so I have a motivation to keep this driver working If Cray doesn't mind I have no issue with keeping this driver in sync. Please look it over. The error code you can inject have changed and I just guessed what the values are so if you attempt to regression test internally this latest patch your test will most likely fail. So we need to sync up on that. Also from your earlier patch some of the Whamcloud engineers had concerns about the certain parts of the code. I didn't want to change those parts of the code without a serious inspection from the cray engineer working on this LNET driver as well. Any changes will be welcomed for testing.

Comment by James A Simmons [ 08/Nov/12 ]

I have pushed a updated driver with Isaac suggestions. The driver appears to work pretty well. The only thing I have observed with the driver is that on server node after LNET is brought up it can't ping the MGS until after I first ping the routers. Issac have any idea what could be causing that?

Comment by Isaac Huang (Inactive) [ 16/Nov/12 ]

Can you reproduce it?

Comment by James A Simmons [ 19/Nov/12 ]

Found the bug in the driver causing this. The latest patch works now.

Comment by Peter Jones [ 19/Dec/12 ]

Patch landed for 2.4. I suggest that any subsequent updates are handled under separate tickets for clarity.

Generated at Sat Feb 10 01:16:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.