[LU-6229] test racer with lustre_rsync Created: 10/Feb/15 Updated: 10/Dec/15 Resolved: 10/Dec/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | wu libin (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 17439 |
| Description |
|
As i describe in The attached file is the test script. |
| Comments |
| Comment by Oleg Drokin [ 10/Feb/15 ] |
|
Can you please elaborate on what is your test? Racer with lustre-rsync in parallel? The looping symlinks are probably to be expected and might be a test artifact where diff would try to follow symlinks. |
| Comment by wu libin (Inactive) [ 11/Feb/15 ] |
|
Ah, my test is in the attached script(could you check the attached file?), it run like: I didn't find any hole in changelogs. |
| Comment by wu libin (Inactive) [ 11/Feb/15 ] |
|
Here i want to share a simple script will cause the problem: pushd /mnt/lustre After ran this script, use lustre_rsync to sync the data, process: That's what i said it could has orphan node under .lustrerepl. |
| Comment by Oleg Drokin [ 11/Feb/15 ] |
|
I see. thanks for these steps. |
| Comment by wu libin (Inactive) [ 12/Feb/15 ] |
|
Yeah, i'm try to fix it myself, but i did not find the root cause, i'm glad that if someone here could help me. |
| Comment by Li Xi (Inactive) [ 13/Apr/15 ] |
|
Following is how lustre_rsync works (Copied from the codes): * 1. creat * If tfid is absent on the source-fs, ignore this operation * If pfid is absent on the source-fs [or] * if f2p(pfid) is not present on target-fs [or] * if f2p(pfid)+name != f2p(tfid) * creat .lustrerepl/tfid * track [pfid,tfid,name] * Else * creat f2p[tfid] * * 2. remove * If .lustrerepl/[tfid] is present on the target * rm .lustrerepl/[tfid] * Else if pfid is present on the source-fs, * if f2p(pfid)+name is present, * rm f2p(pfid)+name * * 3. move (spfid,sname) to (pfid,name) * If pfid is present * if spfid is also present, mv (spfid,sname) to (pfid,name) * else mv .lustrerepl/[sfid] to (pfid,name) * Else if pfid is not present, * if spfid is present, mv (spfid,sname) .lustrerepl/[sfid] * move out all its children in .lustrerepl. * [pfid,tfid,name] tracked from (1) is used for this. And following is how the tree is created and replayed 1. mkdir racer racer 2. mkdir racer/11 racer/11 3. mkdir racer/13 racer/11 racer/13 4. mv race/13 racer/14 racer/11 race/14 5. mkdir racer/14/14 racer/11 race/14/14 6. mv racer/14 racer/11 racer/11/14/14 1. mkdir racer racer 2. mkdir racer/11 racer/11 3. mkdir racer/13 racer/11 .lustrerpl/tfidp[13] 4. mv racer/13 racer/14 racer/11 racer/14 5. mkdir racer/14/14 racer/11 racer/14 .lustrerpl/tfidp[14] 6. mv racer/14 racer/11 racer/11/14 .lustrerpl/tfidp[14] So it seems the step 4 of replay goes wrong already. Libin, are we able to reproduce with a single thread script? |
| Comment by wu libin (Inactive) [ 11/May/15 ] |
|
yeah, single thread of this script will cause the problem. Oleg, any suggestion about this problem? |
| Comment by Li Xi (Inactive) [ 12/May/15 ] |
|
I might be wrong, but the following might be incorrect: * 1. creat * If tfid is absent on the source-fs, ignore this operation * If pfid is absent on the source-fs [or] * if f2p(pfid) is not present on target-fs [or] * if f2p(pfid)+name != f2p(tfid) * creat .lustrerepl/tfid * track [pfid,tfid,name] * Else * creat f2p[tfid] In the case of 'if f2p(pfid)+name != f2p(tfid)', we need to create file tfid with path of "f2p(pfid)+name". I will make a patch and check whether this is the cause. |
| Comment by Gerrit Updater [ 22/May/15 ] |
|
Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/14914 |
| Comment by Li Xi (Inactive) [ 22/May/15 ] |
|
The script that can preproduce the problem easily. |
| Comment by Niu Yawei (Inactive) [ 28/May/15 ] |
|
Thank you for the patch, LiXi. |
| Comment by Gerrit Updater [ 10/Dec/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14914/ |
| Comment by Joseph Gmitter (Inactive) [ 10/Dec/15 ] |
|
Landed for 2.8 |