[LU-3886] sanity test_56a: @@@@@@ FAIL: /usr/bin/lfs getstripe --obd wrong: found 6, expected 3 Created: 05/Sep/13 Updated: 17/Mar/20 Resolved: 17/Mar/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Li Xi (Inactive) | Assignee: | Emoly Liu |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 10111 | ||||||||||||||||||||
| Description |
|
This problem is similar with |
| Comments |
| Comment by Peter Jones [ 06/Sep/13 ] |
|
Emoly Could you please comment on this one? Thanks Peter |
| Comment by Emoly Liu [ 09/Sep/13 ] |
|
"lfs setstripe -d" should be enough to clear directory striping information. LiXi, could you tell me how you hit this problem? IMO, we can add "lfs getstripe -v" after "setstripe -d" to print that striping information, and see if this problem will happen again. |
| Comment by Li Xi (Inactive) [ 10/Sep/13 ] |
|
I've post the script (run-sanity.sh) to hit this problem (and |
| Comment by Li Xi (Inactive) [ 10/Sep/13 ] |
|
What is interesting is that when I add a sleep into the test suit, the problem is gone. That makes me believe that the problem is similar with test_56a() { # was test_56 |
| Comment by Emoly Liu [ 10/Sep/13 ] |
|
I just ran the script of "https://jira.hpdd.intel.com/secure/attachment/13414/run.sh" on my local VM. It showed me 10000 times "No errors after xxx iters". My step is: I tried several times, no error happened. |
| Comment by Li Xi (Inactive) [ 10/Sep/13 ] |
|
Oh, sorry, plase run on lustre mount point '/mnt/lustre' rather than its directory '/mnt/lustre/dir', i.e. sh run.sh /mnt/lustre/ I got following output: No error after 640 iters |
| Comment by Emoly Liu [ 10/Sep/13 ] |
|
Yes, this time I hit that. I will investigate it. |
| Comment by Emoly Liu [ 11/Sep/13 ] |
|
The problem you found by run.sh is probably related to the following code: When we set stripe for root(mount point), set_default is enabled in ll_dir_ioctl() case LL_IOC_LOV_SETSTRIPE: { ... int set_default = 0; ... if (inode->i_sb->s_root == file->f_dentry) set_default = 1; /* in v1 and v3 cases lumv1 points to data */ rc = ll_dir_setstripe(inode, lumv1, set_default); Then, in ll_dir_setstripe() if set_default=1, we will call ll_send_mgc_param() to set information asynchronously. if (set_default && mgc->u.cli.cl_mgc_mgsexp) { /* Set root stripesize */ /* Set root stripecount */ /* Set root stripeoffset */ } Since you run setstripe very frequently and many times in run.sh, the config log queue might be very long (bottleneck), and mgs will take more time to process it. BTW, can you hit this problem if you don't use run.sh, just run sanity.sh regularly? |
| Comment by Li Xi (Inactive) [ 11/Sep/13 ] |
|
Yeah, I hit the problem of 'found 6, expected 3' every time when I run sanity.sh. |
| Comment by James Nunez (Inactive) [ 25/Jan/15 ] |
|
I've hit this problem with lustre-master tag 2.6.92. Results at https://testing.hpdd.intel.com/test_sets/37e63f92-9f0d-11e4-91b3-5254006e85c2 |
| Comment by Andreas Dilger [ 17/Mar/20 ] |
|
Haven't seen this in a long time. |