[LU-2590] lfs setstripe broken on ppc Created: 08/Jan/13 Updated: 02/May/18 Resolved: 13/Feb/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Christopher Morrone | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MB, ppc, sequoia, topsequoia | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 6037 | ||||||||||||
| Description |
|
On ppc64, with lustre version 2.3.58-4chaos (which is 2.3.58, plus LLNL patches, plus a few that we've pulled in for testing), lfs setstripe no longer seems to work on ppc64 clients. I'm not sure at what point in time they stopped working. Here's the error: $ lfs setstripe -c 2 testfile error on ioctl 0x8008669a for 'testfile' (3): Invalid argument Afterward there is a 0-length file created that has no stripe information, which makes sense if the striping ioctl() failed, but the mknod() succeeded. I will attach the lustre log from running the above command. It is very verbose; I used "-1" debug level. |
| Comments |
| Comment by Christopher Morrone [ 08/Jan/13 ] |
|
Client log from failed lfs setstripe. |
| Comment by Christopher Morrone [ 08/Jan/13 ] |
|
I am not able to set striping using xattrs any longer either. Interestingly, there are no error codes returned by the xattr striping path, the setting are just ignored. |
| Comment by Andreas Dilger [ 09/Jan/13 ] |
|
Chris, do you know what version was the last one that worked? The one patch I can think of off the top of my head is the lmm_layout_gen change (commit f90abfdc9), but it was landed way back in 2.1.53, so I'd think you have used PPC since then. |
| Comment by Peter Jones [ 09/Jan/13 ] |
|
Bobijam Could you please look into this one? Thanks Peter |
| Comment by Christopher Morrone [ 09/Jan/13 ] |
|
Andreas, unfortunately no, I can't remember specifically in which version last I last successfully used striping. I suspect it worked in either 2.1.56 or 2.1.57, because I was testing striping through xattrs, which led me to submit the |
| Comment by Zhenyu Xu [ 09/Jan/13 ] |
|
Can I have the verbose MDS log of the operation please? |
| Comment by Christopher Morrone [ 10/Jan/13 ] |
|
Attached files mds_lustre_log.txt.bz2 and client_lustre_log.txt.bz2. This was full debugging while running the following: seqio1-ib0:/p/lsfull/morrone$ lfs setstripe -c 2 twowide error on ioctl 0x8008669a for 'twowide' (3): Invalid argument error: setstripe: create stripe file 'twowide' failed The client's nid is 172.20.12.1@o2ib500. The MDS's nid is 172.20.5.1@o2ib500. |
| Comment by Zhenyu Xu [ 10/Jan/13 ] |
|
From mds log, it reports that the client pattern data is erroneous (stripe pattern: 0x0)
|
| Comment by Zhenyu Xu [ 10/Jan/13 ] |
|
patch tracking at http://review.whamcloud.com/4996 commit message LU-2590 lod: magic changed after swab In lod_qos_parse_config(), the stack magic variable should be changed to native endianness as well. |
| Comment by Christopher Morrone [ 15/Jan/13 ] |
|
lfs setstripe appears to work again with patch 4996 applied. Thanks! |
| Comment by Christopher Morrone [ 15/Jan/13 ] |
|
It looks like the fix still leaves something not-quite-right with the swabbing because I am seeing this: seqio1-ib0:/p/lsfull/morrone/striping_test$ dmesg seqio1-ib0:/p/lsfull/morrone/striping_test$ lfs setstripe -c 9 nine seqio1-ib0:/p/lsfull/morrone/striping_test$ dmesg LustreError: 6803:0:(lov_pack.c:160:lov_packmd()) bad mem LOV MAGIC: 0xD00BD10B != 0x0BD10BD0 nor 0x0BD30BD0 The command succeeds though. |
| Comment by Zhenyu Xu [ 15/Jan/13 ] |
|
What does further "$ lfs getstripe nine" come out? Also success w/o error message in dmesg? |
| Comment by Zhenyu Xu [ 15/Jan/13 ] |
|
Since I don't have ppc node, would you mind collecting -1 logs of MDS/Client w/ patch 4996 applied? |
| Comment by Christopher Morrone [ 17/Jan/13 ] |
|
lfs getstripe works without an error. I just did the following: seqio1-ib0:/p/lsfull/morrone/striping_test$ lfs setstripe -c 9 ninefile && sleep 1 && lfs getstripe ninefile
ninefile
lmm_stripe_count: 9
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 141
obdidx objid objid group
141 7811 0x1e83 0
142 7811 0x1e83 0
143 7779 0x1e63 0
144 7811 0x1e83 0
145 7875 0x1ec3 0
146 7811 0x1e83 0
147 7811 0x1e83 0
148 7811 0x1e83 0
149 7875 0x1ec3 0
The client log will be attached as "seqio1_log2.txt.bz2", and mds log will be in "grove-mds1_log2.txt.bz2". |
| Comment by Zhenyu Xu [ 22/Jan/13 ] |
|
In seqio1_log2.txt.bz2, I saw that in lov_getstripe(), lov_packmd() was called twice 00020000:00000001:0.0:1358445309.046861:2640:4308:0:(lov_obd.c:1901:lov_iocontrol()) Process entered 00020000:00000001:0.0:1358445309.046870:2928:4308:0:(lov_pack.c:590:lov_getstripe()) Process entered 00020000:00000001:0.0:1358445309.046879:3184:4308:0:(lov_pack.c:145:lov_packmd()) Process entered 00020000:00000010:0.0:1358445309.046888:3184:4308:0:(lov_pack.c:201:lov_packmd()) kmalloced '*lmmp': 248 at c0000003790ce900. 00020000:00000040:0.0:1358445309.046899:3184:4308:0:(lov_pack.c:207:lov_packmd()) lov_packmd: LOV_MAGIC 0x0BD10BD0, lmm_size = 248 00020000:00000001:0.0:1358445309.046908:3312:4308:0:(lov_pack.c:247:lov_packmd()) Process leaving (rc=248 : 248 : f8) 00000100:00000001:0.0:1358445309.046918:3104:4308:0:(pack_generic.c:2179:lustre_swab_lov_mds_md()) Process entered 00000100:00000080:0.0:1358445309.046929:3104:4308:0:(pack_generic.c:2180:lustre_swab_lov_mds_md()) swabbing lov_mds_md 00000100:00000001:0.0:1358445309.046936:3104:4308:0:(pack_generic.c:2188:lustre_swab_lov_mds_md()) Process leaving 00000100:00000001:0.0:1358445309.046945:3104:4308:0:(pack_generic.c:2196:lustre_swab_lov_user_md_objects()) Process entered 00000100:00000001:0.0:1358445309.046954:3104:4308:0:(pack_generic.c:2203:lustre_swab_lov_user_md_objects()) Process leaving 00020000:00000001:0.0:1358445309.046964:2928:4308:0:(obd_class.h:726:obd_packmd()) Process entered 00020000:00000001:0.0:1358445309.046971:3184:4308:0:(lov_pack.c:145:lov_packmd()) Process entered 00020000:00020000:0.0:1358445309.046977:3184:4308:0:(lov_pack.c:160:lov_packmd()) bad mem LOV MAGIC: 0xD00BD10B != 0x0BD10BD0 nor 0x0BD30BD0 00020000:00000001:0.0:1358445309.047032:3312:4308:0:(lov_pack.c:161:lov_packmd()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00020000:00000001:0.0:1358445309.047043:3056:4308:0:(obd_class.h:732:obd_packmd()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00020000:00000001:0.0:1358445309.047052:3056:4308:0:(lov_pack.c:671:lov_getstripe()) Process leaving (rc=0 : 0 : 0) 00020000:00000001:0.0:1358445309.047061:2768:4308:0:(lov_obd.c:2087:lov_iocontrol()) Process leaving (rc=0 : 0 : 0) While in our 2.4 source code, lov_getstripe() only calls it once, is it possible that there is discrepancy between our source codes? |
| Comment by Prakash Surya (Inactive) [ 22/Jan/13 ] |
|
I believe that log was taken when running 2.3.58-5chaos. Looking at lov_getstripe, lov_packmd is called once, and obd_packmd is called once (which appears to then call lov_packmd). Here's the source level information, with inlined git blame info for each line as well: c5050e41 502 (phil 2003-12-03 03:16:26 +0000 |578 int lov_getstripe(struct obd_export *exp, struct lov_stripe_md *lsm,
c5050e41 503 (phil 2003-12-03 03:16:26 +0000 |579 struct lov_user_md *lump)
ccb42f24 309 (adilger 2003-01-06 22:22:15 +0000 |580 {
...
041f9454 631 (yangsheng 2009-03-13 15:05:48 +0000 |617 }
041f9454 632 (yangsheng 2009-03-13 15:05:48 +0000 |618 rc = lov_packmd(exp, &lmmk, lsm);
041f9454 633 (yangsheng 2009-03-13 15:05:48 +0000 |619 if (rc < 0)
041f9454 634 (yangsheng 2009-03-13 15:05:48 +0000 |620 GOTO(out_set, rc);
...
041f9454 680 (yangsheng 2009-03-13 15:05:48 +0000 |667
041f9454 681 (yangsheng 2009-03-13 15:05:48 +0000 |668 obd_free_diskmd(exp, &lmmk);
041f9454 682 (yangsheng 2009-03-13 15:05:48 +0000 |669 out_set:
d2d56f38 463 (tappro 2007-07-30 21:08:59 +0000 |670 set_fs(seg);
ccb42f24 344 (adilger 2003-01-06 22:22:15 +0000 |671 RETURN(rc);
ccb42f24 345 (adilger 2003-01-06 22:22:15 +0000 |672 }
Looking at the git blame info, both calls have been in place for a long time. So this doesn't appear to be a discrepancy issue between our trees.. I'm curious why obd_free_diskmd (i.e. obd_packmd) isn't called in your local test case. |
| Comment by Zhenyu Xu [ 22/Jan/13 ] |
|
sorry, your are right, obi_free_diskmd() called lov_packmd(), and the root cause is it rc = lov_pack(exp, &lmmk, lsm); ---> lmmk LE endianness
...
if ((cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) &&
((lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1)) ||
(lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3)))) {
lustre_swab_lov_mds_md(lmmk);
lustre_swab_lov_user_md_objects(
(struct lov_user_ost_data*)lmmk->lmm_objects,
lmmk->lmm_stripe_count);
} ---> lmmk host endianness (BE)
...
obd_free_diskmd(exp, &lmmk); ---> but obd_free_diskmd() presumes that lmmk is LE, so that lov_packmd() errored out w/o free the memory.
|
| Comment by Zhenyu Xu [ 22/Jan/13 ] |
|
the 2nd issue patch tracking at http://review.whamcloud.com/5145 commit message LU-2590 obdclass: correct swab lov_mds_md For caller's convenience, the 2nd parameter of obd_free_diskmd() could be host endianness, it needs swab to LE if necessary, while just lov_mds_md header needs it for figuring out how much memory needs to be freed. |
| Comment by Andreas Dilger [ 13/Feb/13 ] |
|
Second patch landed for 2.4. |