[LU-2590] lfs setstripe broken on ppc Created: 08/Jan/13  Updated: 02/May/18  Resolved: 13/Feb/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Christopher Morrone Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: MB, ppc, sequoia, topsequoia

Attachments: File LU-2590_log.txt.bz2     File client_lustre_log.txt.bz2     File grove-mds1_log2.txt.bz2     File mds_lustre_log.txt.bz2     File seqio1_log2.txt.bz2    
Issue Links:
Related
is related to LU-10100 sanity test_27a: setstripe failed wit... Resolved
is related to LU-10984 ‘lfs setstripe’ fails with “lfs setst... Closed
Severity: 3
Rank (Obsolete): 6037

 Description   

On ppc64, with lustre version 2.3.58-4chaos (which is 2.3.58, plus LLNL patches, plus a few that we've pulled in for testing), lfs setstripe no longer seems to work on ppc64 clients. I'm not sure at what point in time they stopped working.

Here's the error:

$ lfs setstripe -c 2 testfile
error on ioctl 0x8008669a for 'testfile' (3): Invalid argument

Afterward there is a 0-length file created that has no stripe information, which makes sense if the striping ioctl() failed, but the mknod() succeeded.

I will attach the lustre log from running the above command. It is very verbose; I used "-1" debug level.



 Comments   
Comment by Christopher Morrone [ 08/Jan/13 ]

Client log from failed lfs setstripe.

Comment by Christopher Morrone [ 08/Jan/13 ]

I am not able to set striping using xattrs any longer either. Interestingly, there are no error codes returned by the xattr striping path, the setting are just ignored.

Comment by Andreas Dilger [ 09/Jan/13 ]

Chris, do you know what version was the last one that worked? The one patch I can think of off the top of my head is the lmm_layout_gen change (commit f90abfdc9), but it was landed way back in 2.1.53, so I'd think you have used PPC since then.

Comment by Peter Jones [ 09/Jan/13 ]

Bobijam

Could you please look into this one?

Thanks

Peter

Comment by Christopher Morrone [ 09/Jan/13 ]

Andreas, unfortunately no, I can't remember specifically in which version last I last successfully used striping. I suspect it worked in either 2.1.56 or 2.1.57, because I was testing striping through xattrs, which led me to submit the LU-2485 bug. But I could be wrong.

Comment by Zhenyu Xu [ 09/Jan/13 ]

Can I have the verbose MDS log of the operation please?

Comment by Christopher Morrone [ 10/Jan/13 ]

Attached files mds_lustre_log.txt.bz2 and client_lustre_log.txt.bz2. This was full debugging while running the following:

seqio1-ib0:/p/lsfull/morrone$ lfs setstripe -c 2 twowide
error on ioctl 0x8008669a for 'twowide' (3): Invalid argument
error: setstripe: create stripe file 'twowide' failed

The client's nid is 172.20.12.1@o2ib500.

The MDS's nid is 172.20.5.1@o2ib500.

Comment by Zhenyu Xu [ 10/Jan/13 ]

From mds log, it reports that the client pattern data is erroneous (stripe pattern: 0x0)

00000100:00000001:12.0:1357842862.414573:0:13308:0:(pack_generic.c:2160:lustre_swab_lov_user_md_v1()) Process entered
00000100:00000080:12.0:1357842862.414574:0:13308:0:(pack_generic.c:2161:lustre_swab_lov_user_md_v1()) swabbing lov_user_md v1
00000100:00000001:12.0:1357842862.414575:0:13308:0:(pack_generic.c:2146:lustre_swab_lov_user_md_common()) Process entered
00000100:00001000:12.0:1357842862.414575:0:13308:0:(pack_generic.c:2133:print_lum()) lov_user_md ffffc900a2aa11f8:
00000100:00001000:12.0:1357842862.414576:0:13308:0:(pack_generic.c:2134:print_lum()) lmm_magic: 0xbd10bd0
00000100:00001000:12.0:1357842862.414576:0:13308:0:(pack_generic.c:2135:print_lum()) lmm_pattern: 0x0
00000100:00001000:12.0:1357842862.414577:0:13308:0:(pack_generic.c:2136:print_lum()) lmm_object_id: 0
00000100:00001000:12.0:1357842862.414577:0:13308:0:(pack_generic.c:2137:print_lum()) lmm_object_gr: 0
00000100:00001000:12.0:1357842862.414577:0:13308:0:(pack_generic.c:2138:print_lum()) lmm_stripe_size: 0x0
00000100:00001000:12.0:1357842862.414578:0:13308:0:(pack_generic.c:2139:print_lum()) lmm_stripe_count: 0x2
00000100:00001000:12.0:1357842862.414578:0:13308:0:(pack_generic.c:2141:print_lum()) lmm_stripe_offset/lmm_layout_gen: 0xffff
00000100:00000001:12.0:1357842862.414578:0:13308:0:(pack_generic.c:2155:lustre_swab_lov_user_md_common()) Process leaving
00000100:00000001:12.0:1357842862.414579:0:13308:0:(pack_generic.c:2163:lustre_swab_lov_user_md_v1()) Process leaving
00020000:00000001:12.0:1357842862.414579:0:13308:0:(lod_qos.c:1213:lod_use_defined_striping()) Process entered
00000004:00000001:12.0:1357842862.414580:0:13308:0:(lod_lov.c:726:lod_verify_striping()) Process entered
00000004:00000080:12.0:1357842862.414581:0:13308:0:(lod_lov.c:742:lod_verify_striping()) bad userland stripe pattern: 0x0
00000004:00000001:12.0:1357842862.414581:0:13308:0:(lod_lov.c:743:lod_verify_striping()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)

Comment by Zhenyu Xu [ 10/Jan/13 ]

patch tracking at http://review.whamcloud.com/4996

commit message
LU-2590 lod: magic changed after swab

In lod_qos_parse_config(), the stack magic variable should be changed
to native endianness as well.

Comment by Christopher Morrone [ 15/Jan/13 ]

lfs setstripe appears to work again with patch 4996 applied. Thanks!

Comment by Christopher Morrone [ 15/Jan/13 ]

It looks like the fix still leaves something not-quite-right with the swabbing because I am seeing this:

seqio1-ib0:/p/lsfull/morrone/striping_test$ dmesg
seqio1-ib0:/p/lsfull/morrone/striping_test$ lfs setstripe -c 9 nine
seqio1-ib0:/p/lsfull/morrone/striping_test$ dmesg
LustreError: 6803:0:(lov_pack.c:160:lov_packmd()) bad mem LOV MAGIC: 0xD00BD10B != 0x0BD10BD0 nor 0x0BD30BD0

The command succeeds though.

Comment by Zhenyu Xu [ 15/Jan/13 ]

What does further "$ lfs getstripe nine" come out? Also success w/o error message in dmesg?

Comment by Zhenyu Xu [ 15/Jan/13 ]

Since I don't have ppc node, would you mind collecting -1 logs of MDS/Client w/ patch 4996 applied?

Comment by Christopher Morrone [ 17/Jan/13 ]

lfs getstripe works without an error.

I just did the following:

seqio1-ib0:/p/lsfull/morrone/striping_test$ lfs setstripe -c 9 ninefile && sleep 1 && lfs getstripe ninefile
ninefile
lmm_stripe_count:   9
lmm_stripe_size:    1048576
lmm_layout_gen:     0
lmm_stripe_offset:  141
        obdidx           objid          objid            group
           141            7811         0x1e83                0
           142            7811         0x1e83                0
           143            7779         0x1e63                0
           144            7811         0x1e83                0
           145            7875         0x1ec3                0
           146            7811         0x1e83                0
           147            7811         0x1e83                0
           148            7811         0x1e83                0
           149            7875         0x1ec3                0

The client log will be attached as "seqio1_log2.txt.bz2", and mds log will be in "grove-mds1_log2.txt.bz2".

Comment by Zhenyu Xu [ 22/Jan/13 ]

In seqio1_log2.txt.bz2, I saw that in lov_getstripe(), lov_packmd() was called twice

00020000:00000001:0.0:1358445309.046861:2640:4308:0:(lov_obd.c:1901:lov_iocontrol()) Process entered
00020000:00000001:0.0:1358445309.046870:2928:4308:0:(lov_pack.c:590:lov_getstripe()) Process entered
00020000:00000001:0.0:1358445309.046879:3184:4308:0:(lov_pack.c:145:lov_packmd()) Process entered
00020000:00000010:0.0:1358445309.046888:3184:4308:0:(lov_pack.c:201:lov_packmd()) kmalloced '*lmmp': 248 at c0000003790ce900.
00020000:00000040:0.0:1358445309.046899:3184:4308:0:(lov_pack.c:207:lov_packmd()) lov_packmd: LOV_MAGIC 0x0BD10BD0, lmm_size = 248 
00020000:00000001:0.0:1358445309.046908:3312:4308:0:(lov_pack.c:247:lov_packmd()) Process leaving (rc=248 : 248 : f8)
00000100:00000001:0.0:1358445309.046918:3104:4308:0:(pack_generic.c:2179:lustre_swab_lov_mds_md()) Process entered
00000100:00000080:0.0:1358445309.046929:3104:4308:0:(pack_generic.c:2180:lustre_swab_lov_mds_md()) swabbing lov_mds_md
00000100:00000001:0.0:1358445309.046936:3104:4308:0:(pack_generic.c:2188:lustre_swab_lov_mds_md()) Process leaving
00000100:00000001:0.0:1358445309.046945:3104:4308:0:(pack_generic.c:2196:lustre_swab_lov_user_md_objects()) Process entered
00000100:00000001:0.0:1358445309.046954:3104:4308:0:(pack_generic.c:2203:lustre_swab_lov_user_md_objects()) Process leaving
00020000:00000001:0.0:1358445309.046964:2928:4308:0:(obd_class.h:726:obd_packmd()) Process entered
00020000:00000001:0.0:1358445309.046971:3184:4308:0:(lov_pack.c:145:lov_packmd()) Process entered
00020000:00020000:0.0:1358445309.046977:3184:4308:0:(lov_pack.c:160:lov_packmd()) bad mem LOV MAGIC: 0xD00BD10B != 0x0BD10BD0 nor 0x0BD30BD0
00020000:00000001:0.0:1358445309.047032:3312:4308:0:(lov_pack.c:161:lov_packmd()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
00020000:00000001:0.0:1358445309.047043:3056:4308:0:(obd_class.h:732:obd_packmd()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
00020000:00000001:0.0:1358445309.047052:3056:4308:0:(lov_pack.c:671:lov_getstripe()) Process leaving (rc=0 : 0 : 0)
00020000:00000001:0.0:1358445309.047061:2768:4308:0:(lov_obd.c:2087:lov_iocontrol()) Process leaving (rc=0 : 0 : 0)

While in our 2.4 source code, lov_getstripe() only calls it once, is it possible that there is discrepancy between our source codes?

Comment by Prakash Surya (Inactive) [ 22/Jan/13 ]

I believe that log was taken when running 2.3.58-5chaos. Looking at lov_getstripe, lov_packmd is called once, and obd_packmd is called once (which appears to then call lov_packmd).

Here's the source level information, with inlined git blame info for each line as well:

c5050e41 502 (phil              2003-12-03 03:16:26 +0000 |578 int lov_getstripe(struct obd_export *exp, struct lov_stripe_md *lsm,                                                                                                      
c5050e41 503 (phil              2003-12-03 03:16:26 +0000 |579                   struct lov_user_md *lump)                                                                                                                               
ccb42f24 309 (adilger           2003-01-06 22:22:15 +0000 |580 {                         

...

041f9454 631 (yangsheng         2009-03-13 15:05:48 +0000 |617         }                                                                                                                                                                 
041f9454 632 (yangsheng         2009-03-13 15:05:48 +0000 |618         rc = lov_packmd(exp, &lmmk, lsm);                                                                                                                                 
041f9454 633 (yangsheng         2009-03-13 15:05:48 +0000 |619         if (rc < 0)                                                                                                                                                       
041f9454 634 (yangsheng         2009-03-13 15:05:48 +0000 |620                 GOTO(out_set, rc); 

...

041f9454 680 (yangsheng         2009-03-13 15:05:48 +0000 |667                                                                                                                                                                           
041f9454 681 (yangsheng         2009-03-13 15:05:48 +0000 |668         obd_free_diskmd(exp, &lmmk);                                                                                                                                      
041f9454 682 (yangsheng         2009-03-13 15:05:48 +0000 |669 out_set:                                                                                                                                                                  
d2d56f38 463 (tappro            2007-07-30 21:08:59 +0000 |670         set_fs(seg);                                                                                                                                                      
ccb42f24 344 (adilger           2003-01-06 22:22:15 +0000 |671         RETURN(rc);                                                                                                                                                       
ccb42f24 345 (adilger           2003-01-06 22:22:15 +0000 |672 }    

Looking at the git blame info, both calls have been in place for a long time. So this doesn't appear to be a discrepancy issue between our trees.. I'm curious why obd_free_diskmd (i.e. obd_packmd) isn't called in your local test case.

Comment by Zhenyu Xu [ 22/Jan/13 ]

sorry, your are right, obi_free_diskmd() called lov_packmd(), and the root cause is it

   rc = lov_pack(exp, &lmmk, lsm);         ---> lmmk LE endianness
...
   if ((cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) &&
            ((lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1)) ||
            (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3)))) {
                lustre_swab_lov_mds_md(lmmk);
                lustre_swab_lov_user_md_objects(
                                (struct lov_user_ost_data*)lmmk->lmm_objects,
                                lmmk->lmm_stripe_count);
   }                                       ---> lmmk host endianness (BE)
...
   obd_free_diskmd(exp, &lmmk);    ---> but obd_free_diskmd() presumes that lmmk is LE, so that lov_packmd() errored out w/o free the memory.

Comment by Zhenyu Xu [ 22/Jan/13 ]

the 2nd issue patch tracking at http://review.whamcloud.com/5145

commit message
LU-2590 obdclass: correct swab lov_mds_md

For caller's convenience, the 2nd parameter of obd_free_diskmd()
could be host endianness, it needs swab to LE if necessary, while just
lov_mds_md header needs it for figuring out how much memory needs to
be freed.
Comment by Andreas Dilger [ 13/Feb/13 ]

Second patch landed for 2.4.

Generated at Sat Feb 10 01:26:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.