[LU-1411] panic in lmd_parse() on kernels < 2.6.18 Created: 15/May/12 Updated: 05/Dec/14 Resolved: 17/Oct/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Ned Bass | Assignee: | Zhenyu Xu |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | bgp, client, ppc | ||
| Environment: |
IBM BlueGene P |
||
| Severity: | 3 |
| Rank (Obsolete): | 6395 |
| Description |
|
We've begun testing the Lustre 2.1 client on our BlueGene P development system. It immediately panicked on the first mount attempt. This turns out to be related to a patch landed for 2196 #if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,18)) 2197 struct super_block * lustre_get_sb(struct file_system_type *fs_type, int flags, 2198 const char *devname, void * data) 2199 { 2200 return get_sb_nodev(fs_type, flags, data, lustre_fill_super); 2201 } 2202 #else 2203 int lustre_get_sb(struct file_system_type *fs_type, int flags, 2204 const char *devname, void * data, struct vfsmount *mnt) 2205 { 2206 struct lustre_mount_data2 lmd2 = {data, mnt}; 2207 2208 return get_sb_nodev(fs_type, flags, &lmd2, lustre_fill_super, mnt); 2209 } 2210 #endif However, lustre_fill_super() unconditionally casts its void *data argument to a struct lustre_mount_data2 *, which causes a crash when we deference into it on older kernels. 1945 /** Parse mount line options 1946 * e.g. mount -v -t lustre -o abort_recov uml1:uml2:/lustre-client /mnt/lustre 1947 * dev is passed as device=uml1:/lustre by mount.lustre 1948 */ 1949 static int lmd_parse(char *options, struct lustre_mount_data *lmd) 1950 { 1951 char *s1, *s2, *devname = NULL; 1952 struct lustre_mount_data *raw = (struct lustre_mount_data *)options; ... 1963 /* Options should be a string - try to detect old lmd data */ 1964 if ((raw->lmd_magic & 0xffffff00) == (LMD_MAGIC & 0xffffff00)) { <--- crashes here ... 2102 int lustre_fill_super(struct super_block *sb, void *data, int silent) 2103 { 2104 struct lustre_mount_data *lmd; 2105 struct lustre_mount_data2 *lmd2 = data; ... 2122 2123 /* Figure out the lmd from the mount options */ 2124 if (lmd_parse((char *)(lmd2->lmd2_data), lmd)) { |
| Comments |
| Comment by Ned Bass [ 15/May/12 ] |
|
Here is a proof-of-concept patch that at least lets me build and mount on 2.6.16 and 2.6.32 kernels. I'm not sure if passing a null vfsmount pointer to ll_fill_super() is the right thing to do on older kernels. |
| Comment by Peter Jones [ 16/May/12 ] |
|
Bobijam Could you please comment on the validity of this suggested approach? Ned Wouldn't it be safer to continue to run 1.8.x clients that support this older kernel version? Peter |
| Comment by Ned Bass [ 16/May/12 ] |
|
Hi Peter, Yes, we originally planned to run Lustre 1.8 clients on our BGP systems until they retire. But then our BGP users ran into some bugs, i.e. |
| Comment by Zhenyu Xu [ 16/May/12 ] |
|
master branch port at http://review.whamcloud.com/2820 |
| Comment by Peter Jones [ 02/Jun/12 ] |
|
Ned Is my understanding correct that you have abandoned this approach and are now using 18x clients for your systems running older kernels? Peter |
| Comment by Christopher Morrone [ 04/Jun/12 ] |
|
The plan to put 2.1 on BG/P is only delayed, not abandoned. We need to get 2.1 working reasonably well because I do not have the resources to test and bebug 1.8 compatibility with every new 2.X release. The BG/P systems will be around for at least two more years. And the 1.8 to 2.1 compatibility is not good enough to last that long. We'll continually have new breakage as servers go beyond 2.1. |
| Comment by James A Simmons [ 17/Oct/12 ] |
|
This bug is a duplicate of http://review.whamcloud.com/#change,3661 Also NFS with Lustre is broken in older kernels which is addressed with |
| Comment by Peter Jones [ 17/Oct/12 ] |
|
Thanks for the tip James - it's always good to get older tickets cleaned up where possible! |
| Comment by James A Simmons [ 12/Nov/12 ] |
|
The patch from |