Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
ZFS-based OST
-
3
-
11705
Description
Synopsis: zfs-based OST responds to 'lctl create/destoroy' from an
obdecho client with an error, and obdfilter-survey always hits that
error.
The following details are in two parts. The first part lays out in
concrete detail how I built the ZFS-based lustre instance on which
this experement rests. The second part goes through the specific steps
to produce the error. The steps in the second part are, so far as I
can tell, exactly those steps that obdfilter-echo would take given the
command line (and restuls):
[root@oss01 andrew]# nobjhi=2 thrhi=2 size=1024 case=disk /usr/bin/obdfilter-survey
Fri Nov 15 14:33:43 PST 2013 Obdfilter-survey for case=disk from oss01
ost 1 sz 1048576K rsz 1024K obj 1 thr 1 write 853.95 SHORT rewrite 485.57 [ 382.97, 382.97] read 825.94 SHORT
ost 1 sz 1048576K rsz 1024K obj 1 thr 2 ERROR: 1 != 0
create: 1 objects
error: create: #1 - File exists
created object #s on localhost:zlustre-OST0000_ecc not contiguous
----Part I----------------------------------------------------------
The OpenSFS cluster hosted at Indiana University has 24 "client" nodes
and 8 "server" nodes. My work has been on
c[21-24],mds[03-04],oss[01-02], but these details are really focused
on just oss01. I will give the details of the construction of the MGS
and the (separate) MDS, but feel free to skip them. I do not think
they are directly relevant.
--MGS---------
[root@mds04 andrew]# modprobe lustre
[root@mds04 andrew]# zpool create -f zpool-mgt0 mirror /dev/sd[ab] mirror /dev/sd[st]
[root@mds04 andrew]# umount zpool-mgt0
[root@mds04 andrew]# mkfs.lustre --mgs --fsname zlustre --backfstype=zfs --reformat zpool-mgt0/mgt0
Permanent disk data:
Target: MGS
Index: unassigned
Lustre FS: zlustre
Mount type: zfs
Flags: 0x64
(MGS first_time update )
Persistent mount opts:
Parameters:
mkfs_cmd = zfs create -o canmount=off -o xattr=sa zpool-mgt0/mgt0
Writing zpool-mgt0/mgt0 properties
lustre:version=1
lustre:flags=100
lustre:index=65535
lustre:fsname=zlustre
lustre:svname=MGS
[root@mds04 andrew]# mkdir /mgt0
[root@mds04 andrew]# mount -t lustre zpool-mgt0/mgt0 /mgt0
--MGS---------
--MDS---------
[root@mds03 andrew]# modprobe lustre
[root@mds03 andrew]# zpool create -f zpool-mdt0 mirror /dev/sd[gh] mirror /dev/sd[mn]
[root@mds03 andrew]# mkfs.lustre --mdt --index=0 --fsname zlustre --mgsnid=192.168.2.128@o2ib --backfstype=zfs --reformat zpool-mdt0/mdt0
Permanent disk data:
Target: zlustre:MDT0000
Index: 0
Lustre FS: zlustre
Mount type: zfs
Flags: 0x61
(MDT first_time update )
Persistent mount opts:
Parameters: mgsnode=192.168.2.128@o2ib
mkfs_cmd = zfs create -o canmount=off -o xattr=sa zpool-mdt0/mdt0
Writing zpool-mdt0/mdt0 properties
lustre:version=1
lustre:flags=97
lustre:index=0
lustre:fsname=zlustre
lustre:svname=zlustre:MDT0000
lustre:mgsnode=192.168.2.128@o2ib
[root@mds03 andrew]# mkdir /mdt0
[root@mds03 andrew]# mount -t lustre zpool-mdt0/mdt0 /mdt0
--MDS---------
And here is the OSS where the actual experement takes place:
--OSS---------
[root@oss01 andrew]# modprobe lustre
[root@oss01 andrew]# zpool create -f zpool-ost0 raidz2 /dev/sd[rstuvwx]
[root@oss01 andrew]# mkfs.lustre --ost --index=0 --fsname zlustre --mgsnid=192.168.2.128@o2ib --backfstype=zfs --reformat zpool-ost0/ost0
Permanent disk data:
Target: zlustre:OST0000
Index: 0
Lustre FS: zlustre
Mount type: zfs
Flags: 0x62
(OST first_time update )
Persistent mount opts:
Parameters: mgsnode=192.168.2.128@o2ib
mkfs_cmd = zfs create -o canmount=off -o xattr=sa zpool-ost0/ost0
Writing zpool-ost0/ost0 properties
lustre:version=1
lustre:flags=98
lustre:index=0
lustre:fsname=zlustre
lustre:svname=zlustre:OST0000
lustre:mgsnode=192.168.2.128@o2ib
[root@oss01 andrew]# umount /zpool-ost0/
[root@oss01 andrew]# mkdir /ost0
[root@oss01 andrew]# mount -t lustre zpool-ost0/ost0 /ost0
[root@oss01 andrew]# nobjhi=2 thrhi=2 size=1024 case=disk /usr/bin/obdfilter-survey
Thu Nov 14 15:11:28 PST 2013 Obdfilter-survey for case=disk from oss01
ost 1 sz 1048576K rsz 1024K obj 1 thr 1 write 805.38 SHORT rewrite 450.65 [ 453.97, 453.97] read 757.87 SHORT
ost 1 sz 1048576K rsz 1024K obj 1 thr 2 ERROR: 1 != 0
create: 1 objects
error: create: #1 - File exists
created object #s on localhost:zlustre-OST0000_ecc not contiguous
--OSS---------
N.B I also construct two OSTs on oss02, but they also play no role in
this experiment.
Having built the above file system, I mounted it on to of the clients
and ran a few rudimentary 'dd' file creates and 'cp' file copies,
verifying that the file system itself appears to be working correctly.
Once a clean new instance of the file system has been recreated I run
the obdfilter-survey at the top of this note with the results reported
there. Cliff white has confimed that this error also occurs on his
platform, Hyperion. After investigating how the obdfilter-survey
actually interacts with 'lctl' and the 'obdecho' client, I abstracted
the core details and recreated the error (again with a clean new file
system) with a minimum of distraction. Those deatils are in Part II
along with a brief note about the error thatappears on the console.
----Part I----------------------------------------------------------
----Part II---------------------------------------------------------
[root@oss01 andrew]# modprobe obdecho
[root@oss01 andrew]# lctl dl
0 UP osd-zfs zlustre-OST0000-osd zlustre-OST0000-osd_UUID 5
1 UP mgc MGC192.168.2.128@o2ib 46fb8d7b-7224-7952-9c9f-43af71bdf872 5
2 UP ost OSS OSS_uuid 3
3 UP obdfilter zlustre-OST0000 zlustre-OST0000_UUID 5
4 UP lwp zlustre-MDT0000-lwp-OST0000 zlustre-MDT0000-lwp-OST0000_UUID 5
[root@oss01 andrew]# lctl
lctl > attach echo_client zlustre-OST0000_ecc zlustre-OST0000_ecc_UUID
lctl > setup zlustre-OST0000
lctl > dl
0 UP osd-zfs zlustre-OST0000-osd zlustre-OST0000-osd_UUID 5
1 UP mgc MGC192.168.2.128@o2ib 46fb8d7b-7224-7952-9c9f-43af71bdf872 5
2 UP ost OSS OSS_uuid 3
3 UP obdfilter zlustre-OST0000 zlustre-OST0000_UUID 6
4 UP lwp zlustre-MDT0000-lwp-OST0000 zlustre-MDT0000-lwp-OST0000_UUID 5
5 UP echo_client zlustre-OST0000_ecc zlustre-OST0000_ecc_UUID 3
lctl > quit
[root@oss01 andrew]# lctl --device 5 create 1
create: 1 objects
create: #1 is object id 0x2
[root@oss01 andrew]# lctl
lctl > --threads 1 -1 5 test_brw 1024 wx q 256 1t2 p256
Print status every 1 seconds
--threads: starting 1 threads on device 5 running test_brw 1024 wx q 256 1t2 p256
Total: total 1024 threads 1 sec 1.196911 855.535625/second
lctl > --threads 1 -1 5 test_brw 1024 wx q 256 1t2 p256
Print status every 1 seconds
--threads: starting 1 threads on device 5 running test_brw 1024 wx q 256 1t2 p256
Total: total 1024 threads 1 sec 1.042479 982.273983/second
lctl > --threads 1 -1 5 test_brw 1024 rx q 256 1t2 p256
Print status every 1 seconds
--threads: starting 1 threads on device 5 running test_brw 1024 rx q 256 1t2 p256
Total: total 1024 threads 1 sec 1.110232 922.329747/second
lctl > quit
[root@oss01 andrew]# lctl --device 5 destroy 0x2 1
destroy: 1 objects
destroy: #1 is object id 0x2
[root@oss01 andrew]# lctl --device 5 create 1
create: 1 objects
error: create: #1 - No such file or directory
[root@oss01 andrew]# lctl
lctl > cfg zlustre-OST0000_ecc
lctl > cleanup
lctl > detach
lctl > quit
----------------------------------------------------------------------
At the point you load obdecho there is one comment on the
console. After that the console is silent until you hit the object
creation error:
<ConMan> Connection to console [oss01] opened.
Lustre: Echo OBD driver; http://www.lustre.org/
LustreError: 7139:0:(osd_handler.c:213:osd_trans_start()) zlustre-OST0000: can't assign tx: rc = -2
LustreError: 7139:0:(ofd_obd.c:1356:ofd_create()) zlustre-OST0000: unable to precreate: rc = -2
LustreError: 7139:0:(echo_client.c:2310:echo_create_object()) Cannot create objects: rc = -2
LustreError: 7139:0:(echo_client.c:2334:echo_create_object()) create object failed with: rc = -2
----Part II---------------------------------------------------------
If I mount the corresponding ZFS zpool as justa plain old zfs I do see
a directory for .../O/2, but I do not know enough about the object
handling in Lustre to verify if it "looks right".
Andrew Uselton
2013-11-15