[LU-7371]  Wrong read length over isize Created: 02/Nov/15  Updated: 03/Oct/19  Resolved: 02/Dec/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Li Xi (Inactive) Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 0
Labels: patch

Attachments: Text File lustre.log    
Issue Links:
Related
is related to LU-12275 Client-side file data encryption Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When running on osd-ldiskfs, if the isize is equal to 4095, a read length of 4096 will be returned because a wrong calculation of EOF.



 Comments   
Comment by Gerrit Updater [ 02/Nov/15 ]

Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17020
Subject: LU-7371 osd-ldiskfs: fix wrong read length over isize
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 669fc00eb6ad6a6e9c94e3814326c50c65a7e3af

Comment by Joseph Gmitter (Inactive) [ 02/Nov/15 ]

Hi Alex,
Can you take a look at this issue?
Thanks.
Joe

Comment by Li Xi (Inactive) [ 04/Nov/15 ]

I tried to write regression test. However, Lustre client has way to determine file size on client side. So, it seems hard to reproduce the issue on client side.

I collected following messages when doing following things on Lustre without patch (all Lustre client and servers runs on the same machine):

dd if=/dev/zero of=file bs=4095 count=1
sync
echo 3 > /proc/sys/vm/drop_caches
dd if=file of=/dev/null bs=1048576

[root@server1 lustre]# grep tgt_brw_read /tmp/lustre.log | grep leaving
00000020:00000001:0.0:1446642073.288612:0:10887:0:(tgt_handler.c:1915:tgt_brw_read()) Process leaving (rc=4096 : 4096 : 1000)
[root@server1 lustre]# grep 4095 /tmp/lustre.log
00020000:00000001:0.0:1446642073.282732:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)
00020000:00000001:0.0:1446642073.282733:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)
00000080:00000001:1.0:1446642073.288760:0:12771:0:(file.c:1302:ll_file_aio_read()) Process leaving (rc=4095 : 4095 : fff)
00000080:00000001:1.0:1446642073.288762:0:12771:0:(file.c:1332:ll_file_read()) Process leaving (rc=4095 : 4095 : fff)
00020000:00000001:1.0:1446642073.288856:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)
00020000:00000001:1.0:1446642073.288857:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)

Comment by Andreas Dilger [ 05/Nov/15 ]

I had originally thought dd if=/dev/zero of=$DIR/$tfile bs=4095 count=1 conv=sync would be enough to create the file at 4095 bytes, and then the with bs=4096 bytes would trigger the bug. If that doesn't work, then another option is to add an OBD_FAIL_OST_* check in the code to reproduce the original symptom.

Comment by Gerrit Updater [ 06/Nov/15 ]

Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17060
Subject: LU-7371 test: wrong read length over isize
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bc9893da37ae60918b61e2d5fd84b4cb87ad3b82

Comment by Gerrit Updater [ 11/Nov/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17020/
Subject: LU-7371 osd-ldiskfs: fix wrong read length over isize
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 97c4c162be77d2ee9bad5d800c9b5803f252caa0

Comment by Gerrit Updater [ 02/Dec/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17060/
Subject: LU-7371 test: wrong read length over isize
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 7023698133970372031a16beac276e5e3e64cfbe

Comment by Joseph Gmitter (Inactive) [ 02/Dec/15 ]

Both patches have landed for 2.8.0

Generated at Sat Feb 10 02:08:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.