[LU-13850] File in readwrite state and "No data available" error after file was added to cache on one node. Created: 03/Aug/20  Updated: 04/Aug/20  Resolved: 04/Aug/20

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Vikentsi Lapa Assignee: Qian Yingjin
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

 

file was created on one node and on that node was created rule to add file to cache.

lctl pcc add /mnt/lustre /mnt/pccro --param "fname={*.doc} roid=7 ropcc=1"

 

pdsh -w trevis-59vm1[0,1,2] lfs pcc state /mnt/lustre/myfile.doc
trevis-59vm12: file: /mnt/lustre/myfile.doc, type: readwrite, PCC file: /0007/0000/0402/0000/0002/0000/0x200000402:0x7:0x0, user number: 0, flags: 0
trevis-59vm10: file: /mnt/lustre/myfile.doc, type: none
trevis-59vm11: file: /mnt/lustre/myfile.doc, type: none

on attempt to read file from other nodes I seeing this error.  and file type is marked as readwrite. Epected output is readonly.

pdsh -w trevis-59vm1[0,1,2] cat /mnt/lustre/myfile.doc
trevis-59vm11: cat: /mnt/lustre/myfile.doc: No data available
trevis-59vm12: filedata
trevis-59vm10: cat: /mnt/lustre/myfile.doc: No data available

Last time when I was checked this it works successfully file showed  status

 

# pdsh -w trevis-59vm1[0,1,2]  /usr/sbin/lctl pcc list /mnt/lustre  | dshbak
----------------
trevis-59vm10
----------------
pcc:
  -
    pccpath: /mnt/pccro
    hsmtool: posix
    rwid: 0
    roid: 7
    flags: 2e
    autocache: fname={*.doc}
----------------
trevis-59vm11
----------------
pcc:
  -
    pccpath: /mnt/pccro
    hsmtool: posix
    rwid: 0
    roid: 7
    flags: 2e
    autocache: fname={*.doc}
----------------
trevis-59vm12
----------------
pcc:
  -
    pccpath: /mnt/pccro
    hsmtool: posix
    rwid: 0
    roid: 7
    flags: 2e
    autocache: fname={*.doc}

 



 Comments   
Comment by James Nunez (Inactive) [ 03/Aug/20 ]

Yingjin,

Would you please review this issue and comment?

Comment by Qian Yingjin [ 04/Aug/20 ]

Could you please give out the full command lines that caused this errors?

pdsh -w trevis-59vm1[0,1,2] lfs pcc state /mnt/lustre/myfile.doc
trevis-59vm12: file: /mnt/lustre/myfile.doc, type: readwrite, PCC file: /0007/0000/0402/0000/0002/0000/0x200000402:0x7:0x0, user number: 0, flags: 0

How the file become PCC-RW (readwrite) state?

 

Thanks,

Qian

 

Comment by Qian Yingjin [ 04/Aug/20 ]

I reproduced the failure on my local machine.

Will update the patch soon.

 

Thanks,

Qian

Comment by Qian Yingjin [ 04/Aug/20 ]

Fix this bug in https://review.whamcloud.com/#/c/38346/ 
LU-10918 pcc: auto RO-PCC caching when O_RDONLY open files

Please try the topmost patch:
https://review.whamcloud.com/#/c/38352
LU-12373 pcc: uncache the pcc copies when remove a PCC backend

Comment by Vikentsi Lapa [ 04/Aug/20 ]

Thank you, I started to test it.

Comment by Vikentsi Lapa [ 04/Aug/20 ]

Confirmation that bug was fixed.  Verification was done for lustre-2.13.55_3_g066eec1-1.el7.x86_64 version.

[root@trevis-59vm9 tests]# pdsh -w trevis-59vm[10-12]  bash /mnt/lustre/check_ro_cache.sh  | dshbak
----------------
trevis-59vm10
----------------
pcc:
  -
    pccpath: /mnt/pccro
    hsmtool: posix
    rwid: 0
    roid: 7
    flags: 2e
    autocache: fname={*.doc}
----------------
trevis-59vm11
----------------
pcc:
  -
    pccpath: /mnt/pccro
    hsmtool: posix
    rwid: 0
    roid: 7
    flags: 2e
    autocache: fname={*.doc}
----------------
trevis-59vm12
----------------
pcc:
  -
    pccpath: /mnt/pccro
    hsmtool: posix
    rwid: 0
    roid: 7
    flags: 2e
    autocache: fname={*.doc}
[root@trevis-59vm9 tests]# echo “YYYYY” > /mnt/lustre/fileabcd.doc
[root@trevis-59vm9 tests]#  pdsh -w trevis-59vm[10-12] lfs pcc state /mnt/lustre/fileabcd.doc
trevis-59vm12: file: /mnt/lustre/fileabcd.doc, type: readonly, PCC file: /002e/0000/0401/0000/0002/0000/0x200000401:0x2e:0x0, user number: 0, flags: 0
trevis-59vm10: file: /mnt/lustre/fileabcd.doc, type: readonly, PCC file: /002e/0000/0401/0000/0002/0000/0x200000401:0x2e:0x0, user number: 0, flags: 0
trevis-59vm11: file: /mnt/lustre/fileabcd.doc, type: readonly, PCC file: /002e/0000/0401/0000/0002/0000/0x200000401:0x2e:0x0, user number: 0, flags: 0
Comment by Vikentsi Lapa [ 04/Aug/20 ]

Verified with lustre-client-2.13.55_3_g066eec1-1.el7.x86_64

Generated at Sat Feb 10 03:04:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.