Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.4.0, Lustre 1.8.x (1.8.0 - 1.8.5)
-
FEFS, based on Lustre-1.8.5
MDSx1, OSSx1(OSTx3), Clientx1
-
3
-
5516
Description
I've often observed that a couple of quick up and down obd_refcount tends to cause two problems at class_decref() in Lustre-1.8.x. so I had took into this problem for a while, and now I believe that I've found what's behind and that the both problems can happen in Lustre-2.3.x too. So please let me notify the both problems.
One is that calling class_decref() from two different threads at the almost same timing tends to result in that one thread can free obd_device while another's processing the operation of class_decref()'s obd_refcount==1 with the same obd_device.
The other is that if thread A increases and decreases an obd_refcount in a short period of time after thread B decreased the refcount and the refcount reached 1, it allow thread A to process obd_refcount==1's operations again. This case results in LBUG in obd_precleanup() or class_unlink_export() in Lustre-1.8.x