False timeouts in authorization
|Assignee:||Ruben S. Montero||% Done:|
|Category:||Core & System|
|Target version:||Release 3.0|
This problem has been seen when using the authorization plugin framework.
In the ActionManager loop, the thread handling authorizations waits in the function pthread_cond_timedwait. It can happen that the expiration is reached just as the requesting thread (UserPool) is in the trigger function. Normally the ActionManager thread would check whether an actual timeout occurred by checking the timeout value of the request in AuthManager::timer_action. However, in this scenario the timeout value has not yet been set. This is because in the present code the value is set by UserPool in the wait function, which is called after the trigger function returns. Since the timeout value is zero, the code handles it as an expired request.
I have observed that this happens in about one out of every 3000 authorizations.
A fix that works is to set the timeout value just before calling the trigger function. A patch against the current master branch is attached.
I noticed one other thing in the code that I am worried about, though I have not seen effects from it in authorization. When the ActionManager thread is released from pthread_cond_timedwait by a signal from pthread_cond_signal in the trigger function, according to the man page the ActionManager thread now has the lock. As far as I understand, it takes the lock from the thread executing the trigger function. In the code, that thread then calls unlock(). Though in the code it appears that it unlocks the mutex of the triggering thread, depending on the timing it may actually unlocking it from the ActionManager thread. At that point, neither thread has the lock. Then the "actions" queue may be accessed by two threads simultaneously.
#2 Updated by Ruben S. Montero almost 8 years ago
- Category set to Core & System
- Status changed from New to Closed
- Assignee set to Ruben S. Montero
- Resolution set to fixed
Thanks for this one it's not an easy one ;) As explained in your issue the problem is that the auth gets in the auth vector without a timeout. I wanted a solution for any class using the Auth interface so there are two options:
- init the timeout before it enters the auth list
- Do not identify the request as timeout if there is no timeout set (it is set before wait)
I really do not see any clear winner between these two. I've implemented the second.
About the other issue threads monitor are using different mutexes (the lock() and unlock() acts on different threads), so I think we are safe.