Bug #695: False timeouts in authorization - OpenNebula - OpenNebula Development pages

Bug #695

False timeouts in authorization

Added by Ted Hesselroth about 10 years ago. Updated almost 10 years ago.

Status:

Closed

Start date:

06/23/2011

Priority:

Normal

Due date:

Assignee:

Ruben S. Montero

% Done:

Category:

Core & System

Target version:

Release 3.0

Resolution:

fixed

Pull request:

Affected Versions:

Description

This problem has been seen when using the authorization plugin framework.

In the ActionManager loop, the thread handling authorizations waits in the function pthread_cond_timedwait. It can happen that the expiration is reached just as the requesting thread (UserPool) is in the trigger function. Normally the ActionManager thread would check whether an actual timeout occurred by checking the timeout value of the request in AuthManager::timer_action. However, in this scenario the timeout value has not yet been set. This is because in the present code the value is set by UserPool in the wait function, which is called after the trigger function returns. Since the timeout value is zero, the code handles it as an expired request.

I have observed that this happens in about one out of every 3000 authorizations.

A fix that works is to set the timeout value just before calling the trigger function. A patch against the current master branch is attached.

I noticed one other thing in the code that I am worried about, though I have not seen effects from it in authorization. When the ActionManager thread is released from pthread_cond_timedwait by a signal from pthread_cond_signal in the trigger function, according to the man page the ActionManager thread now has the lock. As far as I understand, it takes the lock from the thread executing the trigger function. In the code, that thread then calls unlock(). Though in the code it appears that it unlocks the mutex of the triggering thread, depending on the timing it may actually unlocking it from the ActionManager thread. At that point, neither thread has the lock. Then the "actions" queue may be accessed by two threads simultaneously.

0001-Avoid-timeout-set-to-zero-in-authorization.patch (1.49 KB) Ted Hesselroth, 06/23/2011 06:34 PM

Associated revisions

Revision f9a4eaf5
Added by Ruben S. Montero almost 10 years ago

bug #695: False timeouts are now checked within the Authmanager

Revision 0364fd5f
Added by Ruben S. Montero almost 10 years ago

bug #695: False timeouts are now checked within the Authmanager
(cherry picked from commit f9a4eaf55a211d6f057d71ee1424f77c7a57ca39)

History

#1 Updated by Ruben S. Montero almost 10 years ago

Target version set to Release 3.0

#2 Updated by Ruben S. Montero almost 10 years ago

Category set to Core & System
Status changed from New to Closed
Assignee set to Ruben S. Montero
Resolution set to fixed

Hi Ted,

Thanks for this one it's not an easy one ;) As explained in your issue the problem is that the auth gets in the auth vector without a timeout. I wanted a solution for any class using the Auth interface so there are two options:

init the timeout before it enters the auth list
Do not identify the request as timeout if there is no timeout set (it is set before wait)

I really do not see any clear winner between these two. I've implemented the second.

About the other issue threads monitor are using different mutexes (the lock() and unlock() acts on different threads), so I think we are safe.

Also available in: Atom PDF

OpenNebula

Issues