bareos: master a79789a4
Author | Committer | Branch | Timestamp | Parent | |
---|---|---|---|---|---|
Sebastian Sura | Bareos Bot | master | 2024-01-23 10:32 | master cd082f1b | Pending |
Changeset | connection-pool: fix data race Some operations were improperly synchronized. For example take cleanup() for example: ``` |for (i = connections_->size() - 1; i >= 0; i--) { 1| connection = connections_->get(i); | Dmsg2(800, "checking connection %s (%d)\n", connection->name(), i); 2| if (!connection->check()) { | Dmsg2(120, "connection %s (%d) is terminated => removed\n", | connection->name(), i); | connections_->remove(i); 4| delete (connection); | } |} ``` We dont lock connections_ or connection in anyway here. This means that not only could we get a NULL returned at (1), we also have to account for the fact that at any moment connection could get deleted from under us from a different thread -- even if we are currently holding its lock. This will happen if two threads call cleanup at the same time and one is at (2) while the other one is at (4). Similarly the check() function just calls WaitDataIntr() on the socket without ensuring exclusive access (for example by locking the connection!). WaitDataIntr is not a const function so its not safe to call without ensuring exclusive access. Even though it might look like this should be safe since the function just waits, but it in fact can write to some internal data (e.g. b_errno in case of an error) which can definitely cause problems. Connection::in_use is also very misleading. While it does not suffer from the data race problem (as its an atomic value), its interpretation does: If you read false from it, you do not actually know whether some thread is using the connection (and has yet to update the bool) or if the connection is actually unused. All these problems and some more lead to the decision to rewrite this code completely. The basic idea is that the connection pool now is simply a vector of connections protected by one lock. The connections itself do not have a lock. The locks are owned by the vector. The only way to interact with the connections inside the pool is by locking the whole vector. This eliminates all the problems above. The connections itself are now also an raii type. They own the socket they hold. That means that they will take care of closing/destroying the socket once they leave the scope (similarly to a unique pointer). |
||||
mod - core/src/dird/fd_cmds.cc | Diff File | ||||
mod - core/src/dird/fd_cmds.h | Diff File | ||||
mod - core/src/dird/job.cc | Diff File | ||||
mod - core/src/dird/socket_server.cc | Diff File | ||||
mod - core/src/dird/ua_status.cc | Diff File | ||||
mod - core/src/lib/connection_pool.cc | Diff File | ||||
mod - core/src/lib/connection_pool.h | Diff File |