View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000593 | bareos-core | General | public | 2016-01-07 18:27 | 2019-12-18 15:25 |
Reporter | static_void | Assigned To | mvwieringen | ||
Priority | low | Severity | minor | Reproducibility | have not tried |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | Debian | OS Version | 7 |
Product Version | 15.2.2 | ||||
Fixed in Version | 15.2.3 | ||||
Summary | 0000593: Problem with a job on a passive client and limited Maximum Concurrent Jobs | ||||
Description | Apparently, a job on a passive client takes 2 File Daemon job slots, which I think is not quite reflected in the docs. And Storage Daemon eventually crashes with signal 11 (reporting fatal errors in log) when trying to connect to a passive client when all File Daemon concurrent job slots are taken. My suggestions are: a) reflect that point in the docs, for example in chapter 26.1 "Passive Clients" b) Storage Daemon reaction is too drastic, no need to call this situation a fatal error | ||||
Steps To Reproduce | 1. Configure a client: Passive = yes (on Director) and Maximum Concurrent Jobs = 1 (on File Daemon) 2. Run a backup job on that client - observe two network connections being established with the File Daemon, but one of those connections have packets queued (shown in netstat, rx-queue on client) - observe job hanging for several minutes - after several minutes, Storage Daemon will report fatal errors concerning authentication, and exit with signal 11 Increasing Maximum Concurrent Jobs on client solves the problem. | ||||
Tags | No tags attached. | ||||
We should probably introduce a variable named MaximumConnections just as what the Director has gotten since the socket server refactoring. This variable then controls the listen backlog and is sized to the right value based on the setting of the number of concurrent jobs allowed. For every fatal error M_FATAL is used this should only abort the Job. So I would be interested to see under a debugger with debug symbols what trap the stored runs into (probably some undetermined code path.) |
|
I've added the Maximum Connections parameter to both the SD and FD which is also auto sized when you don't configure it based on the setting of Maximum Concurrent Jobs. We put a set of autogenerated binaries under: http://download.bareos.org/bareos/people/mvw/bug593/ For Debian7 for now as that is what you said you are using. I would still be interested to know where the SD crashes however. |
|
Thanks, going to try the new binaries some time later, maybe today. And I am also gonna try debugging with symbols. | |
bareos: master b3437eee 2016-01-11 22:37 Marco van Wieringen Ported: N/A Details Diff |
socket: Introduce MaximumConnections for filed and stored. When refactoring the socket_server code for the director we added a new config option named MaximumConnections which is used to size the listen backlog. This patch introduces something similar for the filed and stored so we always size the backlog in a proper way according to the wanted concurrency. We now check in accepting a connection in the filed or stored if we already reached the wanted concurrency and then just disconnect the session. Fixes 0000593: Problem with a job on a passive client and limited Maximum Concurrent Jobs |
Affected Issues 0000593 |
|
mod - src/filed/dir_cmd.c | Diff File | ||
mod - src/filed/filed.c | Diff File | ||
mod - src/filed/filed_conf.c | Diff File | ||
mod - src/filed/filed_conf.h | Diff File | ||
mod - src/filed/socket_server.c | Diff File | ||
mod - src/stored/dir_cmd.c | Diff File | ||
mod - src/stored/ndmp_tape.c | Diff File | ||
mod - src/stored/socket_server.c | Diff File | ||
mod - src/stored/stored.c | Diff File | ||
mod - src/stored/stored_conf.c | Diff File | ||
mod - src/stored/stored_conf.h | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2016-01-07 18:27 | static_void | New Issue | |
2016-01-11 14:49 | mvwieringen | Assigned To | => mvwieringen |
2016-01-11 14:49 | mvwieringen | Status | new => acknowledged |
2016-01-11 21:08 | mvwieringen | Note Added: 0002091 | |
2016-01-11 21:08 | mvwieringen | Status | acknowledged => feedback |
2016-01-13 11:13 | mvwieringen | Note Added: 0002096 | |
2016-01-13 11:28 | static_void | Note Added: 0002097 | |
2016-01-13 11:28 | static_void | Status | feedback => assigned |
2016-01-15 16:45 | mvwieringen | Changeset attached | => bareos master b3437eee |
2016-01-15 16:46 | mvwieringen | Status | assigned => resolved |
2016-01-15 16:46 | mvwieringen | Fixed in Version | => 15.2.3 |
2016-01-15 16:46 | mvwieringen | Resolution | open => fixed |
2016-02-25 17:01 | maik | Relationship added | child of 0000625 |
2019-12-18 15:25 | arogge | Status | resolved => closed |