View Issue Details

IDProjectCategoryView StatusLast Update
0000593bareos-coreGeneralpublic2019-12-18 15:25
Reporterstatic_void Assigned Tomvwieringen  
PrioritylowSeverityminorReproducibilityhave not tried
Status closedResolutionfixed 
PlatformLinuxOSDebianOS Version7
Product Version15.2.2 
Fixed in Version15.2.3 
Summary0000593: Problem with a job on a passive client and limited Maximum Concurrent Jobs
DescriptionApparently, a job on a passive client takes 2 File Daemon job slots, which I think is not quite reflected in the docs.
And Storage Daemon eventually crashes with signal 11 (reporting fatal errors in log) when trying to connect to a passive client when all File Daemon concurrent job slots are taken.

My suggestions are:
a) reflect that point in the docs, for example in chapter 26.1 "Passive Clients"
b) Storage Daemon reaction is too drastic, no need to call this situation a fatal error
Steps To Reproduce1. Configure a client: Passive = yes (on Director) and Maximum Concurrent Jobs = 1 (on File Daemon)
2. Run a backup job on that client
 - observe two network connections being established with the File Daemon, but one of those connections have packets queued (shown in netstat, rx-queue on client)
 - observe job hanging for several minutes
 - after several minutes, Storage Daemon will report fatal errors concerning authentication, and exit with signal 11

Increasing Maximum Concurrent Jobs on client solves the problem.
TagsNo tags attached.

Relationships

child of 0000625 closedmaik Release bareos-15.2.3 

Activities

mvwieringen

mvwieringen

2016-01-11 21:08

developer   ~0002091

We should probably introduce a variable named MaximumConnections just
as what the Director has gotten since the socket server refactoring.
This variable then controls the listen backlog and is sized to the right
value based on the setting of the number of concurrent jobs allowed.

For every fatal error M_FATAL is used this should only abort the Job.
So I would be interested to see under a debugger with debug symbols
what trap the stored runs into (probably some undetermined code path.)
mvwieringen

mvwieringen

2016-01-13 11:13

developer   ~0002096

I've added the Maximum Connections parameter to both the SD and FD
which is also auto sized when you don't configure it based on the
setting of Maximum Concurrent Jobs. We put a set of autogenerated
binaries under:

http://download.bareos.org/bareos/people/mvw/bug593/

For Debian7 for now as that is what you said you are using.
I would still be interested to know where the SD crashes however.
static_void

static_void

2016-01-13 11:28

reporter   ~0002097

Thanks, going to try the new binaries some time later, maybe today. And I am also gonna try debugging with symbols.

Related Changesets

bareos: master b3437eee

2016-01-11 22:37

Marco van Wieringen

Ported: N/A

Details Diff
socket: Introduce MaximumConnections for filed and stored.

When refactoring the socket_server code for the director we added a new
config option named MaximumConnections which is used to size the listen
backlog. This patch introduces something similar for the filed and
stored so we always size the backlog in a proper way according to the
wanted concurrency. We now check in accepting a connection in the filed
or stored if we already reached the wanted concurrency and then just
disconnect the session.

Fixes 0000593: Problem with a job on a passive client and limited Maximum
Concurrent Jobs
Affected Issues
0000593
mod - src/filed/dir_cmd.c Diff File
mod - src/filed/filed.c Diff File
mod - src/filed/filed_conf.c Diff File
mod - src/filed/filed_conf.h Diff File
mod - src/filed/socket_server.c Diff File
mod - src/stored/dir_cmd.c Diff File
mod - src/stored/ndmp_tape.c Diff File
mod - src/stored/socket_server.c Diff File
mod - src/stored/stored.c Diff File
mod - src/stored/stored_conf.c Diff File
mod - src/stored/stored_conf.h Diff File

Issue History

Date Modified Username Field Change
2016-01-07 18:27 static_void New Issue
2016-01-11 14:49 mvwieringen Assigned To => mvwieringen
2016-01-11 14:49 mvwieringen Status new => acknowledged
2016-01-11 21:08 mvwieringen Note Added: 0002091
2016-01-11 21:08 mvwieringen Status acknowledged => feedback
2016-01-13 11:13 mvwieringen Note Added: 0002096
2016-01-13 11:28 static_void Note Added: 0002097
2016-01-13 11:28 static_void Status feedback => assigned
2016-01-15 16:45 mvwieringen Changeset attached => bareos master b3437eee
2016-01-15 16:46 mvwieringen Status assigned => resolved
2016-01-15 16:46 mvwieringen Fixed in Version => 15.2.3
2016-01-15 16:46 mvwieringen Resolution open => fixed
2016-02-25 17:01 maik Relationship added child of 0000625
2019-12-18 15:25 arogge Status resolved => closed