0000259: bconsole memory usage and slowness

ID	Project	Category	View Status	Date Submitted	Last Update

0000259	bareos-core	General	public	2013-12-05 02:15	2014-01-23 09:20

Reporter	TomWork	Assigned To
Priority	normal	Severity	minor	Reproducibility	sometimes
Status	closed	Resolution	no change required
Platform	Linux	OS	CentOS	OS Version	6
Product Version	13.2.2

Summary	0000259: bconsole memory usage and slowness
Description	Short version When we execute a lot of parallels bconsole from time to time they become very slow and start to chew a lot of memory. We only execute bconsole commands like the following : sh -c echo 'list jobname=$FQDN' \| /usr/sbin/bconsole Long version - Context: We use nagios. NRPE is a nagios client that executes bconsole to get the status of completed jobs for each monitored host. Nagios will check every 5min each of the hosts/jobs we have (173 IIRC). Nrpe will fork a checker script for each of the host. We use check_bacula_lastbackup.pl v1.0 which will mainly execute a : sh -c echo 'list jobname=$FQDN' \| /usr/sbin/bconsole. Lately I added 2 more nodes to backup in Bareos. I dunno if it's related but it's looks like it was the straw too many. My colleagues had the issue once or twice a few months ago but it resolved itself and no effort have been spent to know what was going on. killall -9 bconsole was the answer. Now, it has been raised to me ;) - Analysis If you stop nrpe and killall all remaining check_bacula and associated bconsole. The box will recover after a while. The load will obviously decrease but mainly bconsole will work again otherwise it hangs on the prompt. Indeed it does NOT display the '' therefore you cannot execute anything. It will work after a while. If you wait a certain amount of time, it becomes quicker and quicker to get the '' and to work from there. This makes me think it is related to timeout or sockets overuse (bconsole to director socket in TIME_WAIT). But I dunno where is the bottlenecks and what is causing it except the fact we have too many bconsole. BTW, I agree that this monitoring design is shit and I am happy to move to something else. If you have any recommendations feel free to share. But in the mean time, I think there is still a bug to solve just about the crazy memory usage. Back to the issue, at first I thought it could be a postgresql issue because it was still running with the default el6 setup (32MB of shared_mem). I increased the parameters to something bigger (see additional info) and even reindexed all tables. No go. Same problem. A select * from pg_stat_activity did not display any running queries, nothing. So I believe the problem is between bconsole and the director but I cannot confirm. - Questions 1. Could the slowness be related to a number of concurrent connections ? 2. Could the slowness be related to the postgresql backend ? 3. Why would it use so much memory ? 4. Why does bconsole takes a long time to (progressively) recover once the flood is over ? - TODO I am happy to follow any directions to help to debug the issue because I just don't know where to start from. If useful I can try to use strace -tt next and save the netstat output.
Steps To Reproduce	I can reproduce it by starting my nagios client and wait a certain amount of time. I am not sure how you can reproduce the issue because I dunno where the problem is yet (bareos or postgresql setup?). Maybe you could reproduce it by starting a lot of sh -c echo 'list jobname=$FQDN' \| /usr/sbin/bconsole for a few hours on your test box.
Additional Information	* check_bacula http://exchange.nagios.org/directory/Plugins/Backup-and-Recovery/Bacula/check_bacula_lastbackup-2Epl/details * Version $ rpm -qa \|grep bareos bareos-common-13.2.1-81.1.el6.x86_64 bareos-filedaemon-13.2.1-81.1.el6.x86_64 bareos-tools-13.2.1-81.1.el6.x86_64 bareos-database-common-13.2.1-81.1.el6.x86_64 bareos-database-tools-13.2.1-81.1.el6.x86_64 bareos-bconsole-13.2.1-81.1.el6.x86_64 bareos-director-13.2.1-81.1.el6.x86_64 bareos-storage-13.2.1-81.1.el6.x86_64 bareos-database-postgresql-13.2.1-81.1.el6.x86_64 bareos-client-13.2.1-81.1.el6.x86_64 $ cat /etc/redhat-release CentOS release 6.4 (Final) * Director config - Password lines have been removed - Servername has been replaced by bareos.server.fqdn - Emails have been tweaked # Bacula Director Master Configuration # for bareos.server.fqdn # Define the name of this director so other clients can # connect to it and work with our system Director { Name = "bareos.server.fqdn:director" Query File = "/usr/lib/bareos/scripts/query.sql" Working Directory = "/var/lib/bareos" PID Directory = "/var/run/bareos" Maximum Concurrent Jobs = 20 Messages = "bareos.server.fqdn:messages:daemon" } # This is where the catalog information will be stored (basically # this should be how to connect to whatever database we're using) Catalog { Name = "bareos.server.fqdn:postgresql" dbname = "bareos"; dbdriver = postgresql user = bareos; password = "b4cul4p4ssw0rd" } # Configure how the directory will log and/or send messages. This # should should be for just about everything. Messages { Name = "bareos.server.fqdn:messages:standard" Mail Command = "/usr/sbin/bsmtp -h localhost -f devnull@domain.tld -s \"Bacula %t %e (for %c)\" %r" Operator Command = "/usr/sbin/bsmtp -h localhost -f devnull@domain.tld -s \"Bacula Intervention Required (for %c)\" %r" Mail = devnull@domain.tld = all, !skipped Mail On Error = sysadmin@domain.tld = all, !skipped Operator = sysadmin@domain.tld = mount Console = all, !skipped, !saved # WARNING! the following will create a file that you must cycle from # time to time as it will grow indefinitely. However, it will # also keep all your messages if they scroll off the console. Append = "/var/log/bareos/bareos.server.fqdn:director.log" = all, !skipped Catalog = all } # These are messages directly from the various daemons themselves. Messages { Name = "bareos.server.fqdn:messages:daemon" Mail Command = "/usr/sbin/bsmtp -h localhost -f devnull@domain.tld -s \"Bacula Notice (from Director %d)\" %r" Mail = sysadmin@domain.tld = all, !skipped Console = all, !skipped, !saved Append = "/var/log/bareos/bareos.server.fqdn:director.log" = all, !skipped } # DEFAULT STORAGE SERVER ------------------------------------------------------ # All the clients will define their own Storage Daemon configuration as they # will connect to a dedicated File device on that director (to aid Pool & Volume # management along with concurrent access). This section will define a default # Storage Daemon to connect to (using the standard FileStorage device) and a # Pool which will be used with that as well. Storage { Name = "bareos.server.fqdn:storage:default" Address = bareos.server.fqdn Device = "DefaultFileStorage" Media Type = File Maximum Concurrent Jobs = 20 } Storage { Name = "bareos.server.fqdn:storage:BackupCatalog" Address = bareos.server.fqdn Device = "FileStorage:BackupCatalog" Media Type = File-BackupCatalog Maximum Concurrent Jobs = 3 } Pool { Name = "bareos.server.fqdn:pool:default" # All Volumes will have the format standard.date.time to ensure they # are kept unique throughout the operation and also aid quick analysis # We won't use a counter format for this at the moment. Label Format = "${Job}.${Year}${Month:p/2/0/r}${Day:p/2/0/r}.${Hour:p/2/0/r}${Minute:p/2/0/r}" Pool Type = Backup # Clean up any we don't need, and keep them for a maximum of a month (in # theory the same time period for weekly backups from the clients) # Note the files for the old volumes will still remain on the disk but will # be truncated to a zero size. Recycle = No Auto Prune = Yes Action On Purge = Truncate Volume Retention = 1 Year # Don't allow re-use of volumes; one volume per job only Maximum Volume Jobs = 1 } Pool { Name = "bareos.server.fqdn:pool:default.full" # All Volumes will have the format standard.date.time to ensure they # are kept unique throughout the operation and also aid quick analysis # We won't use a counter format for this at the moment. Label Format = "${Job}.full.${Year}${Month:p/2/0/r}${Day:p/2/0/r}.${Hour:p/2/0/r}${Minute:p/2/0/r}" Pool Type = Backup # Clean up any we don't need, and keep them for a maximum of a year # Note the files for the old volumes will still remain on the disk but will # be truncated to a zero size. Recycle = No Auto Prune = Yes Action On Purge = Truncate Volume Retention = 1 Year # Don't allow re-use of volumes; one volume per job only Maximum Volume Jobs = 1 } Pool { Name = "bareos.server.fqdn:pool:default.differential" # All Volumes will have the format standard.date.time to ensure they # are kept unique throughout the operation and also aid quick analysis # We won't use a counter format for this at the moment. Label Format = "${Job}.diff.${Year}${Month:p/2/0/r}${Day:p/2/0/r}.${Hour:p/2/0/r}${Minute:p/2/0/r}" Pool Type = Backup # Clean up any we don't need, and keep them for a maximum of fourty days # Note the files for the old volumes will still remain on the disk but will # be truncated to a zero size. Recycle = No Auto Prune = Yes Action On Purge = Truncate Volume Retention = 3 Months # Don't allow re-use of volumes; one volume per job only Maximum Volume Jobs = 1 } Pool { Name = "bareos.server.fqdn:pool:default.incremental" # All Volumes will have the format standard.date.time to ensure they # are kept unique throughout the operation and also aid quick analysis # We won't use a counter format for this at the moment. Label Format = "${Job}.incr.${Year}${Month:p/2/0/r}${Day:p/2/0/r}.${Hour:p/2/0/r}${Minute:p/2/0/r}" Pool Type = Backup # Clean up any we don't need, and keep them for a maximum of 40 days # Note the files for the old volumes will still remain on the disk but will # be truncated to a zero size. Recycle = No Auto Prune = Yes Action On Purge = Truncate Volume Retention = 40 Days # Don't allow re-use of volumes; one volume per job only Maximum Volume Jobs = 1 } Pool { Name = "bareos.server.fqdn:pool:catalog" # All Volumes will have the format director.catalog.date.time to ensure they # are kept unique throughout the operation and also aid quick analysis # Label Format = "${Job}.bareos.server.fqdn.${CounterLdr-bacula1Catalog+:p/3/0/r}" Label Format = "${Job}.bareos.server.fqdn.${Year}${Month:p/2/0/r}${Day:p/2/0/r}.${Hour:p/2/0/r}${Minute:p/2/0/r}" Pool Type = Backup # Clean up any we don't need, and keep them for a maximum of a month (in # theory the same time period for weekly backups from the clients) Recycle = No Auto Prune = Yes Action On Purge = Truncate # We have no limit on the number of volumes, but we will simply set that # we should keep at least one weeks worth of backups of the database Volume Retention = 1 Week # Don't allow re-use of volumes; one volume per job only Maximum Volume Jobs = 1 } # Create a Counter which will be used to label the catalog volumes on the system Counter { Name = "CounterLdr-bacula1Catalog" Minimum = 1 Catalog = "bareos.server.fqdn:postgresql" } # FILE SETS ------------------------------------------------------------------- # Define the standard set of locations which which will be backed up (along # what within those should not be). In general, we have two types: # # Basic:noHome This doesn't back up the /home directory as its mounted # from an NFS director on the network (this is the default). # Basic:withHome This one does for servers where we don't mount NFS on it. FileSet { Name = "Basic:noHome" Include { Options { Signature = SHA1 Compression = lz4hc Shadowing = localwarn } # Don't worry about most of the director as Puppet manages the # configuration. Ensure that per-machine state files or settings # are backed up, along with stuff from /var or /srv which should be # most service-related files File = /boot File = /etc File = /usr/local File = /var File = /opt File = /srv # /home will not be backed up on any normal director as it's managed from # a central file-server for most servers. } Exclude { # Ignore stuff that can be ignored File = /var/cache File = /var/tmp # The state of the packages installed, or their files, etc. # can be ignored as we use puppet to rebuild much of the server File = /var/lib/apt File = /var/lib/dpkg File = /var/lib/puppet File = /var/lib/yum # Ignore database stuff; this will need to be handled # using some sort of a dump script File = /var/lib/mysql File = /var/lib/postgresql File = /var/lib/ldap # Bacula's state files are no use to us on restore File = /var/lib/bareos } } FileSet { Name = "Basic:withHome" Include { Options { Signature = SHA1 Compression = lz4hc Shadowing = localwarn } File = /boot File = /etc File = /usr/local File = /var File = /opt File = /srv # This set does include /home File = /home } Exclude { File = /var/cache File = /var/tmp File = /var/lib/apt File = /var/lib/dpkg File = /var/lib/puppet File = /var/lib/mysql File = /var/lib/postgresql File = /var/lib/ldap File = /var/lib/bareos File = /var/lib/yum } } FileSet { Name = "LinuxAll" Include { Options { Signature = SHA1 Compression = lz4hc Shadowing = localwarn One FS = No # change into other filessytems FS Type = ext2 # filesystems of given types will be backed up FS Type = ext3 # others will be ignored FS Type = ext4 FS Type = xfs FS Type = reiserfs FS Type = jfs FS Type = btrfs FS Type = vzfs } File = / } Exclude { File = /proc File = /tmp File = /.journal File = /.fsck File = /sys File = /dev File = /var/cache File = /var/cfengine/config File = /var/tmp File = /var/lib/apt File = /var/lib/dpkg File = /var/lib/puppet File = /var/lib/mysql # Nagios check results File = /var/nagios/spool/checkresults # postgresql data directories File = /var/lib/postgresql File = /var/lib/pgsql File = /var/lib/ldap # Backup Servers File = /var/lib/bareos File = /var/lib/bacula # Yum temp files File = /var/lib/yum # Cobbler Servers repo mirror File = /var/www/cobbler/repo_mirror # Virtuozzo/openvz conainters File = /vz # Devsunserver localhomes File = /localhome # Spacewalk repositories File = /var/satellite # Dbbackup local dump dir File = /opt/dbbackup # Special filesystem to ignore File = /var/lib/nfs/rpc_pipefs } } # This set is specifically for Bacula to allow it to backup its own internal # cataloge as part of the normal process. FileSet { Name = "Catalog" Include { Options { Signature = SHA1 Compression = lz4hc } File = "/var/lib/bareos/bareos.sql" } } # SCHEDULE -------------------------------------------------------------------- # Define when jobs should be run, and what Levels of backups they will be when # they are run. # These two are the default backup schedule; don't change them Schedule { Name = "WeeklyCycle" Run = Level=Full First Sun at 23:05 Run = Level=Differential Second-Fifth Sun at 23:05 Run = Level=Incremental Mon-Sat at 23:05 } Schedule { Name = "WeeklyCycleAfterBackup" Run = Level=Full Mon-Sun at 05:10 } # These cycles are set up so that we can spread out the full backups of our # servers across the week. Some at the weekend, some mid-week. Schedule { Name = "Weekly:onFriday" Run = Level=Full First Fri at 22:00 Run = Level=Differential Second-Fifth Fri at 22:00 Run = Level=Incremental Sat-Thu at 22:00 } Schedule { Name = "Weekly:onSaturday" # Because this is a weekend job, we'll start the full runs earlier Run = Level=Full First Sat at 22:00 Run = Level=Differential Second-Fifth Sat at 22:00 Run = Level=Incremental Sun-Fri at 22:00 } Schedule { Name = "Weekly:onSunday" # Because this is a weekend job, we'll start the full runs earlier Run = Level=Full First Sun at 22:00 Run = Level=Differential Second-Fifth Sun at 22:00 Run = Level=Incremental Mon-Sat at 22:00 } Schedule { Name = "Weekly:onMonday" Run = Level=Full First Mon at 22:00 Run = Level=Differential Second-Fifth Mon at 22:00 Run = Level=Incremental Tue-Sun at 22:00 } Schedule { Name = "Weekly:onTuesday" Run = Level=Full First Tue at 22:00 Run = Level=Differential Second-Fifth Tue at 22:00 Run = Level=Incremental Wed-Mon at 22:00 } Schedule { Name = "Weekly:onWednesday" Run = Level=Full First Wed at 22:00 Run = Level=Differential Second-Fifth Wed at 22:00 Run = Level=Incremental Thu-Tue at 22:00 } Schedule { Name = "Weekly:onThursday" Run = Level=Full First Thu at 22:00 Run = Level=Differential Second-Fifth Thu at 22:00 Run = Level=Incremental Fri-Wed at 22:00 } Schedule { Name = "Hourly" Run = Level=Incremental hourly at 0:30 } # JOB DEFINITIONS ------------------------------------------------------------- # Create the types of jobs we need to run. # Backup the catalog database (after the nightly save) Job { Name = "BackupCatalog" Type = Backup Client = bareos.server.fqdn FileSet="Catalog" Schedule = "WeeklyCycleAfterBackup" Storage = "bareos.server.fqdn:storage:BackupCatalog" Messages = "bareos.server.fqdn:messages:standard" Pool = "bareos.server.fqdn:pool:catalog" # This creates an ASCII copy of the catalog RunBeforeJob = "/usr/lib/bareos/scripts//make_catalog_backup.pl bareos.server.fqdn:postgresql" # This deletes the copy of the catalog RunAfterJob = "/usr/lib/bareos/scripts//delete_catalog_backup" Write Bootstrap = "/mnt/bareos/bootstraps/BackupCatalog.bsr" # Run after main backup Priority = 50 # This doesn't seem to be working correctly removing it. # RunScript { # RunsWhen=After # RunsOnClient=No # Console = "purge volume action=all allpools storage=File" # } } # Create a standard profile for all normal servers JobDefs { Name = "Basic:noHome:onMonday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onMonday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onMonday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onMonday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:noHome:onTuesday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onTuesday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onTuesday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onTuesday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:noHome:onWednesday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onWednesday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onWednesday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onWednesday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:noHome:onThursday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onThursday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onThursday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onThursday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:noHome:onFriday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onFriday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onFriday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onFriday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:noHome:onSaturday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onSaturday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onSaturday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onSaturday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:noHome:onSunday" Type = Backup Level = Incremental FileSet = "Basic:noHome" Schedule = "Weekly:onSunday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } JobDefs { Name = "Basic:withHome:onSunday" Type = Backup Level = Incremental FileSet = "Basic:withHome" Schedule = "Weekly:onSunday" Messages = "bareos.server.fqdn:messages:standard" # Set the job to work as standard with the default Pool & Storage # (this will be overridden by the Job configuration for each Client) Storage = "bareos.server.fqdn:storage:default" Pool = "bareos.server.fqdn:pool:default" Write Bootstrap = "/mnt/bareos/bootstraps/%c.bsr" Priority = 15 # Define how long any of these jobs are allowed to run for before we should # kill them. Note that this is the run time (how long the actual backup is # running for after starting, and not a maximum time after it was scheduled) Full Max Run Time = 36 Hours Differential Max Run Time = 6 Hours Incremental Max Run Time = 6 Hours } # Finally, bring in all the additional pieces of configuration from the # different servers for which this Director was configured to manage @\|"sh -c 'for f in /etc/bareos/bareos-dir.d/.conf ; do echo @${f} ; done'" Logs/Dumps top - 13:36:09 up 48 days, 5:30, 3 users, load average: 75.72, 55.57, 41.56 Tasks: 994 total, 1 running, 992 sleeping, 0 stopped, 1 zombie Cpu(s): 0.3%us, 0.2%sy, 0.0%ni, 99.1%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16315864k total, 16191248k used, 124616k free, 2524k buffers Swap: 4194296k total, 4165692k used, 28604k free, 118164k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19298 root 20 0 26568 2028 984 R 0.7 0.0 0:00.05 top 107 root 20 0 0 0 0 S 0.2 0.0 2:23.14 kblockd/1 179 root 20 0 0 0 0 D 0.2 0.0 16:49.87 kswapd1 234 root 39 19 0 0 0 S 0.2 0.0 70:41.91 kipmi0 6549 root 20 0 3222m 1.6g 212 D 0.2 10.5 0:13.18 bconsole <<< Look at RES 6638 root 20 0 3226m 1.6g 128 D 0.2 10.2 0:12.90 bconsole <<< 15873 root 20 0 2529m 568m 128 D 0.2 3.6 0:02.99 bconsole <<< 15879 root 20 0 2529m 579m 128 D 0.2 3.6 0:03.06 bconsole .. 15967 root 20 0 2529m 560m 128 D 0.2 3.5 0:02.88 bconsole . 15981 root 20 0 2529m 558m 128 D 0.2 3.5 0:02.69 bconsole 15990 root 20 0 2529m 570m 128 D 0.2 3.6 0:02.79 bconsole 16050 root 20 0 2529m 572m 128 D 0.2 3.6 0:02.73 bconsole 19374 root 20 0 151m 2204 1716 D 0.2 0.0 0:00.01 sudo 1 root 20 0 21304 228 88 S 0.0 0.0 2:50.25 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd [..] Process 26880 attached - interrupt to quit read(3, ^C <unfinished ...> Process 26880 detached [root@ldr-bacula1.begen.iseek.com.au:~]# strace -ff -p 26880 Process 26880 attached with 3 threads - interrupt to quit [pid 26882] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 26880] read(3, <unfinished ...> [pid 26881] restart_syscall(<... resuming interrupted call ...>^C <unfinished ...> Process 26880 detached Process 26881 detached Process 26882 detached [root@ldr-bacula1.begen.iseek.com.au:~]# strace -ff -p 26880 Process 26880 attached with 3 threads - interrupt to quit [pid 26882] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 26881] restart_syscall(<... resuming interrupted call ...> <unfinished ...> <<<<<<<<<<<<<<<<< UNTIL here it was slow, futex ? timeout ?? [pid 26880] read(3, <unfinished ...> [pid 26881] <... restart_syscall resumed> ) = 0 [pid 26881] nanosleep({30, 0}, <unfinished ...> [pid 26882] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out) [pid 26882] futex(0x7f5746f56d80, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 26882] futex(0x7f5746f56dc4, FUTEX_WAIT_BITSET_PRIVATE\|FUTEX_CLOCK_REALTIME, 7, {1385943234, 724898000}, ffffffff <unfinished ...> [pid 26881] <... nanosleep resumed> NULL) = 0 [pid 26881] nanosleep({30, 0}, NULL) = 0 [pid 26881] nanosleep({30, 0}, <unfinished ...> [pid 26882] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) [pid 26882] futex(0x7f5746f56d80, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 26882] futex(0x7f5746f56dc4, FUTEX_WAIT_BITSET_PRIVATE\|FUTEX_CLOCK_REALTIME, 9, {1385943294, 725191000}, ffffffff <unfinished ...> [pid 26880] <... read resumed> "\0\0\0S", 4) = 4 [pid 26880] read(3, "auth cram-md5 <551356239.1385943"..., 83) = 83 [pid 26880] write(3, "\0\0\0\0279+/kl9/uOk/0Mx/AO0REED\0", 27) = 27 [pid 26880] poll([{fd=3, events=POLLIN}], 1, 180000) = 1 ([{fd=3, revents=POLLIN}]) [pid 26880] read(3, "\0\0\0\r", 4) = 4 [pid 26880] read(3, "1000 OK auth\n", 13) = 13 [pid 26880] uname({sys="Linux", node="ldr-bacula1.begen.iseek.com.au", ...}) = 0 [pid 26880] write(3, "\0\0\0004auth cram-md5 <327659900.138"..., 56) = 56 [pid 26880] poll([{fd=3, events=POLLIN}], 1, 180000) = 1 ([{fd=3, revents=POLLIN}]) [pid 26880] read(3, "\0\0\0\27", 4) = 4 [pid 26880] read(3, "F749NyhRlQ/sFi+2eA+tlC\0", 23) = 23 [pid 26880] write(3, "\0\0\0\r1000 OK auth\n", 17) = 17 [pid 26880] read(3, "\0\0\0T", 4) = 4 [pid 26880] read(3, "1000 OK: ldr-bacula1.begen.iseek"..., 84) = 84 [pid 26880] futex(0x7f5746f56dc4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5746f56dc0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 26880] nanosleep({0, 100000}, <unfinished ...> [pid 26882] <... futex resumed> ) = 0 [pid 26882] futex(0x7f5746f56d80, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 26880] <... nanosleep resumed> NULL) = 0 [pid 26882] futex(0x7f5746f56dc4, FUTEX_WAIT_BITSET_PRIVATE\|FUTEX_CLOCK_REALTIME, 11, {1385943298, 390011000}, ffffffff <unfinished ...> [pid 26880] write(1, "Enter a period to cancel a comma"..., 36) = 36
Tags	No tags attached.

TomWork 2013-12-05 02:20 reporter ~0000745	I forgot to say that when the system is stable it takes no time to display a job list. Memory footprint is probably very low as well. real 0m0.062s user 0m0.010s sys 0m0.010s

mvwieringen 2013-12-05 13:35 developer ~0000746	You have the following in your director config: Maximum Concurrent Jobs = 20 That means you cannot run more the 20 jobs concurrently keep in mind that every bconsole is a new Job too. There is a work-queue in the director that is sized using the above setting and it means that after it has reached 20 jobs it will only accept any new connection after one other has finished. First of all I think its kind of useless to query your system like this and especially by using so many bconsole sessions. Keep in mind that bconsole is a hollow program it does nothing more the display the user agent that the director runs. So you are probably just blowing up the director and as that stop accepting connections it seems to trigger some stuff in the bconsole that starts leaking memory in a frantic way. I think it makes much more sense to use NSCA probes in Nagios for these kind of monitoring which turn this whole monitoring into something passive and when the NSCA probe doesn't get the clearance the backup ran it will trigger an error in your monitoring. As to why bconsole blows up no idea but I think the code is not very robust when it comes to handling this resource starvation errors. You could do a kill -SEGV of one of the processes so it will invoke the traceback handler but for that to have any info you need to install the debug rpms and I don't know how that exactly works on centos.

TomWork 2013-12-06 09:03 reporter ~0000747	1. I didn't know bconsole was a Job. 2. Yes, I agree that the current monitoring system is brain dead. It has to be redone. However because you seem to know what kind of passive check would you run ? 3. For the kill -SEGV are you talking about debug symbols for bareos ? I can install : bareos-debuginfo.x86_64. Just to be sure I assume I will have to kill -SEGV `pidof bconsole` and not the director right.

mvwieringen 2013-12-06 10:02 developer ~0000748	Yes you need to install the debug symbols otherwise there is not much to debug with the created core. It should also create a proper core file so we can do some postmortem debugging. And yes it has to be the bconsole as you showed that to use the large amount of memory. I think the dir is fine as it won't run more then 20 jobs concurrently anyway. As to the passive check how about create one in Nagios and then in the post backup script post via NSCA to Nagios that the Job succeeded ? When you put a freshness on such a passive check it will automatically trigger if your backups don't work for a longer time. e.g. if you expect a backup every day a freshness of 25 hours or so should work. Its been quite some time I used Nagios myself but ages ago that was a nice and elegant solution.

mvwieringen 2013-12-06 10:05 developer ~0000749	If you can capture a core-file we can see how we can analyze that to get an idea what is eating all the memory.

TomWork 2013-12-09 03:20 reporter ~0000751	Hi, I can be wrong but I believe bconsole is trying to execute something it cannot find because everything is deployed in a /usr/lib/debug prefix. Is it something you wanted to do ? I am wondering because I believe you wanted to deploy it from / hence the .debug suffix to all binaries. If I have more time I will try to see what is it trying to start. Obviously for the moment I cannot get a coredump :) # rpm -q bareos-debuginfo bareos-debuginfo-13.2.1-81.1.el6.x86_64 # rpm -ql bareos-debuginfo \|grep bconsole /usr/lib/debug/usr/sbin/bconsole.debug # ls -l /usr/lib/debug/usr/sbin/bconsole.debug -r-xr-xr-x 1 root root 86072 Sep 9 22:56 /usr/lib/debug/usr/sbin/bconsole.debug [/usr/lib/debug/usr/sbin]# /usr/lib/debug/usr/sbin/bconsole.debug -bash: /usr/lib/debug/usr/sbin/bconsole.debug: bad ELF interpreter: No such file or directory # strace /usr/lib/debug/usr/sbin/bconsole.debug execve("/usr/lib/debug/usr/sbin/bconsole.debug", ["/usr/lib/debug/usr/sbin/bconsole"...], [/* 21 vars */]) = -1 ENOENT (No such file or directory) dup(2) = 3 fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR\|O_APPEND\|O_LARGEFILE) fstat(3, {st_mode=S_IFCHR\|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap(NULL, 4096, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0) = 0x7f0408a01000 lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) write(3, "strace: exec: No such file or di"..., 40strace: exec: No such file or directory ) = 40 close(3) = 0 munmap(0x7f0408a01000, 4096) = 0 exit_group(1) = ? # ls -l /usr/lib/debug/usr/sbin/bconsole.debug -r-xr-xr-x 1 root root 86072 Sep 9 22:56 bconsole.debug # ls -l /usr/sbin/bconsole -rwxr-xr-x 1 root root 37216 Sep 9 22:56 /usr/sbin/bconsole # uname -r 2.6.32-358.23.2.el6.x86_64

TomWork 2013-12-11 08:05 reporter ~0000758	Re, I don't know why bconsole.debug is not working :/ # ls -l /usr/sbin/bconsole.debug -r-xr-xr-x 1 root root 86072 Dec 11 16:57 /usr/sbin/bconsole.debug # file /usr/sbin/bconsole* /usr/sbin/bconsole: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped /usr/sbin/bconsole.debug: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped # bconsole.debug -bash: /usr/sbin/bconsole.debug: bad ELF interpreter: No such file or directory # strace -s 1024 bconsole.debug execve("/usr/sbin/bconsole.debug", ["bconsole.debug"], [/* 21 vars /]) = -1 ENOENT (No such file or directory) <<<<<<<<<<<<<<<<< WHY ?? dup(2) = 3 fcntl(3, F_GETFL) = 0x8002 (flags O_RDWR\|O_LARGEFILE) fstat(3, {st_mode=S_IFCHR\|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap(NULL, 4096, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0) = 0x7f95647bb000 lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) write(3, "strace: exec: No such file or directory\n", 40strace: exec: No such file or directory ) = 40 close(3) = 0 munmap(0x7f95647bb000, 4096) = 0 exit_group(1) = ? # strace -s 1024 bconsole 2>&1\|head execve("/usr/sbin/bconsole", ["bconsole"], [/ 21 vars */]) = 0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< no problem here. brk(0) = 0x1c89000 mmap(NULL, 4096, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0) = 0x7fa934130000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/x86_64/libreadline.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/usr/lib64/tls/x86_64", 0x7ffff42565c0) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libreadline.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/usr/lib64/tls", {st_mode=S_IFDIR\|0555, st_size=4096, ...}) = 0 open("/usr/lib64/x86_64/libreadline.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/usr/lib64/x86_64", 0x7ffff42565c0) = -1 ENOENT (No such file or directory)

mvwieringen 2013-12-12 12:16 developer ~0000762	Maybe you can look at this: http://le-huy.blogspot.de/2011/01/using-debuginfo-packages-in-redhat.html but it seems the debug file is nothing more then a so called symbol table. e.g. normally the binaries are stripped so when you then start gdb it cannot really show any symbols in your binary only some decoded assembly. So you should just start the normal binary and get it to crash when its so big. Then it should create a corefile and put gdb on that showing the stacktrace using the symbol table in the .debug file.

TomWork 2013-12-24 03:01 reporter ~0000769	Hi, I finally have the time to look into it again. I have restarted nrpe so a lot of bconsole should be launched again. However it has not crashed or increased the load yet. That could be related to the fact I deleted a lot of volumes in the Catalog and on disk. I have 9228 media left and I removed 6659 media today. We will see how it goes but could this be related to a certain amount of rows in the media table maybe ? Does it sound possible ? I will keep you posted if I can reproduce the issue again.

TomWork 2013-12-26 13:08 reporter ~0000770 Last edited: 2013-12-26 13:09	It finally happened again. I have a core file for : 5557 root 20 0 1716m 1.1g 140 D 0.3 7.1 0:08.91 bconsole You can download a 1.5GB core file via http://bogan.dmw.fr/~tom/bconsole.core.5557 Also note that my bareos-dir is currently using 4GB and I cannot connect with bconsole but it could be because it's running our backup jobs at the moment. 14765 bareos 20 0 4368m 8468 776 S 0.0 0.1 13:20.46 bareos-dir

mvwieringen 2013-12-26 17:13 developer ~0000771	Some first impressions: - compress the core file next time with bzip2 ==> 1.5 Gb => 83 Kb. - Its seems from the above there is really a lot of data that compresses like hell. Doing a fast strings -a on the core already reveals one possible problem ==> /root/.bconsole_history e.g. the readline history file is going out of control that might account for quite some of the 1.5 Gb of ram. Just remove it as I guess its not usable anyway. - You seem to use the world famous one volume per job which is known to scale poorly with both Bacula and Bareos. - The last thing you say is that your dir does not respond that is more or less the same problem as why all the NRPE spawned bconsoles hang and that is that the director doesn't allow enough concurrent jobs (and as every console is a job in the director). So first analyze drop the history file and start a complete redesign of your backup environment this is never going to work, you are just way outside the design envelop of both Bacula and Bareos. Maybe there is also a bug somewhere but with these vast amount of data its unlikely we are ever going to find most of it. I will see if I can get any info from the core but I would be interested if you remove the history file if a single bconsole is already using a somewhat reasonable amount of memory,

TomWork 2013-12-27 03:07 reporter ~0000772	Hi Marco, My apologies for the big corefile, it didn't cross my mind that this thing would compress that much or compress at all. Sorry. About the .bconsole_history I emptied it. I will check if the problem is as worse. Thank you for looking into the corefile. I quickly tried to sort \| uniq my bconsole.history and indeed there is a lot of crap in there probably due to check_bacula but a lot didn't make much sense. Anyway now it's empty I will keep an eye on it. About the redesign that's rather unfortunate. Could you explain why this is not going to work ? We do not use tapes at all. I find the idea of one volume per job very simple to understand and very simple to manage. I can understand Bareos/bacula is not designed for that but I didn't know that maybe you should state that somewhere. It is even explained in the Bacula documentation how to do it IIRC. How could we guess as a user that Bacula/Bareos is not designed for that ? I am sure a lot of people are using a Volume per Job because it's simple and because now it's published everywhere on the Internet (Blog, articles, howtos, etc). I think if you believe this is a wrong design you should write about it - if time allows - to explain why and what are the other solutions. Otherwise it will be worse and worse. Tapes are not dead but a lot of people backup D2D(2D) therefore the concept of volume recycling or fixed sizes and etc are maybe obsolete in that context, IMO. At the moment, I am disappointed that Bareos won't answer my requirements. We moved from our old Bacula to Bareos with this design in mind and now we have migrated more 200 servers to it. I am not going to roll back now but I will plan another migration based on your advises or with another software (I do NOT mean to offend here). At the moment it works OK if I remove our silly monitoring which I agree is badly design. RunJobs + NCSA are perfect for that, thanks for the hint. Please keep me posted about the design, what's wrong and how you would do it. Thank you.

mvwieringen 2013-12-27 10:13 developer ~0000773	The console history had a million entries about deleting volumes which triggered me to think you were using one volume per job. Could it be that there is a permission problem on the bconsole history ? Normally the history should have a max of 200 entries and its truncated by the code but that doesn't seem to work. The problem with that is that is doesn't scale. The queries for volumes are sized on a tape based system e.g. that means a limited number of volumes. With the one volume per job the number of volumes explode (it works nicely for small setups but I wonder how well it will work larger setups.) Also you create new volumes all the time and that means the mediaid will increase all the time, that will end eventually (maybe not soon but that depends a bit on how many volumes you create a day.) About the documentation, that is work in progress as you know we got that by forking bacula and it has quite some work to be done there. Also bacula and bareos both have the same problem with a large number of volumes so in that sense bacula would run into the same problems eventually. My prefered setup is using fixed size virtual tapes and have them in a pool with automatic recycling and then use copy jobs to tape. We also changed the default config to use some saner defaults. Problem with fixed sized volumes is however the chance of things becoming corrupt and that is what I have seen a serious problem on Linux. I have been spoiled in that sense on Solaris with ZFS where that never happens anymore. Maybe in the future when BTRFS stabilizes or ZFS gets a real option that is also solved. My first change would be to use one volume per day, that limits the risks of corruption to one day. And it makes you not create a million volumes a day. About the real problem in the core, I guess it just crashes in the memory allocator as it ran out of memory. The smartalloc stuff from bacula will do a null pointer dereference as way to force a core dump (bit like an assert). Other then that if the design you now have makes you happy you may want to keep it but I think it will break eventually.

TomWork 2013-12-30 07:10 reporter ~0000775	What's the biggest number for a media/volume in the code ? With a PostgreSQL, I can see it's an integer so 2147483647 [1]. Ideally if we can, it would be good to change it to bigserial or a bigint. And in the code as an unsigned long ;) However at the moment I am doing a bit less than 300 volumes a day. If the max is only pgsql int type aka 2147483647 and not in bareos code then I still have 19611 years in front of me - at a rate of 300 volume a day minus the mediaID I already used. About the doco, I fully understand. I was not whinging about it. About limitations. I know that Bareos limitations come from Bacula. My point about changing software would be for something else that would allow me to have one volume per job. What everybody wants is backup concurrency (many SD otherwise you cannot write to the same volume IIRC) and the code design not tied to the tape design. I also understand that there is a lot work if you guys would like to move towards that direction. IMO, keeping a such design based on tape would be a mistake. Tapes will still exist but are less important these days. Backups have to be fast, reliable, and should autoclean on disk once the retention make them obsolete to free space for the new backups coming. I will think about your 1 volume per day. I am not sure it will suit our needs but it's a compromise. About the issue, I dunno why .bconsole_history was clobbered with a lot of deletes. What I can tell you is that the .bconsole_history file was probably longer than 200 lines. I stopped my sort \| uniq -c after 5min. I believe sorting 200 lines should have taken less than 1 sec on my server. However, interestingly enough, at the moment, .bconsole_history is exactly 200 long. # wc -l .bconsole_history 200 .bconsole_history Weird... Anyway, if you don't hear from me in 3 weeks you can resolve this ticket. Either the problem stopped or I finally moved to nagios passive checks. I will do my best to ask you to resolve the ticket or to give new info about the issue. Thanks for your support. [1] : http://www.postgresql.org/docs/9.2/static/datatype-numeric.html

Date Modified	Username	Field	Change
2013-12-05 02:15	TomWork	New Issue
2013-12-05 02:20	TomWork	Note Added: 0000745
2013-12-05 13:35	mvwieringen	Note Added: 0000746
2013-12-06 09:03	TomWork	Note Added: 0000747
2013-12-06 10:02	mvwieringen	Note Added: 0000748
2013-12-06 10:05	mvwieringen	Note Added: 0000749
2013-12-06 10:05	mvwieringen	Assigned To	=> mvwieringen
2013-12-06 10:05	mvwieringen	Status	new => feedback
2013-12-09 03:20	TomWork	Note Added: 0000751
2013-12-09 03:20	TomWork	Status	feedback => assigned
2013-12-11 08:05	TomWork	Note Added: 0000758
2013-12-12 12:16	mvwieringen	Note Added: 0000762
2013-12-24 03:01	TomWork	Note Added: 0000769
2013-12-26 13:08	TomWork	Note Added: 0000770
2013-12-26 13:09	TomWork	Note Edited: 0000770
2013-12-26 17:13	mvwieringen	Note Added: 0000771
2013-12-26 17:14	mvwieringen	Status	assigned => feedback
2013-12-27 03:07	TomWork	Note Added: 0000772
2013-12-27 03:07	TomWork	Status	feedback => assigned
2013-12-27 10:13	mvwieringen	Note Added: 0000773
2013-12-27 10:15	mvwieringen	Status	assigned => feedback
2013-12-30 07:10	TomWork	Note Added: 0000775
2013-12-30 07:10	TomWork	Status	feedback => assigned
2014-01-23 09:20	mvwieringen	Assigned To	mvwieringen =>
2014-01-23 09:20	mvwieringen	Status	assigned => closed
2014-01-23 09:20	mvwieringen	Resolution	open => no change required

Reporting new Issues is disabled, please Report new Issues at https://github.com/bareos/bareos/issues

View Issue Details

Activities

Issue History