0001082: Bareos director crashes with segfault when restarting or reload from console

ID	Project	Category	View Status	Date Submitted	Last Update

0001082	bareos-core	director	public	2019-04-30 14:23	2023-07-14 14:20

Reporter	jurgengoedbloed	Assigned To	bruno-at-bareos
Priority	normal	Severity	crash	Reproducibility	sometimes
Status	closed	Resolution	no change required
Platform	Linux	OS	CentOS	OS Version	7
Product Version	18.2.5

Summary	0001082: Bareos director crashes with segfault when restarting or reload from console
Description	After a config change, I reloaded the bareos director and it crashed with a segfault. After I restart the Bareos director, it seems to run for about a minute but after then it crashes again. Sometimes almost directly, sometimes after a couple of minutes. At startup, Bareos doesn't complain about a config error. After a while it just stops. I already had the same issue in the past, but then managed to get the system up and running after waiting a considerable amount of time (> 1 day) and then restarting the director. It had then been running and backing up for over two weeks. The director and storage daemon are running 18.2.5 all clients run 17.2.4 or 18.2.5, all runnining on Centos 7. The director and storage (both on the same machine) run on a fully patched Centos 7 machine. I've had the same issue with the director on version 17.2.4 and a self-compiled 17.2.5 I suspect that it has to do with the fact that all clients use the 'client initiated connection' and something goes wrong as soon as clients reconnect after restart of the director. A race condition, lack of resources..?
Steps To Reproduce	When the crash occurs: - Start the bareos director - Within a minute, the direct will crash again.
Additional Information	As requested by Andreas, created this bug and attached the traceback file.
Tags	No tags attached.

jurgengoedbloed 2019-04-30 14:23 reporter	bareos-dir.1640.bactrace (1,079 bytes) Attempt to dump current JCRs. njcrs=2 threadid=0x0000007f8e268bb8 JobId=0 JobStatus=R jcr=0x12a4b58 name=JobMonitor.2019-04-30_13.10.42_01 threadid=0x6200007f8e268bb8 killable=0 JobId=0 JobStatus=R jcr=0x12a4b58 name=JobMonitor.2019-04-30_13.10.42_01 UseCount=1 JobType=I JobLevel= sched_time=30-Apr-2019 13:10 start_time=30-Apr-2019 13:10 end_time=01-Jan-1970 01:00 wait_time=01-Jan-1970 01:00 db=(nil) db_batch=(nil) batch_started=0 threadid=0xf000007f8e13fff7 JobId=0 JobStatus=R jcr=0x7f8e0c0008e8 name=StatisticsCollector.2019-04-30_13.10.42_02 threadid=0x6600007f8e13fff7 killable=0 JobId=0 JobStatus=R jcr=0x7f8e0c0008e8 name=StatisticsCollector.2019-04-30_13.10.42_02 UseCount=1 JobType=I JobLevel= sched_time=30-Apr-2019 13:10 start_time=30-Apr-2019 13:10 end_time=01-Jan-1970 01:00 wait_time=01-Jan-1970 01:00 db=0x7f8e0c0024e8 db_batch=(nil) batch_started=0 BareosDb=0x7f8e0c0024e8 db_name=bareos db_user=bareos connected=true cmd="cats/sql_create.cc:390 mediatype record File already exists " changes=0 RWLOCK=0x7f8e0c0024f0 w_active=0 w_wait=0 bareos-dir.1640.bactrace (1,079 bytes)

arogge 2019-04-30 15:06 manager ~0003349	Does the problem persist if you disable statistics collection?

jurgengoedbloed 2019-05-02 17:12 reporter ~0003352	Yes. Statistics collection was already turned off. The database tables 'devicestats' and 'jobstats' are also empty.

jurgengoedbloed 2019-05-02 17:25 reporter ~0003353	To add to this... The director had no statistics enabled. The storage daemon has. I have disabled it and restarted the storage daemon. Then I restarted the director and it crashed again. What I did then was the following: Stop the storage daemon start the director and monitor if it would keep running. It keeps on running Then start the storage daemon The director now keeps running. Did a small test backup job: runs fine. Tonight a batch of backup jobs will run, tomorrow I will let you know the outcome.

jurgengoedbloed 2019-05-03 09:13 reporter ~0003354	All backups ran fine this night. Is there anything I can test or try?

arogge 2019-05-03 09:58 manager ~0003355	You can check if you have a meaningful 'traceback' file next to the bactrace you attached. If you have gdb and the debug packages installed (no performance penalties) then a crash will produce a traceback file where we can see exactly in what function on what line the crash has happened. This helps us tracking down the crash a lot.

jurgengoedbloed 2019-05-03 10:12 reporter	bareos.1640.traceback (1,657 bytes) Created /var/lib/bareos/bareos-dir.core.1640 for doing postmortem debugging warning: the debug information found in "/usr/lib/debug//usr/sbin/bareos-dir.debug" does not match "/usr/sbin/bareos-dir" (CRC mismatch). warning: the debug information found in "/usr/lib/debug/usr/sbin/bareos-dir.debug" does not match "/usr/sbin/bareos-dir" (CRC mismatch). [New LWP 1643] [New LWP 1644] [New LWP 1645] [New LWP 1789] [New LWP 1790] [New LWP 1791] [New LWP 1792] [New LWP 1793] [New LWP 1794] [New LWP 1795] [New LWP 1796] [New LWP 1797] [New LWP 1798] [New LWP 1799] [New LWP 1800] [New LWP 1801] [New LWP 1802] [New LWP 1803] [New LWP 1804] [New LWP 1805] [New LWP 1806] [New LWP 1807] [New LWP 1808] [New LWP 1809] [New LWP 1810] [New LWP 1811] [New LWP 1812] [New LWP 1813] [New LWP 1814] [New LWP 1815] [New LWP 1816] [New LWP 1817] [New LWP 1818] [New LWP 1640] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". warning: the debug information found in "/usr/lib/debug//usr/lib64/bareos/backends/libbareoscats-postgresql.so.debug" does not match "/usr/lib64/bareos/backends/libbareoscats-postgresql.so" (CRC mismatch). warning: the debug information found in "/usr/lib/debug/usr/lib64/bareos/backends/libbareoscats-postgresql.so.debug" does not match "/usr/lib64/bareos/backends/libbareoscats-postgresql.so" (CRC mismatch). Core was generated by `/usr/sbin/bareos-dir'. #0 0x00007f8e236e420d in poll () from /lib64/libc.so.6 $1 = 1701994850 $2 = 19234424 $3 = 19234488 /usr/lib/bareos/scripts/btraceback.gdb:4: Error in sourced command file: No symbol table is loaded. Use the "file" command. bareos.1640.traceback (1,657 bytes)

jurgengoedbloed 2019-05-03 10:12 reporter ~0003356	Yes, I have. Here is the corresponding traceback file.

arogge 2019-05-03 10:16 manager ~0003357	From the traceback file (it is a simple text file) it looks like your debug packages don't match the binary packages you've got installed. Could you check this?

jurgengoedbloed 2019-05-03 10:55 reporter ~0003358	I installed from the bareos repository. Noticed that the package bareos-debuginfo was still 18.2.4rc. Updated to 18.2.5 and restarted director and storage (they are on the same machine). If that is what you meant, then the versions should be the same now.

jurgengoedbloed 2019-05-03 11:29 reporter ~0003359	Here is a new traceback file bareos.63319.traceback (12,707 bytes) Created /var/lib/bareos/bareos-dir.core.63319 for doing postmortem debugging [New LWP 63321] [New LWP 63323] [New LWP 63325] [New LWP 63333] [New LWP 63374] [New LWP 63375] [New LWP 63376] [New LWP 63377] [New LWP 63378] [New LWP 63379] [New LWP 63380] [New LWP 63381] [New LWP 63382] [New LWP 63383] [New LWP 63384] [New LWP 63385] [New LWP 63319] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/bareos-dir'. #0 0x00007f1739d3397d in accept () from /lib64/libpthread.so.0 $1 = '\000' <repeats 127 times> $2 = 0x22d2e78 "bareos-dir" $3 = 0x22d2eb8 "/usr/sbin/bareos-dir" $4 = 0x232ef68 "PostgreSQL" $5 = 0x7f173a68ea25 "18.2.5 (30 January 2019)" $6 = 0x7f173a68ea0b "Linux-4.4.92-6.18-default" $7 = 0x7f173a68ea04 "redhat" $8 = 0x7f173a68ef68 "CentOS Linux release 7.6.1810 (Core) " $9 = "prd-mgmt-bareosstore1", '\000' <repeats 234 times> $10 = 0x7f173a68ef90 "redhat CentOS Linux release 7.6.1810 (Core) " Environment variable "TestName" not defined. #0 0x00007f1739d3397d in accept () from /lib64/libpthread.so.0 #1 0x00007f173a63b6bb in BnetThreadServerTcp (addr_list=addr_list@entry=0x242a818, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x6d6180 <directordaemon::socket_workq>, nokeepalive=<optimized out>, HandleConnectionRequest=HandleConnectionRequest@entry=0x446c40 <directordaemon::HandleConnectionRequest(ConfigurationParser, void)>, config=<optimized out>, server_state=<optimized out>, server_state@entry=0x6d6160 <directordaemon::server_state>) at /usr/src/debug/bareos-18.2.5/src/lib/bnet_server_tcp.cc:356 #2 0x0000000000446c34 in directordaemon::connect_thread (arg=0x242a818) at /usr/src/debug/bareos-18.2.5/src/dird/socket_server.cc:132 #3 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 17 (Thread 0x7f173b7a6880 (LWP 63319)): #0 0x00007f1739d33e3d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f173a64a344 in Bmicrosleep (sec=sec@entry=60, usec=usec@entry=0) at /usr/src/debug/bareos-18.2.5/src/lib/bsys.cc:171 #2 0x000000000044cd0d in directordaemon::wait_for_next_job (one_shot_job_to_run=<optimized out>) at /usr/src/debug/bareos-18.2.5/src/dird/scheduler.cc:131 #3 0x000000000041dc81 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/bareos-18.2.5/src/dird/dird.cc:449 Thread 16 (Thread 0x7f16feffd700 (LWP 63385)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 15 (Thread 0x7f16ff7fe700 (LWP 63384)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 14 (Thread 0x7f16fffff700 (LWP 63383)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 13 (Thread 0x7f171cff9700 (LWP 63382)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 12 (Thread 0x7f171d7fa700 (LWP 63381)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 11 (Thread 0x7f171e7fc700 (LWP 63380)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 10 (Thread 0x7f171effd700 (LWP 63379)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7f171f7fe700 (LWP 63378)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f16c3fff700 (LWP 63377)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f171dffb700 (LWP 63376)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f16a3fff700 (LWP 63375)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f172c830700 (LWP 63374)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a677a5e in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:210 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f172d832700 (LWP 63333)): #0 0x00007f1739d34179 in waitpid () from /lib64/libpthread.so.0 #1 0x00007f173a669624 in SignalHandler (sig=11) at /usr/src/debug/bareos-18.2.5/src/lib/signal.cc:241 #2 <signal handler called> #3 Connection::check (this=this@entry=0xaaaaaaaaaaaaaaaa, timeout_data=timeout_data@entry=0) at /usr/src/debug/bareos-18.2.5/src/lib/connection_pool.cc:61 #4 0x00007f173a64e092 in ConnectionPool::cleanup (this=this@entry=0x2344718) at /usr/src/debug/bareos-18.2.5/src/lib/connection_pool.cc:140 #5 0x00007f173a64e17f in ConnectionPool::add (this=this@entry=0x2344718, connection=connection@entry=0x7f1724007aa8) at /usr/src/debug/bareos-18.2.5/src/lib/connection_pool.cc:156 #6 0x00007f173a64e29d in ConnectionPool::add_connection (this=this@entry=0x2344718, name=name@entry=0x7f172d831c90 "tst-civ-nominatim", fd_protocol_version=fd_protocol_version@entry=54, socket=socket@entry=0x7f1728038478, authenticated=authenticated@entry=true) at /usr/src/debug/bareos-18.2.5/src/lib/connection_pool.cc:168 #7 0x0000000000490d6f in directordaemon::HandleFiledConnection (connections=0x2344718, fd=fd@entry=0x7f1728038478, client_name=client_name@entry=0x7f172d831c90 "tst-civ-nominatim", fd_protocol_version=54) at /usr/src/debug/bareos-18.2.5/src/dird/fd_cmds.cc:1365 #8 0x0000000000446f4e in directordaemon::HandleConnectionRequest (config=0x22d43d0, arg=0x7f1728038478) at /usr/src/debug/bareos-18.2.5/src/dird/socket_server.cc:117 #9 0x00007f173a677b9d in workq_server (arg=0x6d6180 <directordaemon::socket_workq>) at /usr/src/debug/bareos-18.2.5/src/lib/workq.cc:232 #10 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f171ffff700 (LWP 63325)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000000000044de37 in wait_for_next_run () at /usr/src/debug/bareos-18.2.5/src/dird/stats.cc:110 #2 directordaemon::statistics_thread (arg=<optimized out>) at /usr/src/debug/bareos-18.2.5/src/dird/stats.cc:293 #3 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f172d031700 (LWP 63323)): #0 0x00007f1739d30d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f173a6771b7 in watchdog_thread (arg=<optimized out>) at /usr/src/debug/bareos-18.2.5/src/lib/watchdog.cc:313 #2 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f17385d9ead in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f172e033700 (LWP 63321)): #0 0x00007f1739d3397d in accept () from /lib64/libpthread.so.0 #1 0x00007f173a63b6bb in BnetThreadServerTcp (addr_list=addr_list@entry=0x242a818, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x6d6180 <directordaemon::socket_workq>, nokeepalive=<optimized out>, HandleConnectionRequest=HandleConnectionRequest@entry=0x446c40 <directordaemon::HandleConnectionRequest(ConfigurationParser, void)>, config=<optimized out>, server_state=<optimized out>, server_state@entry=0x6d6160 <directordaemon::server_state>) at /usr/src/debug/bareos-18.2.5/src/lib/bnet_server_tcp.cc:356 #2 0x0000000000446c34 in directordaemon::connect_thread (arg=0x242a818) at /usr/src/debug/bareos-18.2.5/src/dird/socket_server.cc:132 #3 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f17385d9ead in clone () from /lib64/libc.so.6 #0 0x00007f1739d3397d in accept () from /lib64/libpthread.so.0 No symbol table info available. #1 0x00007f173a63b6bb in BnetThreadServerTcp (addr_list=addr_list@entry=0x242a818, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x6d6180 <directordaemon::socket_workq>, nokeepalive=<optimized out>, HandleConnectionRequest=HandleConnectionRequest@entry=0x446c40 <directordaemon::HandleConnectionRequest(ConfigurationParser, void)>, config=<optimized out>, server_state=<optimized out>, server_state@entry=0x6d6160 <directordaemon::server_state>) at /usr/src/debug/bareos-18.2.5/src/lib/bnet_server_tcp.cc:356 356 newsockfd = accept(fd_ptr->fd, &cli_addr, &clilen); cnt = <optimized out> cli_addr = {sa_family = 2, sa_data = "\334\374\271\252\a\214\000\000\000\000\000\000\000"} tlog = <optimized out> value = 1 ipaddr = 0x0 newsockfd = <optimized out> clilen = 16 fd_ptr = 0x7f172e0321b0 events = 195 pfds = 0x7f172e032190 status = <optimized out> buf = "185.170.7.250", '\000' <repeats 114 times> allbuf = '\000' <repeats 1080 times>... cleanup_object = {sockfds_ = <optimized out>, client_wq_ = 0x6d6180 <directordaemon::socket_workq>} next = <optimized out> to_free = <optimized out> nfds = <optimized out> #2 0x0000000000446c34 in directordaemon::connect_thread (arg=0x242a818) at /usr/src/debug/bareos-18.2.5/src/dird/socket_server.cc:132 132 HandleConnectionRequest, my_config, &server_state); No locals. #3 0x00007f1739d2cdd5 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #4 0x00007f17385d9ead in clone () from /lib64/libc.so.6 No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. bareos.63319.traceback (12,707 bytes)

arogge 2019-05-03 11:58 manager ~0003360	From what I see you're right: you're using client initiated connection and something is wrong with the connection-pool. When the client 'tst-civ-nominatim' connects to the director the director then should add that connection to the connection pool. However, it looks like the connection pool had already been destroyed at this point. Is this reproducible with just one client? I will have to reproduce it myself so we can write a test for this, and I would be glad if there was a simple way to reproduce it.

jurgengoedbloed 2019-05-03 17:10 reporter ~0003361	I will test this after the weekend and let you know.

jurgengoedbloed 2019-05-09 15:34 reporter ~0003363	I did some tests, but at the moment the director is running find and I cannot reproduce the crash. I'll let you know once the director crashes again.

jurgengoedbloed 2019-05-15 08:29 reporter ~0003369	I got another crash. Disabled access from all filedaemons by blocking them with iptables, except for one host. In this situation, the director keeps on running.

jurgengoedbloed 2019-05-15 08:42 reporter ~0003370	After a minute or so, I removed the iptables block rule. All clients are now connected, the director now seems to run fine.

jurgengoedbloed 2019-05-24 15:03 reporter ~0003381	It seems that we have crossed a threshold in the number of clients. I block access to the director except for a small number of clients (<30). Start the director -> runs fine and shows client initiated connection clients. As soon as I remove the blockage, the director crashes. To rule out 'bad clients', I have tried to block different parts of the network, but no solution. The only succes I'm now having is this: - Block all clients - Stop storage daemon (runs on the same machine) - Start director - Allow clients subnet by subnet - Remove last blockage - Start storage. Anything I can do to test?

bruno-at-bareos 2023-07-05 16:28 manager ~0005154	Are you still able to reproduce that with newer version of Bareos 21,22 has a lot of fixes have been incorporated concerning director crashes.

bruno-at-bareos 2023-07-14 14:20 manager ~0005178	No answer, but doesn't seems to be reproducible with Bareos 22.1.0

Date Modified	Username	Field	Change
2019-04-30 14:23	jurgengoedbloed	New Issue
2019-04-30 14:23	jurgengoedbloed	File Added: bareos-dir.1640.bactrace
2019-04-30 15:06	arogge	Assigned To	=> arogge
2019-04-30 15:06	arogge	Status	new => feedback
2019-04-30 15:06	arogge	Note Added: 0003349
2019-05-02 17:12	jurgengoedbloed	Note Added: 0003352
2019-05-02 17:12	jurgengoedbloed	Status	feedback => assigned
2019-05-02 17:25	jurgengoedbloed	Note Added: 0003353
2019-05-03 09:13	jurgengoedbloed	Note Added: 0003354
2019-05-03 09:58	arogge	Note Added: 0003355
2019-05-03 10:01	arogge	Status	assigned => feedback
2019-05-03 10:12	jurgengoedbloed	File Added: bareos.1640.traceback
2019-05-03 10:12	jurgengoedbloed	Note Added: 0003356
2019-05-03 10:12	jurgengoedbloed	Status	feedback => assigned
2019-05-03 10:16	arogge	Status	assigned => feedback
2019-05-03 10:16	arogge	Note Added: 0003357
2019-05-03 10:55	jurgengoedbloed	Note Added: 0003358
2019-05-03 10:55	jurgengoedbloed	Status	feedback => assigned
2019-05-03 11:29	jurgengoedbloed	File Added: bareos.63319.traceback
2019-05-03 11:29	jurgengoedbloed	Note Added: 0003359
2019-05-03 11:58	arogge	Status	assigned => acknowledged
2019-05-03 11:58	arogge	Note Added: 0003360
2019-05-03 17:10	jurgengoedbloed	Note Added: 0003361
2019-05-09 15:34	jurgengoedbloed	Note Added: 0003363
2019-05-15 08:29	jurgengoedbloed	Note Added: 0003369
2019-05-15 08:42	jurgengoedbloed	Note Added: 0003370
2019-05-24 15:03	jurgengoedbloed	Note Added: 0003381
2019-07-10 17:45	arogge	Assigned To	arogge =>
2023-07-05 16:28	bruno-at-bareos	Assigned To	=> bruno-at-bareos
2023-07-05 16:28	bruno-at-bareos	Status	acknowledged => feedback
2023-07-05 16:28	bruno-at-bareos	Note Added: 0005154
2023-07-14 14:20	bruno-at-bareos	Status	feedback => closed
2023-07-14 14:20	bruno-at-bareos	Resolution	open => no change required
2023-07-14 14:20	bruno-at-bareos	Note Added: 0005178

Reporting new Issues is disabled, please Report new Issues at https://github.com/bareos/bareos/issues

View Issue Details

Activities

Issue History