View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001013 | bareos-core | director | public | 2018-09-30 14:52 | 2019-12-18 15:43 |
Reporter | progserega | Assigned To | arogge | ||
Priority | urgent | Severity | crash | Reproducibility | have not tried |
Status | closed | Resolution | unable to reproduce | ||
Platform | amd64 | OS | Debian | OS Version | 9.4 |
Product Version | 17.2.4 | ||||
Summary | 0001013: bareos-dir crash after start job from webui | ||||
Description | bareos-dir crash after start job from webui. | ||||
Steps To Reproduce | 1. Add new client and job for him 2. Reload bareos-dir 3. Enter webui 4. start job for backup new client 5. bareos-dir crash | ||||
Additional Information | Proxmox 5 lxc container with debian 9 ii bareos-bconsole 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - text console ii bareos-client 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - client metapackage ii bareos-common 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - common files ii bareos-database-common 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - common catalog files ii bareos-database-postgresql 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - PostgreSQL backend ii bareos-database-tools 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - database tools ii bareos-director 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - director daemon ii bareos-filedaemon 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - file daemon ii bareos-webui 17.2.4-15.1 all Backup Archiving Recovery Open Sourced - webui Linux bareos 4.9.0-6-amd64 0000001 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux in dmesg: [ 6.569348] input: PC Speaker as /devices/platform/pcspkr/input/input5 [ 6.571505] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input6 [ 6.571535] ACPI: Power Button [PWRF] [ 6.730165] EXT4-fs (vda1): mounting ext2 file system using the ext4 subsystem [ 6.754072] EXT4-fs (vda1): mounted filesystem without journal. Opts: (null) [ 6.805224] Adding 2097148k swap on /dev/mapper/debian--vg-swap_1. Priority:-1 extents:1 across:2097148k FS [ 105.056668] random: crng init done [2929435.899835] INFO: task jbd2/dm-0-8:158 blocked for more than 120 seconds. [2929435.909741] Not tainted 4.9.0-6-amd64 0000001 Debian 4.9.82-1+deb9u3 [2929435.910333] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [2929435.910696] jbd2/dm-0-8 D 0 158 2 0x00000000 [2929435.910700] ffff950778b2f800 0000000000000000 ffff95073689e0c0 ffff95077fc18940 [2929435.910703] ffffffff93211500 ffffadecc0523b30 ffffffff92c0c649 0000000000000001 [2929435.910705] 00ffffff928fdbc4 ffff95077fc18940 ffff950700000000 ffff95073689e0c0 [2929435.910707] Call Trace: [2929435.910713] [<ffffffff92c0c649>] ? __schedule+0x239/0x6f0 [2929435.910715] [<ffffffff92c0d2f0>] ? bit_wait+0x50/0x50 [2929435.910717] [<ffffffff92c0cb32>] ? schedule+0x32/0x80 [2929435.910719] [<ffffffff92c0febd>] ? schedule_timeout+0x1dd/0x380 [2929435.910721] [<ffffffff926a9613>] ? update_load_avg+0x73/0x360 [2929435.910722] [<ffffffff926a9613>] ? update_load_avg+0x73/0x360 [2929435.910725] [<ffffffff92658e5a>] ? kvm_clock_get_cycles+0x1a/0x20 [2929435.910736] [<ffffffff926ed74e>] ? ktime_get+0x3e/0xb0 [2929435.910738] [<ffffffff92c0d2f0>] ? bit_wait+0x50/0x50 [2929435.910739] [<ffffffff92c0c3ad>] ? io_schedule_timeout+0x9d/0x100 [2929435.910741] [<ffffffff926b9887>] ? prepare_to_wait+0x57/0x80 [2929435.910743] [<ffffffff92c0d307>] ? bit_wait_io+0x17/0x60 [2929435.910744] [<ffffffff92c0cec5>] ? __wait_on_bit+0x55/0x80 [2929435.910746] [<ffffffff9269e4c6>] ? finish_task_switch+0x76/0x200 [2929435.910748] [<ffffffff92c0d2f0>] ? bit_wait+0x50/0x50 [2929435.910749] [<ffffffff92c0d02e>] ? out_of_line_wait_on_bit+0x7e/0xa0 [2929435.910751] [<ffffffff926b9cf0>] ? wake_atomic_t_function+0x60/0x60 [2929435.910758] [<ffffffffc04bbf85>] ? jbd2_journal_commit_transaction+0xf55/0x17b0 [jbd2] [2929435.910760] [<ffffffff9269e4c6>] ? finish_task_switch+0x76/0x200 [2929435.910764] [<ffffffffc04c0c02>] ? kjournald2+0xc2/0x260 [jbd2] [2929435.910765] [<ffffffff926b9c50>] ? prepare_to_wait_event+0xf0/0xf0 [2929435.910768] [<ffffffffc04c0b40>] ? commit_timeout+0x10/0x10 [jbd2] [2929435.910771] [<ffffffff926970c9>] ? kthread+0xd9/0xf0 [2929435.910773] [<ffffffff92696ff0>] ? kthread_park+0x60/0x60 [2929435.910775] [<ffffffff9267c3d0>] ? SyS_exit_group+0x10/0x10 [2929435.910776] [<ffffffff92c11537>] ? ret_from_fork+0x57/0x70 in syslog: Sep 30 21:51:15 bareos systemd-timesyncd[370]: Synchronized to time server 94.100.192.29:123 (3.debian.pool.ntp.org). Sep 30 22:09:01 bareos CRON[10341]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi) Sep 30 22:09:10 bareos systemd[1]: Starting Clean php session files... Sep 30 22:09:10 bareos systemd[1]: Started Clean php session files. Sep 30 22:12:39 bareos puppet-agent[10618]: Applied catalog in 21.57 seconds Sep 30 22:17:01 bareos CRON[11401]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Sep 30 22:22:06 bareos bareos-dir: BAREOS interrupted by signal 7: BUS error Sep 30 22:33:34 bareos systemd[1]: Stopping Puppet agent... Sep 30 22:33:34 bareos puppet-agent[16966]: Caught TERM; exiting Sep 30 22:33:34 bareos systemd[1]: Stopped Puppet agent. Sep 30 22:39:01 bareos CRON[12635]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi) | ||||
Tags | No tags attached. | ||||
Starting program: /usr/sbin/bareos-dir -f [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff41bc700 (LWP 13925)] [New Thread 0x7fffebfff700 (LWP 13927)] [New Thread 0x7fffeb7fe700 (LWP 13928)] [New Thread 0x7fffeaffd700 (LWP 13929)] [New Thread 0x7fffea7fc700 (LWP 13937)] [New Thread 0x7fffe9ffb700 (LWP 13945)] [Thread 0x7fffe9ffb700 (LWP 13945) exited] [Thread 0x7fffea7fc700 (LWP 13937) exited] [New Thread 0x7fffea7fc700 (LWP 13949)] [New Thread 0x7fffe9ffb700 (LWP 13950)] [New Thread 0x7fffe97fa700 (LWP 13960)] [New Thread 0x7fffe8ff9700 (LWP 13962)] [New Thread 0x7fffcbfff700 (LWP 13964)] [Thread 0x7fffea7fc700 (LWP 13949) exited] [Thread 0x7fffcbfff700 (LWP 13964) exited] [Thread 0x7fffe8ff9700 (LWP 13962) exited] [Thread 0x7fffe97fa700 (LWP 13960) exited] [Thread 0x7fffe9ffb700 (LWP 13950) exited] [New Thread 0x7fffe9ffb700 (LWP 13968)] [Thread 0x7fffe9ffb700 (LWP 13968) exited] [New Thread 0x7fffe9ffb700 (LWP 13969)] [Thread 0x7fffe9ffb700 (LWP 13969) exited] [New Thread 0x7fffe9ffb700 (LWP 13971)] [Thread 0x7fffe9ffb700 (LWP 13971) exited] [New Thread 0x7fffe9ffb700 (LWP 13972)] [New Thread 0x7fffcbfff700 (LWP 13974)] [Thread 0x7fffe9ffb700 (LWP 13972) exited] [Thread 0x7fffcbfff700 (LWP 13974) exited] [New Thread 0x7fffcbfff700 (LWP 13977)] [New Thread 0x7fffe9ffb700 (LWP 13980)] [Thread 0x7fffe9ffb700 (LWP 13980) exited] [Thread 0x7fffcbfff700 (LWP 13977) exited] [New Thread 0x7fffcbfff700 (LWP 13984)] [New Thread 0x7fffe9ffb700 (LWP 13985)] [Thread 0x7fffcbfff700 (LWP 13984) exited] [New Thread 0x7fffcbfff700 (LWP 13989)] [New Thread 0x7fffe8ff9700 (LWP 13990)] [Thread 0x7fffe8ff9700 (LWP 13990) exited] [New Thread 0x7fffe8ff9700 (LWP 14001)] [New Thread 0x7fffe97fa700 (LWP 14010)] [New Thread 0x7fffea7fc700 (LWP 14012)] [New Thread 0x7fffcb7fe700 (LWP 14016)] [Thread 0x7fffe97fa700 (LWP 14010) exited] [Thread 0x7fffcb7fe700 (LWP 14016) exited] [Thread 0x7fffea7fc700 (LWP 14012) exited] [Thread 0x7fffe8ff9700 (LWP 14001) exited] [New Thread 0x7fffe8ff9700 (LWP 14032)] [Thread 0x7fffe8ff9700 (LWP 14032) exited] [New Thread 0x7fffe8ff9700 (LWP 14086)] [Thread 0x7fffe8ff9700 (LWP 14086) exited] [New Thread 0x7fffe8ff9700 (LWP 14144)] [New Thread 0x7fffcb7fe700 (LWP 14146)] [New Thread 0x7fffea7fc700 (LWP 14148)] [Thread 0x7fffea7fc700 (LWP 14148) exited] [Thread 0x7fffcb7fe700 (LWP 14146) exited] [New Thread 0x7fffea7fc700 (LWP 14161)] [Thread 0x7fffea7fc700 (LWP 14161) exited] [Thread 0x7fffe8ff9700 (LWP 14144) exited] [New Thread 0x7fffe8ff9700 (LWP 14191)] [Thread 0x7fffe8ff9700 (LWP 14191) exited] Thread 5 "bareos-dir" received signal SIGUSR2, User defined signal 2. [Switching to Thread 0x7fffeaffd700 (LWP 13929)] 0x00007ffff56f57dd in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 (gdb) bt full #0 0x00007ffff56f57dd in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 No symbol table info available. 0000001 0x00007ffff6ea51f4 in bmicrosleep(int, int) () from /usr/lib/bareos/libbareos-17.2.4.so No symbol table info available. 0000002 0x00007ffff6ecf0c8 in register_watchdog(s_watchdog_t*) () from /usr/lib/bareos/libbareos-17.2.4.so No symbol table info available. 0000003 0x00007ffff6ea8107 in start_thread_timer(JCR*, unsigned long, unsigned int) () from /usr/lib/bareos/libbareos-17.2.4.so No symbol table info available. 0000004 0x00007ffff6ea37c0 in BSOCK_TCP::connect(JCR*, int, long, long, char const*, char*, char*, int, bool) () from /usr/lib/bareos/libbareos-17.2.4.so No symbol table info available. 0000005 0x00005555555a74cd in ?? () No symbol table info available. 0000006 0x00005555555aa48c in ?? () No symbol table info available. 0000007 0x00007ffff6eb5d9f in lmgr_thread_launcher () from /usr/lib/bareos/libbareos-17.2.4.so No symbol table info available. 0000008 0x00007ffff56ec494 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 No symbol table info available. 0000009 0x00007ffff498dacf in clone () from /lib/x86_64-linux-gnu/libc.so.6 No symbol table info available. (gdb) |
|
it is happens where in bacula-dir client file set mistake IP of client. Which not allow connect to him from bareos-dir. | |
When fix IP of client - job will run success, but then director crash again: сен 30 23:07:56 bareos systemd[1]: Starting Bareos Director Daemon service... сен 30 23:07:56 bareos systemd[1]: bareos-director.service: PID file /var/lib/bareos/bareos-dir.9101.pid not readable (yet?) after start: No such file or directory сен 30 23:07:56 bareos systemd[1]: Started Bareos Director Daemon service. сен 30 23:24:56 bareos bareos-dir[14463]: bsock_tcp.c:407 Write error sending -1 bytes to client:127.0.0.1:9101: ERR=Обрыв канала сен 30 23:27:01 bareos bareos-dir[14463]: bsock_tcp.c:407 Write error sending -1 bytes to client:127.0.0.1:9101: ERR=Обрыв канала сен 30 23:27:56 bareos bareos-dir[14463]: BAREOS interrupted by signal 7: BUS error |
|
Link to bareos-dir.core.14463.gz https://yadi.sk/d/VN-HBejjny-Dhw |
|
If you still want this troubleshooted, we will need a meaningful traceback. For this you need to install the debug-packages (bareos-dbg on Debian) in addition to gdb. The traceback file will then contain a lot more information and will allow us to investigate the issue further. |
|
closing due to no response on feedback request | |
Date Modified | Username | Field | Change |
---|---|---|---|
2018-09-30 14:52 | progserega | New Issue | |
2018-09-30 14:52 | progserega | File Added: bareos-dir.20528.bactrace | |
2018-09-30 15:04 | progserega | Note Added: 0003125 | |
2018-09-30 15:28 | progserega | Note Added: 0003126 | |
2018-09-30 15:37 | progserega | Note Added: 0003127 | |
2018-09-30 15:43 | progserega | File Added: bareos.14463.traceback | |
2018-09-30 15:46 | progserega | Note Added: 0003128 | |
2019-08-01 10:20 | arogge | Status | new => feedback |
2019-08-01 10:20 | arogge | Note Added: 0003549 | |
2019-12-18 15:43 | arogge | Assigned To | => arogge |
2019-12-18 15:43 | arogge | Status | feedback => resolved |
2019-12-18 15:43 | arogge | Resolution | open => unable to reproduce |
2019-12-18 15:43 | arogge | Status | resolved => closed |
2019-12-18 15:43 | arogge | Note Added: 0003695 |