View Issue Details

IDProjectCategoryView StatusLast Update
0001013bareos-coredirectorpublic2019-12-18 15:43
Reporterprogserega Assigned Toarogge  
PriorityurgentSeveritycrashReproducibilityhave not tried
Status closedResolutionunable to reproduce 
Platformamd64OSDebianOS Version9.4
Product Version17.2.4 
Summary0001013: bareos-dir crash after start job from webui
Descriptionbareos-dir crash after start job from webui.

Steps To Reproduce1. Add new client and job for him
2. Reload bareos-dir
3. Enter webui
4. start job for backup new client
5. bareos-dir crash
Additional InformationProxmox 5
lxc container with debian 9

ii bareos-bconsole 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - text console
ii bareos-client 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - client metapackage
ii bareos-common 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - common files
ii bareos-database-common 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - common catalog files
ii bareos-database-postgresql 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - PostgreSQL backend
ii bareos-database-tools 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - database tools
ii bareos-director 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - director daemon
ii bareos-filedaemon 17.2.4-9.1 amd64 Backup Archiving Recovery Open Sourced - file daemon
ii bareos-webui 17.2.4-15.1 all Backup Archiving Recovery Open Sourced - webui

Linux bareos 4.9.0-6-amd64 0000001 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux

in dmesg:

[ 6.569348] input: PC Speaker as /devices/platform/pcspkr/input/input5
[ 6.571505] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input6
[ 6.571535] ACPI: Power Button [PWRF]
[ 6.730165] EXT4-fs (vda1): mounting ext2 file system using the ext4 subsystem
[ 6.754072] EXT4-fs (vda1): mounted filesystem without journal. Opts: (null)
[ 6.805224] Adding 2097148k swap on /dev/mapper/debian--vg-swap_1. Priority:-1 extents:1 across:2097148k FS
[ 105.056668] random: crng init done
[2929435.899835] INFO: task jbd2/dm-0-8:158 blocked for more than 120 seconds.
[2929435.909741] Not tainted 4.9.0-6-amd64 0000001 Debian 4.9.82-1+deb9u3
[2929435.910333] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2929435.910696] jbd2/dm-0-8 D 0 158 2 0x00000000
[2929435.910700] ffff950778b2f800 0000000000000000 ffff95073689e0c0 ffff95077fc18940
[2929435.910703] ffffffff93211500 ffffadecc0523b30 ffffffff92c0c649 0000000000000001
[2929435.910705] 00ffffff928fdbc4 ffff95077fc18940 ffff950700000000 ffff95073689e0c0
[2929435.910707] Call Trace:
[2929435.910713] [<ffffffff92c0c649>] ? __schedule+0x239/0x6f0
[2929435.910715] [<ffffffff92c0d2f0>] ? bit_wait+0x50/0x50
[2929435.910717] [<ffffffff92c0cb32>] ? schedule+0x32/0x80
[2929435.910719] [<ffffffff92c0febd>] ? schedule_timeout+0x1dd/0x380
[2929435.910721] [<ffffffff926a9613>] ? update_load_avg+0x73/0x360
[2929435.910722] [<ffffffff926a9613>] ? update_load_avg+0x73/0x360
[2929435.910725] [<ffffffff92658e5a>] ? kvm_clock_get_cycles+0x1a/0x20
[2929435.910736] [<ffffffff926ed74e>] ? ktime_get+0x3e/0xb0
[2929435.910738] [<ffffffff92c0d2f0>] ? bit_wait+0x50/0x50
[2929435.910739] [<ffffffff92c0c3ad>] ? io_schedule_timeout+0x9d/0x100
[2929435.910741] [<ffffffff926b9887>] ? prepare_to_wait+0x57/0x80
[2929435.910743] [<ffffffff92c0d307>] ? bit_wait_io+0x17/0x60
[2929435.910744] [<ffffffff92c0cec5>] ? __wait_on_bit+0x55/0x80
[2929435.910746] [<ffffffff9269e4c6>] ? finish_task_switch+0x76/0x200
[2929435.910748] [<ffffffff92c0d2f0>] ? bit_wait+0x50/0x50
[2929435.910749] [<ffffffff92c0d02e>] ? out_of_line_wait_on_bit+0x7e/0xa0
[2929435.910751] [<ffffffff926b9cf0>] ? wake_atomic_t_function+0x60/0x60
[2929435.910758] [<ffffffffc04bbf85>] ? jbd2_journal_commit_transaction+0xf55/0x17b0 [jbd2]
[2929435.910760] [<ffffffff9269e4c6>] ? finish_task_switch+0x76/0x200
[2929435.910764] [<ffffffffc04c0c02>] ? kjournald2+0xc2/0x260 [jbd2]
[2929435.910765] [<ffffffff926b9c50>] ? prepare_to_wait_event+0xf0/0xf0
[2929435.910768] [<ffffffffc04c0b40>] ? commit_timeout+0x10/0x10 [jbd2]
[2929435.910771] [<ffffffff926970c9>] ? kthread+0xd9/0xf0
[2929435.910773] [<ffffffff92696ff0>] ? kthread_park+0x60/0x60
[2929435.910775] [<ffffffff9267c3d0>] ? SyS_exit_group+0x10/0x10
[2929435.910776] [<ffffffff92c11537>] ? ret_from_fork+0x57/0x70

in syslog:

Sep 30 21:51:15 bareos systemd-timesyncd[370]: Synchronized to time server 94.100.192.29:123 (3.debian.pool.ntp.org).
Sep 30 22:09:01 bareos CRON[10341]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Sep 30 22:09:10 bareos systemd[1]: Starting Clean php session files...
Sep 30 22:09:10 bareos systemd[1]: Started Clean php session files.
Sep 30 22:12:39 bareos puppet-agent[10618]: Applied catalog in 21.57 seconds
Sep 30 22:17:01 bareos CRON[11401]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Sep 30 22:22:06 bareos bareos-dir: BAREOS interrupted by signal 7: BUS error
Sep 30 22:33:34 bareos systemd[1]: Stopping Puppet agent...
Sep 30 22:33:34 bareos puppet-agent[16966]: Caught TERM; exiting
Sep 30 22:33:34 bareos systemd[1]: Stopped Puppet agent.
Sep 30 22:39:01 bareos CRON[12635]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
TagsNo tags attached.

Activities

progserega

progserega

2018-09-30 14:52

reporter  

progserega

progserega

2018-09-30 15:04

reporter   ~0003125

Starting program: /usr/sbin/bareos-dir -f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff41bc700 (LWP 13925)]
[New Thread 0x7fffebfff700 (LWP 13927)]
[New Thread 0x7fffeb7fe700 (LWP 13928)]
[New Thread 0x7fffeaffd700 (LWP 13929)]
[New Thread 0x7fffea7fc700 (LWP 13937)]
[New Thread 0x7fffe9ffb700 (LWP 13945)]
[Thread 0x7fffe9ffb700 (LWP 13945) exited]
[Thread 0x7fffea7fc700 (LWP 13937) exited]
[New Thread 0x7fffea7fc700 (LWP 13949)]
[New Thread 0x7fffe9ffb700 (LWP 13950)]
[New Thread 0x7fffe97fa700 (LWP 13960)]
[New Thread 0x7fffe8ff9700 (LWP 13962)]
[New Thread 0x7fffcbfff700 (LWP 13964)]
[Thread 0x7fffea7fc700 (LWP 13949) exited]
[Thread 0x7fffcbfff700 (LWP 13964) exited]
[Thread 0x7fffe8ff9700 (LWP 13962) exited]
[Thread 0x7fffe97fa700 (LWP 13960) exited]
[Thread 0x7fffe9ffb700 (LWP 13950) exited]
[New Thread 0x7fffe9ffb700 (LWP 13968)]
[Thread 0x7fffe9ffb700 (LWP 13968) exited]
[New Thread 0x7fffe9ffb700 (LWP 13969)]
[Thread 0x7fffe9ffb700 (LWP 13969) exited]
[New Thread 0x7fffe9ffb700 (LWP 13971)]
[Thread 0x7fffe9ffb700 (LWP 13971) exited]
[New Thread 0x7fffe9ffb700 (LWP 13972)]
[New Thread 0x7fffcbfff700 (LWP 13974)]
[Thread 0x7fffe9ffb700 (LWP 13972) exited]
[Thread 0x7fffcbfff700 (LWP 13974) exited]
[New Thread 0x7fffcbfff700 (LWP 13977)]
[New Thread 0x7fffe9ffb700 (LWP 13980)]
[Thread 0x7fffe9ffb700 (LWP 13980) exited]
[Thread 0x7fffcbfff700 (LWP 13977) exited]
[New Thread 0x7fffcbfff700 (LWP 13984)]
[New Thread 0x7fffe9ffb700 (LWP 13985)]
[Thread 0x7fffcbfff700 (LWP 13984) exited]
[New Thread 0x7fffcbfff700 (LWP 13989)]
[New Thread 0x7fffe8ff9700 (LWP 13990)]
[Thread 0x7fffe8ff9700 (LWP 13990) exited]
[New Thread 0x7fffe8ff9700 (LWP 14001)]
[New Thread 0x7fffe97fa700 (LWP 14010)]
[New Thread 0x7fffea7fc700 (LWP 14012)]
[New Thread 0x7fffcb7fe700 (LWP 14016)]
[Thread 0x7fffe97fa700 (LWP 14010) exited]
[Thread 0x7fffcb7fe700 (LWP 14016) exited]
[Thread 0x7fffea7fc700 (LWP 14012) exited]
[Thread 0x7fffe8ff9700 (LWP 14001) exited]
[New Thread 0x7fffe8ff9700 (LWP 14032)]
[Thread 0x7fffe8ff9700 (LWP 14032) exited]
[New Thread 0x7fffe8ff9700 (LWP 14086)]
[Thread 0x7fffe8ff9700 (LWP 14086) exited]
[New Thread 0x7fffe8ff9700 (LWP 14144)]
[New Thread 0x7fffcb7fe700 (LWP 14146)]
[New Thread 0x7fffea7fc700 (LWP 14148)]
[Thread 0x7fffea7fc700 (LWP 14148) exited]
[Thread 0x7fffcb7fe700 (LWP 14146) exited]
[New Thread 0x7fffea7fc700 (LWP 14161)]
[Thread 0x7fffea7fc700 (LWP 14161) exited]
[Thread 0x7fffe8ff9700 (LWP 14144) exited]
[New Thread 0x7fffe8ff9700 (LWP 14191)]
[Thread 0x7fffe8ff9700 (LWP 14191) exited]

Thread 5 "bareos-dir" received signal SIGUSR2, User defined signal 2.
[Switching to Thread 0x7fffeaffd700 (LWP 13929)]
0x00007ffff56f57dd in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt full
#0 0x00007ffff56f57dd in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
0000001 0x00007ffff6ea51f4 in bmicrosleep(int, int) () from /usr/lib/bareos/libbareos-17.2.4.so
No symbol table info available.
0000002 0x00007ffff6ecf0c8 in register_watchdog(s_watchdog_t*) () from /usr/lib/bareos/libbareos-17.2.4.so
No symbol table info available.
0000003 0x00007ffff6ea8107 in start_thread_timer(JCR*, unsigned long, unsigned int) () from /usr/lib/bareos/libbareos-17.2.4.so
No symbol table info available.
0000004 0x00007ffff6ea37c0 in BSOCK_TCP::connect(JCR*, int, long, long, char const*, char*, char*, int, bool) () from /usr/lib/bareos/libbareos-17.2.4.so
No symbol table info available.
0000005 0x00005555555a74cd in ?? ()
No symbol table info available.
0000006 0x00005555555aa48c in ?? ()
No symbol table info available.
0000007 0x00007ffff6eb5d9f in lmgr_thread_launcher () from /usr/lib/bareos/libbareos-17.2.4.so
No symbol table info available.
0000008 0x00007ffff56ec494 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
0000009 0x00007ffff498dacf in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
(gdb)
progserega

progserega

2018-09-30 15:28

reporter   ~0003126

it is happens where in bacula-dir client file set mistake IP of client. Which not allow connect to him from bareos-dir.
progserega

progserega

2018-09-30 15:37

reporter   ~0003127

When fix IP of client - job will run success, but then director crash again:

сен 30 23:07:56 bareos systemd[1]: Starting Bareos Director Daemon service...
сен 30 23:07:56 bareos systemd[1]: bareos-director.service: PID file /var/lib/bareos/bareos-dir.9101.pid not readable (yet?) after start: No such file or directory
сен 30 23:07:56 bareos systemd[1]: Started Bareos Director Daemon service.
сен 30 23:24:56 bareos bareos-dir[14463]: bsock_tcp.c:407 Write error sending -1 bytes to client:127.0.0.1:9101: ERR=Обрыв канала
сен 30 23:27:01 bareos bareos-dir[14463]: bsock_tcp.c:407 Write error sending -1 bytes to client:127.0.0.1:9101: ERR=Обрыв канала
сен 30 23:27:56 bareos bareos-dir[14463]: BAREOS interrupted by signal 7: BUS error
progserega

progserega

2018-09-30 15:43

reporter  

progserega

progserega

2018-09-30 15:46

reporter   ~0003128

Link to bareos-dir.core.14463.gz
https://yadi.sk/d/VN-HBejjny-Dhw
arogge

arogge

2019-08-01 10:20

manager   ~0003549

If you still want this troubleshooted, we will need a meaningful traceback.
For this you need to install the debug-packages (bareos-dbg on Debian) in addition to gdb.
The traceback file will then contain a lot more information and will allow us to investigate the issue further.
arogge

arogge

2019-12-18 15:43

manager   ~0003695

closing due to no response on feedback request

Issue History

Date Modified Username Field Change
2018-09-30 14:52 progserega New Issue
2018-09-30 14:52 progserega File Added: bareos-dir.20528.bactrace
2018-09-30 15:04 progserega Note Added: 0003125
2018-09-30 15:28 progserega Note Added: 0003126
2018-09-30 15:37 progserega Note Added: 0003127
2018-09-30 15:43 progserega File Added: bareos.14463.traceback
2018-09-30 15:46 progserega Note Added: 0003128
2019-08-01 10:20 arogge Status new => feedback
2019-08-01 10:20 arogge Note Added: 0003549
2019-12-18 15:43 arogge Assigned To => arogge
2019-12-18 15:43 arogge Status feedback => resolved
2019-12-18 15:43 arogge Resolution open => unable to reproduce
2019-12-18 15:43 arogge Status resolved => closed
2019-12-18 15:43 arogge Note Added: 0003695