View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000552 | bareos-core | storage daemon | public | 2015-11-05 23:50 | 2019-12-18 15:25 |
Reporter | avantsysadm@avant.ca | Assigned To | |||
Priority | normal | Severity | crash | Reproducibility | sometimes |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | CentOS | OS Version | 6 |
Product Version | 15.4.0 | ||||
Summary | 0000552: SD crashes in -current | ||||
Description | While attempting to reproduce a next-tape-selection problem on the mailing list, my SD crashed. Maybe this backtrace is useful. I was trying to run a "status storage=TL1000" at the time. | ||||
Additional Information | Created /var/lib/bareos/bareos-sd.core.7501 for doing postmortem debugging Missing separate debuginfo for Try: yum --enablerepo='*-debug*' install /usr/lib/debug/.build-id/fa/be1ca508dffca0ce7e6bffdc6197edd22e4583 [New Thread 7503] [New Thread 7505] [New Thread 7506] [New Thread 18715] [New Thread 7501] [Thread debugging using libthread_db enabled] Core was generated by `/usr/sbin/bareos-sd -g bareos -c /etc/bareos/bareos-sd.conf'. #0 0x00007fefedf13fbd in nanosleep () at ../sysdeps/unix/syscall-template.S:82 82 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) $1 = '\000' <repeats 127 times> $2 = 0x2027068 "bareos-sd" $3 = 0x20270a8 "/usr/sbin/bareos-sd" $4 = 0x0 $5 = 0x7fefef0156d2 "15.4.0 (03 October 2015)" $6 = 0x7fefef0156eb "x86_64-redhat-linux-gnu" $7 = 0x7fefef015703 "redhat" $8 = 0x7fefef01570a "CentOS release 6.6 (Final)" $9 = "backup1.ad.avant.ca", '\000' <repeats 236 times> $10 = 0x7fefef015c48 "redhat CentOS release 6.6 (Final)" Environment variable "TestName" not defined. #0 0x00007fefedf13fbd in nanosleep () at ../sysdeps/unix/syscall-template.S:82 0000001 0x00007fefeefe51f2 in bmicrosleep (sec=30, usec=0) at bsys.c:171 0000002 0x00007fefeeff5f31 in check_deadlock () at lockmgr.c:566 0000003 0x00007fefedf0ca51 in start_thread (arg=0x7fefe596c700) at pthread_create.c:301 0000004 0x00007fefecea693d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 5 (Thread 0x7fefefb1e7e0 (LWP 7501)): #0 0x00007fefece9d113 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 0000001 0x00007fefeefd9386 in bnet_thread_server_tcp (addr_list=0x204d198, max_clients=1076814496, sockfds=0x204c698, client_wq=0x628540, nokeepalive=false, handle_client_request=0x1) at bnet_server_tcp.c:298 0000002 0x000000000041d78e in main (argc=<value optimized out>, argv=<value optimized out>) at stored.c:325 Thread 4 (Thread 0x7fefdffff700 (LWP 18715)): #0 0x00007fefedf1432d in __libc_waitpid (pid=<value optimized out>, stat_loc=<value optimized out>, options=<value optimized out>) at ../sysdeps/unix/sysv/linux/waitpid.c:41 0000001 0x00007fefef0054d1 in signal_handler (sig=11) at signal.c:240 0000002 <signal handler called> 0000003 smart_alloc_msg (file=<value optimized out>, line=229, fmt=0x7fefef018580 "Overrun buffer: len=%d addr=%p allocated: %s:%d called from %s:%d\n") at smartall.c:113 0000004 0x00007fefef006792 in sm_free (file=0x7fefef015275 "mem_pool.c", line=254, fp=0x7fefb4001248) at smartall.c:230 0000005 0x00007fefeeff7004 in sm_free_pool_memory (fname=<value optimized out>, lineno=<value optimized out>, obuf=0x7fefb4001260 "") at mem_pool.c:254 0000006 0x00007fefef45ca6a in DEVICE::set_blocksizes (this=0x7fefd8003c18, dcr=0x7fefb4112e78) at dev.c:484 0000007 0x00007fefef462c43 in read_dev_volume_label (dcr=0x7fefb4112e78) at label.c:286 0000008 0x00007fefef464fcd in DCR::check_volume_label (this=0x7fefb4112e78, ask=@0x7fefdfffe8bf, autochanger=@0x7fefdfffe8be) at mount.c:431 0000009 0x00007fefef465cce in DCR::mount_next_write_volume (this=0x7fefb4112e78) at mount.c:259 0000010 0x00007fefef44f0dc in acquire_device_for_append (dcr=0x7fefb4112e78) at acquire.c:436 0000011 0x000000000040892c in do_append_data (jcr=0x7fefb4001a18, bs=0x204d198, what=0x420750 "FD") at append.c:76 0000012 0x00000000004114c3 in append_data_cmd (jcr=0x7fefb4001a18) at fd_cmds.c:269 0000013 0x0000000000410c99 in do_fd_commands (jcr=0x7fefb4001a18) at fd_cmds.c:225 0000014 0x0000000000411640 in run_job (jcr=0x7fefb4001a18) at fd_cmds.c:181 0000015 0x0000000000412757 in do_job_run (jcr=0x7fefb4001a18) at job.c:237 0000016 0x00000000004109cf in handle_director_connection (dir=0x2054588) at dir_cmd.c:286 0000017 0x00000000004198ab in handle_connection_request (arg=0x2054588) at socket_server.c:99 #18 0x00007fefef00f77d in workq_server (arg=0x628540) at workq.c:335 #19 0x00007fefeeff5e6d in lmgr_thread_launcher (x=0x204c6f8) at lockmgr.c:926 0000020 0x00007fefedf0ca51 in start_thread (arg=0x7fefdffff700) at pthread_create.c:301 0000021 0x00007fefecea693d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 3 (Thread 0x7fefdebfd700 (LWP 7506)): #0 0x00007fefece9d113 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 0000001 0x0000000000416edc in ndmp_thread_server (arg=0x628310) at ndmp_tape.c:1467 0000002 0x00007fefeeff5e6d in lmgr_thread_launcher (x=0x204d678) at lockmgr.c:926 0000003 0x00007fefedf0ca51 in start_thread (arg=0x7fefdebfd700) at pthread_create.c:301 0000004 0x00007fefecea693d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 2 (Thread 0x7fefdf5fe700 (LWP 7505)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239 0000001 0x00007fefeeff4c40 in bthread_cond_timedwait_p (cond=0x7fefef229b20, m=0x7fefef229ae0, abstime=0x7fefdf5fdd60, file=0x7fefef019d0a "watchdog.c", line=313) at lockmgr.c:811 0000002 0x00007fefef00f2d8 in watchdog_thread (arg=<value optimized out>) at watchdog.c:313 0000003 0x00007fefeeff5e6d in lmgr_thread_launcher (x=0x204cf48) at lockmgr.c:926 0000004 0x00007fefedf0ca51 in start_thread (arg=0x7fefdf5fe700) at pthread_create.c:301 0000005 0x00007fefecea693d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 1 (Thread 0x7fefe596c700 (LWP 7503)): #0 0x00007fefedf13fbd in nanosleep () at ../sysdeps/unix/syscall-template.S:82 0000001 0x00007fefeefe51f2 in bmicrosleep (sec=30, usec=0) at bsys.c:171 0000002 0x00007fefeeff5f31 in check_deadlock () at lockmgr.c:566 0000003 0x00007fefedf0ca51 in start_thread (arg=0x7fefe596c700) at pthread_create.c:301 0000004 0x00007fefecea693d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 #0 0x00007fefedf13fbd in nanosleep () at ../sysdeps/unix/syscall-template.S:82 82 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) No locals. 0000001 0x00007fefeefe51f2 in bmicrosleep (sec=30, usec=0) at bsys.c:171 171 status = nanosleep(&timeout, NULL); timeout = {tv_sec = 30, tv_nsec = 0} tv = {tv_sec = 3, tv_usec = 140668483622179} tz = {tz_minuteswest = 0, tz_dsttime = 0} status = <value optimized out> 0000002 0x00007fefeeff5f31 in check_deadlock () at lockmgr.c:566 566 while (!bmicrosleep(30, 0)) { __clframe = {__cancel_routine = 0x7fefeeff5a20 <cln_hdl(void*)>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = <value optimized out>} old = 0 0000003 0x00007fefedf0ca51 in start_thread (arg=0x7fefe596c700) at pthread_create.c:301 301 THREAD_SETMEM (pd, result, CALL_THREAD_FCT (pd)); __res = <value optimized out> pd = 0x7fefe596c700 now = <value optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140668325775104, 8087716692990061641, 140668468073312, 140668325775808, 0, 3, -8078722887656444855, -8078740185181098935}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <value optimized out> pagesize_m1 = <value optimized out> sp = <value optimized out> freesize = <value optimized out> 0000004 0x00007fefecea693d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 115 call *%rax No locals. #0 0x0000000000000000 in ?? () No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. | ||||
Tags | No tags attached. | ||||
Are the packages from download.bareos.org / nightly, which version ? | |
Which package versions are you using? | |
Yes, sorry, should have been more specific... bareos-storage-15.4.0.git.1446221083.2c08394-1171.1.el6.x86_64 (all the other pkgs match that version) |
|
OK, thanks for the information. Does it happen on 15.2.1 also? Otherwise we will care about it later and concentrate on 15.2.1 related stuff first. |
|
I have only observed this on this nightly build so far. (Perhaps "nightly" or "experimental" should be an option, too, when reporting bugs? See 0000538.) |
|
Fix committed to bareos bareos-15.2 branch with changesetid 5766. | |
Fix committed to bareos bareos-14.2 branch with changesetid 5818. | |
bareos: bareos-15.2 0b6435d7 2015-11-12 18:39 Committer: mvwieringen Ported: N/A Details Diff |
Fix random crashes on sd The block variable was set to the dcr->block, but that can be altered in the call to dev->set_label_blocksize(dcr). When that happens, the code goes on with the wrong block. We removed the whole local variable as it makes no sense and is only referenced 3 times when calling empty_block() Fixes 0000414: Bareos storage daemon crashes during backups Fixse 0000483: bareos-sd crash during backup Fixes 0000522: storage daemon crashes ocassionally when starting a new job Fixes 0000552: SD crashes in -current Signed-off-by: Marco van Wieringen <marco.van.wieringen@bareos.com> |
Affected Issues 0000414, 0000483, 0000522, 0000552, 0000564 |
|
mod - src/stored/label.c | Diff File | ||
bareos: bareos-14.2 3a09212c 2015-11-12 18:39 Committer: mvwieringen Ported: N/A Details Diff |
Fix random crashes on sd The block variable was set to the dcr->block, but that can be altered in the call to dev->set_label_blocksize(dcr). When that happens, the code goes on with the wrong block. We removed the whole local variable as it makes no sense and is only referenced 3 times when calling empty_block() Fixes 0000414: Bareos storage daemon crashes during backups Fixse 0000483: bareos-sd crash during backup Fixes 0000522: storage daemon crashes ocassionally when starting a new job Fixes 0000552: SD crashes in -current Signed-off-by: Marco van Wieringen <marco.van.wieringen@bareos.com> |
Affected Issues 0000414, 0000522, 0000552 |
|
mod - src/stored/label.c | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-11-05 23:50 | avantsysadm@avant.ca | New Issue | |
2015-11-06 16:46 | maik | Note Added: 0001925 | |
2015-11-06 16:47 | maik | Note Added: 0001926 | |
2015-11-06 16:47 | maik | Status | new => feedback |
2015-11-06 16:48 | avantsysadm@avant.ca | Note Added: 0001927 | |
2015-11-06 16:48 | avantsysadm@avant.ca | Status | feedback => new |
2015-11-06 17:02 | maik | Note Added: 0001930 | |
2015-11-06 17:02 | maik | Status | new => feedback |
2015-11-06 17:13 | avantsysadm@avant.ca | Note Added: 0001934 | |
2015-11-06 17:13 | avantsysadm@avant.ca | Status | feedback => new |
2015-11-06 17:17 | maik | Status | new => acknowledged |
2015-11-06 17:17 | maik | Product Version | => 15.4.0 |
2015-11-13 10:19 | stephand | Relationship added | related to 0000414 |
2015-11-13 17:27 | mvwieringen | Changeset attached | => bareos bareos-15.2 0b6435d7 |
2015-11-13 17:27 | mvwieringen | Note Added: 0001962 | |
2015-11-13 17:27 | mvwieringen | Status | acknowledged => resolved |
2015-11-13 17:27 | mvwieringen | Resolution | open => fixed |
2015-11-17 12:01 | mvwieringen | Changeset attached | => bareos bareos-14.2 3a09212c |
2015-11-17 12:01 | mvwieringen | Note Added: 0001977 | |
2015-11-30 18:45 | joergs | Relationship added | child of 0000474 |
2019-12-18 15:25 | arogge | Status | resolved => closed |