View Issue Details

IDProjectCategoryView StatusLast Update
0000799bareos-core[All Projects] directorpublic2017-07-17 15:59
ReporterShodanAssigned To 
PrioritynormalSeveritycrashReproducibilityalways
Status newResolutionopen 
PlatformLinuxOSDebianOS Version8
Product Version16.2.4 
Fixed in Version 
Summary0000799: bareos-director crashes on backup job
DescriptionSegmentation fault during backup job

> bconsole
> run
Select Job resource (1-78): 77
Run Backup job
JobName: TL-ECN-MT
Level: Full
Client: tl-srv-str-01-fd
Format: Native
FileSet: ECN-MT
Pool: TL-TPL (From Job resource)
Storage: tl-bkp-tpl-01 (From Pool resource)
When: 2017-03-21 22:08:24
Priority: 10
OK to run? (yes/mod/no):

21-Mar 21:59 ld4-bar-dir JobId 134859: Start Backup JobId 134859, Job=TL-ECN-MT.2017-03-21_21.59.27_04
21-Mar 21:59 ld4-bar-dir JobId 134859: Using Device "ULT3580-HH7" to write.
21-Mar 21:59 ld4-bar-dir: ERROR in mem_pool.c:238 Failed ASSERT: obuf
Steps To Reproducerun backup job
Additional InformationDell tape loader is used for backup job
Traceback file attached
TagsNo tags attached.
bareos-master: impact
bareos-master: action
bareos-19.2: impact
bareos-19.2: action
bareos-18.2: impact
bareos-18.2: action
bareos-17.2: impact
bareos-17.2: action
bareos-16.2: impact
bareos-16.2: action
bareos-15.2: impact
bareos-15.2: action
bareos-14.2: impact
bareos-14.2: action
bareos-13.2: impact
bareos-13.2: action
bareos-12.4: impact
bareos-12.4: action

Activities

Shodan

Shodan

2017-03-21 21:10

reporter  

traceback.txt (9,262 bytes)
Created /var/lib/bareos/bareos-dir.core.42247 for doing postmortem debugging
[New LWP 42248]
[New LWP 42251]
[New LWP 42252]
[New LWP 42253]
[New LWP 43824]
[New LWP 43866]
[New LWP 43867]
[New LWP 42247]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/bareos-dir'.
#0  0x00007f0418da514d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
$1 = 0x6b1e00 <my_name> "ld4-bar-dir"
$2 = 0x9bf738 "bareos-dir"
$3 = 0x9bf778 "/usr/sbin/bareos-dir"
$4 = 0xa51868 "MySQL"
$5 = 0x7f041a4e0fc6 "16.2.4 (01 July 2016)"
$6 = 0x7f041a4e0fb2 "x86_64-pc-linux-gnu"
$7 = 0x7f041a4e0fab "debian"
$8 = 0x7f041a4e0f8d "Debian GNU/Linux 8.0 (jessie)"
$9 = "ld4-bar-dir-01", '\000' <repeats 241 times>
$10 = 0x7f041a4e14a8 "debian Debian GNU/Linux 8.0 (jessie)"
Environment variable "TestName" not defined.
#0  0x00007f0418da514d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f041a4b1354 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
#2  0x00007f041a4c205c in check_deadlock () at lockmgr.c:566
#3  0x00007f0418d9e064 in start_thread (arg=0x7f0417ba7700) at pthread_create.c:309
#4  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 8 (Thread 0x7f041b5ec740 (LWP 42247)):
#0  0x00007f0418da514d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f041a4b1354 in bmicrosleep (sec=47, usec=0) at bsys.c:171
#2  0x00000000004450bf in wait_for_next_job (one_shot_job_to_run=0x58d18611 <error: Cannot access memory at address 0x58d18611>) at scheduler.c:171
#3  0x0000000000410236 in main (argc=<optimized out>, argv=<optimized out>) at dird.c:433

Thread 7 (Thread 0x7f03ffde8700 (LWP 43867)):
#0  0x00007f0418da5489 in __libc_waitpid (pid=43868, stat_loc=0x7f03ffde6f8c, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
#1  0x00007f041a4d1264 in signal_handler (sig=11) at signal.c:240
#2  <signal handler called>
#3  0x00007f041a4c36bb in sm_check_pool_memory_size (fname=<optimized out>, lineno=<optimized out>, obuf=0x0, size=<optimized out>) at mem_pool.c:238
#4  0x0000000000416a8f in update_attribute (jcr=0x7f0404005568, msg=0x7f03f80016f0 "UpdCat Job=TLL1-ECN-MT.2017-03-21_21.59.27_04 FileAttributes ", msglen=206) at catreq.c:472
#5  0x0000000000418088 in catalog_update (jcr=0x0, bs=0x0) at catreq.c:660
#6  0x000000000042758b in bget_dirmsg (bs=0x7f03f8001598, allow_any_message=false) at getmsg.c:260
#7  0x000000000043462c in msg_thread (arg=0x0, arg@entry=0x7f0404005568) at msgchan.c:434
#8  0x00007f041a4c20ef in lmgr_thread_launcher (x=0x7f03f8002838) at lockmgr.c:926
#9  0x00007f0418d9e064 in start_thread (arg=0x7f03ffde8700) at pthread_create.c:309
#10 0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x7f0414a4b700 (LWP 43866)):
#0  0x00007f0418da4a9d in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f041a4af540 in BSOCK_TCP::read_nbytes (this=0x7f03f80043b8, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:957
#2  0x00007f041a4aedbf in BSOCK_TCP::recv (this=0x7f03f80043b8) at bsock_tcp.c:529
#3  0x000000000042728f in bget_dirmsg (bs=0x7f03f80043b8, allow_any_message=false) at getmsg.c:150
#4  0x0000000000412e02 in wait_for_job_termination (jcr=0x7f0404005568, timeout=0) at backup.c:655
#5  0x0000000000414eab in do_native_backup (jcr=jcr@entry=0x7f0404005568) at backup.c:596
#6  0x000000000042a145 in job_thread (arg=0x7f0404005568) at job.c:494
#7  0x000000000042f171 in jobq_server (arg=0x6b2640 <job_queue>) at jobq.c:484
#8  0x00007f041a4c20ef in lmgr_thread_launcher (x=0x7f0404006b88) at lockmgr.c:926
#9  0x00007f0418d9e064 in start_thread (arg=0x7f0414a4b700) at pthread_create.c:309
#10 0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 5 (Thread 0x7f041524c700 (LWP 43824)):
#0  0x00007f0418da4a9d in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f041a4af540 in BSOCK_TCP::read_nbytes (this=0x7f04080011a8, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:957
#2  0x00007f041a4aedbf in BSOCK_TCP::recv (this=0x7f04080011a8) at bsock_tcp.c:529
#3  0x000000000046c608 in handle_UA_client_request (user=user@entry=0x7f04080011a8) at ua_server.c:91
#4  0x00000000004401f3 in handle_connection_request (arg=0x7f04080011a8) at socket_server.c:97
#5  0x00007f041a4dad15 in workq_server (arg=arg@entry=0x6b2940 <socket_workq>) at workq.c:336
#6  0x00007f041a4c20ef in lmgr_thread_launcher (x=0x7f0408002658) at lockmgr.c:926
#7  0x00007f0418d9e064 in start_thread (arg=0x7f041524c700) at pthread_create.c:309
#8  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7f0415a4d700 (LWP 42253)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f041a4c258c in bthread_cond_timedwait_p (cond=0x6b2c60 <wait_for_next_run_cond>, m=0x6b2ca0 <_ZL5mutex>, abstime=0x7f0415a4cd60, file=0x48a204 "stats.c", line=117) at lockmgr.c:811
#2  0x0000000000445e4e in wait_for_next_run () at stats.c:117
#3  statistics_thread_runner (arg=0x6b2c64 <wait_for_next_run_cond+4>, arg@entry=0x0) at stats.c:327
#4  0x00007f041a4c20ef in lmgr_thread_launcher (x=0xa39fa8) at lockmgr.c:926
#5  0x00007f0418d9e064 in start_thread (arg=0x7f0415a4d700) at pthread_create.c:309
#6  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7f041624e700 (LWP 42252)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f041a4c258c in bthread_cond_timedwait_p (cond=cond@entry=0x7f041a6f7b80 <_ZL5timer>, m=m@entry=0x7f041a6f7bc0 <_ZL11timer_mutex>, abstime=abstime@entry=0x7f041624de20, file=file@entry=0x7f041a4e586a "watchdog.c", line=line@entry=313) at lockmgr.c:811
#2  0x00007f041a4da36d in watchdog_thread (arg=arg@entry=0x0) at watchdog.c:313
#3  0x00007f041a4c20ef in lmgr_thread_launcher (x=0xa39fa8) at lockmgr.c:926
#4  0x00007f0418d9e064 in start_thread (arg=0x7f041624e700) at pthread_create.c:309
#5  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7f0416c5b700 (LWP 42251)):
#0  0x00007f04180a4aed in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f041a4a6203 in bnet_thread_server_tcp (addr_list=addr_list@entry=0x9e7088, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x6b2940 <socket_workq>, nokeepalive=<optimized out>, handle_client_request=handle_client_request@entry=0x440170 <handle_connection_request(void*)>) at bnet_server_tcp.c:302
#2  0x00000000004403ff in connect_thread (arg=arg@entry=0x9e7088) at socket_server.c:114
#3  0x00007f041a4c20ef in lmgr_thread_launcher (x=0xa3a348) at lockmgr.c:926
#4  0x00007f0418d9e064 in start_thread (arg=0x7f0416c5b700) at pthread_create.c:309
#5  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f0417ba7700 (LWP 42248)):
#0  0x00007f0418da514d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f041a4b1354 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
#2  0x00007f041a4c205c in check_deadlock () at lockmgr.c:566
#3  0x00007f0418d9e064 in start_thread (arg=0x7f0417ba7700) at pthread_create.c:309
#4  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
#0  0x00007f0418da514d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81      in ../sysdeps/unix/syscall-template.S
No locals.
#1  0x00007f041a4b1354 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
171     bsys.c: No such file or directory.
timeout = {tv_sec = 30, tv_nsec = 0}
tv = {tv_sec = 0, tv_usec = 0}
tz = {tz_minuteswest = 398096128, tz_dsttime = 32516}
status = <optimized out>
#2  0x00007f041a4c205c in check_deadlock () at lockmgr.c:566
566     lockmgr.c: No such file or directory.
__cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -5982215179877381202, 0, 139655615885408, 0, 139655554692864, 5977061249686801326, 5977070986946251694}, __mask_was_saved = 0}}, __pad = {0x7f0417ba6f30, 0x0, 0x7f0417ba7700, 0x7f0417ba7700}}
__cancel_routine = 0x7f041a4c2170 <cln_hdl(void*)>
__not_first_call = <optimized out>
old = 0
#3  0x00007f0418d9e064 in start_thread (arg=0x7f0417ba7700) at pthread_create.c:309
309     pthread_create.c: No such file or directory.
__res = <optimized out>
pd = 0x7f0417ba7700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139655554692864, -5982215179877381202, 0, 139655615885408, 0, 139655554692864, 5977061249735035822, 5977076389527915438}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#4  0x00007f04180ad62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
traceback.txt (9,262 bytes)
Shodan

Shodan

2017-03-21 21:14

reporter   ~0002616

Last edited: 2017-03-22 10:14

View 2 revisions

bareos-dbcheck -f -c /etc/bareos/bareos-dir.conf does not fix the issue

Additional debug

gdb -q -ex 'file /usr/sbin/bareos-dir' -ex 'run -v -f'
Reading symbols from /usr/sbin/bareos-dir...Reading symbols from /usr/lib/debug//usr/sbin/bareos-dir...done.
done.
Starting program: /usr/sbin/bareos-dir -v -f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff45a1700 (LWP 47133)]
[New Thread 0x7ffff3655700 (LWP 47134)]
[Thread 0x7ffff3655700 (LWP 47134) exited]
[New Thread 0x7ffff3655700 (LWP 47135)]
[New Thread 0x7ffff2c48700 (LWP 47136)]
[New Thread 0x7ffff2447700 (LWP 47137)]
[New Thread 0x7ffff1c46700 (LWP 47138)]
[New Thread 0x7ffff1445700 (LWP 47192)]
[New Thread 0x7ffff0827700 (LWP 47214)]
ld4-bar-dir: mem_pool.c:238-134869 Failed ASSERT: obuf

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff0827700 (LWP 47214)]
0x00007ffff6ebd6bb in sm_check_pool_memory_size (fname=<optimized out>, lineno=<optimized out>, obuf=0x0, size=<optimized out>) at mem_pool.c:238
238 mem_pool.c: No such file or directory.

Shodan

Shodan

2017-03-22 10:12

reporter  

traceback2.txt (9,274 bytes)
Created /var/lib/bareos/bareos-dir.core.35014 for doing postmortem debugging
[New LWP 35017]
[New LWP 35060]
[New LWP 35061]
[New LWP 35062]
[New LWP 45839]
[New LWP 45881]
[New LWP 45882]
[New LWP 35014]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/bareos-dir'.
#0  0x00007fa9afcf114d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
$1 = 0x6b1e00 <my_name> "ld4-bar-dir"
$2 = 0x778738 "bareos-dir"
$3 = 0x778778 "/usr/sbin/bareos-dir"
$4 = 0x7fa964069878 "MySQL"
$5 = 0x7fa9b142cfc6 "16.2.4 (01 July 2016)"
$6 = 0x7fa9b142cfb2 "x86_64-pc-linux-gnu"
$7 = 0x7fa9b142cfab "debian"
$8 = 0x7fa9b142cf8d "Debian GNU/Linux 8.0 (jessie)"
$9 = "ld4-bar-dir-01", '\000' <repeats 241 times>
$10 = 0x7fa9b142d4a8 "debian Debian GNU/Linux 8.0 (jessie)"
Environment variable "TestName" not defined.
#0  0x00007fa9afcf114d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fa9b13fd354 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
#2  0x00007fa9b140e05c in check_deadlock () at lockmgr.c:566
#3  0x00007fa9afcea064 in start_thread (arg=0x7fa9aeaf3700) at pthread_create.c:309
#4  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 8 (Thread 0x7fa9b2538740 (LWP 35014)):
#0  0x00007fa9afcf114d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fa9b13fd354 in bmicrosleep (sec=60, usec=0) at bsys.c:171
#2  0x00000000004450bf in wait_for_next_job (one_shot_job_to_run=0x58d23ec3 <error: Cannot access memory at address 0x58d23ec3>) at scheduler.c:171
#3  0x0000000000410236 in main (argc=<optimized out>, argv=<optimized out>) at dird.c:433

Thread 7 (Thread 0x7fa99e7fc700 (LWP 45882)):
#0  0x00007fa9afcf1489 in __libc_waitpid (pid=45924, stat_loc=0x7fa99e7faf8c, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
#1  0x00007fa9b141d264 in signal_handler (sig=11) at signal.c:240
#2  <signal handler called>
#3  0x00007fa9b140f6bb in sm_check_pool_memory_size (fname=<optimized out>, lineno=<optimized out>, obuf=0x0, size=<optimized out>) at mem_pool.c:238
#4  0x0000000000416a8f in update_attribute (jcr=0x7fa9640bc598, msg=0x7fa9a0001300 "UpdCat Job=TLL1-ECN-MT.2017-03-22_11.07.41_04 FileAttributes ", msglen=206) at catreq.c:472
#5  0x0000000000418088 in catalog_update (jcr=0x0, bs=0x0) at catreq.c:660
#6  0x000000000042758b in bget_dirmsg (bs=0x7fa9640be018, allow_any_message=false) at getmsg.c:260
#7  0x000000000043462c in msg_thread (arg=0x0, arg@entry=0x7fa9640bc598) at msgchan.c:434
#8  0x00007fa9b140e0ef in lmgr_thread_launcher (x=0x7fa9640bd3b8) at lockmgr.c:926
#9  0x00007fa9afcea064 in start_thread (arg=0x7fa99e7fc700) at pthread_create.c:309
#10 0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x7fa99dffb700 (LWP 45881)):
#0  0x00007fa9afcf0a9d in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fa9b13fb540 in BSOCK_TCP::read_nbytes (this=0x7fa9640be1d8, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:957
#2  0x00007fa9b13fadbf in BSOCK_TCP::recv (this=0x7fa9640be1d8) at bsock_tcp.c:529
#3  0x000000000042728f in bget_dirmsg (bs=0x7fa9640be1d8, allow_any_message=false) at getmsg.c:150
#4  0x0000000000412e02 in wait_for_job_termination (jcr=0x7fa9640bc598, timeout=0) at backup.c:655
#5  0x0000000000414eab in do_native_backup (jcr=jcr@entry=0x7fa9640bc598) at backup.c:596
#6  0x000000000042a145 in job_thread (arg=0x7fa9640bc598) at job.c:494
#7  0x000000000042f171 in jobq_server (arg=0x6b2640 <job_queue>) at jobq.c:484
#8  0x00007fa9b140e0ef in lmgr_thread_launcher (x=0x7fa9640bd3b8) at lockmgr.c:926
#9  0x00007fa9afcea064 in start_thread (arg=0x7fa99dffb700) at pthread_create.c:309
#10 0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 5 (Thread 0x7fa99cff9700 (LWP 45839)):
#0  0x00007fa9afcf0a9d in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fa9b13fb540 in BSOCK_TCP::read_nbytes (this=0x7fa9a00011a8, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:957
#2  0x00007fa9b13fadbf in BSOCK_TCP::recv (this=0x7fa9a00011a8) at bsock_tcp.c:529
#3  0x000000000046c608 in handle_UA_client_request (user=user@entry=0x7fa9a00011a8) at ua_server.c:91
#4  0x00000000004401f3 in handle_connection_request (arg=0x7fa9a00011a8) at socket_server.c:97
#5  0x00007fa9b1426d15 in workq_server (arg=arg@entry=0x6b2940 <socket_workq>) at workq.c:336
#6  0x00007fa9b140e0ef in lmgr_thread_launcher (x=0x7fa9a0002338) at lockmgr.c:926
#7  0x00007fa9afcea064 in start_thread (arg=0x7fa99cff9700) at pthread_create.c:309
#8  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7fa9ac999700 (LWP 35062)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007fa9b140e58c in bthread_cond_timedwait_p (cond=0x6b2c60 <wait_for_next_run_cond>, m=0x6b2ca0 <_ZL5mutex>, abstime=0x7fa9ac998d60, file=0x48a204 "stats.c", line=117) at lockmgr.c:811
#2  0x0000000000445e4e in wait_for_next_run () at stats.c:117
#3  statistics_thread_runner (arg=0x6b2c64 <wait_for_next_run_cond+4>, arg@entry=0x0) at stats.c:327
#4  0x00007fa9b140e0ef in lmgr_thread_launcher (x=0x7efcf8) at lockmgr.c:926
#5  0x00007fa9afcea064 in start_thread (arg=0x7fa9ac999700) at pthread_create.c:309
#6  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7fa9ad19a700 (LWP 35061)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007fa9b140e58c in bthread_cond_timedwait_p (cond=cond@entry=0x7fa9b1643b80 <_ZL5timer>, m=m@entry=0x7fa9b1643bc0 <_ZL11timer_mutex>, abstime=abstime@entry=0x7fa9ad199e20, file=file@entry=0x7fa9b143186a "watchdog.c", line=line@entry=313) at lockmgr.c:811
#2  0x00007fa9b142636d in watchdog_thread (arg=arg@entry=0x0) at watchdog.c:313
#3  0x00007fa9b140e0ef in lmgr_thread_launcher (x=0x7efcf8) at lockmgr.c:926
#4  0x00007fa9afcea064 in start_thread (arg=0x7fa9ad19a700) at pthread_create.c:309
#5  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7fa9adba7700 (LWP 35060)):
#0  0x00007fa9aeff0aed in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fa9b13f2203 in bnet_thread_server_tcp (addr_list=addr_list@entry=0x79fd28, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x6b2940 <socket_workq>, nokeepalive=<optimized out>, handle_client_request=handle_client_request@entry=0x440170 <handle_connection_request(void*)>) at bnet_server_tcp.c:302
#2  0x00000000004403ff in connect_thread (arg=arg@entry=0x79fd28) at socket_server.c:114
#3  0x00007fa9b140e0ef in lmgr_thread_launcher (x=0x7f2e58) at lockmgr.c:926
#4  0x00007fa9afcea064 in start_thread (arg=0x7fa9adba7700) at pthread_create.c:309
#5  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7fa9aeaf3700 (LWP 35017)):
#0  0x00007fa9afcf114d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fa9b13fd354 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
#2  0x00007fa9b140e05c in check_deadlock () at lockmgr.c:566
#3  0x00007fa9afcea064 in start_thread (arg=0x7fa9aeaf3700) at pthread_create.c:309
#4  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
#0  0x00007fa9afcf114d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81      in ../sysdeps/unix/syscall-template.S
No locals.
#1  0x00007fa9b13fd354 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
171     bsys.c: No such file or directory.
timeout = {tv_sec = 30, tv_nsec = 0}
tv = {tv_sec = 0, tv_usec = 0}
tz = {tz_minuteswest = -1364248832, tz_dsttime = 32681}
status = <optimized out>
#2  0x00007fa9b140e05c in check_deadlock () at lockmgr.c:566
566     lockmgr.c: No such file or directory.
__cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, 5896314722769966371, 0, 140366818111584, 0, 140366756919040, -5872493397977328349, -5872542724098353885}, __mask_was_saved = 0}}, __pad = {0x7fa9aeaf2f30, 0x0, 0x7fa9aeaf3700, 0x7fa9aeaf3700}}
__cancel_routine = 0x7fa9b140e170 <cln_hdl(void*)>
__not_first_call = <optimized out>
old = 0
#3  0x00007fa9afcea064 in start_thread (arg=0x7fa9aeaf3700) at pthread_create.c:309
309     pthread_create.c: No such file or directory.
__res = <optimized out>
pd = 0x7fa9aeaf3700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140366756919040, 5896314722769966371, 0, 140366818111584, 0, 140366756919040, -5872493398025562845, -5872495326232439517}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fa9aeff962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
traceback2.txt (9,274 bytes)
joergs

joergs

2017-06-08 16:18

administrator   ~0002661

Thank you for the stack traces. Unfortunately, this looks strange.
If, as stated in 0000005, jcr is really NULL, than a Segmentation Fault must have appeared earlier:

0000003 0x00007f041a4c36bb in sm_check_pool_memory_size (fname=<optimized out>, lineno=<optimized out>, obuf=0x0, size=<optimized out>) at mem_pool.c:238
0000004 0x0000000000416a8f in update_attribute (jcr=0x7f0404005568, msg=0x7f03f80016f0 "UpdCat Job=TLL1-ECN-MT.2017-03-21_21.59.27_04 FileAttributes ", msglen=206) at catreq.c:472
0000005 0x0000000000418088 in catalog_update (jcr=0x0, bs=0x0) at catreq.c:660
0000006 0x000000000042758b in bget_dirmsg (bs=0x7f03f8001598, allow_any_message=false) at getmsg.c:260

In what situation, this bug occurs? Have you got a small test configuration, were we could reproduce this error? If not, maybe we get some more insight, if you enable debug messages:
bareos-dir -f -d 900
Shodan

Shodan

2017-06-09 11:23

reporter   ~0002668

Sorry, unable to reproduce the issue anymore :(
The pool has been recreated and debian bareos package has been updated and recompiled to exclude unneeded modules (glusterfs, ceph). After this, the problem disappeared.
divanikus

divanikus

2017-07-17 14:25

reporter  

bareos.log (123,597 bytes)
divanikus

divanikus

2017-07-17 14:26

reporter   ~0002683

Having the same issue with 16.2.6 (debian, package built from source).
Bactrace says the following:

Attempt to dump locks
threadid=0x7ff63bfff700 max=1 current=-1
threadid=0x7ff640f26700 max=2 current=-1
threadid=0x7ff641727700 max=2 current=-1
threadid=0x7ff639727700 max=0 current=-1
threadid=0x7ff641f28700 max=0 current=-1
threadid=0x7ff649b97700 max=0 current=-1
threadid=0x7ff64d5e0740 max=2 current=-1
Attempt to dump current JCRs. njcrs=3
threadid=0x7ff64d5e0740 JobId=0 JobStatus=R jcr=0x773008 name=*JobMonitor*.2017-07-17_15.12.14_01
threadid=0x7ff64d5e0740 killable=0 JobId=0 JobStatus=R jcr=0x773008 name=*JobMonitor*.2017-07-17_15.12.14_01
        use_count=1
        JobType=I JobLevel=
        sched_time=17-Jul-2017 15:12 start_time=17-Jul-2017 15:12
        end_time=01-Jan-1970 03:00 wait_time=01-Jan-1970 03:00
        db=(nil) db_batch=(nil) batch_started=0
threadid=0x7ff641727700 JobId=0 JobStatus=R jcr=0x7ff62c001078 name=*StatisticsCollector*.2017-07-17_15.12.14_02
threadid=0x7ff641727700 killable=0 JobId=0 JobStatus=R jcr=0x7ff62c001078 name=*StatisticsCollector*.2017-07-17_15.12.14_02
        use_count=1
        JobType=I JobLevel=
        sched_time=17-Jul-2017 15:12 start_time=17-Jul-2017 15:12
        end_time=01-Jan-1970 03:00 wait_time=01-Jan-1970 03:00
        db=0x7ff62c002c68 db_batch=(nil) batch_started=0
B_DB=0x7ff62c002c68 db_name=bareos db_user=bareos connected=true
        cmd="UPDATE Media SET VolJobs=15,VolFiles=2,VolBlocks=191267,VolBytes=12338415310,VolMounts=93,VolErrors=0,VolWrites=1439631,MaxVolBytes=26843545600,VolStatus='Append',Slot=0,InChanger=0,VolReadTime=0,VolWriteTime=39231914,LabelType=0,StorageId=3,PoolId=2,VolRetention=2592000,VolUseDuration=0,MaxVolJobs=0,MaxVolFiles=0,Enabled=1,LocationId=0,ScratchPoolId=0,RecyclePoolId=0,RecycleCount=3,Recycle=1,ActionOnPurge=1,MinBlocksize=0,MaxBlocksize=0 WHERE VolumeName='Incremental-0012'" changes=6
        RWLOCK=0x7ff62c002c70 w_active=0 w_wait=0
threadid=0x7ff640f26700 JobId=5224 JobStatus=R jcr=0x7ff630006f88 name=backup-frontend-00-int.2017-07-17_15.12.29_09
threadid=0x7ff640f26700 killable=1 JobId=5224 JobStatus=R jcr=0x7ff630006f88 name=backup-frontend-00-int.2017-07-17_15.12.29_09
        use_count=2
        JobType=B JobLevel=I
        sched_time=17-Jul-2017 15:12 start_time=17-Jul-2017 15:12
        end_time=01-Jan-1970 03:00 wait_time=01-Jan-1970 03:00
        db=0x7ff62c002c68 db_batch=(nil) batch_started=0
B_DB=0x7ff62c002c68 db_name=bareos db_user=bareos connected=true
        cmd="UPDATE Media SET VolJobs=15,VolFiles=2,VolBlocks=191267,VolBytes=12338415310,VolMounts=93,VolErrors=0,VolWrites=1439631,MaxVolBytes=26843545600,VolStatus='Append',Slot=0,InChanger=0,VolReadTime=0,VolWriteTime=39231914,LabelType=0,StorageId=3,PoolId=2,VolRetention=2592000,VolUseDuration=0,MaxVolJobs=0,MaxVolFiles=0,Enabled=1,LocationId=0,ScratchPoolId=0,RecyclePoolId=0,RecycleCount=3,Recycle=1,ActionOnPurge=1,MinBlocksize=0,MaxBlocksize=0 WHERE VolumeName='Incremental-0012'" changes=6
        RWLOCK=0x7ff62c002c70 w_active=0 w_wait=0

Stripped down bareos-dir log is in the attachment. I'm not using ceph or glusterfs if any.
divanikus

divanikus

2017-07-17 15:59

reporter   ~0002686

Seems to be caused by this
17-Jul 16:35 bs-00 JobId 5225: Volume "Incremental-0012" previously written, moving to end of data.
17-Jul 16:35 bs-00 JobId 5225: Warning: For Volume "Incremental-0012":
The sizes do not match! Volume=12338423873 Catalog=12338415310
Correcting Catalog
17-Jul 16:35 bareos-dir: ERROR in mem_pool.c:238 Failed ASSERT: obuf

I have temporary disabled the Incremental-0012 volume and now jobs work again. How can I fix it?

Issue History

Date Modified Username Field Change
2017-03-21 21:10 Shodan New Issue
2017-03-21 21:10 Shodan File Added: traceback.txt
2017-03-21 21:14 Shodan Note Added: 0002616
2017-03-22 10:12 Shodan File Added: traceback2.txt
2017-03-22 10:14 Shodan Note Edited: 0002616 View Revisions
2017-06-08 13:29 joergs bareos-master: impact => yes
2017-06-08 13:29 joergs bareos-master: action => will care
2017-06-08 13:29 joergs bareos-16.2: impact => yes
2017-06-08 13:29 joergs bareos-16.2: action => will care
2017-06-08 13:29 joergs bareos-15.2: impact => no
2017-06-08 13:29 joergs bareos-14.2: impact => no
2017-06-08 13:29 joergs bareos-13.2: impact => no
2017-06-08 13:29 joergs bareos-12.4: impact => no
2017-06-08 13:29 joergs Status new => resolved
2017-06-08 13:29 joergs Fixed in Version => 16.2.6
2017-06-08 13:29 joergs Resolution open => fixed
2017-06-08 13:29 joergs Assigned To => joergs
2017-06-08 13:38 joergs bareos-master: impact yes =>
2017-06-08 13:38 joergs bareos-master: action will care =>
2017-06-08 13:38 joergs bareos-16.2: impact yes =>
2017-06-08 13:38 joergs bareos-16.2: action will care =>
2017-06-08 13:38 joergs bareos-15.2: impact no =>
2017-06-08 13:38 joergs bareos-14.2: impact no =>
2017-06-08 13:38 joergs bareos-13.2: impact no =>
2017-06-08 13:38 joergs bareos-12.4: impact no =>
2017-06-08 13:38 joergs Assigned To joergs =>
2017-06-08 13:38 joergs Status resolved => new
2017-06-08 13:38 joergs Resolution fixed => open
2017-06-08 13:38 joergs Fixed in Version 16.2.6 =>
2017-06-08 16:18 joergs Note Added: 0002661
2017-06-08 16:18 joergs Status new => feedback
2017-06-09 11:23 Shodan Note Added: 0002668
2017-06-09 11:23 Shodan Status feedback => new
2017-07-17 14:25 divanikus File Added: bareos.log
2017-07-17 14:26 divanikus Note Added: 0002683
2017-07-17 15:59 divanikus Note Added: 0002686