View Issue Details

IDProjectCategoryView StatusLast Update
0000418bareos-coredirectorpublic2015-06-01 16:23
Reporterjbehrend Assigned To 
PrioritynormalSeveritycrashReproducibilityalways
Status closedResolutionfixed 
PlatformLinuxOSDebianOS Version8
Product Version14.2.2 
Summary0000418: traceback runnning after job
DescriptionI ran into tracebacks running this afterjob:

RunScript {
  RunsWhen=After
  RunsOnClient=No
  Console = "purge volume action=truncate pool=auxme-DailyPool storage=backupsrv1-sd-auxme-DailyPool"
}
Steps To Reproducerun job=auxme storage=Spectra-Logic-T950EF spooldata=yes level=Full yes
Additional InformationCreated /var/lib/bareos/bareos-dir.core.9559 for doing postmortem debugging
[New LWP 9560]
[New LWP 9564]
[New LWP 9565]
[New LWP 9566]
[New LWP 6191]
[New LWP 10256]
[New LWP 11480]
[New LWP 11996]
[New LWP 9559]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/bareos-dir'.
#0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
$1 = 0x69d980 <my_name> "backupsrv1-dir"
$2 = 0xa1c058 "bareos-dir"
$3 = 0xa1c098 "/usr/sbin/bareos-dir"
$4 = 0x7f20f02a3c18 "PostgreSQL"
$5 = 0x7f2116e28c1c "14.4.0 (16 November 2014)"
$6 = 0x7f2116e28c08 "x86_64-pc-linux-gnu"
$7 = 0x7f2116e28c01 "debian"
$8 = 0x7f2116e28be3 "Debian GNU/Linux 8.0 (jessie)"
$9 = "backupsrv1", '\000' <repeats 245 times>
$10 = 0x7f2116e290f0 "debian Debian GNU/Linux 8.0 (jessie)"
Environment variable "TestName" not defined.
#0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116dfd024 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:100
0000002 0x00007f2116e0c30c in check_deadlock () at lockmgr.c:566
0000003 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000004 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 9 (Thread 0x7f2117f29740 (LWP 9559)):
#0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116dfd024 in bmicrosleep (sec=60, usec=0) at bsys.c:100
0000002 0x0000000000440a69 in wait_for_next_job (one_shot_job_to_run=0x7f20f0014ec8 "8P\001\360 \177") at scheduler.c:124
0000003 0x000000000040e02a in main (argc=<optimized out>, argv=<optimized out>) at dird.c:387

Thread 8 (Thread 0x7f2106ffd700 (LWP 11996)):
#0 0x00007f21161334c9 in waitpid () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116e19514 in signal_handler (sig=11) at signal.c:240
0000002 <signal handler called>
0000003 BSOCK::fsend (this=0x0, fmt=0x484128 "%s") at bsock.c:205
0000004 0x00007f2117461df5 in list_result (jcr=jcr@entry=0x7f20f80085f8, mdb=mdb@entry=0x7f20f800a8c8, send=send@entry=0x44d370 <prtit(void*, char const*)>, ctx=ctx@entry=0x7f20f80095a8, type=type@entry=HORZ_LIST) at sql.c:637
0000005 0x00007f211746c7d9 in db_list_media_records (jcr=0x7f20f80085f8, mdb=0x7f20f800a8c8, mdbr=mdbr@entry=0x7f2106ffc580, sendit=0x44d370 <prtit(void*, char const*)>, ctx=ctx@entry=0x7f20f80095a8, type=type@entry=HORZ_LIST) at sql_list.c:191
0000006 0x000000000045e4df in select_media_dbr (ua=ua@entry=0x7f20f80095a8, mr=mr@entry=0x7f2106ffc580) at ua_select.c:677
0000007 0x0000000000452abc in purge_cmd (ua=0x7f20f80095a8, cmd=<optimized out>) at ua_purge.c:155
0000008 0x00000000004449ae in do_a_command (ua=0x7f20f80095a8) at ua_cmds.c:301
0000009 0x000000000042b51a in run_console_command (jcr=<optimized out>, cmd=0x7f20f8001f70 "purge volume action=truncate pool=auxme-DailyPool storage=backupsrv1-sd-auxme-DailyPool") at job.c:2044
0000010 0x00007f2116e15cc7 in RUNSCRIPT::run (this=0x7f20f0165ba8, jcr=0x7f20f000e418, name=0x46f90f "AfterJob") at runscript.c:305
0000011 0x00007f2116e1630e in run_scripts (jcr=0x7f20f000e418, runscripts=0x7f20f0165b48, label=0x7f211747077a "No results to list.\n", label@entry=0x46f90f "AfterJob", allowed_script_dirs=0x0) at runscript.c:212
0000012 0x0000000000426db6 in job_thread (arg=0x7f20f000e418) at job.c:509
0000013 0x000000000042c9e1 in jobq_server (arg=0x69e3c0 <job_queue>) at jobq.c:450
0000014 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f20f02b50b8) at lockmgr.c:926
0000015 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000016 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 7 (Thread 0x7f2105ffb700 (LWP 11480)):
#0 0x00007f2116132add in read () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116dfb170 in BSOCK_TCP::read_nbytes (this=0x7f2100002978, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:906
0000002 0x00007f2116dfa9df in BSOCK_TCP::recv (this=0x7f2100002978) at bsock_tcp.c:478
0000003 0x000000000045ff9a in handle_UA_client_request (user=user@entry=0x7f2100002978) at ua_server.c:89
0000004 0x000000000043b8cc in handle_connection_request (arg=0x7f2100002978) at socket_server.c:84
0000005 0x00007f2116e22e45 in workq_server (arg=arg@entry=0x69e680 <socket_workq>) at workq.c:335
0000006 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f2100002b88) at lockmgr.c:926
0000007 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000008 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 6 (Thread 0x7f21067fc700 (LWP 10256)):
#0 0x00007f2116132add in read () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116dfb170 in BSOCK_TCP::read_nbytes (this=0x7f21000027a8, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:906
0000002 0x00007f2116dfa9df in BSOCK_TCP::recv (this=0x7f21000027a8) at bsock_tcp.c:478
0000003 0x000000000045ff9a in handle_UA_client_request (user=user@entry=0x7f21000027a8) at ua_server.c:89
0000004 0x000000000043b8cc in handle_connection_request (arg=0x7f21000027a8) at socket_server.c:84
0000005 0x00007f2116e22e45 in workq_server (arg=arg@entry=0x69e680 <socket_workq>) at workq.c:335
0000006 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f2100002978) at lockmgr.c:926
0000007 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000008 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 5 (Thread 0x7f21077fe700 (LWP 6191)):
#0 0x00007f2116132add in read () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116dfb170 in BSOCK_TCP::read_nbytes (this=0x7f2100001168, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:906
0000002 0x00007f2116dfa9df in BSOCK_TCP::recv (this=0x7f2100001168) at bsock_tcp.c:478
0000003 0x000000000045ff9a in handle_UA_client_request (user=user@entry=0x7f2100001168) at ua_server.c:89
0000004 0x000000000043b8cc in handle_connection_request (arg=0x7f2100001168) at socket_server.c:84
0000005 0x00007f2116e22e45 in workq_server (arg=arg@entry=0x69e680 <socket_workq>) at workq.c:335
0000006 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f2100002618) at lockmgr.c:926
0000007 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000008 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 4 (Thread 0x7f2107fff700 (LWP 9566)):
#0 0x00007f2116130438 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116e0c83c in bthread_cond_timedwait_p (cond=0x69e9e0 <wait_for_next_run_cond>, m=0x69ea20 <_ZL5mutex>, abstime=0x7f2107ffed60, file=0x47a8cd "stats.c", line=117) at lockmgr.c:811
0000002 0x0000000000441083 in wait_for_next_run () at stats.c:117
0000003 statistics_thread_runner (arg=0x69e9e4 <wait_for_next_run_cond+4>, arg@entry=0x0) at stats.c:290
0000004 0x00007f2116e0c39f in lmgr_thread_launcher (x=0xd04488) at lockmgr.c:926
0000005 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000006 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 3 (Thread 0x7f210cb34700 (LWP 9565)):
#0 0x00007f2116130438 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116e0c83c in bthread_cond_timedwait_p (cond=cond@entry=0x7f211703d900 <_ZL5timer>, m=m@entry=0x7f211703d940 <_ZL11timer_mutex>, abstime=abstime@entry=0x7f210cb33e20, file=file@entry=0x7f2116e2cf02 "watchdog.c", line=line@entry=313) at lockmgr.c:811
0000002 0x00007f2116e224f9 in watchdog_thread (arg=arg@entry=0x0) at watchdog.c:313
0000003 0x00007f2116e0c39f in lmgr_thread_launcher (x=0xd02df8) at lockmgr.c:926
0000004 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000005 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 2 (Thread 0x7f210d335700 (LWP 9564)):
#0 0x00007f211502218d in poll () from /lib/x86_64-linux-gnu/libc.so.6
0000001 0x00007f2116df2493 in bnet_thread_server_tcp (addr_list=addr_list@entry=0xa1e2a8, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x69e680 <socket_workq>, nokeepalive=<optimized out>, handle_client_request=handle_client_request@entry=0x43b870 <handle_connection_request(void*)>) at bnet_server_tcp.c:298
0000002 0x000000000043bacf in connect_thread (arg=arg@entry=0xa1e2a8) at socket_server.c:101
0000003 0x00007f2116e0c39f in lmgr_thread_launcher (x=0xd04488) at lockmgr.c:926
0000004 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000005 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 1 (Thread 0x7f211482f700 (LWP 9560)):
#0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
0000001 0x00007f2116dfd024 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:100
0000002 0x00007f2116e0c30c in check_deadlock () at lockmgr.c:566
0000003 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0000004 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
0000001 0x00007f2116dfd024 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:100
100 bsys.c: No such file or directory.
timeout = {tv_sec = 30, tv_nsec = 0}
tv = {tv_sec = 0, tv_usec = 0}
tz = {tz_minuteswest = 344127232, tz_dsttime = 32545}
status = <optimized out>
0000002 0x00007f2116e0c30c in check_deadlock () at lockmgr.c:566
566 lockmgr.c: No such file or directory.
__cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -3986468669457419380, 0, 139780112523360, 0, 139780054775552, 3958416510911001484, 3958411306325259148}, __mask_was_saved = 0}}, __pad = {0x7f211482ef30, 0x0, 0x7f211482f700, 0x7f211482f700}}
__cancel_routine = 0x7f2116e0c420 <cln_hdl(void*)>
__not_first_call = <optimized out>
old = 0
0000003 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
0000004 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
TagsNo tags attached.

Relationships

child of 0000447 closedjoergs Release bareos-14.2.5 

Activities

jbehrend

jbehrend

2015-01-30 21:52

reporter  

bareos.config.tar.gz (34,087 bytes)
pstorz

pstorz

2015-03-27 18:16

administrator   ~0001657

Fix committed to bareos master branch with changesetid 5155.
mvwieringen

mvwieringen

2015-05-29 18:06

developer   ~0001752

Fix committed to bareos bareos-14.2 branch with changesetid 5321.

Related Changesets

bareos: master 0a34bd6b

2015-03-24 11:20

mvwieringen


Committer: pstorz

Ported: N/A

Details Diff
Don't crash when ua->UA_sock == NULL in prtit()

Some admin jobs have an UA context but not a ua->UA_sock and when you
then blind use ua->UA_sock->fsend() you crash in the worst possible way.
We now call ua->send_msg() which has some fallback logic when
ua->UA_sock is NULL and then uses Jmsg with M_INFO to redirect the info
to the Job.

Its might not fully fix 0000418 as its seems the admin Job want to have
interaction with the user which ain't going to work in an admin Job but
crashing is about the worse what can happen.

Fixes 0000418: traceback runnning after job
Affected Issues
0000418
mod - src/dird/ua_output.c Diff File

bareos: bareos-14.2 79d1036f

2015-03-24 11:20

mvwieringen

Ported: N/A

Details Diff
Don't crash when ua->UA_sock == NULL in prtit()

Some admin jobs have an UA context but not a ua->UA_sock and when you
then blind use ua->UA_sock->fsend() you crash in the worst possible way.
We now call ua->send_msg() which has some fallback logic when
ua->UA_sock is NULL and then uses Jmsg with M_INFO to redirect the info
to the Job.

Its might not fully fix 0000418 as its seems the admin Job want to have
interaction with the user which ain't going to work in an admin Job but
crashing is about the worse what can happen.

Fixes 0000418: traceback runnning after job
Affected Issues
0000418
mod - src/dird/ua_output.c Diff File

Issue History

Date Modified Username Field Change
2015-01-30 21:52 jbehrend New Issue
2015-01-30 21:52 jbehrend File Added: bareos.config.tar.gz
2015-03-27 18:16 pstorz Changeset attached => bareos master 0a34bd6b
2015-03-27 18:16 pstorz Note Added: 0001657
2015-03-27 18:16 pstorz Status new => resolved
2015-03-27 18:16 pstorz Resolution open => fixed
2015-05-27 12:49 joergs Relationship added child of 0000447
2015-05-27 12:49 joergs Additional Information Updated
2015-05-29 18:06 mvwieringen Changeset attached => bareos bareos-14.2 79d1036f
2015-05-29 18:06 mvwieringen Note Added: 0001752
2015-06-01 16:23 joergs Status resolved => closed