View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000418 | bareos-core | director | public | 2015-01-30 21:52 | 2015-06-01 16:23 |
Reporter | jbehrend | Assigned To | |||
Priority | normal | Severity | crash | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | Debian | OS Version | 8 |
Product Version | 14.2.2 | ||||
Summary | 0000418: traceback runnning after job | ||||
Description | I ran into tracebacks running this afterjob: RunScript { RunsWhen=After RunsOnClient=No Console = "purge volume action=truncate pool=auxme-DailyPool storage=backupsrv1-sd-auxme-DailyPool" } | ||||
Steps To Reproduce | run job=auxme storage=Spectra-Logic-T950EF spooldata=yes level=Full yes | ||||
Additional Information | Created /var/lib/bareos/bareos-dir.core.9559 for doing postmortem debugging [New LWP 9560] [New LWP 9564] [New LWP 9565] [New LWP 9566] [New LWP 6191] [New LWP 10256] [New LWP 11480] [New LWP 11996] [New LWP 9559] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/bareos-dir'. #0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 $1 = 0x69d980 <my_name> "backupsrv1-dir" $2 = 0xa1c058 "bareos-dir" $3 = 0xa1c098 "/usr/sbin/bareos-dir" $4 = 0x7f20f02a3c18 "PostgreSQL" $5 = 0x7f2116e28c1c "14.4.0 (16 November 2014)" $6 = 0x7f2116e28c08 "x86_64-pc-linux-gnu" $7 = 0x7f2116e28c01 "debian" $8 = 0x7f2116e28be3 "Debian GNU/Linux 8.0 (jessie)" $9 = "backupsrv1", '\000' <repeats 245 times> $10 = 0x7f2116e290f0 "debian Debian GNU/Linux 8.0 (jessie)" Environment variable "TestName" not defined. #0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116dfd024 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:100 0000002 0x00007f2116e0c30c in check_deadlock () at lockmgr.c:566 0000003 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000004 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 9 (Thread 0x7f2117f29740 (LWP 9559)): #0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116dfd024 in bmicrosleep (sec=60, usec=0) at bsys.c:100 0000002 0x0000000000440a69 in wait_for_next_job (one_shot_job_to_run=0x7f20f0014ec8 "8P\001\360 \177") at scheduler.c:124 0000003 0x000000000040e02a in main (argc=<optimized out>, argv=<optimized out>) at dird.c:387 Thread 8 (Thread 0x7f2106ffd700 (LWP 11996)): #0 0x00007f21161334c9 in waitpid () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116e19514 in signal_handler (sig=11) at signal.c:240 0000002 <signal handler called> 0000003 BSOCK::fsend (this=0x0, fmt=0x484128 "%s") at bsock.c:205 0000004 0x00007f2117461df5 in list_result (jcr=jcr@entry=0x7f20f80085f8, mdb=mdb@entry=0x7f20f800a8c8, send=send@entry=0x44d370 <prtit(void*, char const*)>, ctx=ctx@entry=0x7f20f80095a8, type=type@entry=HORZ_LIST) at sql.c:637 0000005 0x00007f211746c7d9 in db_list_media_records (jcr=0x7f20f80085f8, mdb=0x7f20f800a8c8, mdbr=mdbr@entry=0x7f2106ffc580, sendit=0x44d370 <prtit(void*, char const*)>, ctx=ctx@entry=0x7f20f80095a8, type=type@entry=HORZ_LIST) at sql_list.c:191 0000006 0x000000000045e4df in select_media_dbr (ua=ua@entry=0x7f20f80095a8, mr=mr@entry=0x7f2106ffc580) at ua_select.c:677 0000007 0x0000000000452abc in purge_cmd (ua=0x7f20f80095a8, cmd=<optimized out>) at ua_purge.c:155 0000008 0x00000000004449ae in do_a_command (ua=0x7f20f80095a8) at ua_cmds.c:301 0000009 0x000000000042b51a in run_console_command (jcr=<optimized out>, cmd=0x7f20f8001f70 "purge volume action=truncate pool=auxme-DailyPool storage=backupsrv1-sd-auxme-DailyPool") at job.c:2044 0000010 0x00007f2116e15cc7 in RUNSCRIPT::run (this=0x7f20f0165ba8, jcr=0x7f20f000e418, name=0x46f90f "AfterJob") at runscript.c:305 0000011 0x00007f2116e1630e in run_scripts (jcr=0x7f20f000e418, runscripts=0x7f20f0165b48, label=0x7f211747077a "No results to list.\n", label@entry=0x46f90f "AfterJob", allowed_script_dirs=0x0) at runscript.c:212 0000012 0x0000000000426db6 in job_thread (arg=0x7f20f000e418) at job.c:509 0000013 0x000000000042c9e1 in jobq_server (arg=0x69e3c0 <job_queue>) at jobq.c:450 0000014 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f20f02b50b8) at lockmgr.c:926 0000015 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000016 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 7 (Thread 0x7f2105ffb700 (LWP 11480)): #0 0x00007f2116132add in read () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116dfb170 in BSOCK_TCP::read_nbytes (this=0x7f2100002978, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:906 0000002 0x00007f2116dfa9df in BSOCK_TCP::recv (this=0x7f2100002978) at bsock_tcp.c:478 0000003 0x000000000045ff9a in handle_UA_client_request (user=user@entry=0x7f2100002978) at ua_server.c:89 0000004 0x000000000043b8cc in handle_connection_request (arg=0x7f2100002978) at socket_server.c:84 0000005 0x00007f2116e22e45 in workq_server (arg=arg@entry=0x69e680 <socket_workq>) at workq.c:335 0000006 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f2100002b88) at lockmgr.c:926 0000007 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000008 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 6 (Thread 0x7f21067fc700 (LWP 10256)): #0 0x00007f2116132add in read () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116dfb170 in BSOCK_TCP::read_nbytes (this=0x7f21000027a8, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:906 0000002 0x00007f2116dfa9df in BSOCK_TCP::recv (this=0x7f21000027a8) at bsock_tcp.c:478 0000003 0x000000000045ff9a in handle_UA_client_request (user=user@entry=0x7f21000027a8) at ua_server.c:89 0000004 0x000000000043b8cc in handle_connection_request (arg=0x7f21000027a8) at socket_server.c:84 0000005 0x00007f2116e22e45 in workq_server (arg=arg@entry=0x69e680 <socket_workq>) at workq.c:335 0000006 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f2100002978) at lockmgr.c:926 0000007 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000008 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 5 (Thread 0x7f21077fe700 (LWP 6191)): #0 0x00007f2116132add in read () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116dfb170 in BSOCK_TCP::read_nbytes (this=0x7f2100001168, ptr=<optimized out>, nbytes=4) at bsock_tcp.c:906 0000002 0x00007f2116dfa9df in BSOCK_TCP::recv (this=0x7f2100001168) at bsock_tcp.c:478 0000003 0x000000000045ff9a in handle_UA_client_request (user=user@entry=0x7f2100001168) at ua_server.c:89 0000004 0x000000000043b8cc in handle_connection_request (arg=0x7f2100001168) at socket_server.c:84 0000005 0x00007f2116e22e45 in workq_server (arg=arg@entry=0x69e680 <socket_workq>) at workq.c:335 0000006 0x00007f2116e0c39f in lmgr_thread_launcher (x=0x7f2100002618) at lockmgr.c:926 0000007 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000008 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 4 (Thread 0x7f2107fff700 (LWP 9566)): #0 0x00007f2116130438 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116e0c83c in bthread_cond_timedwait_p (cond=0x69e9e0 <wait_for_next_run_cond>, m=0x69ea20 <_ZL5mutex>, abstime=0x7f2107ffed60, file=0x47a8cd "stats.c", line=117) at lockmgr.c:811 0000002 0x0000000000441083 in wait_for_next_run () at stats.c:117 0000003 statistics_thread_runner (arg=0x69e9e4 <wait_for_next_run_cond+4>, arg@entry=0x0) at stats.c:290 0000004 0x00007f2116e0c39f in lmgr_thread_launcher (x=0xd04488) at lockmgr.c:926 0000005 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000006 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 3 (Thread 0x7f210cb34700 (LWP 9565)): #0 0x00007f2116130438 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116e0c83c in bthread_cond_timedwait_p (cond=cond@entry=0x7f211703d900 <_ZL5timer>, m=m@entry=0x7f211703d940 <_ZL11timer_mutex>, abstime=abstime@entry=0x7f210cb33e20, file=file@entry=0x7f2116e2cf02 "watchdog.c", line=line@entry=313) at lockmgr.c:811 0000002 0x00007f2116e224f9 in watchdog_thread (arg=arg@entry=0x0) at watchdog.c:313 0000003 0x00007f2116e0c39f in lmgr_thread_launcher (x=0xd02df8) at lockmgr.c:926 0000004 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000005 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 2 (Thread 0x7f210d335700 (LWP 9564)): #0 0x00007f211502218d in poll () from /lib/x86_64-linux-gnu/libc.so.6 0000001 0x00007f2116df2493 in bnet_thread_server_tcp (addr_list=addr_list@entry=0xa1e2a8, max_clients=<optimized out>, sockfds=<optimized out>, client_wq=client_wq@entry=0x69e680 <socket_workq>, nokeepalive=<optimized out>, handle_client_request=handle_client_request@entry=0x43b870 <handle_connection_request(void*)>) at bnet_server_tcp.c:298 0000002 0x000000000043bacf in connect_thread (arg=arg@entry=0xa1e2a8) at socket_server.c:101 0000003 0x00007f2116e0c39f in lmgr_thread_launcher (x=0xd04488) at lockmgr.c:926 0000004 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000005 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 1 (Thread 0x7f211482f700 (LWP 9560)): #0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 0000001 0x00007f2116dfd024 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:100 0000002 0x00007f2116e0c30c in check_deadlock () at lockmgr.c:566 0000003 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0000004 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #0 0x00007f211613318d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0 No symbol table info available. 0000001 0x00007f2116dfd024 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:100 100 bsys.c: No such file or directory. timeout = {tv_sec = 30, tv_nsec = 0} tv = {tv_sec = 0, tv_usec = 0} tz = {tz_minuteswest = 344127232, tz_dsttime = 32545} status = <optimized out> 0000002 0x00007f2116e0c30c in check_deadlock () at lockmgr.c:566 566 lockmgr.c: No such file or directory. __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -3986468669457419380, 0, 139780112523360, 0, 139780054775552, 3958416510911001484, 3958411306325259148}, __mask_was_saved = 0}}, __pad = {0x7f211482ef30, 0x0, 0x7f211482f700, 0x7f211482f700}} __cancel_routine = 0x7f2116e0c420 <cln_hdl(void*)> __not_first_call = <optimized out> old = 0 0000003 0x00007f211612c0a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 No symbol table info available. 0000004 0x00007f211502accd in clone () from /lib/x86_64-linux-gnu/libc.so.6 No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. #0 0x0000000000000000 in ?? () No symbol table info available. | ||||
Tags | No tags attached. | ||||
Fix committed to bareos master branch with changesetid 5155. | |
Fix committed to bareos bareos-14.2 branch with changesetid 5321. | |
bareos: master 0a34bd6b 2015-03-24 11:20 Committer: pstorz Ported: N/A Details Diff |
Don't crash when ua->UA_sock == NULL in prtit() Some admin jobs have an UA context but not a ua->UA_sock and when you then blind use ua->UA_sock->fsend() you crash in the worst possible way. We now call ua->send_msg() which has some fallback logic when ua->UA_sock is NULL and then uses Jmsg with M_INFO to redirect the info to the Job. Its might not fully fix 0000418 as its seems the admin Job want to have interaction with the user which ain't going to work in an admin Job but crashing is about the worse what can happen. Fixes 0000418: traceback runnning after job |
Affected Issues 0000418 |
|
mod - src/dird/ua_output.c | Diff File | ||
bareos: bareos-14.2 79d1036f 2015-03-24 11:20 Ported: N/A Details Diff |
Don't crash when ua->UA_sock == NULL in prtit() Some admin jobs have an UA context but not a ua->UA_sock and when you then blind use ua->UA_sock->fsend() you crash in the worst possible way. We now call ua->send_msg() which has some fallback logic when ua->UA_sock is NULL and then uses Jmsg with M_INFO to redirect the info to the Job. Its might not fully fix 0000418 as its seems the admin Job want to have interaction with the user which ain't going to work in an admin Job but crashing is about the worse what can happen. Fixes 0000418: traceback runnning after job |
Affected Issues 0000418 |
|
mod - src/dird/ua_output.c | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-01-30 21:52 | jbehrend | New Issue | |
2015-01-30 21:52 | jbehrend | File Added: bareos.config.tar.gz | |
2015-03-27 18:16 | pstorz | Changeset attached | => bareos master 0a34bd6b |
2015-03-27 18:16 | pstorz | Note Added: 0001657 | |
2015-03-27 18:16 | pstorz | Status | new => resolved |
2015-03-27 18:16 | pstorz | Resolution | open => fixed |
2015-05-27 12:49 | joergs | Relationship added | child of 0000447 |
2015-05-27 12:49 | joergs | Additional Information Updated | |
2015-05-29 18:06 | mvwieringen | Changeset attached | => bareos bareos-14.2 79d1036f |
2015-05-29 18:06 | mvwieringen | Note Added: 0001752 | |
2015-06-01 16:23 | joergs | Status | resolved => closed |