View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000527 | bareos-core | director | public | 2015-09-30 19:52 | 2019-12-18 15:25 |
Reporter | iloving | Assigned To | arogge | ||
Priority | normal | Severity | major | Reproducibility | always |
Status | closed | Resolution | won't fix | ||
Platform | Linux | OS | CentOS | OS Version | 6 |
Product Version | 15.2.2 | ||||
Summary | 0000527: Director crashes (not reproduceable) , the first time you go to Jobs section in bat | ||||
Description | Going to jobs section in bat causes bareos-dir to crash. | ||||
Steps To Reproduce | 1. Start Bareos-dir if it's not running 2. Launch bat 3. Go to Jobs tab Bat will appear to hang while populating the jobs list window, but what's actually happened is that bareos-dir has crashed. If you restart bareos-dir again, without closing bat, bat will reconnect and everything continues as normal. If you quit bat and re-run it, then the problem happens again. | ||||
Additional Information | The following is mailed to the root user: From root@localhost.jonahgroup.com Wed Sep 30 13:49:23 2015 Return-Path: <root@localhost.jonahgroup.com> X-Original-To: root@localhost Delivered-To: root@localhost.jonahgroup.com From: root@localhost.jonahgroup.com Subject: Bareos GDB traceback of bareos-dir on archive01.jonahgroup.com Sender: bareos@archive01.jonahgroup.com To: root@localhost.jonahgroup.com Date: Wed, 30 Sep 2015 13:49:23 -0400 (EDT) Status: RO Created /archivepool/bareos/bareos-dir.core.12195 for doing postmortem debugging [New Thread 12197] [New Thread 12200] [New Thread 12201] [New Thread 12202] [New Thread 12234] [New Thread 12195] [Thread debugging using libthread_db enabled] Core was generated by `/usr/sbin/bareos-dir -g bareos -c /etc/bareos/bareos-dir.conf'. #0 0x00007f9f3a1bdfbd in nanosleep () from /lib64/libpthread.so.0 $1 = 1751347809 $2 = 10276984 $3 = 10277048 /usr/lib/bareos/scripts/btraceback.gdb:4: Error in sourced command file: No symbol table is loaded. Use the "file" command. | ||||
Tags | No tags attached. | ||||
Hello, can you please provide the exact version of your OS? Also, please install the debuginfo package and send the files that are then generated. It would also be very intersting if you can reproduce the problem on version 15.2 You find the repos here: http://download.bareos.org/bareos/release/15.2/ |
|
I've installed bareos-contrib-debuginfo and made it crash again, but I think something might be missing since the traceback file says 'no symbol table loaded' Created /archivepool/bareos/bareos-dir.core.28293 for doing postmortem debugging [New Thread 28294] [New Thread 28297] [New Thread 28298] [New Thread 28299] [New Thread 28328] [New Thread 28293] [Thread debugging using libthread_db enabled] Core was generated by `/usr/sbin/bareos-dir -g bareos -c /etc/bareos/bareos-dir.conf'. #0 0x0000003bb720efbd in nanosleep () from /lib64/libpthread.so.0 $1 = 1751347809 $2 = 10899576 $3 = 10899640 /usr/lib/bareos/scripts/btraceback.gdb:4: Error in sourced command file: No symbol table is loaded. Use the "file" command. |
|
Ok this is wierd... I left bat open while I was collecting the info you wanted, and even though bareos-dir crashed, the jobs panel (I assume timed out waiting for the director) displayed a partial data set, stopping at one specific job, that happens to be a Copy job. Dunno if this helps at all, but the job it seems to have died on contains the following: # Fake fileset for copy jobs Fileset { Name = None_Full Include { Options { signature = MD5 } } } # Fake client for copy jobs Client { Name = None_Full Address = localhost Password = "NoNe" Catalog = MyCatalog } Job { Name = "Copy-Full" Type = Copy Client = None_Full Fileset = None_Full Level = Full Messages = Standard Pool = Full Selection Type = PoolUncopiedJobs Priority = 100 } |
|
Hello, after this commit: https://github.com/bareos/bareos/commit/858a8a642b3d9e78ce7431be5f59e36498d057af you should not be obligated to define a fileset for a copy job anymore. Please check if the problem persists in 15.2. Thanks for your time and help Philippp |
|
I have verified that the problem continues in 15.2 when using the same config files. I've tried removing the client and Fileset lines from the jobs in question, and the issue persists. Here are the last lines generated by bat before everything dies: bat: console/console.cpp:438-0 job_defaults: key=job, value=Copy-Full bat: console/console.cpp:438-0 job_defaults: key=pool, value=Full bat: console/console.cpp:438-0 job_defaults: key=messages, value=Standard bat: console/console.cpp:438-0 job_defaults: key=client, value=*None* The Job definition now looks like this: Job { Name = "Copy-Full" Type = Copy Level = Full Messages = Standard Pool = Full Selection Type = PoolUncopiedJobs Priority = 100 } |
|
I've uploaded the backtrace. The core file is 27mb when gzipped, so I can't upload it. | |
I installed a Centos 6 test system with Bareos 15.2.1-rc2 and did run bat. However, I was not able to reproduce this problem. Of course, I only have a small test setup here. Please enable "Debug Comm" in bat (Settings -> Preferences -> Debug -> Debug Comm) and run it again. The the director crashes again, please check the last "send" line from the stdout output of bat. It should look similar to: bat: bcomm/dircomm.cpp:276-0 conn 0 send: .defaults job="RestoreFiles" The text after "send:" is the last command send to the director. Restart the director and run this command in bconsole, to see if the director also crashes without using bat. |
|
bat was replaced by bareos-webui, so problems with bat won't be handled anymore. | |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-09-30 19:52 | iloving | New Issue | |
2015-10-01 10:13 | pstorz | Note Added: 0001855 | |
2015-10-01 10:13 | pstorz | Assigned To | => pstorz |
2015-10-01 10:13 | pstorz | Status | new => feedback |
2015-10-01 17:20 | iloving | Note Added: 0001859 | |
2015-10-01 17:20 | iloving | Status | feedback => assigned |
2015-10-01 17:23 | iloving | Note Added: 0001860 | |
2015-10-02 10:11 | pstorz | Note Added: 0001861 | |
2015-10-21 19:32 | iloving | Note Added: 0001885 | |
2015-10-21 19:41 | iloving | File Added: bareos.12556.traceback | |
2015-10-21 19:42 | iloving | Note Added: 0001886 | |
2015-11-06 18:30 | maik | Relationship added | child of 0000554 |
2015-11-16 18:21 | joergs | Assigned To | pstorz => joergs |
2015-11-16 19:03 | joergs | Note Added: 0001968 | |
2015-11-16 19:04 | joergs | Status | assigned => feedback |
2015-11-20 13:30 | maik | Priority | high => normal |
2015-11-20 13:30 | maik | Severity | block => major |
2015-11-20 13:30 | maik | Summary | Director crashes, the first time you go to Jobs section in bat => Director crashes (not reproduceable) , the first time you go to Jobs section in bat |
2015-12-11 09:48 | joergs | Assigned To | joergs => |
2015-12-11 09:48 | joergs | Relationship deleted | child of 0000554 |
2015-12-11 09:49 | joergs | Product Version | 14.2.2 => 15.2.2 |
2019-01-16 11:36 | arogge | Note Added: 0003185 | |
2019-01-16 11:36 | arogge | Status | feedback => resolved |
2019-01-16 11:36 | arogge | Resolution | open => won't fix |
2019-01-16 11:36 | arogge | Assigned To | => arogge |
2019-12-18 15:25 | arogge | Status | resolved => closed |