View Issue Details

IDProjectCategoryView StatusLast Update
0000662bareos-corestorage daemonpublic2016-10-29 09:02
ReporterM1Sports20 Assigned Tojoergs  
PrioritynormalSeveritycrashReproducibilityhave not tried
Status closedResolutionfixed 
PlatformRaspberry PiOSArchLinuxOS VersionRolling
Product Version15.2.3 
Fixed in Version16.2.4 
Summary0000662: bareos-storage daemon dies
DescriptionThe bareos-storage daemon dies after the director connects to it and reads status:

Director:
Connecting to Director 192.168.5.20:9101
1000 OK: my-dir Version: 15.2.2 (16 November 2015)
Enter a period to cancel a command.
*status
Status available for:
     1: Director
     2: Storage
     3: Client
     4: Scheduler
     5: All
Select daemon type for status (1-5): 2
Automatically selected Storage: Offsite
Connecting to Storage daemon Offsite at bstorage01:9103

bstorage01 Version: 15.2.3 (07 March 2016) armv7l-unknown-linux-gnueabihf debian
Daemon started 25-May-16 23:10. Jobs: run=0, running=0.
 Heap: heap=135,168 smbytes=28,481 max_bytes=88,486 bufs=80 max_bufs=81
 Sizes: boffset_t=8 size_t=4 int32_t=4 int64_t=8 mode=0 bwlimit=0kB/s


Storage Daemon:
[root@bstorage01 user]# /usr/sbin/bareos-sd -d99 -c /etc/bareos/bareos-sd.conf
bareos-sd (90): stored_conf.c:845-0 Inserting Director res: my-mon
[root@bstorage01 user]# bstorage01 (8): crypto_cache.c:55-0 Could not open crypto cache file. /var/lib/bareos/bareos-sd.9103.cryptoc ERR=No such file or directory
bstorage01 (10): socket_server.c:112-0 stored: listening on port 9103
bstorage01 (90): stored.c:656-0 calling init_dev /mnt/drive_a
bstorage01 (10): stored.c:658-0 SD init done /mnt/drive_a
bstorage01 (50): cram-md5.c:68-0 send: auth cram-md5 <454384897.1464232877@bstorage01> ssl=2
bstorage01 (99): cram-md5.c:143-0 sending resp to challenge: Q4oTXTqzhE4rk7I83j2SSB
bstorage01 (50): bnet.c:143-0 TLS server negotiation established.
bstorage01 (90): dir_cmd.c:277-0 Message channel init completed.
BAREOS interrupted by signal 11: Segmentation violation
Kaboom! bareos-sd, bstorage01 got signal 11 - Segmentation violation. Attempting traceback.
Kaboom! exepath=/usr/sbin/
Calling: /usr/sbin/btraceback /usr/sbin/bareos-sd 8413 /var/lib/bareos
bsmtp: bsmtp.c:501-0 Failed to connect to mailhost localhost
The btraceback call returned 1
Dumping: /var/lib/bareos/bstorage01.8413.bactrace


/var/lib/bareos/bstorage01.8413.bactrace:
Attempt to dump locks
threadid=0x75cff450 max=2 current=-1
threadid=0x752ff450 max=0 current=-1
threadid=0x76670450 max=0 current=-1
threadid=0x76f7b000 max=0 current=-1
Attempt to dump current JCRs. njcrs=1
threadid=0x75cff450 JobId=0 JobStatus=C jcr=0x753008e0 name=*System*
threadid=0x75cff450 killable=1 JobId=0 JobStatus=C jcr=0x753008e0 name=*System*
        use_count=1
        JobType=I JobLevel=
        sched_time=25-May-2016 23:21 start_time=31-Dec-1969 19:00
        end_time=31-Dec-1969 19:00 wait_time=31-Dec-1969 19:00
        db=(nil) db_batch=(nil) batch_started=0
Steps To ReproduceRun "status storage" from director
TagsNo tags attached.

Activities

mvwieringen

mvwieringen

2016-05-27 15:47

developer   ~0002277

bactrace files are kind of useless, we need a proper stacktrace e.g. traceback
file. You only get those when you have debug symbols and a debugger like gdb
installed.
M1Sports20

M1Sports20

2016-05-27 16:07

reporter   ~0002279

I agree, but didn't know if bactrace file plus the log would give you any clues, such as if you have seen this before.

I can provide a stacktrack and corefile after the holiday weekend. I may even try and debug it a bit.

Thanks,
Michael
mvwieringen

mvwieringen

2016-05-27 16:15

developer   ~0002280

A sig 11 e.g. SEGV always needs a stacktrace as then you at least see any
fishy arguments to functions and where it exactly crashes and exactly what
call resulted in the SEGV in the debugger. So yes you might be able to answer
the question yourself if you look under a debugger to the crash. If you have
more info it might be something that is solvable easily or not who knows.

A core is most of the times not to great as you need the exact same environment
to be able to do something with the core e.g. same platform etc.

But the stacktrace is most of the time a good starting point.
mvwieringen

mvwieringen

2016-05-27 16:16

developer   ~0002281

I leave this on feedback until you have new input.
M1Sports20

M1Sports20

2016-06-04 05:11

reporter   ~0002286

Not to much inforation

(gdb) bt
#0 0x76ec182c in list_plugins(alist*, POOL_MEM&) () from /usr/lib/bareos/libbareos-15.2.3.so
Backtrace stopped: Cannot access memory at address 0x392e
M1Sports20

M1Sports20

2016-09-23 04:29

reporter   ~0002360

More details:

(gdb) bt
#0 0x76ec1854 in list_plugins(alist*, POOL_MEM&) () from /usr/lib/bareos/libbareos-15.2.4.so
0000001 0x00027174 in list_status_header (sp=0x75cfeb80, sp@entry=0x75cfeb78) at status.c:443
0000002 0x00028b2c in output_status (devicenames=0x46e38 "DriveA", sp=0x75cfeb78, jcr=0x753008e0) at status.c:77
0000003 status_cmd (jcr=jcr@entry=0x753008e0) at status.c:973
0000004 0x0001ea64 in handle_director_connection (dir=dir@entry=0x63f08) at dir_cmd.c:307
0000005 0x000269e4 in handle_connection_request (arg=0x63f08) at socket_server.c:99
0000006 0x76ed19e4 in workq_server () from /usr/lib/bareos/libbareos-15.2.4.so
0000007 0x76eb8e8c in lmgr_thread_launcher () from /usr/lib/bareos/libbareos-15.2.4.so
0000008 0x76c24fac in start_thread () from /usr/lib/libpthread.so.0
0000009 0x76993cc0 in ?? () from /usr/lib/libc.so.6
M1Sports20

M1Sports20

2016-09-23 05:59

reporter   ~0002361

Implemented quick fix @ https://github.com/bareos/bareos/pull/53

I admit I do not fully understand the code execution path, but this does seem to fix the issue. Let me know if you want me to do something else or test something.

I do have the default plugin directory along with a single plugin in the directory:
/usr/lib/bareos/plugins/autoxflate-sd.so
joergs

joergs

2016-09-23 09:38

developer   ~0002362

Interesting. We recently changed this (and other places) similar in bareos-16.2 and master. This had been required for Fedora 24, because gcc >= 6.

What compiler did you use?
joergs

joergs

2016-09-23 09:44

developer   ~0002363

If your problem does only come the gcc (with optimization enabled), the commit https://github.com/bareos/bareos/commit/805800af8b88b07efe0748f9a7533af67095e85d should also be relevant.
M1Sports20

M1Sports20

2016-09-23 14:07

reporter   ~0002364

gcc (GCC) 6.1.1 20160501

I was going to try 16.2 but didn't want to upgrade since it wasn't released or considered stable yet.
joergs

joergs

2016-09-23 14:21

developer   ~0002365

Sure, it makes sense to wait for the release. I'm optimistic, that we'll have a release candidate today.

A proper fix for bareos-15.2 would include the patch above. However, as this might influence other things, we will not add it to bareos-15.2

I guess, the problem did only occur in this place of the code, because you're only the bareos-sd on the Raspberry?

AFAIK a solution would be to compile it without optimization (-O0).
M1Sports20

M1Sports20

2016-09-23 20:06

reporter   ~0002366

I confirmed disabling optimizations fixes the problem.
M1Sports20

M1Sports20

2016-09-23 20:08

reporter   ~0002367

Also, yes I am running the bareos-sd on the raspberry.
M1Sports20

M1Sports20

2016-09-24 03:50

reporter   ~0002368

FYI: My director box has also moved to gcc6.1.1. I am now seeing this problem on directors too.
M1Sports20

M1Sports20

2016-10-03 16:45

reporter   ~0002373

This has been fixed on 16.2.4 rc1.
joergs

joergs

2016-10-29 09:01

developer   ~0002421

Bareos 16.2.4 is released. Therefore changes to bareos 15.2 do not seam to be required.

Issue History

Date Modified Username Field Change
2016-05-26 05:26 M1Sports20 New Issue
2016-05-27 15:47 mvwieringen Note Added: 0002277
2016-05-27 15:48 mvwieringen Status new => feedback
2016-05-27 16:07 M1Sports20 Note Added: 0002279
2016-05-27 16:07 M1Sports20 Status feedback => new
2016-05-27 16:15 mvwieringen Note Added: 0002280
2016-05-27 16:16 mvwieringen Note Added: 0002281
2016-05-27 16:16 mvwieringen Status new => feedback
2016-06-04 05:11 M1Sports20 Note Added: 0002286
2016-06-04 05:11 M1Sports20 Status feedback => new
2016-09-23 04:29 M1Sports20 Note Added: 0002360
2016-09-23 05:59 M1Sports20 Note Added: 0002361
2016-09-23 09:36 joergs Assigned To => joergs
2016-09-23 09:36 joergs Status new => assigned
2016-09-23 09:38 joergs Note Added: 0002362
2016-09-23 09:39 joergs Status assigned => feedback
2016-09-23 09:44 joergs Note Added: 0002363
2016-09-23 14:07 M1Sports20 Note Added: 0002364
2016-09-23 14:07 M1Sports20 Status feedback => assigned
2016-09-23 14:21 joergs Note Added: 0002365
2016-09-23 20:06 M1Sports20 Note Added: 0002366
2016-09-23 20:08 M1Sports20 Note Added: 0002367
2016-09-24 03:50 M1Sports20 Note Added: 0002368
2016-10-03 16:45 M1Sports20 Note Added: 0002373
2016-10-29 09:01 joergs Note Added: 0002421
2016-10-29 09:01 joergs Status assigned => resolved
2016-10-29 09:01 joergs Fixed in Version => 16.2.4
2016-10-29 09:01 joergs Resolution open => fixed
2016-10-29 09:02 joergs Status resolved => closed