View Issue Details

IDProjectCategoryView StatusLast Update
0001024bareos-corestorage daemonpublic2019-07-15 10:33
ReporterNTANMA Assigned Toarogge  
PrioritynoneSeverityminorReproducibilityalways
Status closedResolutionduplicate 
PlatformLinuxOSCentOSOS Version7
Product Version17.2.6 
Summary0001024: crash SD if 2 jobs are running using one tape
DescriptionIn general, everything works stably, but if 2 tasks are started that are written to one pool (according to the schedule or manually), 1 task starts to be completed, and the second one as it is expected waits until the tape is released. After the completion of task 1 of the SD, dye in a few minutes
p.s. Sorry for the quality of the text, I use google translate
Steps To Reproducerun 2 jobs using the same tape
Additional Information#cat bareos-sd.585.bactrace
Attempt to dump locks
threadid=0x0000007f2c46a707 max=0 current=-1
threadid=0x3000007f2c4626f7 max=0 current=-1
threadid=0x6600007f2c4807d7 max=0 current=-1
threadid=0x6400007f2c52caa8 max=0 current=-1
Attempt to dump current JCRs. njcrs=0
Attempt to dump plugins. Hook count=1
Plugin 0xe6c938 name="autoxflate-sd.so"
Plugin 0xe6d388 name="scsicrypto-sd.so"
Plugin 0xe6dbd8 name="scsitapealert-sd.so"
----------------------------------------------
# cat bareos.585.traceback
Created /var/lib/bareos/bareos-sd.core.585 for doing postmortem debugging
[New LWP 587]
[New LWP 589]
[New LWP 590]
[New LWP 585]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/bareos-sd'.
#0 0x00007f2c50ff5f3d in nanosleep () from /usr/lib64/libpthread.so.0
$1 = '\000' <repeats 127 times>
$2 = 0xe3c068 "bareos-sd"
$3 = 0xe3c0a8 "/usr/sbin/bareos-sd"
$4 = 0x0
$5 = 0x7f2c5218317c "17.2.6 (19 Jun 2018)"
$6 = 0x7f2c52183164 "x86_64-redhat-linux-gnu"
$7 = 0x7f2c5218315d "redhat"
$8 = 0x7f2c52183660 "CentOS Linux release 7.4.1708 (Core) "
$9 = "as-bareos-msk", '\000' <repeats 242 times>
$10 = 0x7f2c52183688 "redhat CentOS Linux release 7.4.1708 (Core) "
Environment variable "TestName" not defined.
#0 0x00007f2c50ff5f3d in nanosleep () from /usr/lib64/libpthread.so.0
0000001 0x00007f2c5214f2b4 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
0000002 0x00007f2c521613d1 in check_deadlock () at lockmgr.c:568
0000003 0x00007f2c50feee25 in start_thread () from /usr/lib64/libpthread.so.0
0000004 0x00007f2c4fee4bad in clone () from /usr/lib64/libc.so.6
 
Thread 4 (Thread 0x7f2c52caa880 (LWP 585)):
#0 0x00007f2c50ff6279 in waitpid () from /usr/lib64/libpthread.so.0
0000001 0x00007f2c52172964 in signal_handler (sig=11) at signal.c:240
0000002 <signal handler called>
0000003 e_msg (file=file@entry=0x7f2c5217e163 "bnet_server_tcp.c", line=line@entry=227, type=type@entry=1, level=level@entry=0, fmt=0x7f2c5217e193 "Cannot bind port %d: ERR=%s.\n") at message.c:1537
0000004 0x00007f2c52142f55 in bnet_thread_server_tcp (addr_list=0xe3e1f8, max_clients=42, sockfds=0xe64f58, client_wq=client_wq@entry=0x628620 <socket_workq>, nokeepalive=false, handle_client_request=handle_client_request@entry=0x419b40 <handle_connection_request(void*)>) at bnet_server_tcp.c:227
0000005 0x0000000000419eb8 in start_socket_server (addrs=<optimized out>) at socket_server.c:122
0000006 0x00000000004091ea in main (argc=<optimized out>, argv=<optimized out>) at stored.c:322
 
Thread 3 (Thread 0x7f2c4626f700 (LWP 590)):
#0 0x00007f2c50ff2d42 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
0000001 0x00007f2c5216189c in bthread_cond_timedwait_p (cond=cond@entry=0x628580 <wait_for_next_run>, m=m@entry=0x6285c0 <_ZL5mutex>, abstime=abstime@entry=0x7f2c4626ed50, file=file@entry=0x421340 "sd_stats.c", line=line@entry=400) at lockmgr.c:813
0000002 0x0000000000418e8f in statistics_thread_runner (arg=arg@entry=0x0) at sd_stats.c:400
0000003 0x00007f2c52161440 in lmgr_thread_launcher (x=0xe65cd8) at lockmgr.c:928
0000004 0x00007f2c50feee25 in start_thread () from /usr/lib64/libpthread.so.0
0000005 0x00007f2c4fee4bad in clone () from /usr/lib64/libc.so.6
 
Thread 2 (Thread 0x7f2c46a70700 (LWP 589)):
#0 0x00007f2c50ff2d42 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
0000001 0x00007f2c5216189c in bthread_cond_timedwait_p (cond=cond@entry=0x7f2c5239a800 <_ZL5timer>, m=m@entry=0x7f2c5239a840 <_ZL11timer_mutex>, abstime=abstime@entry=0x7f2c46a6fd40, file=file@entry=0x7f2c52187a82 "watchdog.c", line=line@entry=313) at lockmgr.c:813
0000002 0x00007f2c5217c458 in watchdog_thread (arg=arg@entry=0x0) at watchdog.c:313
0000003 0x00007f2c52161440 in lmgr_thread_launcher (x=0xe65708) at lockmgr.c:928
0000004 0x00007f2c50feee25 in start_thread () from /usr/lib64/libpthread.so.0
0000005 0x00007f2c4fee4bad in clone () from /usr/lib64/libc.so.6
 
Thread 1 (Thread 0x7f2c4807d700 (LWP 587)):
#0 0x00007f2c50ff5f3d in nanosleep () from /usr/lib64/libpthread.so.0
0000001 0x00007f2c5214f2b4 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
0000002 0x00007f2c521613d1 in check_deadlock () at lockmgr.c:568
0000003 0x00007f2c50feee25 in start_thread () from /usr/lib64/libpthread.so.0
0000004 0x00007f2c4fee4bad in clone () from /usr/lib64/libc.so.6
#0 0x00007f2c50ff5f3d in nanosleep () from /usr/lib64/libpthread.so.0
No symbol table info available.
0000001 0x00007f2c5214f2b4 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at bsys.c:171
171 status = nanosleep(&timeout, NULL);
timeout = {tv_sec = 30, tv_nsec = 0}
tv = {tv_sec = 0, tv_usec = -1930332233579481344}
tz = {tz_minuteswest = 1073746040, tz_dsttime = 32556}
status = <optimized out>
0000002 0x00007f2c521613d1 in check_deadlock () at lockmgr.c:568
568 while (!bmicrosleep(30, 0)) {
__clframe = <optimized out>
old = 0
0000003 0x00007f2c50feee25 in start_thread () from /usr/lib64/libpthread.so.0
No symbol table info available.
0000004 0x00007f2c4fee4bad in clone () from /usr/lib64/libc.so.6
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.

-------------------------------
 systemctl status bareos-sd
● bareos-sd.service - SYSV: Backup Archiving REcovery Open Sourced.
   Loaded: loaded (/etc/rc.d/init.d/bareos-sd; bad; vendor preset: disabled)
   Active: active (exited) since Thu 2018-10-25 10:39:46 MSK; 33min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 31223 ExecStop=/etc/rc.d/init.d/bareos-sd stop (code=exited, status=0/SUCCESS)
  Process: 573 ExecStart=/etc/rc.d/init.d/bareos-sd start (code=exited, status=0/SUCCESS)
 
Oct 25 10:39:46 as-bareos-msk systemd[1]: Starting SYSV: Backup Archiving REcovery Open Sourced....
Oct 25 10:39:46 as-bareos-msk runuser[582]: pam_unix(runuser:session): session opened for user bareos by (uid=0)
Oct 25 10:39:46 as-bareos-msk runuser[582]: pam_unix(runuser:session): session closed for user bareos
Oct 25 10:39:46 as-bareos-msk bareos-sd[573]: Starting Bareos Storage services: [ OK ]
Oct 25 10:39:46 as-bareos-msk systemd[1]: Started SYSV: Backup Archiving REcovery Open Sourced..
Oct 25 11:09:46 as-bareos-msk bareos-sd[585]: bareos-sd: ABORTING due to ERROR in bnet_server_tcp.c:227
                                              Cannot bind port 9103: ERR=Address already in use.
Oct 25 11:09:46 as-bareos-msk bareos-sd[585]: BAREOS interrupted by signal 11: Segmentation violation
TagsNo tags attached.

Relationships

duplicate of 0001006 closedjoergs Storage daemon segfaults in update_job_statistics when starting scheduled jobs 

Activities

NTANMA

NTANMA

2018-10-25 12:30

reporter  

image001.png (36,345 bytes)   
image001.png (36,345 bytes)   
NTANMA

NTANMA

2018-10-26 02:21

reporter  

bareos-sd.trace (1,011,487 bytes)
NTANMA

NTANMA

2018-11-28 01:32

reporter   ~0003154

please close the incident, it is duplicate 0001006

Issue History

Date Modified Username Field Change
2018-10-25 12:30 NTANMA New Issue
2018-10-25 12:30 NTANMA File Added: image001.png
2018-10-26 02:21 NTANMA File Added: bareos-sd.trace
2018-11-28 01:32 NTANMA Note Added: 0003154
2019-07-15 10:33 arogge Relationship added duplicate of 0001006
2019-07-15 10:33 arogge Assigned To => arogge
2019-07-15 10:33 arogge Status new => resolved
2019-07-15 10:33 arogge Resolution open => duplicate
2019-07-15 10:33 arogge Status resolved => closed