View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000845 | bareos-core | storage daemon | public | 2017-08-17 16:20 | 2019-01-31 10:22 |
Reporter | frank | Assigned To | arogge_adm | ||
Priority | high | Severity | crash | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Product Version | 16.2.6 | ||||
Summary | 0000845: NetApp OnCommand System Manager calls on SD Port 10000 lead to Segmentation Violation | ||||
Description | It looks like somehow calls from the NetApp OnCommand System Manager on SD Port 10000 can lead to an SD Crash, Segmentation Violation. This has been deteced via tcpdump, tracing and logs while we were wondering about an SD crash happening frequently. The tcpdump showed each time after the NetApp System Manager sends an CONNECT_OPEN, CONFIG_GET_SERVER_INFO AND CONNECT_CLOSE Request via NDMP on SD port 10000, the SD crashed. Thread 4 (Thread 0x7f9343fff700 (LWP 9520)): #0 0x00007f934a360489 in __libc_waitpid (pid=9521, stat_loc=0x7f9343ffdecc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40 0000001 0x00007f934b468324 in signal_handler (sig=11) at signal.c:240 0000002 <signal handler called> 0000003 strlen () at ../sysdeps/x86_64/strlen.S:106 0000004 0x00007f93491f197e in __GI___strdup (s=0xaaaaaaaaaaaaaaaa <error: Cannot access memory at address 0xaaaaaaaaaaaaaaaa>) at strdup.c:41 0000005 0x00007f934bb24cee in convert_strdup (src=<optimized out>, dstp=dstp@entry=0x7f9343ffebb8) at ndmp_translate.c:154 0000006 0x00007f934bb33ed9 in ndmp_9to4_config_get_server_info_reply (reply9=0x7f9343ffea10, reply4=0x7f9343ffebb0) at ndmp4_translate.c:809 0000007 0x00007f934bb4334b in ndma_dispatch_request (sess=sess@entry=0x7f933c001078, arg_xa=arg_xa@entry=0x7f9343ffeae0, ref_conn=ref_conn@entry=0x7f933c0041a0) at ndma_comm_dispatch.c:240 0000008 0x00007f934bb43463 in ndma_dispatch_conn (sess=sess@entry=0x7f933c001078, conn=0x7f933c0041a0) at ndma_comm_dispatch.c:306 0000009 0x00007f934bb4447f in ndma_session_quantum (sess=0x7f933c001078, max_delay_secs=<optimized out>) at ndma_comm_session.c:407 0000010 0x0000000000413b6d in handle_ndmp_client_request (arg=0x7f9338001108) at ndmp_tape.c:1231 0000011 0x00007f934b471dd5 in workq_server (arg=arg@entry=0x625900 <ndmp_workq>) at workq.c:336 0000012 0x00007f934b4590ef in lmgr_thread_launcher (x=0x7f9338001218) at lockmgr.c:926 0000013 0x00007f934a359064 in start_thread (arg=0x7f9343fff700) at pthread_create.c:309 0000014 0x00007f934925862d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 | ||||
Steps To Reproduce | 1. run NetApp OnCommand System Manager on another host in your network 2. run "tcpdump -i <interface-name> -w output.dump dst <ip-addr-bareos-sd> and tcp port 10000" on SD host 3. use setdebug in bconsole to produce tracefiles | ||||
Additional Information | Blocking port 10000 for the host the NetApp On Command Manager runs on lead to no further SD crashes. | ||||
Tags | No tags attached. | ||||
I think we encountered the same problem in another environment on a RHEL 7.4 host, Bareos 17.2.4 SD installed. The SD is crashing frequently with SIG 11. Thread 7 (Thread 0x7efd5d7fa700 (LWP 4371)): #0 0x00007efd9fb2c229 in waitpid () from /usr/lib64/libpthread.so.0 0000001 0x00007efda0ca87e4 in signal_handler (sig=11) at signal.c:240 0000002 <signal handler called> 0000003 0x00007efd9e9a8a81 in __strlen_sse2 () from /usr/lib64/libc.so.6 0000004 0x00007efd9e9a878e in strdup () from /usr/lib64/libc.so.6 0000005 0x00007efda136d61e in convert_strdup (src=<optimized out>, dstp=dstp@entry=0x7efd5d7f9aa8) at ndmp_translate.c:154 0000006 0x00007efda137d599 in ndmp_9to4_config_get_server_info_reply (reply9=0x7efd5d7f98f0, reply4=0x7efd5d7f9aa0) at ndmp4_translate.c:809 0000007 0x00007efda138e38f in ndma_dispatch_request (sess=sess@entry=0x7efcfc0008e8, arg_xa=arg_xa@entry=0x7efd5d7f99d0, ref_conn=ref_conn@entry=0x7efcfc001120) at ndma_comm_dispatch.c:240 0000008 0x00007efda138e514 in ndma_dispatch_conn (sess=sess@entry=0x7efcfc0008e8, conn=0x7efcfc001120) at ndma_comm_dispatch.c:306 0000009 0x00007efda138f58f in ndma_session_quantum (sess=0x7efcfc0008e8, max_delay_secs=<optimized out>) at ndma_comm_session.c:411 0000010 0x000000000041512d in handle_ndmp_client_request (arg=0x7efd80001108) at ndmp_tape.c:1233 0000011 0x00007efda0cb2ce5 in workq_server (arg=arg@entry=0x628040 <ndmp_workq>) at workq.c:336 0000012 0x00007efda0c97320 in lmgr_thread_launcher (x=0x7efd80001368) at lockmgr.c:928 0000013 0x00007efd9fb24dd5 in start_thread () from /usr/lib64/libpthread.so.0 0000014 0x00007efd9ea1ab3d in clone () from /usr/lib64/libc.so.6 From line 809 ndmp4_translate.c it looks like the vendor_name is not initialized properly. CNVT_STRDUP_FROM_9x (reply4, reply9, vendor_name, config_info.vendor_name); |
|
Fix committed to bareos dev branch with changesetid 10740. | |
Date Modified | Username | Field | Change |
---|---|---|---|
2017-08-17 16:20 | frank | New Issue | |
2017-08-17 16:23 | frank | Status | new => confirmed |
2018-12-13 12:47 | frank | Note Added: 0003159 | |
2018-12-19 10:43 | pstorz | Changeset attached | => bareos dev 0fe91c0c |
2018-12-19 10:43 | pstorz | Note Added: 0003161 | |
2018-12-19 10:43 | pstorz | Status | confirmed => resolved |
2018-12-19 10:43 | pstorz | Resolution | open => fixed |
2019-01-31 10:08 | arogge_adm | Relationship added | child of 0001040 |
2019-01-31 10:22 | arogge_adm | Assigned To | => arogge_adm |
2019-01-31 10:22 | arogge_adm | Status | resolved => closed |