View Issue Details

IDProjectCategoryView StatusLast Update
0000845bareos-core[All Projects] storage daemonpublic2019-01-31 10:22
ReporterfrankAssigned Toarogge_adm 
PriorityhighSeveritycrashReproducibilityalways
Status closedResolutionfixed 
Product Version16.2.6 
Target VersionFixed in Version 
Summary0000845: NetApp OnCommand System Manager calls on SD Port 10000 lead to Segmentation Violation
DescriptionIt looks like somehow calls from the NetApp OnCommand System Manager on SD Port 10000
can lead to an SD Crash, Segmentation Violation.

This has been deteced via tcpdump, tracing and logs while we were wondering about an SD crash happening
frequently.

The tcpdump showed each time after the NetApp System Manager sends an
CONNECT_OPEN, CONFIG_GET_SERVER_INFO AND CONNECT_CLOSE Request via NDMP
on SD port 10000, the SD crashed.


Thread 4 (Thread 0x7f9343fff700 (LWP 9520)):
#0 0x00007f934a360489 in __libc_waitpid (pid=9521, stat_loc=0x7f9343ffdecc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
0000001 0x00007f934b468324 in signal_handler (sig=11) at signal.c:240
0000002 <signal handler called>
0000003 strlen () at ../sysdeps/x86_64/strlen.S:106
0000004 0x00007f93491f197e in __GI___strdup (s=0xaaaaaaaaaaaaaaaa <error: Cannot access memory at address 0xaaaaaaaaaaaaaaaa>) at strdup.c:41
0000005 0x00007f934bb24cee in convert_strdup (src=<optimized out>, dstp=dstp@entry=0x7f9343ffebb8) at ndmp_translate.c:154
0000006 0x00007f934bb33ed9 in ndmp_9to4_config_get_server_info_reply (reply9=0x7f9343ffea10, reply4=0x7f9343ffebb0) at ndmp4_translate.c:809
0000007 0x00007f934bb4334b in ndma_dispatch_request (sess=sess@entry=0x7f933c001078, arg_xa=arg_xa@entry=0x7f9343ffeae0, ref_conn=ref_conn@entry=0x7f933c0041a0) at ndma_comm_dispatch.c:240
0000008 0x00007f934bb43463 in ndma_dispatch_conn (sess=sess@entry=0x7f933c001078, conn=0x7f933c0041a0) at ndma_comm_dispatch.c:306
0000009 0x00007f934bb4447f in ndma_session_quantum (sess=0x7f933c001078, max_delay_secs=<optimized out>) at ndma_comm_session.c:407
0000010 0x0000000000413b6d in handle_ndmp_client_request (arg=0x7f9338001108) at ndmp_tape.c:1231
0000011 0x00007f934b471dd5 in workq_server (arg=arg@entry=0x625900 <ndmp_workq>) at workq.c:336
0000012 0x00007f934b4590ef in lmgr_thread_launcher (x=0x7f9338001218) at lockmgr.c:926
0000013 0x00007f934a359064 in start_thread (arg=0x7f9343fff700) at pthread_create.c:309
0000014 0x00007f934925862d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Steps To Reproduce1. run NetApp OnCommand System Manager on another host in your network

2. run "tcpdump -i <interface-name> -w output.dump dst <ip-addr-bareos-sd> and
tcp port 10000" on SD host

3. use setdebug in bconsole to produce tracefiles
Additional InformationBlocking port 10000 for the host the NetApp On Command Manager runs on lead to no further SD crashes.
TagsNo tags attached.
bareos-master: impact
bareos-master: action
bareos-18.2: impact
bareos-18.2: action
bareos-17.2: impact
bareos-17.2: action
bareos-16.2: impact
bareos-16.2: action
bareos-15.2: impact
bareos-15.2: action
bareos-14.2: impact
bareos-14.2: action
bareos-13.2: impact
bareos-13.2: action
bareos-12.4: impact
bareos-12.4: action

Relationships

child of 0001040 resolvedpstorz Release bareos-18.2.5 

Activities

frank

frank

2018-12-13 12:47

manager   ~0003159

I think we encountered the same problem in another environment on a RHEL 7.4 host,
Bareos 17.2.4 SD installed.

The SD is crashing frequently with SIG 11.

Thread 7 (Thread 0x7efd5d7fa700 (LWP 4371)):
#0 0x00007efd9fb2c229 in waitpid () from /usr/lib64/libpthread.so.0
0000001 0x00007efda0ca87e4 in signal_handler (sig=11) at signal.c:240
0000002 <signal handler called>
0000003 0x00007efd9e9a8a81 in __strlen_sse2 () from /usr/lib64/libc.so.6
0000004 0x00007efd9e9a878e in strdup () from /usr/lib64/libc.so.6
0000005 0x00007efda136d61e in convert_strdup (src=<optimized out>, dstp=dstp@entry=0x7efd5d7f9aa8) at ndmp_translate.c:154
0000006 0x00007efda137d599 in ndmp_9to4_config_get_server_info_reply (reply9=0x7efd5d7f98f0, reply4=0x7efd5d7f9aa0) at ndmp4_translate.c:809
0000007 0x00007efda138e38f in ndma_dispatch_request (sess=sess@entry=0x7efcfc0008e8, arg_xa=arg_xa@entry=0x7efd5d7f99d0, ref_conn=ref_conn@entry=0x7efcfc001120) at ndma_comm_dispatch.c:240
0000008 0x00007efda138e514 in ndma_dispatch_conn (sess=sess@entry=0x7efcfc0008e8, conn=0x7efcfc001120) at ndma_comm_dispatch.c:306
0000009 0x00007efda138f58f in ndma_session_quantum (sess=0x7efcfc0008e8, max_delay_secs=<optimized out>) at ndma_comm_session.c:411
0000010 0x000000000041512d in handle_ndmp_client_request (arg=0x7efd80001108) at ndmp_tape.c:1233
0000011 0x00007efda0cb2ce5 in workq_server (arg=arg@entry=0x628040 <ndmp_workq>) at workq.c:336
0000012 0x00007efda0c97320 in lmgr_thread_launcher (x=0x7efd80001368) at lockmgr.c:928
0000013 0x00007efd9fb24dd5 in start_thread () from /usr/lib64/libpthread.so.0
0000014 0x00007efd9ea1ab3d in clone () from /usr/lib64/libc.so.6

From line 809 ndmp4_translate.c it looks like the vendor_name is not initialized properly.

   CNVT_STRDUP_FROM_9x (reply4, reply9,
         vendor_name, config_info.vendor_name);
pstorz

pstorz

2018-12-19 10:43

administrator   ~0003161

Fix committed to bareos dev branch with changesetid 10740.

Related Changesets

bareos: dev 0fe91c0c

2018-12-19 09:48:02

pstorz

Ported: N/A

Details Diff
NDMP: fix sd crash when ndmp info is queried

Fixes 0000845: NetApp OnCommand System Manager calls on SD Port 10000 lead to Segmentation Violation
Affected Issues
0000845
mod - core/src/ndmp/ndmos_common.c Diff File

Issue History

Date Modified Username Field Change
2017-08-17 16:20 frank New Issue
2017-08-17 16:23 frank Status new => confirmed
2018-12-13 12:47 frank Note Added: 0003159
2018-12-19 10:43 pstorz Changeset attached => bareos dev 0fe91c0c
2018-12-19 10:43 pstorz Note Added: 0003161
2018-12-19 10:43 pstorz Status confirmed => resolved
2018-12-19 10:43 pstorz Resolution open => fixed
2019-01-31 10:08 arogge_adm Relationship added child of 0001040
2019-01-31 10:22 arogge_adm Assigned To => arogge_adm
2019-01-31 10:22 arogge_adm Status resolved => closed