View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001073 | bareos-core | director | public | 2019-04-05 12:29 | 2019-12-18 15:24 |
Reporter | bratkartoffel | Assigned To | arogge | ||
Priority | normal | Severity | major | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | Alpine | OS Version | 3.8+ |
Product Version | 17.2.7 | ||||
Fixed in Version | 19.2.1 | ||||
Summary | 0001073: bsmtp and bareos-dir crashing with segfaults due to undefined behaviour in pthread usage | ||||
Description | Using Alpine 3.8+ (with musl libc) bsmtp crashes when invoked and bareos-dir crashes when running a job or getting a connection with bconsole. Bug at alpine: https://bugs.alpinelinux.org/issues/10156 Problem may still be there in latest bareos release. Proposed patches are attached. I'm not that deep into c programming and I'm absolutely not sure if my checks are correct. Please adjust as needed. Thanks, Simon | ||||
Steps To Reproduce | Simple reproduction of bsmtp crash: --- $> docker run -it alpine:3.9 /bin/ash / # apk add bareos [...] / # bsmtp Segmentation fault (core dumped) -- | ||||
Additional Information | bsmtp: There should be a thread key set called jcr_key, but its initial value is undefined. This is passed into tss_get which is a C11 wrapper for pthread_getspecific. http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_getspecific.html -> The effect of calling pthread_getspecific() or pthread_setspecific() with a key value not obtained from pthread_key_create() or after key has been deleted with pthread_key_delete() is undefined. --- 0x00007ffff7fbe2c3 in tss_get () from /lib/ld-musl-x86_64.so.1 (gdb) bt #0 0x00007ffff7fbe2c3 in tss_get () from /lib/ld-musl-x86_64.so.1 0000001 0x00007ffff7f2d94f in get_jobid_from_tsd () at jcr.c:739 0000002 0x00007ffff7f34279 in p_msg (file=0x555555558006 "bsmtp.c", line=364, level=0, fmt=0x555555558568 "Fatal error: no recipient given.\n") at message.c:1343 0000003 0x00005555555564d5 in main (argc=0, argv=<optimized out>) at bsmtp.c:364 --- bareos-dir (when bconsole connects): The issue is that the code tries to do a pthread_detach() on an already detached thread. This leads to undefined behaviour and a crash. http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_detach.html -> The behavior is undefined if the value specified by the thread argument to pthread_detach() does not refer to a joinable thread. --- (gdb) bt #0 __pthread_timedjoin_np (t=0x7f9c93e91b10, res=res@entry=0x0, at=at@entry=0x0) at src/thread/pthread_join.c:11 0000001 0x00007f9c95899383 in __pthread_join (t=<optimized out>, res=res@entry=0x0) at src/thread/pthread_join.c:24 0000002 0x00007f9c9589918c in __pthread_detach (t=<optimized out>) at src/thread/pthread_detach.c:9 0000003 __pthread_detach (t=<optimized out>) at src/thread/pthread_detach.c:4 0000004 0x00005559ea34ebd7 in handle_UA_client_request (user=user@entry=0x5559ea99d448) at ua_server.c:76 0000005 0x00005559ea329325 in handle_connection_request (arg=0x5559ea99d448) at socket_server.c:98 0000006 0x00007f9c957d5bc6 in workq_server (arg=arg@entry=0x5559ea38eae0 <socket_workq>) at workq.c:336 0000007 0x00007f9c957c0116 in lmgr_thread_launcher (x=0x5559ea99d588) at lockmgr.c:928 0000008 0x00007f9c95898c4c in start (p=<optimized out>) at src/thread/pthread_create.c:144 0000009 0x00007f9c9589acf2 in __clone () at src/thread/x86_64/clone.s:22 Backtrace stopped: frame did not save the PC --- bareos-dir (when job starts): The issue is that the code tries to do a pthread_detach() on an already detached thread. This leads to undefined behaviour and a crash. http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_detach.html -> The behavior is undefined if the value specified by the thread argument to pthread_detach() does not refer to a joinable thread. --- (gdb) bt #0 __pthread_timedjoin_np (t=0x7f7daf1ffb10, res=res@entry=0x0, at=at@entry=0x0) at src/thread/pthread_join.c:11 0000001 0x00007f7db0c2a383 in __pthread_join (t=<optimized out>, res=res@entry=0x0) at src/thread/pthread_join.c:24 0000002 0x00007f7db0c2a18c in __pthread_detach (t=<optimized out>) at src/thread/pthread_detach.c:9 0000003 __pthread_detach (t=<optimized out>) at src/thread/pthread_detach.c:4 0000004 0x000055e353dce058 in job_thread (arg=0x55e35419f3e8) at job.c:423 0000005 0x000055e353dd2f2b in jobq_server (arg=arg@entry=0x55e353e3f880 <job_queue>) at jobq.c:485 0000006 0x00007f7db0b51116 in lmgr_thread_launcher (x=0x55e3541a1248) at lockmgr.c:928 0000007 0x00007f7db0c29c4c in start (p=<optimized out>) at src/thread/pthread_create.c:144 0000008 0x00007f7db0c2bcf2 in __clone () at src/thread/x86_64/clone.s:22 Backtrace stopped: frame did not save the PC --- | ||||
Tags | No tags attached. | ||||
fix-bsmtp-segfault.patch (1,449 bytes)
Patch from Michael Cassaniti, posted here: https://bugs.alpinelinux.org/issues/10156#note-5 Until this issue is fixed upstream this patch is needed. diff --git a/src/lib/jcr.c b/src/lib/jcr.c index 00bfe8c87..338f90e59 100644 --- a/src/lib/jcr.c +++ b/src/lib/jcr.c @@ -77,6 +77,7 @@ static pthread_mutex_t jcr_lock = PTHREAD_MUTEX_INITIALIZER; static pthread_mutex_t job_start_mutex = PTHREAD_MUTEX_INITIALIZER; static pthread_mutex_t last_jobs_mutex = PTHREAD_MUTEX_INITIALIZER; +static bool jcr_initialized = false; #ifdef HAVE_WIN32 static bool tsd_initialized = false; static pthread_key_t jcr_key; /* Pointer to jcr for each thread */ @@ -351,6 +352,8 @@ static void create_jcr_key() berrno be; Jmsg1(NULL, M_ABORT, 0, _("pthread key create failed: ERR=%s\n"), be.bstrerror(status)); + } else { + jcr_initialized = true; } } @@ -719,7 +722,10 @@ void set_jcr_in_tsd(JCR *jcr) */ JCR *get_jcr_from_tsd() { - JCR *jcr = (JCR *)pthread_getspecific(jcr_key); + JCR *jcr = (JCR *)INVALID_JCR; + if (jcr_initialized){ + jcr = (JCR *)pthread_getspecific(jcr_key); + } /* * Set any INVALID_JCR to NULL which the rest of BAREOS understands @@ -736,7 +742,7 @@ JCR *get_jcr_from_tsd() */ uint32_t get_jobid_from_tsd() { - JCR *jcr = (JCR *)pthread_getspecific(jcr_key); + JCR *jcr = get_jcr_from_tsd(); uint32_t JobId = 0; if (jcr && jcr != INVALID_JCR) { pthread-double-detach.patch (1,238 bytes)
This patch fixes a double pthread_detach(), which is undefined behaviour and leads to a segfault when running with musl libc. Until this issue is fixed upstream this patch is needed. --- a/src/dird/ua_server.c +++ b/src/dird/ua_server.c @@ -73,7 +73,15 @@ UAContext *ua; JCR *jcr; - pthread_detach(pthread_self()); + /* only detach if not yet detached */ + int _detachstate; + pthread_attr_t _gattr; + pthread_getattr_np(pthread_self(), &_gattr); + pthread_attr_getdetachstate(&_gattr, &_detachstate); + pthread_attr_destroy(&_gattr); + if(_detachstate != PTHREAD_CREATE_DETACHED) { + pthread_detach(pthread_self()); + } jcr = new_control_jcr("-Console-", JT_CONSOLE); --- a/src/dird/job.c +++ b/src/dird/job.c @@ -420,7 +420,16 @@ { JCR *jcr = (JCR *)arg; - pthread_detach(pthread_self()); + /* only detach if not yet detached */ + int _detachstate; + pthread_attr_t _gattr; + pthread_getattr_np(pthread_self(), &_gattr); + pthread_attr_getdetachstate(&_gattr, &_detachstate); + pthread_attr_destroy(&_gattr); + if(_detachstate != PTHREAD_CREATE_DETACHED) { + pthread_detach(pthread_self()); + } + Dsm_check(100); Dmsg0(200, "=====Start Job=========\n"); |
|
First of all: thank you very much for putting this amount of effort into this. To get your patches into Bareos easily, you should clone the git repository, apply your changes against master and then either create a github pull-request (this is the preferred way) or git format-patch the commits and send them to the bareos-devel mailing-list. see also: https://docs.bareos.org/DeveloperGuide/generaldevel.html#contributions The director-patch will need work. It is the exact same code twice, so it should be moved into its own function. |
|
Thank you for taking a look at the patches. Yes, I'll create a PR at github, I just wanted to make sure the patches / approach is correct. | |
PR open: https://github.com/bareos/bareos/pull/169 | |
Can you please take a look at my branch (as I already suggested in your PR). I guess we can merge the change, but I cannot test on Alpine. | |
Thanks for you change, I've seen the PR at github. I'll compile and install it on saturday and test it next week. As far as I see you haven't changed that much, so I expect it to work without problems. |
|
The new PR is https://github.com/bareos/bareos/pull/220 Most changes concern CMake. |
|
The first batch of backups ran without a problem yesterday, I'll play with bconsole the next days and give you feedback here. But as far as I see It looks very good |
|
bareos: master ad273196 2019-04-18 17:30 Simon Frankenberger Committer: arogge Ported: N/A Details Diff |
dird: Fix pthread issues Fixes two double pthread_detach() calls and one pthread_getspecific() without initialization, all leading to a SEGFAULT with musl libc and undefined behaviour on other libc implementations. This closes Bug 0001073 |
Affected Issues 0001073 |
|
mod - core/src/dird/dird.cc | Diff File | ||
mod - core/src/dird/dird.h | Diff File | ||
mod - core/src/dird/job.cc | Diff File | ||
mod - core/src/dird/ua_server.cc | Diff File | ||
mod - core/src/lib/jcr.cc | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2019-04-05 12:29 | bratkartoffel | New Issue | |
2019-04-05 12:29 | bratkartoffel | File Added: fix-bsmtp-segfault.patch | |
2019-04-05 12:29 | bratkartoffel | File Added: pthread-double-detach.patch | |
2019-04-11 09:29 | arogge | Assigned To | => arogge |
2019-04-11 09:29 | arogge | Status | new => assigned |
2019-04-11 09:42 | arogge | Status | assigned => feedback |
2019-04-11 09:42 | arogge | Note Added: 0003329 | |
2019-04-11 18:18 | bratkartoffel | Note Added: 0003333 | |
2019-04-11 18:18 | bratkartoffel | Status | feedback => assigned |
2019-04-18 15:39 | bratkartoffel | Note Added: 0003342 | |
2019-07-10 17:46 | arogge | Status | assigned => feedback |
2019-07-10 17:46 | arogge | Note Added: 0003436 | |
2019-07-11 10:36 | bratkartoffel | Note Added: 0003439 | |
2019-07-11 10:36 | bratkartoffel | Status | feedback => assigned |
2019-07-11 14:08 | arogge | Note Added: 0003441 | |
2019-07-15 09:27 | arogge | Assigned To | arogge => |
2019-07-15 09:27 | arogge | Status | assigned => feedback |
2019-07-15 09:35 | bratkartoffel | Note Added: 0003461 | |
2019-07-15 09:35 | bratkartoffel | Status | feedback => new |
2019-07-16 10:22 | arogge | Changeset attached | => bareos master ad273196 |
2019-08-01 10:14 | arogge | Assigned To | => arogge |
2019-08-01 10:14 | arogge | Status | new => resolved |
2019-08-01 10:14 | arogge | Resolution | open => fixed |
2019-08-01 10:14 | arogge | Fixed in Version | => 19.2.1 |
2019-12-18 15:24 | arogge | Status | resolved => closed |