View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000987 | bareos-core | director | public | 2018-07-18 10:42 | 2023-07-05 16:20 |
Reporter | franku | Assigned To | bruno-at-bareos | ||
Priority | normal | Severity | major | Reproducibility | random |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | Debian | OS Version | 9 |
Product Version | 17.2.6 | ||||
Summary | 0000987: Canceling a job leads to a director crash (TT4200333) | ||||
Description | When canceling a job using bconsole the director can occasionally crash with a coredump. It is likely that this appears as a result of a race condition where a signal is being sent to a job-thread. The job's thread_id used is a member in the JobControlRecord class whose memory could be deleted meanwhile. | ||||
Steps To Reproduce | This issue appears very seldom. No way yet to reproduce reliably. | ||||
Additional Information | Excerpt from the coredump: 0000001 0x00007f4107389c14 in signal_handler (sig=11) at signal.c:240 0000002 <signal handler called> 0000003 __pthread_kill (threadid=139913685083904, signo=signo@entry=12) at ../sysdeps/unix/sysv/linux/pthread_kill.c:40 0000004 0x00007f4107377434 in JCR::my_thread_send_signal (this=this@entry=0x558a5e446318, sig=sig@entry=12) at jcr.c:682 0000005 0x0000558a5b0983ec in cancel_file_daemon_job (ua=ua@entry=0x7f3f0c00ed28, jcr=jcr@entry=0x558a5e446318) at fd_cmds.c:1080 | ||||
Tags | No tags attached. | ||||
Current solution: Refactor the function that frees JobControlRecord (JCR) memory in order to lock the JCR mutex consecutively. | |
This one affects me also. We have a lot of copy and migrate jobs. Canceling them lets the director crash by a chance of about 50%. If I can provide something to get this one fixed please let me know. Regards, Dennis |
|
Long time we didn't face anymore this crash with recent code (Bareos 21,22) so closing. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2018-07-18 10:42 | franku | New Issue | |
2018-07-18 10:42 | franku | Status | new => assigned |
2018-07-18 10:42 | franku | Assigned To | => franku |
2018-07-18 15:13 | franku | Note Added: 0003074 | |
2018-07-18 15:13 | franku | Description Updated | |
2018-07-18 15:13 | franku | Steps to Reproduce Updated | |
2019-01-23 11:30 | stephand | Relationship added | child of 0000984 |
2019-07-15 15:51 | franku | Status | assigned => new |
2019-07-15 15:52 | franku | Assigned To | franku => |
2019-09-15 16:17 | therm | Note Added: 0003572 | |
2023-07-05 16:20 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2023-07-05 16:20 | bruno-at-bareos | Status | new => closed |
2023-07-05 16:20 | bruno-at-bareos | Resolution | open => fixed |
2023-07-05 16:20 | bruno-at-bareos | Note Added: 0005149 |