View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000987 | bareos-core | director | public | 2018-07-18 10:42 | 2023-07-05 16:20 |
| Reporter | franku | Assigned To | bruno-at-bareos | ||
| Priority | normal | Severity | major | Reproducibility | random |
| Status | closed | Resolution | fixed | ||
| Platform | Linux | OS | Debian | OS Version | 9 |
| Product Version | 17.2.6 | ||||
| Summary | 0000987: Canceling a job leads to a director crash (TT4200333) | ||||
| Description | When canceling a job using bconsole the director can occasionally crash with a coredump. It is likely that this appears as a result of a race condition where a signal is being sent to a job-thread. The job's thread_id used is a member in the JobControlRecord class whose memory could be deleted meanwhile. | ||||
| Steps To Reproduce | This issue appears very seldom. No way yet to reproduce reliably. | ||||
| Additional Information | Excerpt from the coredump: 0000001 0x00007f4107389c14 in signal_handler (sig=11) at signal.c:240 0000002 <signal handler called> 0000003 __pthread_kill (threadid=139913685083904, signo=signo@entry=12) at ../sysdeps/unix/sysv/linux/pthread_kill.c:40 0000004 0x00007f4107377434 in JCR::my_thread_send_signal (this=this@entry=0x558a5e446318, sig=sig@entry=12) at jcr.c:682 0000005 0x0000558a5b0983ec in cancel_file_daemon_job (ua=ua@entry=0x7f3f0c00ed28, jcr=jcr@entry=0x558a5e446318) at fd_cmds.c:1080 | ||||
| Tags | No tags attached. | ||||
| Current solution: Refactor the function that frees JobControlRecord (JCR) memory in order to lock the JCR mutex consecutively. | |
|
This one affects me also. We have a lot of copy and migrate jobs. Canceling them lets the director crash by a chance of about 50%. If I can provide something to get this one fixed please let me know. Regards, Dennis |
|
|
Long time we didn't face anymore this crash with recent code (Bareos 21,22) so closing. |
|
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2018-07-18 10:42 | franku | New Issue | |
| 2018-07-18 10:42 | franku | Status | new => assigned |
| 2018-07-18 10:42 | franku | Assigned To | => franku |
| 2018-07-18 15:13 | franku | Note Added: 0003074 | |
| 2018-07-18 15:13 | franku | Description Updated | |
| 2018-07-18 15:13 | franku | Steps to Reproduce Updated | |
| 2019-01-23 11:30 | stephand | Relationship added | child of 0000984 |
| 2019-07-15 15:51 | franku | Status | assigned => new |
| 2019-07-15 15:52 | franku | Assigned To | franku => |
| 2019-09-15 16:17 | therm | Note Added: 0003572 | |
| 2023-07-05 16:20 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
| 2023-07-05 16:20 | bruno-at-bareos | Status | new => closed |
| 2023-07-05 16:20 | bruno-at-bareos | Resolution | open => fixed |
| 2023-07-05 16:20 | bruno-at-bareos | Note Added: 0005149 |