View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000528 | bareos-core | director | public | 2015-10-01 09:38 | 2019-12-18 15:25 |
Reporter | jkhradil | Assigned To | pstorz | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Summary | 0000528: Migration job hangs waiting for waiting for max Storage jobs | ||||
Description | Since version 15.2 migration (copy) jobs do not work when run from schedule. The control job doesn't get the next pool setting applied and hangs waiting for max Storage jobs. This doesn't happen when the job is run manually as the next pool is set in reset_restore_context function (ua_run.c) in this case. Patch fixing this issue is attached. | ||||
Steps To Reproduce | 1) Define migration or copy job as per documentation 2) Set this job to run from a schedule and wait for the specified 3) See job hang waiting for waiting for max Storage jobs | ||||
Tags | No tags attached. | ||||
0001-Fix-migration-control-job-hanging-due-to-next-pool-n.patch (858 bytes)
From 3bf26b78b49992c15c9b06891c1068eb686ae783 Mon Sep 17 00:00:00 2001 From: Jakub Hradil <jkhradil@gmail.com> Date: Thu, 1 Oct 2015 09:12:38 +0200 Subject: [PATCH] Fix migration control job hanging due to next pool not being set when run from schedule --- src/dird/migrate.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/dird/migrate.c b/src/dird/migrate.c index a8b9d1c..cdbdf31 100644 --- a/src/dird/migrate.c +++ b/src/dird/migrate.c @@ -1223,6 +1223,11 @@ bool do_migration_init(JCR *jcr) * one to the writing SD. */ jcr->remote_replicate = !is_same_storage_daemon(jcr->res.rstore, jcr->res.wstore); + } else { + /* + * Set next pool even for control job, otherwise it will hang waiting for max Storage jobs + */ + set_migration_next_pool(jcr, &pool); } return true; -- 2.4.3 |
|
Hello, wer are using 15.2.1 for quite some time and we also use copy jobs but that always worked. Also, reproducing the problem in a regression test also was not successful Can you please specify how exactly to reproduce the problem? Thank you very much |
|
My configuration files for the director and storage daemon are in the attached file bareos.zip. This configuration worked with version 14.2.4 on Centos 7, after upgrading to version 15.2.1 the copy job hangs waiting for max Storage jobs. Same happens using build from master on Fedora. Steps to reproduce: 1) create empty bareos database 2) used attached config files 3) schedule Full and Copy jobs (DIR_Bareos_Schedule_Bareos_Backup, DIR_Bareos_Schedule_Bareos_Copy) 4) see full backup finish successfully 5) see copy job hang, director's status shows: "job" is waiting for max Storage jobs Running this scenario under debbuger, I see it go thru the do_migration_init(JCR *jcr) function only once to init the control job. It jumps over the if (jcr->MigrateJobId != 0) block (this block is new since version 14.2) and never enters the do_migration_init(JCR *jcr) function for the second time to init the copy job. If I run the copy job manually, the do_migration_init(JCR *jcr) function is executed twice and job runs succesfully. The difference I can see is that in a scheduled job the rstorage and wstorage are the same, however in manually ran job they differ, since wstorage is set to the right value in reset_restore_context function (ua_run.c). |
|
Hello, looks like you forgot to upload the bareos.zip file. Is that right? |
|
Yeah, sorry about that. I selected the file, but forgot to click the Upload File button. Now it's uploaded. | |
Fixing your problem is very easy; when you have the dir "waiting for max Storage jobs.", you only have to increase the maximum concurrent jobs on your storage to something more than one. I added "Maximum Concurrent Jobs = 10" to each of your Storages in your storage.conf, and everything works without any change. However we will have a look at your patch anyway. Thanks best regards Philipp |
|
Fix committed to bareos bareos-15.2 branch with changesetid 5893. | |
bareos: bareos-15.2 4cc7481f 2015-11-17 16:27 Committer: mvwieringen Ported: N/A Details Diff |
migration control jobs don't count for concurrency Migration control jobs do not touch the storage in any way so they do not need to be counted when checking the maximum concurrent jobs for storages. Also did a cleanup of the the code and comments along the way. Fixes 0000528: Migration job hangs waiting for waiting for max Storage jobs Signed-off-by: Marco van Wieringen <marco.van.wieringen@bareos.com> |
Affected Issues 0000528 |
|
mod - src/dird/jobq.c | Diff File | ||
mod - src/include/jcr.h | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-10-01 09:38 | jkhradil | New Issue | |
2015-10-01 09:38 | jkhradil | File Added: 0001-Fix-migration-control-job-hanging-due-to-next-pool-n.patch | |
2015-10-01 13:10 | pstorz | Note Added: 0001856 | |
2015-10-01 13:10 | pstorz | Assigned To | => pstorz |
2015-10-01 13:10 | pstorz | Status | new => feedback |
2015-10-01 15:29 | jkhradil | Note Added: 0001857 | |
2015-10-01 15:29 | jkhradil | Status | feedback => assigned |
2015-10-02 10:14 | pstorz | Note Added: 0001862 | |
2015-10-02 11:38 | jkhradil | File Added: bareos.zip | |
2015-10-02 11:39 | jkhradil | Note Added: 0001863 | |
2015-10-02 12:19 | pstorz | Note Added: 0001864 | |
2015-11-19 14:49 | mvwieringen | Changeset attached | => bareos bareos-15.2 4cc7481f |
2015-11-19 14:49 | mvwieringen | Note Added: 0002004 | |
2015-11-19 14:49 | mvwieringen | Status | assigned => resolved |
2015-11-19 14:49 | mvwieringen | Resolution | open => fixed |
2019-12-18 15:25 | arogge | Status | resolved => closed |