View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000326 | bareos-core | storage daemon | public | 2014-08-06 18:27 | 2015-01-26 16:05 |
Reporter | holtkamp | Assigned To | |||
Priority | normal | Severity | tweak | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | RHEL | OS Version | 6 |
Summary | 0000326: Enhance read device selection to respect media_type | ||||
Description | When looking for a device to mount a volume that is to be read bareos goes for the device the volume was last mounted in. If the original device is blocked but there are other devices that can be used those will not be selected and the job will fail. -> I do not think this applies to autochangers, only file devices | ||||
Steps To Reproduce | - Have a basic backup setup for a client with a next pool definition - have a storage resource with multiple devices / same media type - run some backups - run a virtual full that is written to the next pool - run some backups - run another virtual full -> bareos creates a new volume to write to but refuses to select one of the available devices to read from (because original is blocked with the volume that is to be written). | ||||
Additional Information | I attached a patch with a suggestion on how this could be fixed. Also attached are two debug-logs from stored, one in original state, one with the fix. For each device it checks it also compares if the media_type would match and if it does just do the same as if the device name matches. The function is only used at two points in the code and it should have no other implications. Maybe there is a more elegant way to do this, i´m open for suggestions :) | ||||
Tags | No tags attached. | ||||
bareos-reserve-failure.txt (4,971 bytes)
06-Aug-2014 16:47:04 backup.test-sd: mac.c:435-0 Start read data. 06-Aug-2014 16:47:04 backup.test-sd: mac.c:572-0 read_dcr=7f82ec016028 write_dcr=7f82ec026738 06-Aug-2014 16:47:04 backup.test-sd: mac.c:574-0 Found 3 volumes names for Virtual Backup. First=library.test.0895 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:63-0 dcr=7f82ec016028 dev=7f82ec0016d8 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:64-0 MediaType dcr=backup.test.filestorage dev=backup.test.filestorage 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:92-0 Want Vol=library.test.0895 Slot=0 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:106-0 MediaType dcr=library.test.filestorage dev=backup.test.filestorage 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:117-0 Changing read device. Want Media Type="library.test.filestorage" have="backup.test.filestorage" device="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:473-0 release_device device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) is disk 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:134-0 Dec reserve=0 dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:484-0 dir_update_vol_info. label=0 Vol= 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:537-0 0 writers, 0 reserve, dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: dev.c:883-0 close_dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: dev.c:890-0 device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) already closed vol= 06-Aug-2014 16:47:04 backup.test-sd: vol_mgr.c:732-0 No vol on dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:595-0 JobId=4123 broadcast wait_device_release at 06-Aug-2014 16:47:04 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:781-0 Detach Jid=4123 dcr=7f82ec016028 size=1 to dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:617-0 Device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) released by JobId=4123 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:557-0 search res for library.test.filestorage-1 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-1 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-2 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-3 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-4 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-5 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-6 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-7 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-8 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-9 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-10 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-1 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-2 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-1 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:628-0 chk MediaType device=library.test.filestorage request=library.test.filestorage 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:651-0 try reserve library.test.filestorage-1 06-Aug-2014 16:47:04 backup.test-sd: block.c:127-0 created new block of blocksize 64512 (dev->device->label_block_size) as dev->max_block_size is zero 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:760-0 Attach Jid=4123 dcr=7f82ec016028 size=1 dev="library.test.filestorage-1" (/var/backup/library.test.filestorage/) 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:804-0 Device "library.test.filestorage-1" (/var/backup/library.test.filestorage/) is busy ST_READREADY=0 num_writers=0 reserved=1. 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:770-0 Not OK. 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-2 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage.alternative-1 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage.alternative-2 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=archive.test.filestorage-1 06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=archive.test.filestorage-2 06-Aug-2014 16:47:04 backup.test-sd: acquire.c:170-0 No suitable device found to read Volume "library.test.0895" |
|
bareos-reserve-fixed.txt (5,583 bytes)
06-Aug-2014 16:53:13 backup.test-sd: mac.c:435-0 Start read data. 06-Aug-2014 16:53:13 backup.test-sd: mac.c:572-0 read_dcr=7f5de4004a08 write_dcr=7f5de4015118 06-Aug-2014 16:53:13 backup.test-sd: mac.c:574-0 Found 3 volumes names for Virtual Backup. First=library.test.0895 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:63-0 dcr=7f5de4004a08 dev=7f5de80016c8 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:64-0 MediaType dcr=backup.test.filestorage dev=backup.test.filestorage 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:92-0 Want Vol=library.test.0895 Slot=0 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:106-0 MediaType dcr=library.test.filestorage dev=backup.test.filestorage 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:117-0 Changing read device. Want Media Type="library.test.filestorage" have="backup.test.filestorage" device="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:473-0 release_device device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) is disk 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:134-0 Dec reserve=0 dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:484-0 dir_update_vol_info. label=0 Vol= 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:537-0 0 writers, 0 reserve, dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: dev.c:883-0 close_dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: dev.c:890-0 device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) already closed vol= 06-Aug-2014 16:53:13 backup.test-sd: vol_mgr.c:732-0 No vol on dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:595-0 JobId=4124 broadcast wait_device_release at 06-Aug-2014 16:53:13 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:781-0 Detach Jid=4124 dcr=7f5de4004a08 size=1 to dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:617-0 Device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) released by JobId=4124 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:557-0 search res for library.test.filestorage-1 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-1 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-2 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-3 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-4 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-5 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-6 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-7 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-8 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-9 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-10 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-1 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-2 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-1 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:643-0 chk MediaType device=library.test.filestorage request=library.test.filestorage 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:666-0 try reserve library.test.filestorage-1 06-Aug-2014 16:53:13 backup.test-sd: block.c:127-0 created new block of blocksize 64512 (dev->device->label_block_size) as dev->max_block_size is zero 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:760-0 Attach Jid=4124 dcr=7f5de4004a08 size=1 dev="library.test.filestorage-1" (/var/backup/library.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:819-0 Device "library.test.filestorage-1" (/var/backup/library.test.filestorage/) is busy ST_READREADY=0 num_writers=0 reserved=1. 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:785-0 Not OK. 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-2 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:643-0 chk MediaType device=library.test.filestorage request=library.test.filestorage 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:666-0 try reserve library.test.filestorage-2 06-Aug-2014 16:53:13 backup.test-sd: block.c:127-0 created new block of blocksize 64512 (dev->device->label_block_size) as dev->max_block_size is zero 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:781-0 Detach Jid=4124 dcr=7f5de4004a08 size=2 to dev="library.test.filestorage-1" (/var/backup/library.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: acquire.c:760-0 Attach Jid=4124 dcr=7f5de4004a08 size=0 dev="library.test.filestorage-2" (/var/backup/library.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:125-0 Inc reserve=0 dev="library.test.filestorage-2" (/var/backup/library.test.filestorage/) 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:763-0 Read reserved=1 dev_name=library.test.filestorage-1 mediatype=library.test.filestorage pool=db2.dev.riege.de ok=1 06-Aug-2014 16:53:13 backup.test-sd: reserve.c:619-0 Device library.test.filestorage-2 reserved=1 for read. |
|
I forgot to mention the version numbers: original: bareos-storage-14.3.0.git.1407164810-951.1.el6.x86_64 fixed: applied to a fresh git version |
|
The patch looks a bit fishy could it be that you hand copied it over and applied it with a offset of one line ? It now looks that the else if is after the foreach loop which won't compile, if I move it one line up its within the foreach loop. I cannot really say this will not have any "side-effects" as the stored kind of "assumes" a lot so just like your other patch we will have to see what side effects its gives. I hope we find some time later this year to do some major changes to the storage daemon. We just finished some serious refactoring so that its already something that looks quite a bit better then before but its kind of a minefield as to how some things work (or don't work to be more precise.) |
|
That could be true, i generate the diffs on my machine and did all the testing on my test server. I might have messed up copy & paste :) Will check tomorrow. |
|
find-device-also-by-media_type-fix2.diff (1,110 bytes)
diff --git a/src/stored/reserve.c b/src/stored/reserve.c index 46b95e0..7ce6486 100644 --- a/src/stored/reserve.c +++ b/src/stored/reserve.c @@ -604,6 +604,21 @@ int search_res_for_device(RCTX &rctx) rctx.device->hdr.name, rctx.jcr->read_dcr->dev->num_reserved()); } return status; + } else if (bstrcmp(rctx.store->media_type, rctx.device->media_type)) { + Dmsg0(dbglvl, "Alternative matching by media_type works.\n"); + status = reserve_device(rctx); + if (status != 1) { /* try another device */ + continue; + } + /* Debug code */ + if (rctx.store->append == SD_APPEND) { + Dmsg2(dbglvl, "Device %s reserved=%d for append.\n", + rctx.device->hdr.name, rctx.jcr->dcr->dev->num_reserved()); + } else { + Dmsg2(dbglvl, "Device %s reserved=%d for read.\n", + rctx.device->hdr.name, rctx.jcr->read_dcr->dev->num_reserved()); + } + return status; } } } |
|
Yeah, i messed up. This time i copied the file that compiled so it should work. That happens when you get called to a meeting in the middle of preparing the bug report... |
|
The new patch looks like what I already thought it should be. I think the original design never anticipated that you would have multiple file device entries pointing to the same storage daemon. It then opts for using either the virtual autochanger using the /dev/null changer path or tricks like the diskchanger or vchanger. As I already said as the code assumes a lot this might work but it might also have quite some side effects (which sometimes don't show directly) I once tried implementing some LRU technique for the autochanger so it doesn't always tries to use the first drive but that also had some strange side effects. So its something I want to fix but until now it always has been preempted with other work. |
|
It will all come together in my talk at the OSBC :) Looking forward to meeting you there. | |
Fix committed to bareos master branch with changesetid 2013. | |
bareos: master aed0a0c6 2014-08-08 21:29 Ported: N/A Details Diff |
Enhance read device selection to respect media_type When looking for a device to mount a volume that is to be read bareos goes for the device the volume was last mounted in. If the original device is blocked but there are other devices that can be used those will not be selected and the job will fail. Fixes 0000326: Enhance read device selection to respect media_type Signed-off-by: Marco van Wieringen <marco.van.wieringen@bareos.com> |
Affected Issues 0000326 |
|
mod - src/stored/reserve.c | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2014-08-06 18:27 | holtkamp | New Issue | |
2014-08-06 18:27 | holtkamp | File Added: find-device-also-by-media_type.diff | |
2014-08-06 18:28 | holtkamp | File Added: bareos-reserve-failure.txt | |
2014-08-06 18:28 | holtkamp | File Added: bareos-reserve-fixed.txt | |
2014-08-06 18:29 | holtkamp | Note Added: 0000938 | |
2014-08-06 20:22 | mvwieringen | Note Added: 0000939 | |
2014-08-06 20:22 | mvwieringen | Status | new => feedback |
2014-08-06 21:19 | holtkamp | Note Added: 0000940 | |
2014-08-06 21:19 | holtkamp | Status | feedback => new |
2014-08-07 12:24 | holtkamp | File Added: find-device-also-by-media_type-fix2.diff | |
2014-08-07 12:26 | holtkamp | Note Added: 0000941 | |
2014-08-07 13:09 | mvwieringen | File Deleted: find-device-also-by-media_type.diff | |
2014-08-07 13:15 | mvwieringen | Note Added: 0000943 | |
2014-08-07 13:41 | holtkamp | Note Added: 0000944 | |
2014-08-13 09:42 | mvwieringen | Status | new => acknowledged |
2014-08-18 15:49 | holtkamp | Changeset attached | => bareos master aed0a0c6 |
2014-08-18 15:49 | holtkamp | Note Added: 0000956 | |
2014-08-18 15:49 | holtkamp | Assigned To | => holtkamp |
2014-08-18 15:49 | holtkamp | Status | acknowledged => resolved |
2014-08-18 15:49 | holtkamp | Resolution | open => fixed |
2015-01-26 16:05 |
|
Assigned To | holtkamp => |
2015-01-26 16:05 |
|
Status | resolved => closed |