View Issue Details

IDProjectCategoryView StatusLast Update
0000326bareos-corestorage daemonpublic2015-01-26 16:05
Reporterholtkamp Assigned To 
PrioritynormalSeveritytweakReproducibilityalways
Status closedResolutionfixed 
PlatformLinuxOSRHELOS Version6
Summary0000326: Enhance read device selection to respect media_type
DescriptionWhen looking for a device to mount a volume that is to be read bareos goes for the device the volume was last mounted in. If the original device is blocked but there are other devices that can be used those will not be selected and the job will fail.

-> I do not think this applies to autochangers, only file devices
Steps To Reproduce- Have a basic backup setup for a client with a next pool definition
- have a storage resource with multiple devices / same media type
- run some backups
- run a virtual full that is written to the next pool
- run some backups
- run another virtual full -> bareos creates a new volume to write to but refuses to select one of the available devices to read from (because original is blocked with the volume that is to be written).
Additional InformationI attached a patch with a suggestion on how this could be fixed.
Also attached are two debug-logs from stored, one in original state, one with the fix.

For each device it checks it also compares if the media_type would match and if it does just do the same as if the device name matches.

The function is only used at two points in the code and it should have no other implications.

Maybe there is a more elegant way to do this, i´m open for suggestions :)
TagsNo tags attached.

Activities

holtkamp

holtkamp

2014-08-06 18:28

reporter  

bareos-reserve-failure.txt (4,971 bytes)   
06-Aug-2014 16:47:04 backup.test-sd: mac.c:435-0 Start read data.
06-Aug-2014 16:47:04 backup.test-sd: mac.c:572-0 read_dcr=7f82ec016028 write_dcr=7f82ec026738
06-Aug-2014 16:47:04 backup.test-sd: mac.c:574-0 Found 3 volumes names for Virtual Backup. First=library.test.0895
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:63-0 dcr=7f82ec016028 dev=7f82ec0016d8
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:64-0 MediaType dcr=backup.test.filestorage dev=backup.test.filestorage
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:92-0 Want Vol=library.test.0895 Slot=0
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:106-0 MediaType dcr=library.test.filestorage dev=backup.test.filestorage
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:117-0 Changing read device. Want Media Type="library.test.filestorage" have="backup.test.filestorage"
  device="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:473-0 release_device device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) is disk
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:134-0 Dec reserve=0 dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:484-0 dir_update_vol_info. label=0 Vol=
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:537-0 0 writers, 0 reserve, dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: dev.c:883-0 close_dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: dev.c:890-0 device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) already closed vol=
06-Aug-2014 16:47:04 backup.test-sd: vol_mgr.c:732-0 No vol on dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:595-0 JobId=4123 broadcast wait_device_release at 06-Aug-2014 16:47:04
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:781-0 Detach Jid=4123 dcr=7f82ec016028 size=1 to dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:617-0 Device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) released by JobId=4123
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:557-0 search res for library.test.filestorage-1
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-1
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-2
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-3
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-4
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-5
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-6
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-7
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-8
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-9
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-10
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-1
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-2
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-1
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:628-0 chk MediaType device=library.test.filestorage request=library.test.filestorage
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:651-0 try reserve library.test.filestorage-1
06-Aug-2014 16:47:04 backup.test-sd: block.c:127-0 created new block of blocksize 64512 (dev->device->label_block_size) as dev->max_block_size is zero
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:760-0 Attach Jid=4123 dcr=7f82ec016028 size=1 dev="library.test.filestorage-1" (/var/backup/library.test.filestorage/)
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:804-0 Device "library.test.filestorage-1" (/var/backup/library.test.filestorage/) is busy ST_READREADY=0 num_writers=0 reserved=1.
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:770-0 Not OK.
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-2
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage.alternative-1
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage.alternative-2
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=archive.test.filestorage-1
06-Aug-2014 16:47:04 backup.test-sd: reserve.c:591-0 Try match res=archive.test.filestorage-2
06-Aug-2014 16:47:04 backup.test-sd: acquire.c:170-0 No suitable device found to read Volume "library.test.0895"
bareos-reserve-failure.txt (4,971 bytes)   
holtkamp

holtkamp

2014-08-06 18:28

reporter  

bareos-reserve-fixed.txt (5,583 bytes)   
06-Aug-2014 16:53:13 backup.test-sd: mac.c:435-0 Start read data.
06-Aug-2014 16:53:13 backup.test-sd: mac.c:572-0 read_dcr=7f5de4004a08 write_dcr=7f5de4015118
06-Aug-2014 16:53:13 backup.test-sd: mac.c:574-0 Found 3 volumes names for Virtual Backup. First=library.test.0895
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:63-0 dcr=7f5de4004a08 dev=7f5de80016c8
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:64-0 MediaType dcr=backup.test.filestorage dev=backup.test.filestorage
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:92-0 Want Vol=library.test.0895 Slot=0
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:106-0 MediaType dcr=library.test.filestorage dev=backup.test.filestorage
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:117-0 Changing read device. Want Media Type="library.test.filestorage" have="backup.test.filestorage"
  device="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:473-0 release_device device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) is disk
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:134-0 Dec reserve=0 dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:484-0 dir_update_vol_info. label=0 Vol=
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:537-0 0 writers, 0 reserve, dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: dev.c:883-0 close_dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: dev.c:890-0 device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) already closed vol=
06-Aug-2014 16:53:13 backup.test-sd: vol_mgr.c:732-0 No vol on dev "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:595-0 JobId=4124 broadcast wait_device_release at 06-Aug-2014 16:53:13
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:781-0 Detach Jid=4124 dcr=7f5de4004a08 size=1 to dev="backup.test.filestorage-1" (/var/backup/backup.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:617-0 Device "backup.test.filestorage-1" (/var/backup/backup.test.filestorage/) released by JobId=4124
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:557-0 search res for library.test.filestorage-1
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-1
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-2
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-3
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-4
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-5
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-6
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-7
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-8
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-9
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.filestorage-10
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-1
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=backup.test.copy.filestorage-2
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-1
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:643-0 chk MediaType device=library.test.filestorage request=library.test.filestorage
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:666-0 try reserve library.test.filestorage-1
06-Aug-2014 16:53:13 backup.test-sd: block.c:127-0 created new block of blocksize 64512 (dev->device->label_block_size) as dev->max_block_size is zero
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:760-0 Attach Jid=4124 dcr=7f5de4004a08 size=1 dev="library.test.filestorage-1" (/var/backup/library.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:819-0 Device "library.test.filestorage-1" (/var/backup/library.test.filestorage/) is busy ST_READREADY=0 num_writers=0 reserved=1.
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:785-0 Not OK.
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:591-0 Try match res=library.test.filestorage-2
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:643-0 chk MediaType device=library.test.filestorage request=library.test.filestorage
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:666-0 try reserve library.test.filestorage-2
06-Aug-2014 16:53:13 backup.test-sd: block.c:127-0 created new block of blocksize 64512 (dev->device->label_block_size) as dev->max_block_size is zero
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:781-0 Detach Jid=4124 dcr=7f5de4004a08 size=2 to dev="library.test.filestorage-1" (/var/backup/library.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: acquire.c:760-0 Attach Jid=4124 dcr=7f5de4004a08 size=0 dev="library.test.filestorage-2" (/var/backup/library.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:125-0 Inc reserve=0 dev="library.test.filestorage-2" (/var/backup/library.test.filestorage/)
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:763-0 Read reserved=1 dev_name=library.test.filestorage-1 mediatype=library.test.filestorage pool=db2.dev.riege.de ok=1
06-Aug-2014 16:53:13 backup.test-sd: reserve.c:619-0 Device library.test.filestorage-2 reserved=1 for read.
bareos-reserve-fixed.txt (5,583 bytes)   
holtkamp

holtkamp

2014-08-06 18:29

reporter   ~0000938

I forgot to mention the version numbers:

original: bareos-storage-14.3.0.git.1407164810-951.1.el6.x86_64

fixed: applied to a fresh git version
mvwieringen

mvwieringen

2014-08-06 20:22

developer   ~0000939

The patch looks a bit fishy could it be that you hand copied it over and
applied it with a offset of one line ? It now looks that the else if is
after the foreach loop which won't compile, if I move it one line up its
within the foreach loop.

I cannot really say this will not have any "side-effects" as the stored
kind of "assumes" a lot so just like your other patch we will have to see
what side effects its gives. I hope we find some time later this year to
do some major changes to the storage daemon. We just finished some serious
refactoring so that its already something that looks quite a bit better then
before but its kind of a minefield as to how some things work (or don't work
to be more precise.)
holtkamp

holtkamp

2014-08-06 21:19

reporter   ~0000940

That could be true, i generate the diffs on my machine and did all the testing on my test server. I might have messed up copy & paste :)

Will check tomorrow.
holtkamp

holtkamp

2014-08-07 12:24

reporter  

find-device-also-by-media_type-fix2.diff (1,110 bytes)   
diff --git a/src/stored/reserve.c b/src/stored/reserve.c
index 46b95e0..7ce6486 100644
--- a/src/stored/reserve.c
+++ b/src/stored/reserve.c
@@ -604,6 +604,21 @@ int search_res_for_device(RCTX &rctx)
                   rctx.device->hdr.name, rctx.jcr->read_dcr->dev->num_reserved());
             }
             return status;
+         } else if (bstrcmp(rctx.store->media_type, rctx.device->media_type)) {
+            Dmsg0(dbglvl, "Alternative matching by media_type works.\n");
+            status = reserve_device(rctx);
+            if (status != 1) {             /* try another device */
+               continue;
+            }
+            /* Debug code */
+            if (rctx.store->append == SD_APPEND) {
+               Dmsg2(dbglvl, "Device %s reserved=%d for append.\n",
+                  rctx.device->hdr.name, rctx.jcr->dcr->dev->num_reserved());
+            } else {
+               Dmsg2(dbglvl, "Device %s reserved=%d for read.\n",
+                  rctx.device->hdr.name, rctx.jcr->read_dcr->dev->num_reserved());
+            }
+            return status;
          }
       }
    }
holtkamp

holtkamp

2014-08-07 12:26

reporter   ~0000941

Yeah, i messed up. This time i copied the file that compiled so it should work.

That happens when you get called to a meeting in the middle of preparing the bug report...
mvwieringen

mvwieringen

2014-08-07 13:15

developer   ~0000943

The new patch looks like what I already thought it should be. I think the
original design never anticipated that you would have multiple file device
entries pointing to the same storage daemon. It then opts for using either
the virtual autochanger using the /dev/null changer path or tricks like the
diskchanger or vchanger. As I already said as the code assumes a lot this might
work but it might also have quite some side effects (which sometimes don't show
directly) I once tried implementing some LRU technique for the autochanger so
it doesn't always tries to use the first drive but that also had some strange
side effects. So its something I want to fix but until now it always has been
preempted with other work.
holtkamp

holtkamp

2014-08-07 13:41

reporter   ~0000944

It will all come together in my talk at the OSBC :) Looking forward to meeting you there.
holtkamp

holtkamp

2014-08-18 15:49

reporter   ~0000956

Fix committed to bareos master branch with changesetid 2013.

Related Changesets

bareos: master aed0a0c6

2014-08-08 21:29

holtkamp

Ported: N/A

Details Diff
Enhance read device selection to respect media_type

When looking for a device to mount a volume that is to be read bareos
goes for the device the volume was last mounted in. If the original
device is blocked but there are other devices that can be used those
will not be selected and the job will fail.

Fixes 0000326: Enhance read device selection to respect media_type

Signed-off-by: Marco van Wieringen <marco.van.wieringen@bareos.com>
Affected Issues
0000326
mod - src/stored/reserve.c Diff File

Issue History

Date Modified Username Field Change
2014-08-06 18:27 holtkamp New Issue
2014-08-06 18:27 holtkamp File Added: find-device-also-by-media_type.diff
2014-08-06 18:28 holtkamp File Added: bareos-reserve-failure.txt
2014-08-06 18:28 holtkamp File Added: bareos-reserve-fixed.txt
2014-08-06 18:29 holtkamp Note Added: 0000938
2014-08-06 20:22 mvwieringen Note Added: 0000939
2014-08-06 20:22 mvwieringen Status new => feedback
2014-08-06 21:19 holtkamp Note Added: 0000940
2014-08-06 21:19 holtkamp Status feedback => new
2014-08-07 12:24 holtkamp File Added: find-device-also-by-media_type-fix2.diff
2014-08-07 12:26 holtkamp Note Added: 0000941
2014-08-07 13:09 mvwieringen File Deleted: find-device-also-by-media_type.diff
2014-08-07 13:15 mvwieringen Note Added: 0000943
2014-08-07 13:41 holtkamp Note Added: 0000944
2014-08-13 09:42 mvwieringen Status new => acknowledged
2014-08-18 15:49 holtkamp Changeset attached => bareos master aed0a0c6
2014-08-18 15:49 holtkamp Note Added: 0000956
2014-08-18 15:49 holtkamp Assigned To => holtkamp
2014-08-18 15:49 holtkamp Status acknowledged => resolved
2014-08-18 15:49 holtkamp Resolution open => fixed
2015-01-26 16:05 mvwieringen adm Assigned To holtkamp =>
2015-01-26 16:05 mvwieringen adm Status resolved => closed