View Issue Details

IDProjectCategoryView StatusLast Update
0000435bareos-corestorage daemonpublic2015-07-10 14:09
ReporterrOger Assigned To 
PrioritynormalSeveritymajorReproducibilityalways
Status closedResolutionno change required 
PlatformLinuxOSUbuntuOS Version12.04
Product Version14.2.2 
Summary0000435: bareos-sd with scsicrypto plugin and LTO6 drive will not reset drive properly after encryption key was used
DescriptionWe are using a HP Ultrium-6 drive with LTO5 tapes and scsicrypto plugin enabled. After a successful backup job on a volume A another job on volume B will fail with Read/Write Error. The best work around we found is to power cycle the drive. After that the first backup run works like expected - the second one will fail again.
Steps To ReproduceBasically it is easy; every time the backup starts the access to the drive will fail with a read/write error, see example logs of bareos in "Additional Information". Also if you trigger the backup job manually it will fail with the same error message. If you first do a power cycle of the drive (and of course with the same tape as before) the job will run without issues.

As a remark: we want to have more than one backup on a volume (tape) so we defined the corresponding pool like:

Pool {
  Name = WeeklyData
  Pool Type = Backup
  AutoPrune = Yes
  VolumeRetention = 2 weeks
  Maximum Volumes = 5
  Maximum Volume Jobs = 3
  Recycle = Yes
  Storage = TapeLTO5
  Label Format = "WeeklyData-LTO5-"
}
Additional InformationEnvironment:
* Ubuntu 12.04, latest patches, no HWE (means still kernel 3.2.0-76-generic #111-Ubuntu SMP)
* Tape Drive
** HP StoreEver Ultrium-6 6250 External SAS (HP ProductID: EH970A)
** Firmware 353D/Standalone (latest one)
* The tests mentioned in the bareos manual where all successful

Log lines of bareos:

01-Mar 23:30 fs0-sd JobId 45: Recycled volume "WeeklyData-LTO5-0001" on device "TapeStorageLTO5" (/dev/nst0), all previous data lost.
01-Mar 23:30 fs0-sd JobId 45: Error: block.c:631 Write error at 0:1 on device "TapeStorageLTO5" (/dev/nst0). ERR=Device or resource busy.
01-Mar 23:30 fs0-sd JobId 45: Error: block.c:645 Write error on fd=6 at file:blk 0:1 on device "TapeStorageLTO5" (/dev/nst0). ERR=Device or resource busy.
01-Mar 23:31 fs0-sd JobId 45: Error: block.c:1004 Read error on fd=6 at file:blk 0:4294967295 on device "TapeStorageLTO5" (/dev/nst0). ERR=Device or resource busy.
01-Mar 23:31 fs0-sd JobId 45: Error: Re-read last block at EOT failed. ERR=block.c:1004 Read error on fd=6 at file:blk 0:4294967295 on device "TapeStorageLTO5" (/dev/nst0).
01-Mar 23:31 fs0-sd JobId 45: End of medium on Volume "WeeklyData-LTO5-0001" Bytes=64,512 Blocks=0 at 01-Mar-2015 23:31.
01-Mar 23:31 fs0-sd JobId 45: Please mount append Volume "WeeklyData-LTO5-0003" or label a new one for:

More information or configuration snippets are available upon request.
TagsNo tags attached.

Activities

mvwieringen

mvwieringen

2015-03-10 16:29

developer   ~0001305

Think for starters we need a debug output level 200 of the storage daemon
to see what the plugin is doing and what not and why.

e.g.

bareos-sd -f -d 200 > logfile 2>&1 run as user bareos via su - bareos -s /bin/bash

And probably the exact config of the tape device in the bareos-sd.conf
rOger

rOger

2015-03-16 10:10

reporter  

bareos-sd.conf (2,841 bytes)
rOger

rOger

2015-03-16 10:10

reporter  

rOger

rOger

2015-03-16 10:11

reporter   ~0001317

As requested I attached the desired files (for security reasons we removed keys).
mvwieringen

mvwieringen

2015-03-16 18:08

developer   ~0001318

Ok log shows that the plugin thinks it doesn't need
to clear the encryption key because no encryption is enabled
on the drive. Given you have the following in your config:

Query Crypto Status = yes;

It means that the plugin queries the actual encryption status
of the drive in determining ic crypto is enabled.

You could change Query Crypto Status = yes; to Query Crypto Status = no;
then it doesn't query the drive but in our testing that always worked ok.

I would be interested in the output of status storage=<storage_name> when
the volume is loaded as that prints the encryption status of the drive
and uses the same low level SCSI page so we can see what the drive
encryption status is.
rOger

rOger

2015-03-17 08:20

reporter  

20150317-bareos-status-storage.txt (2,917 bytes)   
*status storage=TapeLTO5
Connecting to Storage daemon TapeLTO5 at fs0-sd.terreactive.ch:9103

fs0-sd Version: 14.2.2 (12 December 2014) x86_64-pc-linux-gnu ubuntu Ubuntu 12.04 LTS
Daemon started 11-Mar-15 11:47. Jobs: run=9, running=0.
 Heap: heap=270,336 smbytes=197,003 max_bytes=2,291,483 bufs=155 max_bufs=185
 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0 bwlimit=0kB/s
Plugin Info:
 Plugin     : scsicrypto-sd.so
 Description: SCSI Encryption Storage Daemon Plugin
 Version    : 1, Date: March 2012
 Author     : Marco van Wieringen
 License    : Bareos AGPLv3
 Usage      : (No usage yet)



Running Jobs:
No Jobs running.
====

Jobs waiting to reserve a drive:
====

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name 
===================================================================
    62  Incr        391    121.8 M  OK       10-Mar-15 23:16 DataBackup
    63  Incr        367    1.257 G  OK       11-Mar-15 23:16 DataBackup
    64  Incr        463    63.22 M  OK       12-Mar-15 23:16 DataBackup
    65  Incr        324    46.41 M  OK       13-Mar-15 23:16 DataBackup
    66  Incr      4,958    1.709 G  OK       14-Mar-15 23:16 DataBackup
    67  Incr        238    18.92 M  OK       15-Mar-15 23:16 DataBackup
    68  Full         17    1.019 M  Cancel   16-Mar-15 08:21 DataBackup
    69  Full         17    1.019 M  Cancel   16-Mar-15 08:45 DataBackup
    70  Full         17    1.019 M  Cancel   16-Mar-15 08:49 DataBackup
    71  Incr        237    18.14 M  OK       16-Mar-15 23:16 DataBackup
====

Device status:

Device "FileStorage" (/bareos) is not open.
Configured device capabilities:
  EOF BSR BSF FSR FSF EOM !REM RACCESS AUTOMOUNT LABEL !ANONVOLS ALWAYSOPEN
Device state:
  !OPENED !TAPE !LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED
  num_writers=0 reserves=0 block=0
Attached Jobs: 
Device parameters:
  Archive name: /bareos Device name: FileStorage
  File=0 block=0
  Min block=64512 Max block=64512
==

Device "TapeStorageLTO5" (/dev/nst0) is not open.
Configured device capabilities:
  EOF BSR BSF FSR FSF EOM REM !RACCESS AUTOMOUNT !LABEL !ANONVOLS !ALWAYSOPEN
Device state:
  !OPENED TAPE !LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED
  num_writers=0 reserves=0 block=0
Attached Jobs: 
Device parameters:
  Archive name: /dev/nst0 Device name: TapeStorageLTO5
  File=0 block=0
  Min block=64512 Max block=64512
==

Device "TapeStorageLTO6" (/dev/nst0) is not open.
Configured device capabilities:
  EOF BSR BSF FSR FSF EOM REM !RACCESS AUTOMOUNT !LABEL !ANONVOLS !ALWAYSOPEN
Device state:
  !OPENED TAPE !LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED
  num_writers=0 reserves=0 block=0
Attached Jobs: 
Device parameters:
  Archive name: /dev/nst0 Device name: TapeStorageLTO6
  File=0 block=0
  Min block=0 Max block=1048576
==
====

Used Volume status:
====

====

rOger

rOger

2015-03-17 08:28

reporter   ~0001319

I attached the output of status storage=TapeLTO5. The tape was still in the drive.

I'm not sure what you want to see. But due the fact we set AlwaysOpen=no I assume that you do not see what you expected. I tried thereafter a "mount" and then the output looks basically like the one already attached except the end with the info about Used Volume status:

---- 8< -------- 8< -------- 8< -------- 8< -------- 8< ----
Used Volume status:
WeeklyData-LTO5-0001 on device "TapeStorageLTO5" (/dev/nst0)
    Reader=0 writers=0 reserves=0 volinuse=0
====

====
---- 8< -------- 8< -------- 8< -------- 8< -------- 8< ----

Even the change of the mentioned flag (Query Crypto Status to "no") does not mean that the issue is solved, right?
mvwieringen

mvwieringen

2015-03-17 14:43

developer   ~0001320

I was somewhat lured into thinking that the plugin didn't want to clear
the key but it seems there are multiple backups in the debug log so the
real info somehow got lost.

Its seems this is what happens:

fs0-sd: scsicrypto-sd.c:401-0 scsicrypto-sd: Loading new crypto key <key>
...
fs0-sd: block.c:591-0 ===== write retry=1 status=-1 errno=16: ERR=Device or resource busy
fs0-sd: block.c:591-0 ===== write retry=2 status=-1 errno=16: ERR=Device or resource busy
fs0-sd: block.c:591-0 ===== write retry=3 status=-1 errno=16: ERR=Device or resource busy
...
fs0-sd: scsicrypto-sd.c:457-0 scsicrypto-sd: Clearing crypto key

So the plugin is setting the key and clearing it but after setting it the
drive seems to go into some weird state where it doesn't want things
to be written to it. Normally that should not happen even when you changed
keys over time it will just mean you cannot read older data on the tape
because of changing the key. But in this case it puts the drive into
some weird mode.

Maybe it makes more sense to test the encryption of the drive outside
of the storage daemon using bscrypto (which essentially uses the same
low level routines) on a test tape.

e.g. disable the bareos-sd and insert a test tape into the drive.

# bscrypto -s - /dev/sg?
Type Some key^D
# bscrypto -e /dev/sg?
# tar cvf /dev/nst0 /etc
# bscrypto -c /dev/sg?

First sets a key, second gets the drive encryption status.
Then we write some data using tar
last clears the key again.

That should at least show how the drive reacts to the crypto operations.
rOger

rOger

2015-03-17 16:23

reporter   ~0001322

I tried the thing you suggested:

# bscrypto -s - /dev/sg5
Type Some key^D
# bscrypto -e /dev/sg5
Drive encryption status:
   Encryption Mode: Encrypt
   Decryption Mode: Mixed
   Raw Decryption Mode Disabled (RDMD): Enabled
   Volume Contains Encrypted Logical Blocks (VCELB): Enabled
   Logical Block encryption parameters: Application Managed
   Key Associated Data (KAD) Descriptor: Normal key
# tar cvf /dev/nst0 /etc
# bscrypto -c /dev/sg5

After this reading from the tape was not possible -> OK because key was missing.

Then I setup the key again and tried to reread the tape. Worked -> OK

I cleared the key and ejected the tape and mounted it again. After setting up the key again I was able to read the backup from tape. -> OK

So in general it seems to work.

Could it be that the positioning operation on the tape could produce the error? Background: we want to make three backups on the same tape. I got a similar error in the case I forgot to rewind the tape to the beginning when I tried to do a restore/test with tar.
rOger

rOger

2015-03-23 10:18

reporter  

20150323_bareos-sd.log (21,103 bytes)
rOger

rOger

2015-03-23 10:59

reporter   ~0001333

I just uploaded a new log file of the storage daemon. From the logs it looks for me that bareos is not able to append to the tape if there is already a backup on it. As you can see in the logs it mounts the tape with the encryption key. Then it tries to write the "session_label". I do not understand the logs; it mentions that the writing of the label seems to be ok ("Leave" log line) but then the following block writing operation fails: This operation fails according to the logs with the lines "retry=[1..3] status=-1 errno=16: ERR=Device or resource busy". After some tries it give up on this tape and want a new one (volume). I'm not sure about this block write error if it belongs to the labeling or already data will be written.

Is there a way to confirm this issue somehow? Could we add some debug code to get more debug information?
mvwieringen

mvwieringen

2015-03-24 11:21

developer   ~0001336

The new log shows exactly the same as before. After setting the crypto key
the drive gets inaccessible for any further write.

Some things that are strange in your sd config:

- Two storage entries accessing the same physical drive
  (May work but not really within the design envelope)
  Maybe change the media type to LTO (instead of LTO4/LTO5)
- Always open to no on tape
  No idea why you set this but for tapes that is most of the time
  not the best option. e.g. is keeps also rewinding tapes this way to
  re-read the tape label.

B.T.W. the tape label gets reread but it doesn't get updated after
its written once to the tape. We also made the actual tape label
unencrypted as then we can determine a encrypted tape vs a empty tape
without a label.

Have you tried using the default blocksize on the tape instead of this
big blocksize ?

Any messages in the unix messages given that the device gives an EBUSY I
wonder if the ST driver logs something.
rOger

rOger

2015-03-26 16:43

reporter   ~0001653

* Two storage entries: As I understood this is to specify which media the drive supports. Unfortunatly a LTO6 drive does not automatically support all older LTO tapes; as an example LTO4 is not supported on such drives. I'm not aware that different media types could be specifed in "Media Type" so this was for me the "cleanest" solution. To be sure that this config section does not interferer with "our" problem I changed the device to /dev/nst1

* Always Open: We set this option to NO because we anyway eject the tape after a backup operation to bring it to a different location (keyword BCM). So when we set this option to YES we just have to unmount it after a backup job by hand - anyway as soon as the tape gets ejected it was rewound before. So in my opinion I do not see any benefit of setting this value to YES.

Yes, we tried also with the default block size with the same error.

Unfortunatly the ST driver does not log anything - is there a special option to force the ST driver to log something?
mvwieringen

mvwieringen

2015-03-26 17:05

developer   ~0001654

You could try if setting always open to yes and release (e.g. not unmount)
the device after the backup to see if that changes anything.

Given the exact same setting using bscrypto work it seems the drive is
somewhat sick as it locks itself with device busy after loading a key.
Never seen that on any piece of other hardware so I'm out of ideas.

Given the ST driver also doesn't log anything only option left if trying
with always open = yes.

If that also doesn't bring anything then I guess we are out of options for
a non subscription bug report.
rOger

rOger

2015-07-01 08:53

reporter   ~0001779

As it turned out the problem was the setting

  Maximum Blocksize = 1048576;

We removed it so it reverted to default and as it looks now it works like expected (as mentioned before).

In my opinion the documentation should be adjusted because we followed the section to optimize the tape speed which suggests to adjust this parameter according to the defined procedure in the manual.

Another side note: You must be careful with mounting of the tape; in our experience the encryption was only active if we mounted the tape manual before a scheduled task wrote data on it.

From my point of view this bug can be closed.
mvwieringen

mvwieringen

2015-07-10 14:07

developer   ~0001787

Given that each vendor implements its hardware different it can happen
that it doesn't support hardware encryption with bigger blocksizes that
was also one of the reasons that I advised earlier to test with the default
blocksize.

As to mounting the tape every trace I saw until now showed things worked
fine as to loading the key and the right key was cleared or loaded when needed.
The key gets loaded when the tape label is verified by an event that is then
triggered and that is true for any invocation of a backup e.g. mounted volume
or not.

I will close the ticket and ask the original author of the tape tuning
document to add a section on the EBUSY error the Linux ST driver seems
to issue when it has buffering problems with bigger blocksizes.

Issue History

Date Modified Username Field Change
2015-03-10 13:41 rOger New Issue
2015-03-10 16:29 mvwieringen Note Added: 0001305
2015-03-10 16:30 mvwieringen Status new => feedback
2015-03-16 10:10 rOger File Added: bareos-sd.conf
2015-03-16 10:10 rOger File Added: 20150316-bareos-sd_NOT_WORKING.log
2015-03-16 10:11 rOger Note Added: 0001317
2015-03-16 10:11 rOger Status feedback => new
2015-03-16 18:08 mvwieringen Note Added: 0001318
2015-03-16 18:08 mvwieringen Status new => feedback
2015-03-17 08:20 rOger File Added: 20150317-bareos-status-storage.txt
2015-03-17 08:28 rOger Note Added: 0001319
2015-03-17 08:28 rOger Status feedback => new
2015-03-17 14:43 mvwieringen Note Added: 0001320
2015-03-17 14:43 mvwieringen Status new => feedback
2015-03-17 16:23 rOger Note Added: 0001322
2015-03-17 16:23 rOger Status feedback => new
2015-03-23 10:18 rOger File Added: 20150323_bareos-sd.log
2015-03-23 10:59 rOger Note Added: 0001333
2015-03-24 11:21 mvwieringen Note Added: 0001336
2015-03-24 11:21 mvwieringen Status new => feedback
2015-03-26 16:43 rOger Note Added: 0001653
2015-03-26 16:43 rOger Status feedback => new
2015-03-26 17:05 mvwieringen Note Added: 0001654
2015-03-26 17:05 mvwieringen Status new => feedback
2015-07-01 08:53 rOger Note Added: 0001779
2015-07-01 08:53 rOger Status feedback => new
2015-07-10 14:07 mvwieringen Note Added: 0001787
2015-07-10 14:09 mvwieringen Status new => closed
2015-07-10 14:09 mvwieringen Resolution open => no change required