View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0001308||bareos-core||[All Projects] General||public||2021-01-19 02:06||2021-08-18 10:55|
|Reporter||Ruth Ivimey-Cook||Assigned To|
|Platform||amd64||OS||Linux||OS Version||Ubuntu 20.04|
|Fixed in Version|
|Summary||0001308: If a job fails, but has written files to tape, don't leave tape full of undescribed data.|
|Description||If a job fails to complete, currently the job is terminated and AFAICT no records are added to the database describing it. Nevertheless, the job exists on tape and is consuming tape space - potentially several tapes.|
I would like the behaviour to be modified to either:
1. Cleanup, that is, return the tapes and catalogue to a similar state to before the failed job started:
- the tape drive is rewound to the end of the last good job and tape-eod written there.
- If the job spanned multiple complete tapes, those tapes are marked for recycling in the appropriate way in the database.
- If the job started on a not-mounted tape then the operator is prompted (email/console) to insert it for an 'eod' to be written, with the option of not doing so and leaving that tape Full.
2. Describe what was written in the DB: Let the job be described in the catalogue completely, up to the point of failure.
- This may involve rewinding the tape to find the end of the good data, or even doing a full scan of the tape (or even the job, though pref not!).
- I'm not sure whether bareos retains enough info to properly continue from this state, but ideally a subsequent Incr or Diff backup following this would be able to back up the 'right' things, bearing in mind the failed job.
3. Work harder at not failing at all: If a tape error happens, call the operator and provide ways to carry on.
- if there is an error when re-reading the end of tape block, rewind further and rescan the written data to determine what is there, and write an eof, then carry on with a new tape in the usual way.
- If the tape drive itself fails in some way, enable the operator to direct the job onto another drive and carry on.
|Tags||No tags attached.|