View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001012 | bareos-core | director | public | 2018-09-25 17:01 | 2023-10-12 09:55 |
Reporter | stephand | Assigned To | bruno-at-bareos | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Product Version | 17.2.7 | ||||
Summary | 0001012: If a Job terminates with error then status client should not report OK for that job | ||||
Description | When a fatal SD error occurs while a restore job is running, nevertheless the "status client=<ClientName>" output shows OK status for that failed restore job. | ||||
Steps To Reproduce | While the restore job is running, fill the volume currently being used with zeroes using dd. (Note: Never do this on a production system) Example: [root@vgr-c7bpdev2-test-pgsql storage]# ls -lah total 299M drwxrwxr-x. 2 bareos bareos 23 Sep 25 10:39 . drwxrwx---. 3 bareos bareos 210 Sep 25 11:49 .. -rw-r-----. 1 bareos bareos 299M Sep 25 10:46 Full-0001 [root@vgr-c7bpdev2-test-pgsql storage]# ll /dev/zero crw-rw-rw-. 1 root root 1, 5 Sep 25 10:24 /dev/zero [root@vgr-c7bpdev2-test-pgsql storage]# dd if=/dev/zero of=Full-0001 bs=1M count=300 300+0 records in 300+0 records out 314572800 bytes (315 MB) copied, 0.168414 s, 1.9 GB/s This will cause a fatal SD error: *restore client=bareos-fd select all done yes Using Catalog "MyCatalog" Automatically selected FileSet: Data1Set +-------+-------+-----------+----------+---------------------+------------+ | jobid | level | jobfiles | jobbytes | starttime | volumename | +-------+-------+-----------+----------+---------------------+------------+ | 1 | F | 1,001,111 | 0 | 2018-09-25 10:39:01 | Full-0001 | +-------+-------+-----------+----------+---------------------+------------+ You have selected the following JobId: 1 Building directory tree for JobId(s) 1 ... +++++++++++++++++++++++++++++++++++++++++++++++++ 1,000,000 files inserted into the tree and marked for extraction. Bootstrap records written to /var/lib/bareos/bareos-dir.restore.4.bsr The job will require the following Volume(s) Storage(s) SD Device(s) =========================================================================== Full-0001 File FileStorage Volumes marked with "*" are online. 1,001,111 files selected to be restored. Using Catalog "MyCatalog" Job queued. JobId=7 * You have messages. *mes 25-Sep 12:28 bareos-dir JobId 7: Start Restore Job RestoreFiles.2018-09-25_12.28.13_41 25-Sep 12:28 bareos-dir JobId 7: Using Device "FileStorage" to read. 25-Sep 12:28 bareos-sd JobId 7: Ready to read from volume "Full-0001" on device "FileStorage" (/var/lib/bareos/storage). 25-Sep 12:28 bareos-sd JobId 7: Forward spacing Volume "Full-0001" to file:block 0:219. 25-Sep 12:28 bareos-sd JobId 7: Error: block.c:288 Volume data error at 0:18192512! Wanted ID: "BB02", got "". Buffer discarded. 25-Sep 12:28 bareos-sd JobId 7: Releasing device "FileStorage" (/var/lib/bareos/storage). 25-Sep 12:28 bareos-sd JobId 7: Fatal error: fd_cmds.c:236 Command error with FD, hanging up. 25-Sep 12:28 bareos-dir JobId 7: Error: Bareos bareos-dir 17.2.7 (16Jul18): Build OS: x86_64-redhat-linux-gnu redhat CentOS Linux release 7.5.1804 (Core) JobId: 7 Job: RestoreFiles.2018-09-25_12.28.13_41 Restore Client: bareos-fd Start time: 25-Sep-2018 12:28:15 End time: 25-Sep-2018 12:28:19 Elapsed time: 4 secs Files Expected: 1,001,111 Files Restored: 74,872 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 0 FD termination status: OK SD termination status: Fatal Error Termination: *** Restore Error *** But the "status client" ouput lists this restore job as OK: *status client=bareos-fd Connecting to Client bareos-fd at localhost:9102 vgr-c7bpdev2-test-pgsql-fd Version: 17.2.7 (16 Jul 2018) x86_64-redhat-linux-gnu redhat CentOS Linux release 7.5.1804 (Core) Daemon started 25-Sep-18 10:25. Jobs: run=7 running=0. Heap: heap=135,168 smbytes=113,964 max_bytes=181,086 bufs=89 max_bufs=122 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 bwlimit=0kB/s Running Jobs: bareos-dir (director) connected at: 25-Sep-18 12:28 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ====================================================================== 1 Full 1,001,111 0 OK 25-Sep-18 10:39 Data1 2 Full 106,836 0 Error 25-Sep-18 10:45 Data1 3 Full 175,194 0 Error 25-Sep-18 10:46 Data1 4 1,001,111 0 OK 25-Sep-18 10:50 RestoreFiles 5 36,734 0 Error 25-Sep-18 10:53 RestoreFiles 6 49,344 0 Error 25-Sep-18 11:49 RestoreFiles 7 74,872 0 OK 25-Sep-18 12:28 RestoreFiles ==== | ||||
Additional Information | The expected behavior is that whenever you look at the status of a job on any way that it is possible, the status of the job should be the same. Note that this is not reproducible by simply stopping bareos-sd while running a restore job. I also wasn't able to reproduce this kind of status discrepancy for backup jobs. In both cases the "status client" termination status was correctly showing Error. | ||||
Tags | No tags attached. | ||||
Is this still happening ? | |
The behaviour changed, now getting 11-Oct 16:47 bareos-sd JobId 12: Error: stored/block.cc:294 Volume data error at 0:0! Wanted ID: "BB02", got "". Buffer discarded. 11-Oct 16:47 bareos-sd JobId 12: Warning: stored/acquire.cc:325 Read acquire: Requested Volume "Full-0001" on "FileStorage" (/var/lib/bareos/storage) is not a Bareos labeled Volume, because: ERR=stored/block.cc:294 Volume data error at 0:0! Wanted ID: "BB02", got "". Buffer discarded. 11-Oct 16:47 bareos-sd JobId 12: Please mount read Volume "Full-0001" for: Job: RestoreFiles.2023-10-11_16.47.46_20 Storage: "FileStorage" (/var/lib/bareos/storage) Pool: Incremental Media type: File and the job stay in status "waiting for a mount request" After cancelling, the job correctly displays status "Cancel" in status client output. Just veried that with packages from https://download.bareos.org/next/, so version is 23.0.0~pre1020.b1d94178f (09 October 2023) Can't tell since when the behaviour changed, but we can close this as the reported issue is obviously not existing any more. |
|
In the meantime, it has be fixed | |
Date Modified | Username | Field | Change |
---|---|---|---|
2018-09-25 17:01 | stephand | New Issue | |
2023-09-11 17:21 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2023-09-11 17:21 | bruno-at-bareos | Status | new => feedback |
2023-09-11 17:21 | bruno-at-bareos | Note Added: 0005408 | |
2023-10-11 18:57 | stephand | Note Added: 0005469 | |
2023-10-11 18:57 | stephand | Status | feedback => assigned |
2023-10-12 09:55 | bruno-at-bareos | Status | assigned => closed |
2023-10-12 09:55 | bruno-at-bareos | Resolution | open => fixed |
2023-10-12 09:55 | bruno-at-bareos | Note Added: 0005470 |