View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000764 | bareos-core | director | public | 2017-01-12 11:13 | 2023-07-04 14:33 |
Reporter | nktl | Assigned To | bruno-at-bareos | ||
Priority | high | Severity | major | Reproducibility | always |
Status | closed | Resolution | no change required | ||
Platform | Linux | OS | RHEL | OS Version | 7 |
Product Version | 16.2.4 | ||||
Summary | 0000764: VirtualFull job does not do anything | ||||
Description | #Latest Bareos from the repo on vanilla RHEL7# VirtualFull job does not consolidate any data - it is just sitting there. This happens regardless if the VirtualFull job is spawned by Consolidate job or just ran individually. The sequence of events is as follows (when using Consolidate job to spawn VirtualFulls): 1. Consolidate job is started with the following definition: Job { Name = "AI-Consolidate-Job" Enabled = yes Client = "some-client" FileSet = "some-fileset" Accurate = Yes Pool = AI-Consolidated Type = Consolidate Schedule = "Consolidation" Storage = File Messages = Standard Priority = 10 } 2. Consolidate spawns child jobs and goes through the data: 12-Jan 09:44 bareos-dir JobId 592: Consolidating JobId 594 started. 12-Jan 09:44 bareos-dir JobId 592: Looking at always incremental job SomeBackupJob 12-Jan 09:44 bareos-dir JobId 593: Consolidating JobIds 5,25,45,65,85,109,131,153,175,197,219,241,263,285,307,329,351,373,395,417,439 12-Jan 09:44 bareos-dir JobId 592: SomeBackupJob: considering jobs older than 05-Jan-2017 09:44:02 for consolidation. 12-Jan 09:44 bareos-dir JobId 592: SomeBackupJob: Start new consolidation 12-Jan 09:44 bareos-dir JobId 592: Using Catalog "MyCatalog" 3. Consolidate job finishes successfully: 12-Jan 09:44 bareos-dir JobId 592: BAREOS 16.2.4 (01Jul16): 12-Jan-2017 09:44:06 JobId: 592 Job: AI-Consolidate-Job.2017-01-12_09.44.00_09 Scheduled time: 12-Jan-2017 09:44:00 Start time: 12-Jan-2017 09:44:02 End time: 12-Jan-2017 09:44:06 Termination: Consolidate OK 4. At the same time, VirtualFull job is spawned: 12-Jan 09:44 bareos-dir JobId 595: Start Virtual Backup JobId 595, Job=SomeBackupJob.2017-01-12_09.44.03_13 There is a lot of database activity at this point. 5. Eventually database activity ceases and following entry appears in the log: 12-Jan 09:49 bareos-dir JobId 595: Bootstrap records written to /var/lib/bareos/bareos-dir.restore.5.bsr At this point nothing else happens. The VirtualFull job is just sitting there in 'Running' state - there is no I/O or CPU activity on the server at all. I left it in this state for days, it does not seem to do anything. When stracing bareos processes, they seem to be just spinning in sleep state: [root@backup/]# strace -fF -p `pidof bareos-dir` [pid 163519] <... nanosleep resumed> NULL) = 0 [pid 163519] nanosleep({2, 0}, <unfinished ...> [pid 163526] <... nanosleep resumed> NULL) = 0 [pid 163526] nanosleep({2, 0}, <unfinished ...> [pid 163529] <... nanosleep resumed> NULL) = 0 and: [root@backup/]# strace -fF -p `pidof bareos-sd` [pid 162137] nanosleep({30, 0}, <unfinished ...> [pid 162183] <... restart_syscall resumed> ) = 0 [pid 162183] nanosleep({30, 0}, <unfinished ...> [pid 162175] <... restart_syscall resumed> ) = 0 [pid 162175] nanosleep({30, 0}, <unfinished ...> [pid 162176] <... restart_syscall resumed> ) = 0 It looks like some obscure bug - any idea how to tell what this thing is waiting for? The pools and media (all disk based) are all accessible. | ||||
Steps To Reproduce | Start Consolidate job using the following definition: Job { Name = "AI-Consolidate-Job" Enabled = yes Client = "some-client" FileSet = "some-fileset" Accurate = Yes Pool = AI-Consolidated Type = Consolidate Schedule = "Consolidation" Storage = File Messages = Standard Priority = 10 } | ||||
Tags | No tags attached. | ||||
Do you have multiple storages in places, like requested in http://doc.bareos.org/master/html/bareos-manual-main-reference.html#StoragesAndPools ? If yes, what is the output of list joblog jobid=595 |
|
Yeas, there are multiple storage pools defined as: Pool { Name = AI-Incremental Pool Type = Backup Recycle = yes Auto Prune = no Maximum Volume Bytes = 100G Label Format = "AI-Incremental-" Volume Use Duration = 23h Storage = File Next Pool = AI-Consolidated } and: Pool { Name = AI-Consolidated Pool Type = Backup Recycle = yes Auto Prune = no Volume Retention = 10 years Maximum Volume Bytes = 100G Label Format = "AI-Consolidated-" Volume Use Duration = 23h Storage = File Next Pool = AI-Longterm } (+ there is longterm pool too) I have just realized that both pools point to the same storage engine/definition: 'File' - which is not the case in the docs - is this a problem? The command does not reveal much: 2017-01-12 17:27:14 bareos-dir JobId 616: Start Virtual Backup JobId 616, Job=SomeJob.2017-01-12_17.27.14_15 2017-01-12 17:27:14 bareos-dir JobId 616: Consolidating JobIds 7,27,47,67,87,111,133,155,177,199,221,243,265,287,309,331,353,375,397,419,441 2017-01-12 17:33:02 bareos-dir JobId 616: Bootstrap records written to /var/lib/bareos/bareos-dir.restore.5.bsr |
|
Yes, having only one storage is the problem, as Bareos will need a separate one for reading and writing. You are right. Your joblog does not reveal this fact. See http://doc.bareos.org/master/html/bareos-manual-main-reference.html#UsingMultipleStorageDevices |
|
OK, many thanks for clarifying! I created another storage (File-Consolidate), pointing to the same SD and device and assigned it to the Consolidate job - but this does not seem to be enough, the job is stuck again. Does this other storage has to be pointed to a different SD? Or is simply different device on the same SD is good enough? It would be good to understand what is the exact requirement here? |
|
OK, I created a new device on SD pointing to the different filesystem path - and used this device as a storage target in director - it seems to work now! It might be a good idea to clarify documentation here, as it is not clear what exactly is required to get this thing up and running: it only says that "at least two storages are needed" - but this can be interpreted as 'Storage Pools' or 'Storage definitions in director'. Possibly some additional logging to detect this situation would be useful too. |
|
Forgotten resolved issue | |
Date Modified | Username | Field | Change |
---|---|---|---|
2017-01-12 11:13 | nktl | New Issue | |
2017-01-12 17:57 | joergs | Note Added: 0002503 | |
2017-01-12 18:00 | joergs | Status | new => feedback |
2017-01-12 18:36 | nktl | Note Added: 0002504 | |
2017-01-12 18:36 | nktl | Status | feedback => new |
2017-01-12 18:51 | joergs | Note Added: 0002505 | |
2017-01-12 18:57 | nktl | Note Added: 0002506 | |
2017-01-12 19:16 | nktl | Note Added: 0002507 | |
2023-07-04 14:33 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2023-07-04 14:33 | bruno-at-bareos | Status | new => closed |
2023-07-04 14:33 | bruno-at-bareos | Resolution | open => no change required |
2023-07-04 14:33 | bruno-at-bareos | Note Added: 0005120 |