View Issue Details

IDProjectCategoryView StatusLast Update
0000764bareos-core[All Projects] directorpublic2017-01-12 19:16
ReporternktlAssigned To 
PriorityhighSeveritymajorReproducibilityalways
Status newResolutionopen 
PlatformLinuxOSRHELOS Version7
Product Version16.2.4 
Fixed in Version 
Summary0000764: VirtualFull job does not do anything
Description#Latest Bareos from the repo on vanilla RHEL7#

VirtualFull job does not consolidate any data - it is just sitting there.

This happens regardless if the VirtualFull job is spawned by Consolidate job or just ran individually.

The sequence of events is as follows (when using Consolidate job to spawn VirtualFulls):

1. Consolidate job is started with the following definition:

Job {
  Name = "AI-Consolidate-Job"
  Enabled = yes
  Client = "some-client"
  FileSet = "some-fileset"
  Accurate = Yes
  Pool = AI-Consolidated
  Type = Consolidate
  Schedule = "Consolidation"
  Storage = File
  Messages = Standard
  Priority = 10
}


2. Consolidate spawns child jobs and goes through the data:

12-Jan 09:44 bareos-dir JobId 592: Consolidating JobId 594 started.
12-Jan 09:44 bareos-dir JobId 592: Looking at always incremental job SomeBackupJob
12-Jan 09:44 bareos-dir JobId 593: Consolidating JobIds 5,25,45,65,85,109,131,153,175,197,219,241,263,285,307,329,351,373,395,417,439
12-Jan 09:44 bareos-dir JobId 592: SomeBackupJob: considering jobs older than 05-Jan-2017 09:44:02 for consolidation.
12-Jan 09:44 bareos-dir JobId 592: SomeBackupJob: Start new consolidation
12-Jan 09:44 bareos-dir JobId 592: Using Catalog "MyCatalog"


3. Consolidate job finishes successfully:

12-Jan 09:44 bareos-dir JobId 592: BAREOS 16.2.4 (01Jul16): 12-Jan-2017 09:44:06
  JobId: 592
  Job: AI-Consolidate-Job.2017-01-12_09.44.00_09
  Scheduled time: 12-Jan-2017 09:44:00
  Start time: 12-Jan-2017 09:44:02
  End time: 12-Jan-2017 09:44:06
  Termination: Consolidate OK

4. At the same time, VirtualFull job is spawned:

12-Jan 09:44 bareos-dir JobId 595: Start Virtual Backup JobId 595, Job=SomeBackupJob.2017-01-12_09.44.03_13

There is a lot of database activity at this point.

5. Eventually database activity ceases and following entry appears in the log:

12-Jan 09:49 bareos-dir JobId 595: Bootstrap records written to /var/lib/bareos/bareos-dir.restore.5.bsr

At this point nothing else happens. The VirtualFull job is just sitting there in 'Running' state - there is no I/O or CPU activity on the server at all. I left it in this state for days, it does not seem to do anything. When stracing bareos processes, they seem to be just spinning in sleep state:

[root@backup/]# strace -fF -p `pidof bareos-dir`
[pid 163519] <... nanosleep resumed> NULL) = 0
[pid 163519] nanosleep({2, 0}, <unfinished ...>
[pid 163526] <... nanosleep resumed> NULL) = 0
[pid 163526] nanosleep({2, 0}, <unfinished ...>
[pid 163529] <... nanosleep resumed> NULL) = 0

and:

[root@backup/]# strace -fF -p `pidof bareos-sd`

[pid 162137] nanosleep({30, 0}, <unfinished ...>
[pid 162183] <... restart_syscall resumed> ) = 0
[pid 162183] nanosleep({30, 0}, <unfinished ...>
[pid 162175] <... restart_syscall resumed> ) = 0
[pid 162175] nanosleep({30, 0}, <unfinished ...>
[pid 162176] <... restart_syscall resumed> ) = 0


It looks like some obscure bug - any idea how to tell what this thing is waiting for? The pools and media (all disk based) are all accessible.
Steps To ReproduceStart Consolidate job using the following definition:

Job {
  Name = "AI-Consolidate-Job"
  Enabled = yes
  Client = "some-client"
  FileSet = "some-fileset"
  Accurate = Yes
  Pool = AI-Consolidated
  Type = Consolidate
  Schedule = "Consolidation"
  Storage = File
  Messages = Standard
  Priority = 10
}
TagsNo tags attached.
bareos-master: impactyes
bareos-master: action
bareos-19.2: impact
bareos-19.2: action
bareos-18.2: impact
bareos-18.2: action
bareos-17.2: impact
bareos-17.2: action
bareos-16.2: impactyes
bareos-16.2: action
bareos-15.2: impactno
bareos-15.2: action
bareos-14.2: impactno
bareos-14.2: action
bareos-13.2: impactno
bareos-13.2: action
bareos-12.4: impactyes
bareos-12.4: action

Activities

joergs

joergs

2017-01-12 17:57

administrator   ~0002503

Do you have multiple storages in places, like requested in http://doc.bareos.org/master/html/bareos-manual-main-reference.html#StoragesAndPools ?

If yes, what is the output of

list joblog jobid=595
nktl

nktl

2017-01-12 18:36

reporter   ~0002504

Yeas, there are multiple storage pools defined as:

Pool {
  Name = AI-Incremental
  Pool Type = Backup
  Recycle = yes
  Auto Prune = no
  Maximum Volume Bytes = 100G
  Label Format = "AI-Incremental-"
  Volume Use Duration = 23h
  Storage = File
  Next Pool = AI-Consolidated
}


and:

Pool {
  Name = AI-Consolidated
  Pool Type = Backup
  Recycle = yes
  Auto Prune = no
  Volume Retention = 10 years
  Maximum Volume Bytes = 100G
  Label Format = "AI-Consolidated-"
  Volume Use Duration = 23h
  Storage = File
  Next Pool = AI-Longterm
}

(+ there is longterm pool too)

I have just realized that both pools point to the same storage engine/definition: 'File' - which is not the case in the docs - is this a problem?

The command does not reveal much:

2017-01-12 17:27:14 bareos-dir JobId 616: Start Virtual Backup JobId 616, Job=SomeJob.2017-01-12_17.27.14_15
2017-01-12 17:27:14 bareos-dir JobId 616: Consolidating JobIds 7,27,47,67,87,111,133,155,177,199,221,243,265,287,309,331,353,375,397,419,441
2017-01-12 17:33:02 bareos-dir JobId 616: Bootstrap records written to /var/lib/bareos/bareos-dir.restore.5.bsr
joergs

joergs

2017-01-12 18:51

administrator   ~0002505

Yes, having only one storage is the problem, as Bareos will need a separate one for reading and writing.

You are right. Your joblog does not reveal this fact.

See http://doc.bareos.org/master/html/bareos-manual-main-reference.html#UsingMultipleStorageDevices
nktl

nktl

2017-01-12 18:57

reporter   ~0002506

OK, many thanks for clarifying!

I created another storage (File-Consolidate), pointing to the same SD and device and assigned it to the Consolidate job - but this does not seem to be enough, the job is stuck again.

Does this other storage has to be pointed to a different SD? Or is simply different device on the same SD is good enough? It would be good to understand what is the exact requirement here?
nktl

nktl

2017-01-12 19:16

reporter   ~0002507

OK, I created a new device on SD pointing to the different filesystem path - and used this device as a storage target in director - it seems to work now!

It might be a good idea to clarify documentation here, as it is not clear what exactly is required to get this thing up and running: it only says that "at least two storages are needed" - but this can be interpreted as 'Storage Pools' or 'Storage definitions in director'. Possibly some additional logging to detect this situation would be useful too.

Issue History

Date Modified Username Field Change
2017-01-12 11:13 nktl New Issue
2017-01-12 17:57 joergs Note Added: 0002503
2017-01-12 18:00 joergs bareos-master: impact => yes
2017-01-12 18:00 joergs bareos-16.2: impact => yes
2017-01-12 18:00 joergs bareos-15.2: impact => no
2017-01-12 18:00 joergs bareos-14.2: impact => no
2017-01-12 18:00 joergs bareos-13.2: impact => no
2017-01-12 18:00 joergs bareos-12.4: impact => yes
2017-01-12 18:00 joergs Status new => feedback
2017-01-12 18:36 nktl Note Added: 0002504
2017-01-12 18:36 nktl Status feedback => new
2017-01-12 18:51 joergs Note Added: 0002505
2017-01-12 18:57 nktl Note Added: 0002506
2017-01-12 19:16 nktl Note Added: 0002507