View Issue Details

IDProjectCategoryView StatusLast Update
0001598bareos-coredirectorpublic2024-02-21 11:14
ReporterInt Assigned To 
PrioritynormalSeveritymajorReproducibilityhave not tried
Status newResolutionopen 
PlatformLinuxOSRHEL (and clones)OS Version8
Summary0001598: Bareos 23.0.1 creates new Volume in catalog but no physical File volume in storage
DescriptionI ran into the issue that Bareos 23.0.1 created a new Volume in catalog but no physical File volume in storage.

This happend during a running backup job when the Pool reached its limit of the configured "Maximum Volumes". The job then got stuck because no free volume was available - message in the job log:

26 2024-02-09 06:25:40 truenas-sd JobId 35326: Job filebackup-s01-fd.2024-02-09_01.00.00_16 is waiting. Cannot find any appendable volumes.
Please use the "label" command to create a new Volume for:
Storage: "FileStorageTrueNAS0002" (/var/lib/bareos/storage)
Pool: Incremental
Media type: File

I then increased the "Maximum Volumes" in the Pool config file and issued a "reload" command in bconsole to load the changed configuration.
The job then continued and created a new Volume - message in the job log:

27 2024-02-09 09:51:01 bareos-dir JobId 35326: Created new Volume "Incremental-0920" in catalog.

But Bareos 23.0.1 created the new Volume only in the catalog but no physical File volume in storage. This lead to this error:

28 2024-02-09 09:51:01 truenas-sd JobId 35326: Warning: stored/mount.cc:248 Open device "FileStorageTrueNAS0002" (/var/lib/bareos/storage) Volume "Incremental-0920" failed: ERR=stored/dev.cc:598 Could not open: /var/lib/bareos/storage/Incremental-0920, ERR=No such file or directory
Steps To ReproduceSee Description
TagsNo tags attached.

Activities

bruno-at-bareos

bruno-at-bareos

2024-02-12 13:32

manager   ~0005761

Just to confirm: you didn't do the update pool from resource command after your reload right ?
Int

Int

2024-02-12 14:02

reporter   ~0005762

I edited the config file /etc/bareos/bareos-dir.d/pool/Incremental.conf with a text editor and increased the "Maximum Volumes".
Then I started bconsole and issued the "reload" command.
I did not do the update pool from resource command after the "reload" command in bconsole.
bruno-at-bareos

bruno-at-bareos

2024-02-13 14:00

manager   ~0005770

Hi we would like to check if its something new. We tried to reproduce the case and the software is working as expected.
Increase pool limit, reload, and a new volume is created and then used.

We would like to see the full joblog, which you can extract with (then compress and attach here)
bconsole <<< "list joblog jobid=35326" > /var/tmp/jobid_35326.log

We have seen in the past, similar issue, but mostly it was because at the time of creating the volume the /var/lib/bareos/storage mount point was not available, and your error looks really similar.
Otherwise the sd would have created that file and would have failed later being not able to read the label.
It maybe be worse the effort to double check system's logs happening at incident time.
Here the SD can't access the file (so the directory).
Int

Int

2024-02-13 14:46

reporter   ~0005771

I attached the full joblog as requested.

I checked the system's logs of the storage server and it doesn't show any errors during that time.
Later (on Feb 13th) the storage server created new volumes successfully without any changes to config or access rights:

root@truenas-sd:~ # ls -la /var/lib/bareos/storage/
total 114812986408
drwxr-x--- 2 bareos bareos 375 Feb 13 01:38 .
drwxr-x--- 3 bareos bareos 4 Feb 6 14:45 ..
...
-rw-r----- 1 bareos bareos 39998980267 Feb 8 05:19 Incremental-0918
-rw-r----- 1 bareos bareos 39999236662 Feb 9 01:00 Incremental-0919
-rw-r----- 1 bareos bareos 26896987171 Feb 13 01:55 Incremental-0934
-rw-r----- 1 bareos bareos 39999706259 Feb 13 02:04 Incremental-0935
-rw-r----- 1 bareos bareos 32469240858 Feb 13 12:00 Incremental-0936
-rw-r----- 1 bareos bareos 37987976031 Feb 13 01:45 Incremental-0937
-rw-r----- 1 bareos bareos 39999120592 Feb 13 01:38 Incremental-0938
-rw-r----- 1 bareos bareos 35671581125 Feb 13 02:05 Incremental-0939
root@truenas-sd:~ #


I guess the special circumstance here was that the job was running and waiting for a new volume because it ran out of empty/recyclable volumes. And I increased pool limit while the job was running and waiting.
jobid_35326.log.gz (2,096 bytes)
bruno-at-bareos

bruno-at-bareos

2024-02-13 14:51

manager   ~0005772

Thanks for the log.

You assumption "I guess the special circumstance here was that the job was running and waiting for a new volume because it ran out of empty/recyclable volumes. And I increased pool limit while the job was running and waiting." is what we tested and tried to reproduce without seeing any failures :-( unfortunately as it is more harder to fix something that doesn't failed, or only on unknown circonstances.

I'm seeing root@truenas-sd is the SD detached from the dir ?
Int

Int

2024-02-13 15:04

reporter   ~0005773

Yes, baros-dir is running in a VM with AlmaLinux release 8.9 (Midnight Oncilla)
bareos-sd is running in a jail with FreeBSD 13.2-RELEASE-p9 on TrueNAS 13.0-U6.1
bruno-at-bareos

bruno-at-bareos

2024-02-14 09:48

manager   ~0005774

Thank for the details, this may help to reproduce the case.

Are you able to reproduce it easily ( maybe creating a dumb pool with low allowed volumes, and run small job on it to recreate the situation ) ?
Int

Int

2024-02-15 09:30

reporter   ~0005775

I was able to reproduce it with another pool with low allowed volumes and a small job.
See attached job log.

This time at first nothing happened when I ran the "reload" command in bconsole. The job did not continue and was stuck at

2024-02-15 09:15:52 truenas-sd JobId 35413: Job migrate.2024-02-15_09.10.03_01 is waiting. Cannot find any appendable volumes.
Please use the "label" command to create a new Volume for:
Storage: "FileStorageTrueNAS0001" (/var/lib/bareos/storage)
Pool: Migrate
Media type: File

When I ran the "reload" command in bconsole a second time the job did continue and ran into the same issue.
jobid_35413.log.gz (1,489 bytes)
bruno-at-bareos

bruno-at-bareos

2024-02-15 09:39

manager   ~0005776

Thanks you. I will forward this to the dev's team.
bruno-at-bareos

bruno-at-bareos

2024-02-15 14:13

manager   ~0005777

Hi, I'm failing to reproduce yet the case. In the meantime

what is the output of
bconsole <<< "llist volume=Migrate-0950"

Would you mind to rerun the test job with the following set before

bconsole <<< "setdebug level=100 trace=1 timestamp=1 storage=FileStorageTrueNAS0001"

after the job you can remove the debug level

bconsole <<< "setdebug level=0 trace=0 timestamp=0 storage=FileStorageTrueNAS0001"

then check on the storage in /var/lib/bareos and join compressed the trace file there (can be removed afterwards)
Int

Int

2024-02-16 16:04

reporter   ~0005780

Hi, I collected the information you wanted.

I could not collect the output of
bconsole <<< "llist volume=Migrate-0950"
since I deleted that test volume already. But I collected the output for the newly failed volume "Migrate-0956" instead.

I had to change
bconsole <<< "setdebug level=100 trace=1 timestamp=1 storage=FileStorageTrueNAS0001"
to
bconsole <<< "setdebug level=100 trace=1 timestamp=1 storage=File"
because "FileStorageTrueNAS0001" is the name of the device not the name of the storage.
The storage daemon is using five devices to write jobs in parallel but the issue also happens when only one job is running.
I attached the config files of the storage daemon - maybe they help to reproduce the issue.
bareos-sd.conf (400 bytes)   
Storage {
  Name = truenas-sd
  Maximum Concurrent Jobs = 20

  # remove comment from "Plugin Directory" to load plugins from specified directory.
  # if "Plugin Names" is defined, only the specified plugins will be loaded,
  # otherwise all storage plugins (*-sd.so) from the "Plugin Directory".
  #
  # Plugin Directory = "/usr/local/lib/bareos/plugins"
  # Plugin Names = ""

  TLS Require = no
}
bareos-sd.conf (400 bytes)   
FileStorage.conf (445 bytes)   
Device {
  Count = 5

  Name = FileStorageTrueNAS
  Media Type = File
  Archive Device = /var/lib/bareos/storage
  LabelMedia = yes;                   # lets Bareos label unlabeled media
  Random Access = yes;
  AutomaticMount = yes;               # when device opened, read it
  RemovableMedia = no;
  AlwaysOpen = no;
  Description = "File device. A connecting Director must have the same Name and MediaType."
  Maximum Concurrent Jobs = 1
}

FileStorage.conf (445 bytes)   
jobid_35437.log.gz (1,950 bytes)
truenas-sd.trace.gz (6,235 bytes)
bruno-at-bareos

bruno-at-bareos

2024-02-20 10:04

manager   ~0005784

Thanks for the report, I will check if our dev's can detect something with this. I was not able to reproduce until now.
how is mounted /var/lib/storage/ ?
Int

Int

2024-02-20 15:46

reporter   ~0005786

I created a mount point for the bareos jail which mounts a path from TrueNAS data pool to /var/lib/bareos/storage/ inside the jail.
See screenshot attached.
JailMountPoint.png (31,827 bytes)   
JailMountPoint.png (31,827 bytes)   
Int

Int

2024-02-21 09:55

reporter   ~0005787

I just realized that I forgot to mention the bareos storage daemon installed in the bareos jail is "bareos-server-23.0.1_1.pkg" from the official FreeBSD repository
https://pkg.freebsd.org/FreeBSD:13:amd64/latest/All/bareos-server-23.0.1_1.pkg
bruno-at-bareos

bruno-at-bareos

2024-02-21 09:58

manager   ~0005788

Would you mind to test if this is the case with the official Bareos package available here
https://download.bareos.org/current/FreeBSD_13.2/

You can use our helper script to get the repo installed https://download.bareos.org/current/FreeBSD_13.2/add_bareos_repositories.sh
bruno-at-bareos

bruno-at-bareos

2024-02-21 10:43

manager   ~0005789

Ok seems we may are able to reproduce the case, it will only appear if the sd has emited the first "Please label message"
Int

Int

2024-02-21 11:14

reporter   ~0005790

Thanks for the feed back. I will skip the effort of switching to the Bareos repository for now.
Let me know if it will be necessary to test with the packages from https://download.bareos.org/current/FreeBSD_13.2/

Issue History

Date Modified Username Field Change
2024-02-09 10:37 Int New Issue
2024-02-12 13:32 bruno-at-bareos Note Added: 0005761
2024-02-12 14:02 Int Note Added: 0005762
2024-02-13 14:00 bruno-at-bareos Note Added: 0005770
2024-02-13 14:46 Int Note Added: 0005771
2024-02-13 14:46 Int File Added: jobid_35326.log.gz
2024-02-13 14:51 bruno-at-bareos Note Added: 0005772
2024-02-13 15:04 Int Note Added: 0005773
2024-02-14 09:48 bruno-at-bareos Note Added: 0005774
2024-02-15 09:30 Int Note Added: 0005775
2024-02-15 09:30 Int File Added: jobid_35413.log.gz
2024-02-15 09:39 bruno-at-bareos Note Added: 0005776
2024-02-15 14:13 bruno-at-bareos Note Added: 0005777
2024-02-16 16:04 Int Note Added: 0005780
2024-02-16 16:04 Int File Added: bareos-sd.conf
2024-02-16 16:04 Int File Added: FileStorage.conf
2024-02-16 16:04 Int File Added: jobid_35437.log.gz
2024-02-16 16:04 Int File Added: llist-Migrate-0956.log.gz
2024-02-16 16:04 Int File Added: truenas-sd.trace.gz
2024-02-20 10:04 bruno-at-bareos Note Added: 0005784
2024-02-20 15:46 Int Note Added: 0005786
2024-02-20 15:46 Int File Added: JailMountPoint.png
2024-02-21 09:55 Int Note Added: 0005787
2024-02-21 09:58 bruno-at-bareos Note Added: 0005788
2024-02-21 10:43 bruno-at-bareos Note Added: 0005789
2024-02-21 11:14 Int Note Added: 0005790