View Issue Details

IDProjectCategoryView StatusLast Update
0001202bareos-corestorage daemonpublic2023-08-23 13:56
Reporterstephand Assigned Tobruno-at-bareos  
PriorityhighSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Product Version19.2.6 
Summary0001202: droplet backend (S3): Using MaximumConcurrentJobs > 1 causes restore errors although backup job terminated with "Backup OK"
DescriptionWhen using the droplet S3 bareos-sd backend with MaximumConcurrentJobs > 1 and running concurrent jobs, backup terminates with "Backup OK" but problems occur on restore.

This issue can be mitigated by using multiple storage devices each with MaximumConcurrentJobs = 1 as described in
https://docs.bareos.org/TasksAndConcepts/StorageBackends.html#configuration or
https://docs.bareos.org/Configuration/StorageDaemon.html#autochanger-resource
Steps To Reproduce1. Configure as in the attached config
2. Run two concurrent jobs, can be done like this:
    echo -e "run job=S3Data1 yes\nrun job=S3Data2 yes" |bconsole
3. Try to restore
Additional InformationThe problematic part seems to happen when the first job terminates while the second is still writing:

25-Feb 17:42 bareos-sd JobId 238: Releasing device "S3_Dev1" (S3).
25-Feb 17:42 bareos-sd JobId 239: End of Volume "S3Full-0" at 1:4189926 on device "S3_Dev1" (S3). Write of 64512 bytes got -1.
25-Feb 17:42 bareos-sd JobId 239: End of medium on Volume "S3Full-0" Bytes=4,299,157,223 Blocks=66,642 at 25-Feb-2020 17:42.
25-Feb 17:42 bareos-dir JobId 239: Created new Volume "S3Full-1" in catalog.

Restore of the first job restores the files but never terminates, bareos-sd debug messages repeats the following every 10s until terminating the process:
25-Feb-2020 22:22:08.359365 bareos-sd (100): backends/droplet_device.cc:949-250 get chunked_remote_volume_size(S3Full-0)
25-Feb-2020 22:22:08.363639 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0000 exists. Calling callback.
25-Feb-2020 22:22:08.367575 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0001 exists. Calling callback.
...
25-Feb-2020 22:22:08.568750 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0040 exists. Calling callback.
25-Feb-2020 22:22:08.572301 bareos-sd (100): backends/droplet_device.cc:257-250 chunk /S3Full-0/0041 does not exists. Exiting.
25-Feb-2020 22:22:08.572366 bareos-sd (100): backends/droplet_device.cc:960-250 Size of volume /S3Full-0: 4246192871
25-Feb-2020 22:22:08.572402 bareos-sd (100): backends/chunked_device.cc:1256-250 volume: S3Full-0, chunked_remote_volume_size = 4246192871, VolCatInfo.VolCatBytes = 4299157223
25-Feb-2020 22:22:08.572436 bareos-sd (100): backends/chunked_device.cc:1262-250 volume S3Full-0 is pending, as 'remote volume size' = 4246192871 < 'catalog volume size' = 4299157223

Restoring the second job fails with

25-Feb 22:28 bareos2-fd JobId 253: Error: findlib/attribs.cc:441 File size of restored file /tmp/bareos-restores/data/2/testfile4.txt not correct. Original 545837338, restored 492912238.

When larger chunksizes are configured, eg. chunksize=1000M, also the following error message appears before the File size not correct error:

21-Feb 17:36 bareos-sd JobId 13: Error: stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded.
21-Feb 17:36 bareos-sd JobId 13: Error: stored/read_record.cc:256 stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded.
TagsNo tags attached.

Activities

stephand

stephand

2020-02-26 00:24

developer  

S3_Dev1.conf (1,516 bytes)   
Device {
  Name = S3_Dev1
  Media Type = S3_T1
  Archive Device = S3 Object Storage

  #
  # Device Options:
  #    profile=      - Droplet profile path, e.g. /etc/bareos/bareos-sd.d/device/droplet/droplet.profile
  #    acl=          - Canned ACL
  #    storageclass= - Storage Class to use.
  #    bucket=       - Bucket to store objects in.
  #    chunksize=    - Size of Volume Chunks (default = 10 Mb)
  #    iothreads=    - Number of IO-threads to use for upload (use blocking uploads if not defined)
  #    ioslots=      - Number of IO-slots per IO-thread (0-255, default 10)
  #    retries=      - Number of retires if a write fails (0-255, default = 0, which means unlimited retries)
  #    mmap=         - Use mmap to allocate Chunk memory instead of malloc().
  #    location=     - Deprecated. If required (AWS only), it has to be set in the Droplet profile.
  #

  # testing:
  #Device Options = "profile=/etc/bareos/bareos-sd.d/droplet/droplet.profile,bucket=bareos-backup-test1-bucket1,chunksize=100M,iothreads=0,retries=1"

  # performance:
  Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/droplet.profile,bucket=bareos-backup-test1-bucket1,chunksize=100M"

  Device Type = droplet
  Label Media = yes                    # lets Bareos label unlabeled media
  Random Access = yes
  Automatic Mount = yes                # when device opened, read it
  Removable Media = no
  Always Open = no
  Description = "S3 device on bareos-backup-test1-bucket1"
  Maximum Concurrent Jobs = 2
}

S3_Dev1.conf (1,516 bytes)   
stephand

stephand

2020-02-26 00:25

developer  

S3_Storage1.conf (253 bytes)   
Storage {
  Name = S3_Storage1
  Address = bareostest.example.com                # N.B. Use a fully qualified name here (do not use "localhost" here).
  Password = "rrandomsecret"
  Device = S3_Dev1
  Media Type = S3_T1
  Maximum Concurrent Jobs = 2
}

S3_Storage1.conf (253 bytes)   
stephand

stephand

2020-02-26 00:26

developer  

S3Job.conf (282 bytes)   
JobDefs {
  Name = "S3Job"
  Type = Backup
  Level = Full
  Client = bareos-fd
  FileSet = "SelfTest"
  Schedule = "WeeklyCycle"
  Storage = S3_Storage1
  Messages = Standard
  Pool = S3Full
  Priority = 10
  Write Bootstrap = "/var/lib/bareos/%c.bsr"
  Full Backup Pool = S3Full
}
S3Job.conf (282 bytes)   
stephand

stephand

2020-02-26 00:26

developer  

S3testjobs.conf (189 bytes)   
Job {
  Name = "S3Data1"
  JobDefs = "S3Job"
  FileSet = "Data1Set"
  Client = "bareos1-fd"
}

Job {
  Name = "S3Data2"
  JobDefs = "S3Job"
  FileSet = "Data2Set"
  Client = "bareos2-fd"
}
S3testjobs.conf (189 bytes)   
stephand

stephand

2020-02-26 00:27

developer  

testfilesets.conf (221 bytes)   
FileSet {
  Name = "Data1Set"
  Include {
    Options {
      signature = SHA1
    }
    File = /data/1
  }
}

FileSet {
  Name = "Data2Set"
  Include {
    Options {
      signature = SHA1
    }
    File = /data/2
  }
}
testfilesets.conf (221 bytes)   
stephand

stephand

2020-02-26 00:27

developer  

gen_testfile.py (228 bytes)   
#!/usr/bin/python
import string

max_size = 1024 * 1024 * 512
line_number = 0
size = 0

while size < max_size:
    line = str(line_number) + ":" + string.ascii_letters
    print(line)
    size += len(line)
    line_number += 1

gen_testfile.py (228 bytes)   
stephand

stephand

2020-02-26 00:29

developer   ~0003858

To reproduce with the attached config, create testfiles like this:
mkdir -p /data/1
mkdir -p /data/2
./gen_testfile.py > /data/1/testfile1.txt
./gen_testfile.py > /data/1/testfile2.txt
./gen_testfile.py > /data/2/testfile1.txt
./gen_testfile.py > /data/2/testfile2.txt
bruno-at-bareos

bruno-at-bareos

2023-08-23 13:56

manager   ~0005348

MaximumConcurrentJob = 1 is recommended for all disk based jobs, will be enforced to 1 in 23

Issue History

Date Modified Username Field Change
2020-02-26 00:15 stephand New Issue
2020-02-26 00:24 stephand File Added: S3_Dev1.conf
2020-02-26 00:25 stephand File Added: S3_Storage1.conf
2020-02-26 00:26 stephand File Added: S3Job.conf
2020-02-26 00:26 stephand File Added: S3testjobs.conf
2020-02-26 00:27 stephand File Added: testfilesets.conf
2020-02-26 00:27 stephand File Added: gen_testfile.py
2020-02-26 00:29 stephand Note Added: 0003858
2023-08-23 13:56 bruno-at-bareos Assigned To => bruno-at-bareos
2023-08-23 13:56 bruno-at-bareos Status new => closed
2023-08-23 13:56 bruno-at-bareos Resolution open => fixed
2023-08-23 13:56 bruno-at-bareos Note Added: 0005348