View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0001202 | bareos-core | storage daemon | public | 2020-02-26 00:15 | 2023-08-23 13:56 |
| Reporter | stephand | Assigned To | bruno-at-bareos | ||
| Priority | high | Severity | major | Reproducibility | always |
| Status | closed | Resolution | fixed | ||
| Product Version | 19.2.6 | ||||
| Summary | 0001202: droplet backend (S3): Using MaximumConcurrentJobs > 1 causes restore errors although backup job terminated with "Backup OK" | ||||
| Description | When using the droplet S3 bareos-sd backend with MaximumConcurrentJobs > 1 and running concurrent jobs, backup terminates with "Backup OK" but problems occur on restore. This issue can be mitigated by using multiple storage devices each with MaximumConcurrentJobs = 1 as described in https://docs.bareos.org/TasksAndConcepts/StorageBackends.html#configuration or https://docs.bareos.org/Configuration/StorageDaemon.html#autochanger-resource | ||||
| Steps To Reproduce | 1. Configure as in the attached config 2. Run two concurrent jobs, can be done like this: echo -e "run job=S3Data1 yes\nrun job=S3Data2 yes" |bconsole 3. Try to restore | ||||
| Additional Information | The problematic part seems to happen when the first job terminates while the second is still writing: 25-Feb 17:42 bareos-sd JobId 238: Releasing device "S3_Dev1" (S3). 25-Feb 17:42 bareos-sd JobId 239: End of Volume "S3Full-0" at 1:4189926 on device "S3_Dev1" (S3). Write of 64512 bytes got -1. 25-Feb 17:42 bareos-sd JobId 239: End of medium on Volume "S3Full-0" Bytes=4,299,157,223 Blocks=66,642 at 25-Feb-2020 17:42. 25-Feb 17:42 bareos-dir JobId 239: Created new Volume "S3Full-1" in catalog. Restore of the first job restores the files but never terminates, bareos-sd debug messages repeats the following every 10s until terminating the process: 25-Feb-2020 22:22:08.359365 bareos-sd (100): backends/droplet_device.cc:949-250 get chunked_remote_volume_size(S3Full-0) 25-Feb-2020 22:22:08.363639 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0000 exists. Calling callback. 25-Feb-2020 22:22:08.367575 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0001 exists. Calling callback. ... 25-Feb-2020 22:22:08.568750 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0040 exists. Calling callback. 25-Feb-2020 22:22:08.572301 bareos-sd (100): backends/droplet_device.cc:257-250 chunk /S3Full-0/0041 does not exists. Exiting. 25-Feb-2020 22:22:08.572366 bareos-sd (100): backends/droplet_device.cc:960-250 Size of volume /S3Full-0: 4246192871 25-Feb-2020 22:22:08.572402 bareos-sd (100): backends/chunked_device.cc:1256-250 volume: S3Full-0, chunked_remote_volume_size = 4246192871, VolCatInfo.VolCatBytes = 4299157223 25-Feb-2020 22:22:08.572436 bareos-sd (100): backends/chunked_device.cc:1262-250 volume S3Full-0 is pending, as 'remote volume size' = 4246192871 < 'catalog volume size' = 4299157223 Restoring the second job fails with 25-Feb 22:28 bareos2-fd JobId 253: Error: findlib/attribs.cc:441 File size of restored file /tmp/bareos-restores/data/2/testfile4.txt not correct. Original 545837338, restored 492912238. When larger chunksizes are configured, eg. chunksize=1000M, also the following error message appears before the File size not correct error: 21-Feb 17:36 bareos-sd JobId 13: Error: stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded. 21-Feb 17:36 bareos-sd JobId 13: Error: stored/read_record.cc:256 stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded. | ||||
| Tags | No tags attached. | ||||
|
S3_Dev1.conf (1,516 bytes)
Device {
Name = S3_Dev1
Media Type = S3_T1
Archive Device = S3 Object Storage
#
# Device Options:
# profile= - Droplet profile path, e.g. /etc/bareos/bareos-sd.d/device/droplet/droplet.profile
# acl= - Canned ACL
# storageclass= - Storage Class to use.
# bucket= - Bucket to store objects in.
# chunksize= - Size of Volume Chunks (default = 10 Mb)
# iothreads= - Number of IO-threads to use for upload (use blocking uploads if not defined)
# ioslots= - Number of IO-slots per IO-thread (0-255, default 10)
# retries= - Number of retires if a write fails (0-255, default = 0, which means unlimited retries)
# mmap= - Use mmap to allocate Chunk memory instead of malloc().
# location= - Deprecated. If required (AWS only), it has to be set in the Droplet profile.
#
# testing:
#Device Options = "profile=/etc/bareos/bareos-sd.d/droplet/droplet.profile,bucket=bareos-backup-test1-bucket1,chunksize=100M,iothreads=0,retries=1"
# performance:
Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/droplet.profile,bucket=bareos-backup-test1-bucket1,chunksize=100M"
Device Type = droplet
Label Media = yes # lets Bareos label unlabeled media
Random Access = yes
Automatic Mount = yes # when device opened, read it
Removable Media = no
Always Open = no
Description = "S3 device on bareos-backup-test1-bucket1"
Maximum Concurrent Jobs = 2
}
|
|
|
S3_Storage1.conf (253 bytes)
Storage {
Name = S3_Storage1
Address = bareostest.example.com # N.B. Use a fully qualified name here (do not use "localhost" here).
Password = "rrandomsecret"
Device = S3_Dev1
Media Type = S3_T1
Maximum Concurrent Jobs = 2
}
|
|
|
S3Job.conf (282 bytes)
JobDefs {
Name = "S3Job"
Type = Backup
Level = Full
Client = bareos-fd
FileSet = "SelfTest"
Schedule = "WeeklyCycle"
Storage = S3_Storage1
Messages = Standard
Pool = S3Full
Priority = 10
Write Bootstrap = "/var/lib/bareos/%c.bsr"
Full Backup Pool = S3Full
}
|
|
|
S3testjobs.conf (189 bytes)
Job {
Name = "S3Data1"
JobDefs = "S3Job"
FileSet = "Data1Set"
Client = "bareos1-fd"
}
Job {
Name = "S3Data2"
JobDefs = "S3Job"
FileSet = "Data2Set"
Client = "bareos2-fd"
}
|
|
|
testfilesets.conf (221 bytes)
FileSet {
Name = "Data1Set"
Include {
Options {
signature = SHA1
}
File = /data/1
}
}
FileSet {
Name = "Data2Set"
Include {
Options {
signature = SHA1
}
File = /data/2
}
}
|
|
|
gen_testfile.py (228 bytes)
#!/usr/bin/python
import string
max_size = 1024 * 1024 * 512
line_number = 0
size = 0
while size < max_size:
line = str(line_number) + ":" + string.ascii_letters
print(line)
size += len(line)
line_number += 1
|
|
|
To reproduce with the attached config, create testfiles like this: mkdir -p /data/1 mkdir -p /data/2 ./gen_testfile.py > /data/1/testfile1.txt ./gen_testfile.py > /data/1/testfile2.txt ./gen_testfile.py > /data/2/testfile1.txt ./gen_testfile.py > /data/2/testfile2.txt |
|
| MaximumConcurrentJob = 1 is recommended for all disk based jobs, will be enforced to 1 in 23 | |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2020-02-26 00:15 | stephand | New Issue | |
| 2020-02-26 00:24 | stephand | File Added: S3_Dev1.conf | |
| 2020-02-26 00:25 | stephand | File Added: S3_Storage1.conf | |
| 2020-02-26 00:26 | stephand | File Added: S3Job.conf | |
| 2020-02-26 00:26 | stephand | File Added: S3testjobs.conf | |
| 2020-02-26 00:27 | stephand | File Added: testfilesets.conf | |
| 2020-02-26 00:27 | stephand | File Added: gen_testfile.py | |
| 2020-02-26 00:29 | stephand | Note Added: 0003858 | |
| 2023-08-23 13:56 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
| 2023-08-23 13:56 | bruno-at-bareos | Status | new => closed |
| 2023-08-23 13:56 | bruno-at-bareos | Resolution | open => fixed |
| 2023-08-23 13:56 | bruno-at-bareos | Note Added: 0005348 |