View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001202 | bareos-core | storage daemon | public | 2020-02-26 00:15 | 2023-08-23 13:56 |
Reporter | stephand | Assigned To | bruno-at-bareos | ||
Priority | high | Severity | major | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Product Version | 19.2.6 | ||||
Summary | 0001202: droplet backend (S3): Using MaximumConcurrentJobs > 1 causes restore errors although backup job terminated with "Backup OK" | ||||
Description | When using the droplet S3 bareos-sd backend with MaximumConcurrentJobs > 1 and running concurrent jobs, backup terminates with "Backup OK" but problems occur on restore. This issue can be mitigated by using multiple storage devices each with MaximumConcurrentJobs = 1 as described in https://docs.bareos.org/TasksAndConcepts/StorageBackends.html#configuration or https://docs.bareos.org/Configuration/StorageDaemon.html#autochanger-resource | ||||
Steps To Reproduce | 1. Configure as in the attached config 2. Run two concurrent jobs, can be done like this: echo -e "run job=S3Data1 yes\nrun job=S3Data2 yes" |bconsole 3. Try to restore | ||||
Additional Information | The problematic part seems to happen when the first job terminates while the second is still writing: 25-Feb 17:42 bareos-sd JobId 238: Releasing device "S3_Dev1" (S3). 25-Feb 17:42 bareos-sd JobId 239: End of Volume "S3Full-0" at 1:4189926 on device "S3_Dev1" (S3). Write of 64512 bytes got -1. 25-Feb 17:42 bareos-sd JobId 239: End of medium on Volume "S3Full-0" Bytes=4,299,157,223 Blocks=66,642 at 25-Feb-2020 17:42. 25-Feb 17:42 bareos-dir JobId 239: Created new Volume "S3Full-1" in catalog. Restore of the first job restores the files but never terminates, bareos-sd debug messages repeats the following every 10s until terminating the process: 25-Feb-2020 22:22:08.359365 bareos-sd (100): backends/droplet_device.cc:949-250 get chunked_remote_volume_size(S3Full-0) 25-Feb-2020 22:22:08.363639 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0000 exists. Calling callback. 25-Feb-2020 22:22:08.367575 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0001 exists. Calling callback. ... 25-Feb-2020 22:22:08.568750 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0040 exists. Calling callback. 25-Feb-2020 22:22:08.572301 bareos-sd (100): backends/droplet_device.cc:257-250 chunk /S3Full-0/0041 does not exists. Exiting. 25-Feb-2020 22:22:08.572366 bareos-sd (100): backends/droplet_device.cc:960-250 Size of volume /S3Full-0: 4246192871 25-Feb-2020 22:22:08.572402 bareos-sd (100): backends/chunked_device.cc:1256-250 volume: S3Full-0, chunked_remote_volume_size = 4246192871, VolCatInfo.VolCatBytes = 4299157223 25-Feb-2020 22:22:08.572436 bareos-sd (100): backends/chunked_device.cc:1262-250 volume S3Full-0 is pending, as 'remote volume size' = 4246192871 < 'catalog volume size' = 4299157223 Restoring the second job fails with 25-Feb 22:28 bareos2-fd JobId 253: Error: findlib/attribs.cc:441 File size of restored file /tmp/bareos-restores/data/2/testfile4.txt not correct. Original 545837338, restored 492912238. When larger chunksizes are configured, eg. chunksize=1000M, also the following error message appears before the File size not correct error: 21-Feb 17:36 bareos-sd JobId 13: Error: stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded. 21-Feb 17:36 bareos-sd JobId 13: Error: stored/read_record.cc:256 stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded. | ||||
Tags | No tags attached. | ||||
S3_Dev1.conf (1,516 bytes)
Device { Name = S3_Dev1 Media Type = S3_T1 Archive Device = S3 Object Storage # # Device Options: # profile= - Droplet profile path, e.g. /etc/bareos/bareos-sd.d/device/droplet/droplet.profile # acl= - Canned ACL # storageclass= - Storage Class to use. # bucket= - Bucket to store objects in. # chunksize= - Size of Volume Chunks (default = 10 Mb) # iothreads= - Number of IO-threads to use for upload (use blocking uploads if not defined) # ioslots= - Number of IO-slots per IO-thread (0-255, default 10) # retries= - Number of retires if a write fails (0-255, default = 0, which means unlimited retries) # mmap= - Use mmap to allocate Chunk memory instead of malloc(). # location= - Deprecated. If required (AWS only), it has to be set in the Droplet profile. # # testing: #Device Options = "profile=/etc/bareos/bareos-sd.d/droplet/droplet.profile,bucket=bareos-backup-test1-bucket1,chunksize=100M,iothreads=0,retries=1" # performance: Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/droplet.profile,bucket=bareos-backup-test1-bucket1,chunksize=100M" Device Type = droplet Label Media = yes # lets Bareos label unlabeled media Random Access = yes Automatic Mount = yes # when device opened, read it Removable Media = no Always Open = no Description = "S3 device on bareos-backup-test1-bucket1" Maximum Concurrent Jobs = 2 } |
|
S3_Storage1.conf (253 bytes)
Storage { Name = S3_Storage1 Address = bareostest.example.com # N.B. Use a fully qualified name here (do not use "localhost" here). Password = "rrandomsecret" Device = S3_Dev1 Media Type = S3_T1 Maximum Concurrent Jobs = 2 } |
|
S3Job.conf (282 bytes)
JobDefs { Name = "S3Job" Type = Backup Level = Full Client = bareos-fd FileSet = "SelfTest" Schedule = "WeeklyCycle" Storage = S3_Storage1 Messages = Standard Pool = S3Full Priority = 10 Write Bootstrap = "/var/lib/bareos/%c.bsr" Full Backup Pool = S3Full } |
|
S3testjobs.conf (189 bytes)
Job { Name = "S3Data1" JobDefs = "S3Job" FileSet = "Data1Set" Client = "bareos1-fd" } Job { Name = "S3Data2" JobDefs = "S3Job" FileSet = "Data2Set" Client = "bareos2-fd" } |
|
testfilesets.conf (221 bytes)
FileSet { Name = "Data1Set" Include { Options { signature = SHA1 } File = /data/1 } } FileSet { Name = "Data2Set" Include { Options { signature = SHA1 } File = /data/2 } } |
|
gen_testfile.py (228 bytes)
#!/usr/bin/python import string max_size = 1024 * 1024 * 512 line_number = 0 size = 0 while size < max_size: line = str(line_number) + ":" + string.ascii_letters print(line) size += len(line) line_number += 1 |
|
To reproduce with the attached config, create testfiles like this: mkdir -p /data/1 mkdir -p /data/2 ./gen_testfile.py > /data/1/testfile1.txt ./gen_testfile.py > /data/1/testfile2.txt ./gen_testfile.py > /data/2/testfile1.txt ./gen_testfile.py > /data/2/testfile2.txt |
|
MaximumConcurrentJob = 1 is recommended for all disk based jobs, will be enforced to 1 in 23 | |
Date Modified | Username | Field | Change |
---|---|---|---|
2020-02-26 00:15 | stephand | New Issue | |
2020-02-26 00:24 | stephand | File Added: S3_Dev1.conf | |
2020-02-26 00:25 | stephand | File Added: S3_Storage1.conf | |
2020-02-26 00:26 | stephand | File Added: S3Job.conf | |
2020-02-26 00:26 | stephand | File Added: S3testjobs.conf | |
2020-02-26 00:27 | stephand | File Added: testfilesets.conf | |
2020-02-26 00:27 | stephand | File Added: gen_testfile.py | |
2020-02-26 00:29 | stephand | Note Added: 0003858 | |
2023-08-23 13:56 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2023-08-23 13:56 | bruno-at-bareos | Status | new => closed |
2023-08-23 13:56 | bruno-at-bareos | Resolution | open => fixed |
2023-08-23 13:56 | bruno-at-bareos | Note Added: 0005348 |