View Issue Details

IDProjectCategoryView StatusLast Update
0001202bareos-core[All Projects] storage daemonpublic2020-02-26 00:29
ReporterstephandAssigned To 
PriorityhighSeveritymajorReproducibilityalways
Status newResolutionopen 
Product Version19.2.6 
Fixed in Version 
Summary0001202: droplet backend (S3): Using MaximumConcurrentJobs > 1 causes restore errors although backup job terminated with "Backup OK"
DescriptionWhen using the droplet S3 bareos-sd backend with MaximumConcurrentJobs > 1 and running concurrent jobs, backup terminates with "Backup OK" but problems occur on restore.

This issue can be mitigated by using multiple storage devices each with MaximumConcurrentJobs = 1 as described in
https://docs.bareos.org/TasksAndConcepts/StorageBackends.html#configuration or
https://docs.bareos.org/Configuration/StorageDaemon.html#autochanger-resource
Steps To Reproduce1. Configure as in the attached config
2. Run two concurrent jobs, can be done like this:
    echo -e "run job=S3Data1 yes\nrun job=S3Data2 yes" |bconsole
3. Try to restore
Additional InformationThe problematic part seems to happen when the first job terminates while the second is still writing:

25-Feb 17:42 bareos-sd JobId 238: Releasing device "S3_Dev1" (S3).
25-Feb 17:42 bareos-sd JobId 239: End of Volume "S3Full-0" at 1:4189926 on device "S3_Dev1" (S3). Write of 64512 bytes got -1.
25-Feb 17:42 bareos-sd JobId 239: End of medium on Volume "S3Full-0" Bytes=4,299,157,223 Blocks=66,642 at 25-Feb-2020 17:42.
25-Feb 17:42 bareos-dir JobId 239: Created new Volume "S3Full-1" in catalog.

Restore of the first job restores the files but never terminates, bareos-sd debug messages repeats the following every 10s until terminating the process:
25-Feb-2020 22:22:08.359365 bareos-sd (100): backends/droplet_device.cc:949-250 get chunked_remote_volume_size(S3Full-0)
25-Feb-2020 22:22:08.363639 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0000 exists. Calling callback.
25-Feb-2020 22:22:08.367575 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0001 exists. Calling callback.
...
25-Feb-2020 22:22:08.568750 bareos-sd (100): backends/droplet_device.cc:240-250 chunk /S3Full-0/0040 exists. Calling callback.
25-Feb-2020 22:22:08.572301 bareos-sd (100): backends/droplet_device.cc:257-250 chunk /S3Full-0/0041 does not exists. Exiting.
25-Feb-2020 22:22:08.572366 bareos-sd (100): backends/droplet_device.cc:960-250 Size of volume /S3Full-0: 4246192871
25-Feb-2020 22:22:08.572402 bareos-sd (100): backends/chunked_device.cc:1256-250 volume: S3Full-0, chunked_remote_volume_size = 4246192871, VolCatInfo.VolCatBytes = 4299157223
25-Feb-2020 22:22:08.572436 bareos-sd (100): backends/chunked_device.cc:1262-250 volume S3Full-0 is pending, as 'remote volume size' = 4246192871 < 'catalog volume size' = 4299157223

Restoring the second job fails with

25-Feb 22:28 bareos2-fd JobId 253: Error: findlib/attribs.cc:441 File size of restored file /tmp/bareos-restores/data/2/testfile4.txt not correct. Original 545837338, restored 492912238.

When larger chunksizes are configured, eg. chunksize=1000M, also the following error message appears before the File size not correct error:

21-Feb 17:36 bareos-sd JobId 13: Error: stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded.
21-Feb 17:36 bareos-sd JobId 13: Error: stored/read_record.cc:256 stored/block.cc:1127 Volume data error at 0:3145669842! Short block of 58157 bytes on device "S3_Wasabi1" (S3) discarded.
TagsNo tags attached.
bareos-master: impact
bareos-master: action
bareos-19.2: impact
bareos-19.2: action
bareos-18.2: impact
bareos-18.2: action
bareos-17.2: impact
bareos-17.2: action
bareos-16.2: impact
bareos-16.2: action
bareos-15.2: impact
bareos-15.2: action
bareos-14.2: impact
bareos-14.2: action
bareos-13.2: impact
bareos-13.2: action
bareos-12.4: impact
bareos-12.4: action

Activities

stephand

stephand

2020-02-26 00:24

developer  

S3_Dev1.conf (1,516 bytes)
stephand

stephand

2020-02-26 00:25

developer  

S3_Storage1.conf (253 bytes)
stephand

stephand

2020-02-26 00:26

developer  

S3Job.conf (282 bytes)
stephand

stephand

2020-02-26 00:26

developer  

S3testjobs.conf (189 bytes)
stephand

stephand

2020-02-26 00:27

developer  

testfilesets.conf (221 bytes)
stephand

stephand

2020-02-26 00:27

developer  

gen_testfile.py (228 bytes)
stephand

stephand

2020-02-26 00:29

developer   ~0003858

To reproduce with the attached config, create testfiles like this:
mkdir -p /data/1
mkdir -p /data/2
./gen_testfile.py > /data/1/testfile1.txt
./gen_testfile.py > /data/1/testfile2.txt
./gen_testfile.py > /data/2/testfile1.txt
./gen_testfile.py > /data/2/testfile2.txt

Issue History

Date Modified Username Field Change
2020-02-26 00:15 stephand New Issue
2020-02-26 00:24 stephand File Added: S3_Dev1.conf
2020-02-26 00:25 stephand File Added: S3_Storage1.conf
2020-02-26 00:26 stephand File Added: S3Job.conf
2020-02-26 00:26 stephand File Added: S3testjobs.conf
2020-02-26 00:27 stephand File Added: testfilesets.conf
2020-02-26 00:27 stephand File Added: gen_testfile.py
2020-02-26 00:29 stephand Note Added: 0003858