View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001180 | bareos-core | storage daemon | public | 2020-02-07 00:24 | 2020-02-11 17:19 |
Reporter | antiduh | Assigned To | arogge | ||
Priority | immediate | Severity | block | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | x86-64 | OS | FreeBSD | OS Version | 12.1 |
Product Version | 19.2.5 | ||||
Fixed in Version | 19.2.6 | ||||
Summary | 0001180: CRC checksum algorithm changed between 18.2.7 and 19.2.5, all volumes fail | ||||
Description | I tried upgrading from 18.2.7 to 19.2.5 today. After upgrading, every single time stored tries to load a volume, it results in a crc checksum error while reading the first block in the volume: 06-Feb 16:37 HomeFile JobId 8110: Error: stored/block.cc:350 Volume data error at 0:0! Block checksum mismatch in block=0 len=192: calc=f625ccaf blk=64bbb13a 06-Feb 16:37 HomeFile JobId 8110: Warning: Volume "Incr-0012" not on device "FileStorage" (/space/bareos). 06-Feb 16:37 HomeFile JobId 8110: Marking Volume "Incr-0012" in Error in Catalog. 06-Feb 16:43 HomeFile JobId 8111: Error: stored/block.cc:350 Volume data error at 0:0! Block checksum mismatch in block=0 len=208: calc=8dc2aa1c blk=466e7c3c 06-Feb 16:37 angst-fd JobId 8111: ACL support is enabled 06-Feb 16:43 HomeFile JobId 8111: Warning: Volume "Full-0005" not on device "FileStorage" (/space/bareos). 06-Feb 16:43 HomeFile JobId 8111: Marking Volume "Full-0005" in Error in Catalog. I suspect the files are fine: - The files are on a raidz1 3-disk ZFS pool, and ZFS reports no sha checksum errors. - The volumes were working just fine this morning before the update. - The drives are in good health and there are no IO errors reported by the OS. I noticed that 6 months ago, you completely replaced the CRC implementation used: https://github.com/bareos/bareos/commit/838aef14bddc69241221900ba962690ac9e26203#diff-5d8d24f7d7fbd164fc11057003dd30c5 Any chance you accidentally installed a CRC implementation that gives different answers than the historical implementation? I would expect that you tested this or that someone else would've run in to this, but it's a suspicious coincidence. For what it's worth, Bareos is able to plow ahead and save the backup to a fresh volume after recycling. I'm running on FreeBSD 12.1 on both the client and server machine. FreeBSD masheen.redacted.com 12.1-STABLE FreeBSD 12.1-STABLE r357422 GENERIC amd64 I tried upgrading from 18.2.7 to 19.2.5 today. After upgrading, every single time stored tries to load a volume, it results in a crc checksum error while reading the first block in the volume: 06-Feb 16:37 HomeFile JobId 8110: Error: stored/block.cc:350 Volume data error at 0:0! Block checksum mismatch in block=0 len=192: calc=f625ccaf blk=64bbb13a 06-Feb 16:37 HomeFile JobId 8110: Warning: Volume "Incr-0012" not on device "FileStorage" (/space/bareos). 06-Feb 16:37 HomeFile JobId 8110: Marking Volume "Incr-0012" in Error in Catalog. 06-Feb 16:43 HomeFile JobId 8111: Error: stored/block.cc:350 Volume data error at 0:0! Block checksum mismatch in block=0 len=208: calc=8dc2aa1c blk=466e7c3c 06-Feb 16:37 angst-fd JobId 8111: ACL support is enabled 06-Feb 16:43 HomeFile JobId 8111: Warning: Volume "Full-0005" not on device "FileStorage" (/space/bareos). 06-Feb 16:43 HomeFile JobId 8111: Marking Volume "Full-0005" in Error in Catalog. I suspect the files are fine: - The files are on a raidz1 3-disk ZFS pool, and ZFS reports no sha checksum errors. - The volumes were working just fine this morning before the update. - The drives are in good health and there are no IO errors reported by the OS. I noticed that 6 months ago, you completely replaced the CRC implementation used: https://github.com/bareos/bareos/commit/838aef14bddc69241221900ba962690ac9e26203#diff-5d8d24f7d7fbd164fc11057003dd30c5 Any chance you accidentally installed a CRC implementation that gives different answers than the historical implementation? I would expect that you tested this or that someone else would've run in to this, but it's a suspicious coincidence. For what it's worth, Bareos is able to plow ahead and save the backup to a fresh volume after recycling. I'm running on FreeBSD 12.1 on both the client and server machine. FreeBSD masheen.redacted.com 12.1-STABLE FreeBSD 12.1-STABLE r357422 GENERIC amd64 Downgrading back down to 18.2.5 fixes everything aside from the volumes that are now tainted by the 19.2.5 CRC. | ||||
Steps To Reproduce | 1) Install 18.2.7 2) Create a fresh instance using file volumes 3) Back up some files 4) Install 19.2.5 5) Attempt to restore, observe failure | ||||
Tags | No tags attached. | ||||
Hi, thanks for writing a report. The correctness of the CRC is tested automatically. However, such a test may or may not be 100% accurate. As far as I can tell the old and the new algorithm both yield the exact same results. Nevertheless I'll try to reproduce your issue. Do you see the same issue with volumes created on 19 when read in 18? |
|
For what it's worth: if you run something like ZFS and don't require the checksumming, you can disable it with "Block Checksum" in the SD's device configuration. (I'm not encouraging you to do so and it will be hard to go back, but you can work around the problem with this). | |
I can use 18.2's "bls" on a volume written with 19.2, so at least it doesn't seem to be an obvious problem. | |
Is it possible to dump the first few blocks of your volumes using dd and attach these, so I have some test-data to work with? I know that this might contain some sensitive data, so please make sure it doesn't. However, it would really help me. If you cannot provide this, maybe you could run a test yourself by trying to read the volumes written with 18.2/19.2 on FreeBSD with a bls/bscan/bextract on Linux (and 18.2/19.2). |
|
I can reproduce this on FreeBSD. However, it works on Linux and when I copy the volume from FreeBSD to a Linux machine it can be read. So it looks like the block checksumming is broken on FreeBSD. |
|
Looks like the endianess is not detected correctly on FreeBSD. | |
I have created a PR that will fix the problem on GitHub: https://github.com/bareos/bareos/pull/412 Testing packages are building right now and should show up at https://download.bareos.org/bareos/experimental/CD/PR-412/ later today. I would be glad if you could check that this change actually fixes your problem. I have tested the change, but I also did the testing for the original change and it looked good to all of us. |
|
That's great news, I'll do a test sometime in the next few hours and let you know. Thank you so much for the quick turnaround time on this! |
|
Took me a while to get the patch tested, but it looks like it's working just fine now. Thanks! | |
The fix has been merged into the master-branch and will be backported to 19.2. The next release 19.2.6 will contain a fix. | |
Fix committed to bareos bareos-19.2 branch with changesetid 12810. | |
Fixed in Bareos 19.2.6 | |
bareos: master 4e482a27 2020-02-07 11:30 Ported: N/A Details Diff |
tests: test crc32 with a real label block Bug 0001180: CRC checksum algorithm changed between 18.2.7 and 19.2.5 Previously the crc32 tests did only rudimentary changes, but did not check with a real bareos block. This patch now adds a dumped label block from a test-installation and calculates the checksum for that. This patch also changes the pattern for another test, so it triggers on an endianess problem too. |
Affected Issues 0001180 |
|
mod - core/src/tests/test_crc32.cc | Diff File | ||
bareos: bareos-19.2 bf4250b8 2020-02-07 11:30 Ported: N/A Details Diff |
tests: test crc32 with a real label block Bug 0001180: CRC checksum algorithm changed between 18.2.7 and 19.2.5 Previously the crc32 tests did only rudimentary changes, but did not check with a real bareos block. This patch now adds a dumped label block from a test-installation and calculates the checksum for that. This patch also changes the pattern for another test, so it triggers on an endianess problem too. (cherry picked from commit 4e482a27661ae6221e077811b714e6cb985fdb5e) |
Affected Issues 0001180 |
|
mod - core/src/tests/test_crc32.cc | Diff File | ||
bareos: master ee0b908a 2020-02-07 17:58 Ported: N/A Details Diff |
stored: use correct algorithm on FreeBSD Fixes 0001180: CRC checksum algorihm changed between 18.2.7 and 19.2.5 Previously crc32.cc did not detect when it couldn't find out what endianess the machine was. This is now fixed so that 1. FreeBSD detects endianess correctly 2. the compile fails when there is no __BYTE_ORDER |
Affected Issues 0001180 |
|
mod - core/src/stored/crc32/crc32.cc | Diff File | ||
bareos: bareos-19.2 ace6c834 2020-02-07 17:58 Ported: N/A Details Diff |
stored: use correct algorithm on FreeBSD Fixes 0001180: CRC checksum algorihm changed between 18.2.7 and 19.2.5 Previously crc32.cc did not detect when it couldn't find out what endianess the machine was. This is now fixed so that 1. FreeBSD detects endianess correctly 2. the compile fails when there is no __BYTE_ORDER (cherry picked from commit ee0b908a19cd740379e483da01d3054c9ccfd0a9) |
Affected Issues 0001180 |
|
mod - core/src/stored/crc32/crc32.cc | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2020-02-07 00:24 | antiduh | New Issue | |
2020-02-07 07:50 | arogge | Note Added: 0003740 | |
2020-02-07 08:00 | arogge | Note Added: 0003741 | |
2020-02-07 08:13 | arogge | Note Added: 0003742 | |
2020-02-07 08:20 | arogge | Assigned To | => arogge |
2020-02-07 08:20 | arogge | Status | new => feedback |
2020-02-07 08:20 | arogge | Note Added: 0003743 | |
2020-02-07 09:15 | arogge | Status | feedback => confirmed |
2020-02-07 09:15 | arogge | Note Added: 0003745 | |
2020-02-07 11:29 | arogge | Note Added: 0003746 | |
2020-02-07 17:05 | arogge | Note Added: 0003750 | |
2020-02-07 23:00 | antiduh | Note Added: 0003753 | |
2020-02-09 18:46 | antiduh | Note Added: 0003755 | |
2020-02-10 10:37 | arogge | Relationship added | related to 0001177 |
2020-02-10 10:39 | arogge | Status | confirmed => resolved |
2020-02-10 10:39 | arogge | Resolution | open => fixed |
2020-02-10 10:39 | arogge | Fixed in Version | => 19.2.6 |
2020-02-10 10:39 | arogge | Note Added: 0003757 | |
2020-02-10 11:22 | arogge | Changeset attached | => bareos master ee0b908a |
2020-02-10 11:22 | arogge | Changeset attached | => bareos master 4e482a27 |
2020-02-10 11:22 | arogge | Changeset attached | => bareos bareos-19.2 ace6c834 |
2020-02-10 11:22 | arogge | Changeset attached | => bareos bareos-19.2 bf4250b8 |
2020-02-10 11:22 | arogge | Note Added: 0003760 | |
2020-02-11 17:19 | arogge | Status | resolved => closed |
2020-02-11 17:19 | arogge | Note Added: 0003786 |