View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000284 | bareos-core | director | public | 2014-03-15 22:32 | 2015-01-26 15:35 |
Reporter | andre | Assigned To | |||
Priority | normal | Severity | major | Reproducibility | always |
Status | closed | Resolution | unable to reproduce | ||
Platform | x86_64 | OS | Gentoo | ||
Product Version | 13.2.2 | ||||
Summary | 0000284: Despooling of attributes leads to fatal error | ||||
Description | Setting "Spool Data" to yes causes the following job failures: 15-Mär 19:19 fs-sd JobId 7: Sending spooled attrs to the Director. Despooling 258,159,689 bytes ... 15-Mär 19:19 fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success 15-Mär 19:19 fs-dir JobId 7: Fatal error: catreq.c:674 fread attr spool error. ERR=Success Rerunning the failed job results in the same error. If "Spool Data" is turned off, backup starts working again. My first investigations showed, that spooling of data and writing to tape is successful. Only despooling of attributes causes the mentioned error. The debug output between job runs is not exactly the same, but all debug logs contain weird characters like "UpdCat Job=NightlySave.2014-03-15_15.14K<*<F8>QK<BF><E1><E3><E3><81> a$05A><B9>|^B<8F><9B><E9>r<BF>F(<E6><A2><U+D9A1><DB>rH\<D3><C1><8E>%<E2><9F>^P" when job is aborted. Every job fails in the same directory (/home/andre/.cache/chromium/Default/Cache/). The directory is not modified when backup is running (headless server). Debug output of one errenous job run: ==================================== console 15-Mär 19:19 fs-sd JobId 7: Sending spooled attrs to the Director. Despooling 258,159,689 bytes ... 15-Mär 19:19 fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success 15-Mär 19:19 fs-dir JobId 7: Fatal error: catreq.c:674 fread attr spool error. ERR=Success 15-Mär 19:19 fs-dir JobId 7: Error: Bareos fs-dir 13.2.2 (12Nov13): Build OS: x86_64-pc-linux-gnu gentoo JobId: 7 Job: NightlySave.2014-03-15_17.20.26_03 Backup Level: Full Client: "fs-fd" 13.2.2 (12Nov13) x86_64-pc-linux-gnu,gentoo, FileSet: "Local Server FileSet" 2014-03-15 09:27:50 Pool: "Daily" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "Autochanger" (From Job resource) Scheduled time: 15-Mär-2014 17:20:25 Start time: 15-Mär-2014 17:29:04 End time: 15-Mär-2014 19:19:20 Elapsed time: 1 hour 50 mins 16 secs Priority: 10 FD Files Written: 761,198 SD Files Written: 761,198 FD Bytes Written: 79,135,982,464 (79.13 GB) SD Bytes Written: 79,479,550,632 (79.47 GB) Rate: 11961.3 KB/s Software Compression: None VSS: no Encryption: yes Accurate: no Volume name(s): DA0001L3 Volume Session Id: 1 Volume Session Time: 1394900387 Last Volume Bytes: 79,584,583,680 (79.58 GB) Non-fatal FD errors: 1 SD Errors: 1 FD termination status: OK SD termination status: Error Termination: *** Backup Error *** ==================================== dir [look at the weird characters at line 2 and 4] 15-Mär-2014 19:19:11 fs-dir: catreq.c:474-7 dird<stored: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 19:19:11 fs-dir: catreq.c:475-7 dird<stored: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+ <8B> 15-Mär-2014 19:19:11 fs-dir: catreq.c:494-7 dird<filed: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 19:19:11 fs-dir: catreq.c:495-7 dird<filed: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+ <8B> 15-Mär-2014 19:19:11 fs-dir: msgchan.c:461-7 I'm waiting for message thread termination. 15-Mär-2014 19:19:12 fs-dir: catreq.c:672-7 nbytes=239267896 msglen=1049961239 15-Mär-2014 19:19:12 fs-dir: catreq.c:704-7 End despool_attributes_from_file retval=0 15-Mär-2014 19:19:12 fs-dir: getmsg.c:135-7 bget_dirmsg 203: UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes 15-Mär-2014 19:19:12 fs-dir: catreq.c:435-7 UpdCat msg=UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes 15-Mär-2014 19:19:12 fs-dir: catreq.c:437-7 UpdCat VolSessId=1 VolSessT=1394900387 FI=1 Strm=1 reclen=122 15-Mär-2014 19:19:12 fs-dir: catreq.c:443-7 Cached attr. Stream=1 fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 19:19:12 fs-dir: sql_create.c:890-7 Fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 19:19:12 fs-dir: sql_create.c:891-7 put_file_into_catalog 15-Mär-2014 19:19:12 fs-dir: sql.c:406-7 split path=/home/andre/.cache/chromium/Default/Cache/ file=f_000c9c 15-Mär-2014 19:19:12 fs-dir: mysql.c:453-7 sql_query starts with 'INSERT INTO batch VALUES (56545,7,'/home/andre/.cache/chromium/Default/Cache/','f_000cf3','gH gNG IGA B Po Po A Ia3 BAA BI BRWFY6 BRWFY6 BRWFY6 A A U','GcuvSSabHNcwhOtU0EjfYN+m9V0',0),(56546,7,'/home/andre/.cache/chromium/Default/Cache/','f_00070b','gH geY IGA B Po Po A ELS BAA o BQ3ctW BQ3ctW BQ3ctW A A U','yM8RzEv8aqK36DlInoAKHxE+J24',0),(56547,7,'/home/andre/.cache/chromium/Default/Cache/','f_000d19','gH gcH IGA B Po Po A FZi BAA w BRWFZW BRWFZW BRWFZW A A U','LqUrMSns7g2CY5VJdbU8ASUrPnI',0),(56548,7,'/home/andre/.cache/chromium/Default/Cache/','f_000df4','gH gvy IGA B Po Po A KC5 BAA BY BRWF8C BRWF8C BRWF8C A A U','dTNfFVmOXP+gn2HIhPYQfL4c/ts',0),(56549,7,'/home/andre/.cache/chromium/Default/Cache/','f_000eec','gH gV3 IGA B Po Po A PR/ BAA CA BRXBlN BRXBlN BRXBlN A A U','X8+pdPeoqRYCG2kf0s8TPpoG8ig',0),(56550,7,'/home/andre/.cache/chromium/Default/Cache/','f_000d64','gH gtn IGA B Po Po A MnA BAA Bo BRWFaM BRWFaM BRWFaM A A U','3UPBks3DJZsYlOHdjT3t5F24QF8',0),(56551,7,'/home/andre/.cache/chromium/Default/Cache/','f_000e3f','gH gw3 IGA B Po Po A I9Q BAA BI BRWGcH BRWGcH BRWGcH A A U','v5JTaIHls2H6AX5rckbGTEk9Eqw',0),(56552,7,'/home/andre/.cache/chromium/Default/Cache/','f_000777','gH ggS IGA B Po Po A BegC BAA L4 BQ4FMe BQ4FMe BQ4FMe A A U','jO2w9geXEfPnvO5Y9acu08EZGTE',0),(56553,7,'/home/andre/.cache/chromium/Default/Cache/','f_000d2e','gH gqQ IGA B Po Po A ImB BAA BI BRWFZk BRWFZk BRWFZk A A U','P5Pxplv6zJQWSOKVeEbwmnbnpkc',0),(56554,7,'/home/andre/.cache/chromium/Default/Cache/','f_000f37','gH gGw IGA B Po Po A NTh BAA Bw BRXBsO BRXBsO BRXBsO A A U','z2j53N4sNFXCnYStPuQHHyA/dd4',0),(56555,7,'/home/andre/.cache/chromium/Default/Cache/','f_000f6d','gH gY7 IGA B Po Po A JrO BAA BQ BRXBvm BRXBvm BRXBvm A A U','d2F/dIJ5kQBDqWicT63fKy7Oz8Y',0),(56556,7,'/home/andre/.cache/chromium/Default/Cache/','f_0006c3','gH gdT IGA B Po Po A J9V BAA BQ BQ3bpf BQ3bpf BQ3bpf A A U','nQGbLp2W2/8RN/o6eBdZjLYXEa8',0),(56557,7,'/home/andre/.cache/chromium/Default/Cache/','f_000c48','gH gFk IGA B Po Po A OQ+ BAA B4 BRQz4Z BRQz4Z BRQz4Z A A U','XaUb/8BEkWcjoaOaKhyLeyNveEI',0),(56558,7,'/home/andre/.cache/chromium/Default/Cache/','f_000dac','gH gu2 IGA B Po Po A rOi BAA Fg BRWFtr BRWFtr BRWFtr A A U','aHqyvGpxy5h/+fn1/BF/86a1A+g',0),(56559,7,'/home/andre/.cache/chromium/Default/Cache/','f_000e1f','gH gwb IGA B Po Po A FME BAA w BRWGVo BRWGVo BRWGVo A A U','JXjaAemMzpivuUdC7RRrMa78jxU',0),(56560,7,'/home/andre/.cache/chromium/Default/Cache/','f_000ba2','gH gnp IGA B Po Po A H3t BAA BA BRQzy+ BRQzy+ BRQzy+ A A U','xYi/8oYQzckpqIHqgaY/JwUPrbU',0),(56561,7,'/home/andre/.cache/chromium/Default/Cache/','f_000fa3','gH gH2 IGA B Po Po A Lyr BAA Bg BRZRGs BRZRGs BRZRGs A A U','Pp3afFwbtvl/hMhQ6wP4/O/VNqA',0),(56562,7,'/home/andre/.cache/chromium/Default/Cache/','f_000743','gH gfg IGA B Po Po A Hh1 BAA BA BQ4FDD BQ4FDD BQ4FDD A A U','I74B2ZU8RPIfXi4/k4ySetoeZFA',0),(56563,7,'/home/andre/.cache/chromium/Default/Cache/','f_0007e7','gH gCt IGA B Po Po A G6vs BAA 3Y BQ4Xrz BQ4Xr5 BQ4Xr5 A A U','z6ZYe1dxHeU/64VxA4ikucxIZ28',0),(56564,7,'/home/andre/.cache/chromium/Default/Cache/','f_000c1f','gH gpo IGA B Po Po A IpC BAA BI BRQz3t BRQz3t BRQz3t A A U','SI5PefsS4UAkn6NFkZdQQURlw/0',0),(56565,7,'/home/andre/.cache/chromium/Default/Cache/','f_000c9c','gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+<8B>','0',0),(1,7,'/mnt/backup/bedework-backup/bedework/','bwaddgroupmember','PsB w0B0w IHt B Pz Pz A BH BAA I BSWxri BLxiTC BSWxri A A U','kAZY6NLhP7rFo1VuSiTFqqPS22E',0),(2,7,'/mnt/backup/bedework-backup/bedework/','bwinitdir','PsB w0B0y IGk B Pz Pz A BA BAA I BSWxri BLxiTC BSWxri A A U','vlwcBpUk1Qo6JyR84ignKP/BslI',0),(3,7,'/mnt/backup/bedework-backup/bedework/','startjboss.bat','PsB w0B03 IGk B Pz Pz A BV BAA I BSWxri BLxiVt BSWxri A A U','mlYWyPJFsayA4Wq5VkPxQjaBF7c',0),(4,7,'/mnt/backup/bedework-backup/bedework/','README','PsB w0B0p IGk B Pz Pz A CF BAA I BSWxri BLxiTC BSWxri A A U','l3iW9+l+e6OcX2tBds960hhmLsA',0),(5,7,'/mnt/backup/bedework-backup/bedework/bedework/','bedework-cleanup.xml','PsB w4B1R IGk B Pz Pz A 4z BAA I BSWxrr BNZY5m BSWxrr A A U','FA8GB5o+qpDnJSo7FJsUMn6hSe8',0),(6,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/','caldavTest.zip','PsB xECCj IGk B Pz Pz A C+/3 BAA X4 BSWxsR BOIESD BSWxsR A A U','xnm+A6Ecg8o53kHG1u0X0uSe9uw',0),(7,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/WEB-INF/','','PsB xaB2W EHt C Pz Pz A BAA BAA I BSWxsR BOIERL BSWxsR A A C','0',0),(8,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/','setup.do','PsB xECCn IGk B Pz Pz A Eg BAA I BSWxsR BOIERL BSWxsR A A U','wGCGRPjThFNoo6Ol5VxaqvT11ng',0),(9,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/docs/','','PsB xcB1d EHt C Pz Pz A BAA BAA I BSWxsR BOIERL BSWxsR A A C','0',0),(10,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/','','PsB xECCg EHt E Pz Pz A BAA BAA I BSWxsS BOIERL BSWxsS A A C','0',0),(11,7,'/mnt/backup/bedework-backup/bedework15-Mär-2014 19:19:12 fs-dir: mysql.c:468-7 we have a result 15-Mär-2014 19:19:12 fs-dir: mysql.c:484-7 we have 32 rows 15-Mär-2014 19:19:12 fs-dir: getmsg.c:135-7 bget_dirmsg 238: UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes 15-Mär-2014 19:19:12 fs-dir: catreq.c:435-7 UpdCat msg=UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes 15-Mär-2014 19:19:12 fs-dir: catreq.c:437-7 UpdCat VolSessId=1 VolSessT=1394900387 FI=12 Strm=1 reclen=157 [snip] 15-Mär-2014 19:19:16 fs-dir: catreq.c:474-7 dird<stored: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 19:19:16 fs-dir: catreq.c:475-7 dird<stored: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+ <8B> 15-Mär-2014 19:19:16 fs-dir: catreq.c:494-7 dird<filed: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 19:19:16 fs-dir: catreq.c:495-7 dird<filed: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+ <8B> 15-Mär-2014 19:19:16 fs-dir: getmsg.c:135-7 bget_dirmsg 59: Status Job=NightlySave.2014-03-15_17.20.26_03 JobStatus=97 15-Mär-2014 19:19:16 fs-dir: getmsg.c:135-7 bget_dirmsg 144: Jmsg Job=NightlySave.2014-03-15_17.20.26_03 type=3 level=1394907556 fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success 15-Mär-2014 19:19:16 fs-dir: mysql.c:453-7 sql_query starts with 'INSERT INTO Log (JobId, Time, LogText) VALUES (7,'2014-03-15 19:19:16','fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success\n\0')' ==================================== fd 15-Mär-2014 17:35:15 fs-fd: find_one.c:936-0 File ----: /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: find.c:337-0 enter accept_file: fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: backup.c:505-0 FT_REG saving: /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: backup.c:606-0 filed: sending /home/andre/.cache/chromium/Default/Cache/f_000c9c to stored 15-Mär-2014 17:35:15 fs-fd: crypto_openssl.c:529-0 crypto_digest_new jcr=7fd130001078 15-Mär-2014 17:35:15 fs-fd: crypto_openssl.c:529-0 crypto_digest_new jcr=7fd130001078 15-Mär-2014 17:35:15 fs-fd: backup.c:1161-0 encode_and_send_attrs fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: backup.c:1178-0 File /home/andre/.cache/chromium/Default/Cache/f_000c9c attribs=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWFSX A A U attribsEx= 15-Mär-2014 17:35:15 fs-fd: backup.c:1205-0 >stored: attrhdr 56565 1 0 15-Mär-2014 17:35:15 fs-fd: backup.c:1356-0 No strip for /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: backup.c:1298-0 >stored: attr len=121: 56565 3 /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: crypto.c:121-0 Send hdr fi=56565 stream=22 15-Mär-2014 17:35:15 fs-fd: crypto.c:129-0 Send data len=640 15-Mär-2014 17:35:15 fs-fd: backup.c:722-0 type=3 do_read=1 15-Mär-2014 17:35:15 fs-fd: bfile.c:1031-0 open file /home/andre/.cache/chromium/Default/Cache/f_000c9c 15-Mär-2014 17:35:15 fs-fd: bfile.c:1055-0 Open file 12 15-Mär-2014 17:35:15 fs-fd: backup.c:1028-0 Saving data, type=3 15-Mär-2014 17:35:15 fs-fd: backup.c:1049-0 >stored: datahdr 56565 20 0 15-Mär-2014 17:35:15 fs-fd: crypto.c:487-0 Encrypt len=27090 15-Mär-2014 17:35:15 fs-fd: crypto.c:513-0 encrypted len=27088 unencrypted len=27090 15-Mär-2014 17:35:15 fs-fd: backup.c:983-0 Send data to SD len=27088 15-Mär-2014 17:35:15 fs-fd: backup.c:1112-0 Send data to SD len=16 15-Mär-2014 17:35:15 fs-fd: bfile.c:1095-0 Close file 12 15-Mär-2014 17:35:15 fs-fd: crypto_openssl.c:629-0 crypto_sign_new jcr=7fd130001078 15-Mär-2014 17:35:15 fs-fd: backup.c:355-0 filed>stored:header 56565 19 0 15-Mär-2014 17:35:15 fs-fd: backup.c:387-0 filed>stored:header 56565 10 0 15-Mär-2014 17:35:15 fs-fd: find_one.c:523-0 FT_REG FI=56565 linked=0 file=/home/andre/.cache/chromium/Default/Cache/f_000c9c ==================================== sd 15-Mär-2014 19:19:02 fs-sd: spool.c:694-0 Commit attributes at 15-Mär-2014 19:19:02 15-Mär-2014 19:19:16 fs-sd: bsock.c:652-0 nbytes=239267896 msglen=1049961239 15-Mär-2014 19:19:16 fs-sd: spool.c:781-0 Close attr spool file at 15-Mär-2014 19:19:16 | ||||
Tags | No tags attached. | ||||
It seems things get seriously corrupted some where. Your setup is probably with a SD on a separate machine and a DIR on an other right ? Not that it matters much but when SD and DIR are on the same machine the spooled attributes are not transferred via the network. The spool file is some binary format and it seems its corrupted when you look at things like 'msglen=1049961239' which should normally be only a few bytes as the spooled data is nothing more then the normal data send to the DIR directly when a file is being backup. The spooling is done in the working directory of the SD. Given that we didn't change much in this code and its been working for quite some time I get the feeling that its something local. You are sure your filesystem is not running out of space ? |
|
This specific backup job runs on the same machine, so there is no network in between. The spool directory points to /mnt/spool. It is a 100 gig partition and I verified if spool files are really written there. Enough space is left on that device (85G free). To be 100 percent sure I already limited the spool size to a maximum of 20 gig. Of course the spool directory is excluded from backup. Backing up with spooling disabled works perfectly. For me this looks like there is a bug in the serializer or deserializer. Can I somehow verify this? Do you need some more debug output? I will check in the meantime if this bug exists in bacula, too. |
|
> I will check in the meantime if this bug exists in bacula, too. This bug exists in bacula, too. Any ideas how to track it down? |
|
We found some nasty bugs in the spooling code which we changed in the current master code I would be interested if that still has the same kind of problems you report here. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2014-03-15 22:32 | andre | New Issue | |
2014-03-19 16:53 | mvwieringen | Note Added: 0000834 | |
2014-03-19 16:53 | mvwieringen | Assigned To | => mvwieringen |
2014-03-19 16:53 | mvwieringen | Status | new => feedback |
2014-03-19 21:36 | andre | Note Added: 0000836 | |
2014-03-19 21:36 | andre | Status | feedback => assigned |
2014-03-28 19:03 | andre | Note Added: 0000839 | |
2014-11-23 15:18 | mvwieringen | Note Added: 0001073 | |
2014-11-23 15:18 | mvwieringen | Status | assigned => feedback |
2015-01-26 15:35 | mvwieringen | Assigned To | mvwieringen => |
2015-01-26 15:35 | mvwieringen | Status | feedback => closed |
2015-01-26 15:35 | mvwieringen | Resolution | open => unable to reproduce |