View Issue Details

IDProjectCategoryView StatusLast Update
0000284bareos-coredirectorpublic2015-01-26 15:35
Reporterandre Assigned To 
PrioritynormalSeveritymajorReproducibilityalways
Status closedResolutionunable to reproduce 
Platformx86_64OSGentoo 
Product Version13.2.2 
Summary0000284: Despooling of attributes leads to fatal error
DescriptionSetting "Spool Data" to yes causes the following job failures:

15-Mär 19:19 fs-sd JobId 7: Sending spooled attrs to the Director. Despooling 258,159,689 bytes ...
15-Mär 19:19 fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success
15-Mär 19:19 fs-dir JobId 7: Fatal error: catreq.c:674 fread attr spool error. ERR=Success

Rerunning the failed job results in the same error. If "Spool Data" is turned off, backup starts working again.

My first investigations showed, that spooling of data and writing to tape is successful. Only despooling of attributes causes the mentioned error. The debug output between job runs is not exactly the same, but all debug logs contain weird characters like "UpdCat Job=NightlySave.2014-03-15_15.14K<*<F8>QK<BF><E1><E3><E3><81>
a$05A><B9>|^B<8F><9B><E9>r<BF>F(<E6><A2><U+D9A1><DB>rH\<D3><C1><8E>%<E2><9F>^P" when job is aborted.

Every job fails in the same directory (/home/andre/.cache/chromium/Default/Cache/). The directory is not modified when backup is running (headless server).



Debug output of one errenous job run:

==================================== console
15-Mär 19:19 fs-sd JobId 7: Sending spooled attrs to the Director. Despooling 258,159,689 bytes ...
15-Mär 19:19 fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success
15-Mär 19:19 fs-dir JobId 7: Fatal error: catreq.c:674 fread attr spool error. ERR=Success
15-Mär 19:19 fs-dir JobId 7: Error: Bareos fs-dir 13.2.2 (12Nov13):
  Build OS: x86_64-pc-linux-gnu gentoo
  JobId: 7
  Job: NightlySave.2014-03-15_17.20.26_03
  Backup Level: Full
  Client: "fs-fd" 13.2.2 (12Nov13) x86_64-pc-linux-gnu,gentoo,
  FileSet: "Local Server FileSet" 2014-03-15 09:27:50
  Pool: "Daily" (From Job resource)
  Catalog: "MyCatalog" (From Client resource)
  Storage: "Autochanger" (From Job resource)
  Scheduled time: 15-Mär-2014 17:20:25
  Start time: 15-Mär-2014 17:29:04
  End time: 15-Mär-2014 19:19:20
  Elapsed time: 1 hour 50 mins 16 secs
  Priority: 10
  FD Files Written: 761,198
  SD Files Written: 761,198
  FD Bytes Written: 79,135,982,464 (79.13 GB)
  SD Bytes Written: 79,479,550,632 (79.47 GB)
  Rate: 11961.3 KB/s
  Software Compression: None
  VSS: no
  Encryption: yes
  Accurate: no
  Volume name(s): DA0001L3
  Volume Session Id: 1
  Volume Session Time: 1394900387
  Last Volume Bytes: 79,584,583,680 (79.58 GB)
  Non-fatal FD errors: 1
  SD Errors: 1
  FD termination status: OK
  SD termination status: Error
  Termination: *** Backup Error ***



==================================== dir
[look at the weird characters at line 2 and 4]

15-Mär-2014 19:19:11 fs-dir: catreq.c:474-7 dird<stored: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 19:19:11 fs-dir: catreq.c:475-7 dird<stored: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+
<8B>
15-Mär-2014 19:19:11 fs-dir: catreq.c:494-7 dird<filed: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 19:19:11 fs-dir: catreq.c:495-7 dird<filed: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+
<8B>
15-Mär-2014 19:19:11 fs-dir: msgchan.c:461-7 I'm waiting for message thread termination.
15-Mär-2014 19:19:12 fs-dir: catreq.c:672-7 nbytes=239267896 msglen=1049961239
15-Mär-2014 19:19:12 fs-dir: catreq.c:704-7 End despool_attributes_from_file retval=0
15-Mär-2014 19:19:12 fs-dir: getmsg.c:135-7 bget_dirmsg 203: UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes
15-Mär-2014 19:19:12 fs-dir: catreq.c:435-7 UpdCat msg=UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes
15-Mär-2014 19:19:12 fs-dir: catreq.c:437-7 UpdCat VolSessId=1 VolSessT=1394900387 FI=1 Strm=1 reclen=122
15-Mär-2014 19:19:12 fs-dir: catreq.c:443-7 Cached attr. Stream=1 fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 19:19:12 fs-dir: sql_create.c:890-7 Fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 19:19:12 fs-dir: sql_create.c:891-7 put_file_into_catalog
15-Mär-2014 19:19:12 fs-dir: sql.c:406-7 split path=/home/andre/.cache/chromium/Default/Cache/ file=f_000c9c


15-Mär-2014 19:19:12 fs-dir: mysql.c:453-7 sql_query starts with 'INSERT INTO batch VALUES (56545,7,'/home/andre/.cache/chromium/Default/Cache/','f_000cf3','gH gNG IGA B Po Po A Ia3 BAA BI BRWFY6 BRWFY6 BRWFY6 A A U','GcuvSSabHNcwhOtU0EjfYN+m9V0',0),(56546,7,'/home/andre/.cache/chromium/Default/Cache/','f_00070b','gH geY IGA B Po Po A ELS BAA o BQ3ctW BQ3ctW BQ3ctW A A U','yM8RzEv8aqK36DlInoAKHxE+J24',0),(56547,7,'/home/andre/.cache/chromium/Default/Cache/','f_000d19','gH gcH IGA B Po Po A FZi BAA w BRWFZW BRWFZW BRWFZW A A U','LqUrMSns7g2CY5VJdbU8ASUrPnI',0),(56548,7,'/home/andre/.cache/chromium/Default/Cache/','f_000df4','gH gvy IGA B Po Po A KC5 BAA BY BRWF8C BRWF8C BRWF8C A A U','dTNfFVmOXP+gn2HIhPYQfL4c/ts',0),(56549,7,'/home/andre/.cache/chromium/Default/Cache/','f_000eec','gH gV3 IGA B Po Po A PR/ BAA CA BRXBlN BRXBlN BRXBlN A A U','X8+pdPeoqRYCG2kf0s8TPpoG8ig',0),(56550,7,'/home/andre/.cache/chromium/Default/Cache/','f_000d64','gH gtn IGA B Po Po A MnA BAA Bo BRWFaM BRWFaM BRWFaM A A U','3UPBks3DJZsYlOHdjT3t5F24QF8',0),(56551,7,'/home/andre/.cache/chromium/Default/Cache/','f_000e3f','gH gw3 IGA B Po Po A I9Q BAA BI BRWGcH BRWGcH BRWGcH A A U','v5JTaIHls2H6AX5rckbGTEk9Eqw',0),(56552,7,'/home/andre/.cache/chromium/Default/Cache/','f_000777','gH ggS IGA B Po Po A BegC BAA L4 BQ4FMe BQ4FMe BQ4FMe A A U','jO2w9geXEfPnvO5Y9acu08EZGTE',0),(56553,7,'/home/andre/.cache/chromium/Default/Cache/','f_000d2e','gH gqQ IGA B Po Po A ImB BAA BI BRWFZk BRWFZk BRWFZk A A U','P5Pxplv6zJQWSOKVeEbwmnbnpkc',0),(56554,7,'/home/andre/.cache/chromium/Default/Cache/','f_000f37','gH gGw IGA B Po Po A NTh BAA Bw BRXBsO BRXBsO BRXBsO A A U','z2j53N4sNFXCnYStPuQHHyA/dd4',0),(56555,7,'/home/andre/.cache/chromium/Default/Cache/','f_000f6d','gH gY7 IGA B Po Po A JrO BAA BQ BRXBvm BRXBvm BRXBvm A A U','d2F/dIJ5kQBDqWicT63fKy7Oz8Y',0),(56556,7,'/home/andre/.cache/chromium/Default/Cache/','f_0006c3','gH gdT IGA B Po Po A J9V BAA BQ BQ3bpf BQ3bpf BQ3bpf A A U','nQGbLp2W2/8RN/o6eBdZjLYXEa8',0),(56557,7,'/home/andre/.cache/chromium/Default/Cache/','f_000c48','gH gFk IGA B Po Po A OQ+ BAA B4 BRQz4Z BRQz4Z BRQz4Z A A U','XaUb/8BEkWcjoaOaKhyLeyNveEI',0),(56558,7,'/home/andre/.cache/chromium/Default/Cache/','f_000dac','gH gu2 IGA B Po Po A rOi BAA Fg BRWFtr BRWFtr BRWFtr A A U','aHqyvGpxy5h/+fn1/BF/86a1A+g',0),(56559,7,'/home/andre/.cache/chromium/Default/Cache/','f_000e1f','gH gwb IGA B Po Po A FME BAA w BRWGVo BRWGVo BRWGVo A A U','JXjaAemMzpivuUdC7RRrMa78jxU',0),(56560,7,'/home/andre/.cache/chromium/Default/Cache/','f_000ba2','gH gnp IGA B Po Po A H3t BAA BA BRQzy+ BRQzy+ BRQzy+ A A U','xYi/8oYQzckpqIHqgaY/JwUPrbU',0),(56561,7,'/home/andre/.cache/chromium/Default/Cache/','f_000fa3','gH gH2 IGA B Po Po A Lyr BAA Bg BRZRGs BRZRGs BRZRGs A A U','Pp3afFwbtvl/hMhQ6wP4/O/VNqA',0),(56562,7,'/home/andre/.cache/chromium/Default/Cache/','f_000743','gH gfg IGA B Po Po A Hh1 BAA BA BQ4FDD BQ4FDD BQ4FDD A A U','I74B2ZU8RPIfXi4/k4ySetoeZFA',0),(56563,7,'/home/andre/.cache/chromium/Default/Cache/','f_0007e7','gH gCt IGA B Po Po A G6vs BAA 3Y BQ4Xrz BQ4Xr5 BQ4Xr5 A A U','z6ZYe1dxHeU/64VxA4ikucxIZ28',0),(56564,7,'/home/andre/.cache/chromium/Default/Cache/','f_000c1f','gH gpo IGA B Po Po A IpC BAA BI BRQz3t BRQz3t BRQz3t A A U','SI5PefsS4UAkn6NFkZdQQURlw/0',0),(56565,7,'/home/andre/.cache/chromium/Default/Cache/','f_000c9c','gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+<8B>','0',0),(1,7,'/mnt/backup/bedework-backup/bedework/','bwaddgroupmember','PsB w0B0w IHt B Pz Pz A BH BAA I BSWxri BLxiTC BSWxri A A U','kAZY6NLhP7rFo1VuSiTFqqPS22E',0),(2,7,'/mnt/backup/bedework-backup/bedework/','bwinitdir','PsB w0B0y IGk B Pz Pz A BA BAA I BSWxri BLxiTC BSWxri A A U','vlwcBpUk1Qo6JyR84ignKP/BslI',0),(3,7,'/mnt/backup/bedework-backup/bedework/','startjboss.bat','PsB w0B03 IGk B Pz Pz A BV BAA I BSWxri BLxiVt BSWxri A A U','mlYWyPJFsayA4Wq5VkPxQjaBF7c',0),(4,7,'/mnt/backup/bedework-backup/bedework/','README','PsB w0B0p IGk B Pz Pz A CF BAA I BSWxri BLxiTC BSWxri A A U','l3iW9+l+e6OcX2tBds960hhmLsA',0),(5,7,'/mnt/backup/bedework-backup/bedework/bedework/','bedework-cleanup.xml','PsB w4B1R IGk B Pz Pz A 4z BAA I BSWxrr BNZY5m BSWxrr A A U','FA8GB5o+qpDnJSo7FJsUMn6hSe8',0),(6,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/','caldavTest.zip','PsB xECCj IGk B Pz Pz A C+/3 BAA X4 BSWxsR BOIESD BSWxsR A A U','xnm+A6Ecg8o53kHG1u0X0uSe9uw',0),(7,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/WEB-INF/','','PsB xaB2W EHt C Pz Pz A BAA BAA I BSWxsR BOIERL BSWxsR A A C','0',0),(8,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/','setup.do','PsB xECCn IGk B Pz Pz A Eg BAA I BSWxsR BOIERL BSWxsR A A U','wGCGRPjThFNoo6Ol5VxaqvT11ng',0),(9,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/docs/','','PsB xcB1d EHt C Pz Pz A BAA BAA I BSWxsR BOIERL BSWxsR A A C','0',0),(10,7,'/mnt/backup/bedework-backup/bedework/bedework/dist/cal.war/','','PsB xECCg EHt E Pz Pz A BAA BAA I BSWxsS BOIERL BSWxsS A A C','0',0),(11,7,'/mnt/backup/bedework-backup/bedework15-Mär-2014 19:19:12 fs-dir: mysql.c:468-7 we have a result
15-Mär-2014 19:19:12 fs-dir: mysql.c:484-7 we have 32 rows
15-Mär-2014 19:19:12 fs-dir: getmsg.c:135-7 bget_dirmsg 238: UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes
15-Mär-2014 19:19:12 fs-dir: catreq.c:435-7 UpdCat msg=UpdCat Job=NightlySave.2014-03-15_17.20.26_03 FileAttributes
15-Mär-2014 19:19:12 fs-dir: catreq.c:437-7 UpdCat VolSessId=1 VolSessT=1394900387 FI=12 Strm=1 reclen=157

[snip]

15-Mär-2014 19:19:16 fs-dir: catreq.c:474-7 dird<stored: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 19:19:16 fs-dir: catreq.c:475-7 dird<stored: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+
<8B>
15-Mär-2014 19:19:16 fs-dir: catreq.c:494-7 dird<filed: stream=1 /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 19:19:16 fs-dir: catreq.c:495-7 dird<filed: attr=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWF<8D><A7>˱<90>ݣ<DB><EF>"<B4>+
<8B>
15-Mär-2014 19:19:16 fs-dir: getmsg.c:135-7 bget_dirmsg 59: Status Job=NightlySave.2014-03-15_17.20.26_03 JobStatus=97

15-Mär-2014 19:19:16 fs-dir: getmsg.c:135-7 bget_dirmsg 144: Jmsg Job=NightlySave.2014-03-15_17.20.26_03 type=3 level=1394907556 fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success

15-Mär-2014 19:19:16 fs-dir: mysql.c:453-7 sql_query starts with 'INSERT INTO Log (JobId, Time, LogText) VALUES (7,'2014-03-15 19:19:16','fs-sd JobId 7: Fatal error: bsock.c:654 fread attr spool error. ERR=Success\n\0')'




==================================== fd
15-Mär-2014 17:35:15 fs-fd: find_one.c:936-0 File ----: /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: find.c:337-0 enter accept_file: fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: backup.c:505-0 FT_REG saving: /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: backup.c:606-0 filed: sending /home/andre/.cache/chromium/Default/Cache/f_000c9c to stored
15-Mär-2014 17:35:15 fs-fd: crypto_openssl.c:529-0 crypto_digest_new jcr=7fd130001078
15-Mär-2014 17:35:15 fs-fd: crypto_openssl.c:529-0 crypto_digest_new jcr=7fd130001078
15-Mär-2014 17:35:15 fs-fd: backup.c:1161-0 encode_and_send_attrs fname=/home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: backup.c:1178-0 File /home/andre/.cache/chromium/Default/Cache/f_000c9c
attribs=gH gMB IGA B Po Po A GnS BAA 4 BRWFSX BRWFSX BRWFSX A A U
attribsEx=
15-Mär-2014 17:35:15 fs-fd: backup.c:1205-0 >stored: attrhdr 56565 1 0
15-Mär-2014 17:35:15 fs-fd: backup.c:1356-0 No strip for /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: backup.c:1298-0 >stored: attr len=121: 56565 3 /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: crypto.c:121-0 Send hdr fi=56565 stream=22
15-Mär-2014 17:35:15 fs-fd: crypto.c:129-0 Send data len=640
15-Mär-2014 17:35:15 fs-fd: backup.c:722-0 type=3 do_read=1
15-Mär-2014 17:35:15 fs-fd: bfile.c:1031-0 open file /home/andre/.cache/chromium/Default/Cache/f_000c9c
15-Mär-2014 17:35:15 fs-fd: bfile.c:1055-0 Open file 12
15-Mär-2014 17:35:15 fs-fd: backup.c:1028-0 Saving data, type=3
15-Mär-2014 17:35:15 fs-fd: backup.c:1049-0 >stored: datahdr 56565 20 0
15-Mär-2014 17:35:15 fs-fd: crypto.c:487-0 Encrypt len=27090
15-Mär-2014 17:35:15 fs-fd: crypto.c:513-0 encrypted len=27088 unencrypted len=27090
15-Mär-2014 17:35:15 fs-fd: backup.c:983-0 Send data to SD len=27088
15-Mär-2014 17:35:15 fs-fd: backup.c:1112-0 Send data to SD len=16
15-Mär-2014 17:35:15 fs-fd: bfile.c:1095-0 Close file 12
15-Mär-2014 17:35:15 fs-fd: crypto_openssl.c:629-0 crypto_sign_new jcr=7fd130001078
15-Mär-2014 17:35:15 fs-fd: backup.c:355-0 filed>stored:header 56565 19 0
15-Mär-2014 17:35:15 fs-fd: backup.c:387-0 filed>stored:header 56565 10 0
15-Mär-2014 17:35:15 fs-fd: find_one.c:523-0 FT_REG FI=56565 linked=0 file=/home/andre/.cache/chromium/Default/Cache/f_000c9c


==================================== sd
15-Mär-2014 19:19:02 fs-sd: spool.c:694-0 Commit attributes at 15-Mär-2014 19:19:02
15-Mär-2014 19:19:16 fs-sd: bsock.c:652-0 nbytes=239267896 msglen=1049961239
15-Mär-2014 19:19:16 fs-sd: spool.c:781-0 Close attr spool file at 15-Mär-2014 19:19:16
TagsNo tags attached.

Activities

mvwieringen

mvwieringen

2014-03-19 16:53

developer   ~0000834

It seems things get seriously corrupted some where. Your setup is probably
with a SD on a separate machine and a DIR on an other right ? Not that it
matters much but when SD and DIR are on the same machine the spooled attributes
are not transferred via the network. The spool file is some binary format and
it seems its corrupted when you look at things like 'msglen=1049961239' which
should normally be only a few bytes as the spooled data is nothing more then
the normal data send to the DIR directly when a file is being backup. The
spooling is done in the working directory of the SD. Given that we didn't change
much in this code and its been working for quite some time I get the feeling that
its something local. You are sure your filesystem is not running out of space ?
andre

andre

2014-03-19 21:36

reporter   ~0000836

This specific backup job runs on the same machine, so there is no network in between. The spool directory points to /mnt/spool. It is a 100 gig partition and I verified if spool files are really written there. Enough space is left on that device (85G free). To be 100 percent sure I already limited the spool size to a maximum of 20 gig. Of course the spool directory is excluded from backup.

Backing up with spooling disabled works perfectly. For me this looks like there is a bug in the serializer or deserializer. Can I somehow verify this?
Do you need some more debug output?

I will check in the meantime if this bug exists in bacula, too.
andre

andre

2014-03-28 19:03

reporter   ~0000839

> I will check in the meantime if this bug exists in bacula, too.

This bug exists in bacula, too.

Any ideas how to track it down?
mvwieringen

mvwieringen

2014-11-23 15:18

developer   ~0001073

We found some nasty bugs in the spooling code which we changed in the current
master code I would be interested if that still has the same kind of problems
you report here.

Issue History

Date Modified Username Field Change
2014-03-15 22:32 andre New Issue
2014-03-19 16:53 mvwieringen Note Added: 0000834
2014-03-19 16:53 mvwieringen Assigned To => mvwieringen
2014-03-19 16:53 mvwieringen Status new => feedback
2014-03-19 21:36 andre Note Added: 0000836
2014-03-19 21:36 andre Status feedback => assigned
2014-03-28 19:03 andre Note Added: 0000839
2014-11-23 15:18 mvwieringen Note Added: 0001073
2014-11-23 15:18 mvwieringen Status assigned => feedback
2015-01-26 15:35 mvwieringen Assigned To mvwieringen =>
2015-01-26 15:35 mvwieringen Status feedback => closed
2015-01-26 15:35 mvwieringen Resolution open => unable to reproduce