View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000498 | bareos-core | file daemon | public | 2015-07-29 11:03 | 2016-04-26 10:15 |
Reporter | Gunther | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | unable to reproduce | ||
Platform | x86_64 on vmware vsphere 5.5 (2) | OS | CentOS | OS Version | 6.6 |
Product Version | 14.2.2 | ||||
Summary | 0000498: centos - backup fails after 5 minutes because of clientside network reset | ||||
Description | Old bug - existed already with bacula. (google finds old references) Starting a backup-run of a centos-client will end up in a "broken connection" / network reset caused by the client. Reset takes place after 5 minutes (and 1 or 2 seconds). .............. Already changed on client: net.ipv4.tcp_keepalive_time = 600 (instead 7200) no effect.. All other clients (>100; different linux flavours) work flawlessly. | ||||
Steps To Reproduce | Start backup of centos-client - wait for the error. | ||||
Additional Information | /usr/sbin/bareos-fd -d 200 -f -c /etc/bareos/bareos-fd.conf start backup-run:: bareos-fd: filed_conf.c:484-0 Inserting director res: bareos-server-mon centos-client-fd: jcr.c:141-0 read_last_jobs seek to 192 centos-client-fd: jcr.c:148-0 Read num_items=9 centos-client-fd: fd_plugins.c:1560-0 plugin dir is NULL centos-client-fd: filed.c:267-0 filed: listening on port 9102 centos-client-fd: bnet_server_tcp.c:166-0 Addresses host[ipv4;0.0.0.0;9102] centos-client-fd: dir_cmd.c:576-0 Conn: Hello Director bareos-server-dir calling centos-client-fd: dir_cmd.c:582-0 Got a DIR connection at 29-Jul-2015 10:38:30 centos-client-fd: dir_cmd.c:444-0 Calling Authenticate centos-client-fd: cram-md5.c:68-0 send: auth cram-md5 <1794404934.1438159110@centos-client-fd> ssl=0 centos-client-fd: cram-md5.c:123-0 cram-get received: auth cram-md5 <1285386616.1438159110@bareos-server-dir> ssl=0 centos-client-fd: cram-md5.c:143-0 sending resp to challenge: m9Q+ZmYMq4Z92n4pA9Fc9A centos-client-fd: dir_cmd.c:449-0 OK Authenticate centos-client-fd: dir_cmd.c:463-0 <dird: JobId=69 Job=centos-client.2015-07-29_10.38.28_25 SDid=64 SDtime=1438092081 Authorization=JPMM-EIGA-HIAJ-HGNO-GHKF-MNPN-MGIP-KEBO centos-client-fd: dir_cmd.c:474-0 Executing JobId= command. centos-client-fd: dir_cmd.c:1430-0 set sd auth key centos-client-fd: dir_cmd.c:810-0 JobId=69 Auth=JPMM-EIGA-HIAJ-HGNO-GHKF-MNPN-MGIP-KEBO centos-client-fd: fd_plugins.c:1691-0 plugin list is NULL centos-client-fd: dir_cmd.c:463-0 <dird: level = full mtime_only=0 centos-client-fd: dir_cmd.c:474-0 Executing level = command. centos-client-fd: dir_cmd.c:1265-0 level_cmd: level = full mtime_only=0 centos-client-fd: dir_cmd.c:463-0 <dird: fileset vss=1 centos-client-fd: dir_cmd.c:474-0 Executing fileset command. centos-client-fd: fileset.c:251-0 I centos-client-fd: fileset.c:251-0 O fMZ6 centos-client-fd: fileset.c:251-0 N centos-client-fd: fileset.c:251-0 F / centos-client-fd: fileset.c:251-0 N centos-client-fd: fileset.c:251-0 N centos-client-fd: fileset.c:251-0 E centos-client-fd: fileset.c:251-0 F /proc centos-client-fd: fileset.c:251-0 F /sys centos-client-fd: fileset.c:251-0 F /tmp centos-client-fd: fileset.c:251-0 F /.journal centos-client-fd: fileset.c:251-0 F /.fsck centos-client-fd: fileset.c:251-0 F /srv/chroot/el5/dev .... .... tail: ..... centos-client-fd: backup.c:627-0 filed: sending /usr/share/doc/gcc-c++-4.4.7/ChangeLog-2003.bz2 to stored centos-client-fd: crypto_openssl.c:529-0 crypto_digest_new jcr=7f14480018c8 centos-client-fd: backup.c:1501-0 No strip for /usr/share/doc/gcc-c++-4.4.7/ChangeLog-2003.bz2 centos-client-fd: backup.c:748-0 type=3 do_read=1 centos-client-fd: bfile.c:1101-0 bopen: fname /usr/share/doc/gcc-c++-4.4.7/ChangeLog-2003.bz2, flags 0, mode 0, rdev 0 centos-client-fd: bfile.c:1111-0 open file /usr/share/doc/gcc-c++-4.4.7/ChangeLog-2003.bz2 centos-client-fd: backup.c:177-0 end blast_data ok=0 centos-client-fd: dir_cmd.c:1712-0 Error in blast_data. centos-client-fd: dir_cmd.c:477-0 Quit command loop. Canceled=1 centos-client-fd: runscript.c:149-0 runscript: running all RUNSCRIPT object (ClientAfterJob) JobStatus=f centos-client-fd: dir_cmd.c:513-0 End FD msg: Jmsg Job=centos-client.2015-07-29_10.38.28_25 type=4 level=1438159416 centos-client-fd JobId 69: Error: bsock_tcp.c:422 Write error sending 41896 bytes to Storage daemon:bareos-server.backup-domain:9103: ERR=Broken pipe centos-client-fd: dir_cmd.c:533-0 Calling term_find_files centos-client-fd: htable.c:84-0 free malloc buf=7f14634c2038 centos-client-fd: mem_pool.c:432-0 Freed mem_pool count=33 size=25,957 Pool Maxsize Maxused Inuse NoPool 256 3 1 NAME 256 3 0 FNAME 2304 20 3 MSG 512 10 3 EMSG 1024 3 0 BSOCK 65636 5 2 RECORD 128 0 0 centos-client-fd: htable.c:590-0 Done destroy. centos-client-fd: dir_cmd.c:538-0 Done with term_find_files centos-client-fd: jcr.c:184-0 write_last_jobs seek to 192 centos-client-fd: dir_cmd.c:540-0 Done with free_jcr centos-client-fd: mem_pool.c:394-0 garbage collect memory pool ................ ................ Server logs: 29-Jul 10:38 bareos-server-dir JobId 69: No prior Full backup Job record found. 29-Jul 10:38 bareos-server-dir JobId 69: No prior or suitable Full backup found in catalog. Doing FULL backup. 29-Jul 10:38 bareos-server-dir JobId 69: Start Backup JobId 69, Job=centos-client.2015-07-29_10.38.28_25 29-Jul 10:38 bareos-server-dir JobId 69: Using Device "bareos-server-div-storage" to write. 29-Jul 10:38 bareos-server-sd JobId 69: Volume "div0040" previously written, moving to end of data. 29-Jul 10:38 bareos-server-sd JobId 69: Ready to append to end of Volume "div0040" size=11965573289 29-Jul 10:43 bareos-server-dir JobId 69: Fatal error: Network error with FD during Backup: ERR=Die Verbindung wurde vom Kommunikationspartner zurückgesetzt 29-Jul 10:43 bareos-server-dir JobId 69: Fatal error: No Job status returned from FD. 29-Jul 10:43 bareos-server-dir JobId 69: Error: Bareos bareos-server-dir 14.2.2 (12Dec14): Build OS: x86_64-pc-linux-gnu debian Debian GNU/Linux 7.0 (wheezy) JobId: 69 Job: centos-client.2015-07-29_10.38.28_25 Backup Level: Full (upgraded from Incremental) Client: "centos-client-fd" 14.2.2 (12Dec14) x86_64-unknown-linux-gnu,redhat,CentOS release 6.2 (Final) FileSet: "centos-client-fs" 2015-07-28 15:57:12 Pool: "Div" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "bareos-server-sd-div" (From Pool resource) Scheduled time: 29-Jul-2015 10:37:21 Start time: 29-Jul-2015 10:38:30 End time: 29-Jul-2015 10:43:36 Elapsed time: 5 mins 6 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 10,994,848 (10.99 MB) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 64 Volume Session Time: 1438092081 Last Volume Bytes: 0 (0 B) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: Running Termination: *** Backup Error *** ............ ............ Status client (bconsole): Connecting to Client centos-client-fd at centos-client.backup-domain:9102 centos-client-fd Version: 14.2.2 (12 December 2014) x86_64-unknown-linux-gnu redhat CentOS release 6.2 (Final) Daemon started 29-Jul-15 10:36. Jobs: run=1 running=0. Heap: heap=36,864 smbytes=93,929 max_bytes=2,240,953 bufs=65 max_bufs=214 Sizeof: boffset_t=8 size_t=8 debug=200 trace=0 bwlimit=0kB/s Running Jobs: Director connected at: 29-Jul-15 10:57 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ====================================================================== 112616 Full 83,474 841.4 M Error 28-Jul-15 09:59 centos-client 112617 Full 44,966 430.8 M Error 28-Jul-15 10:05 centos-client 112623 Full 127,076 1.048 G Error 28-Jul-15 10:28 centos-client 112624 Full 54,257 572.9 M Error 28-Jul-15 11:08 centos-client 112625 Full 79,922 829.4 M Error 28-Jul-15 11:35 centos-client 4 Full 0 0 Error 28-Jul-15 16:00 centos-client 5 Full 34,339 351.6 M Error 28-Jul-15 16:06 centos-client 6 Full 65,083 742.3 M Error 28-Jul-15 16:28 centos-client 7 Full 59,222 674.6 M Error 28-Jul-15 16:42 centos-client 69 Full 76,035 813.4 M Error 29-Jul-15 10:43 centos-client ==== * (new jobid because of re-installation of bareos) *est Select Job resource (1-63): 12 Using Catalog "MyCatalog" Connecting to Client centos-client-fd at centos-client.backup-domain:9102 2000 OK estimate files=128,588 bytes=3,221,471,374 * | ||||
Tags | No tags attached. | ||||
I have very similiar situation. my file daemon is on Gentoo Version: 14.2.1 my storage and director are on Ubuntu Version: 15.2.0 They are placed on one network (so nothing can break the connection) so when I try to run backup i get this on FD mail-fd: backup.c:989-0 Send data to SD len=59174 mail-fd: backup.c:989-0 Send data to SD len=59232 mail-fd: backup.c:989-0 Send data to SD len=59547 mail-fd: heartbeat.c:98-0 wait_intr=0 stop=0 mail-fd: heartbeat.c:98-0 wait_intr=0 stop=0 mail-fd: heartbeat.c:98-0 wait_intr=0 stop=0 and on SD i get bareos-sd (50): askdir.c:177-1769 <dird 1000 OK VolName=lab00 VolJobs=3 VolFiles=0 VolBlocks=0 VolBytes=146958542 VolMounts=3 VolErrors=0 VolWrites=1 MaxVolBytes=268435456000 VolCapacityBytes=0 VolStatus=Append Slot=0 MaxVolJobs=0 MaxVolFiles=0 InChanger=0 VolReadTime=0 VolWriteTime=27 EndFile=0 EndBlock=0 LabelType=0 MediaId=144 EncryptionKey= MinBlocksize=0 MaxBlocksize=0 bareos-sd (50): askdir.c:213-1769 do_get_volume_info return true slot=0 Volume=lab00, VolminBlocksize=0 VolMaxBlocksize=0 bareos-sd (50): askdir.c:215-1769 setting dcr->VolMinBlocksize(0) to vol.VolMinBlocksize(0) bareos-sd (50): askdir.c:217-1769 setting dcr->VolMaxBlocksize(0) to vol.VolMaxBlocksize(0) bareos-sd (150): reserve.c:134-1769 Dec reserve=0 dev="device-synology" (/mnt/data/backup/storage) bareos-sd (50): append.c:89-1769 Begin append device="device-synology" (/mnt/data/backup/storage) bareos-sd (100): append.c:100-1769 Just after acquire_device_for_append bareos-sd (130): label.c:803-1769 session_label record=980053c0 bareos-sd (150): label.c:848-1769 Write sesson_label record JobId=1769 FI=SOS_LABEL SessId=1 Strm=1769 len=153 remainder=0 bareos-sd (150): label.c:852-1769 Leave write_session_label Block=146 958542d File=0d So at some moment FD looses connection to SD. IDK if I should open a new ticket or this is the same problem |
|
Probably an individual network problem. Please reopen, if it occurs on other clients, too. | |