View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000457 | bareos-core | storage daemon | public | 2015-04-15 00:01 | 2015-05-11 22:15 |
Reporter | systemike | Assigned To | |||
Priority | normal | Severity | crash | Reproducibility | always |
Status | closed | Resolution | no change required | ||
Platform | Linux | OS | CentOS | OS Version | 7 |
Summary | 0000457: NDMP BACKUP HANG | ||||
Description | Hello, i'm trying to backup a nfs export hosted by emc VNX via ndmp. the backup start and after few second it hangs below the log: Timestamp Log Job Message 2015-04-14 23:54:21 47139 33 bareos-sd JobId 33: Elapsed time=00:00:53, Transfer rate=15.54 M Bytes/second 2015-04-14 23:54:01 47138 33 bareos-dir JobId 33: Operation ended questionably 2015-04-14 23:54:01 47137 33 bareos-dir JobId 33: Operation halted, stopping 2015-04-14 23:54:01 47136 33 bareos-dir JobId 33: Closing tape drive BJGH-JPPG-ABKD-AHLI-LODJ-NBMJ-LDBL-FPIP@/pathnfs 2015-04-14 23:54:01 47135 33 bareos-dir JobId 33: Commanding tape drive to rewind 2015-04-14 23:54:01 47134 33 bareos-dir JobId 33: Commanding tape drive to NDMP9_MTIO_EOF 2 times 2015-04-14 23:54:01 47133 33 bareos-dir JobId 33: Waiting for operation to halt 2015-04-14 23:54:01 47132 33 bareos-dir JobId 33: Operation done, cleaning up 2015-04-14 23:54:01 47131 33 bareos-dir JobId 33: DATA: bytes 0KB MOVER: written 804573KB record 12770 2015-04-14 23:54:01 47130 33 bareos-dir JobId 33: Error: LOG_MESSAGE: 'Backup is aborted.' 2015-04-14 23:54:01 47129 33 bareos-dir JobId 33: LOG_MESSAGE: 'server_archive: emctar vol 1, 8132 files, 0 bytes read, 823853056 bytes written' 2015-04-14 23:54:01 47128 33 bareos-dir JobId 33: Warning: LOG_MESSAGE: 'Backup was aborted' 2015-04-14 23:53:58 47127 33 bareos-dir JobId 33: DATA: bytes 679296KB MOVER: written 679077KB record 10778 2015-04-14 23:53:28 47126 33 bareos-dir JobId 33: Monitoring backup 2015-04-14 23:53:28 47125 33 bareos-dir JobId 33: Operation started 2015-04-14 23:53:28 47124 33 bareos-dir JobId 33: Async request NDMP4_LOG_MESSAGE 2015-04-14 23:53:28 47123 33 bareos-dir JobId 33: Waiting for operation to start 2015-04-14 23:53:28 47122 33 bareos-dir JobId 33: Commanding tape drive to rewind 2015-04-14 23:53:28 47121 33 bareos-sd JobId 33: Wrote label to prelabeled Volume "ndmp-0004" on device "NDMPStorage-device-3" (/BACKUP/PAIRED) 2015-04-14 23:53:28 47120 33 bareos-sd JobId 33: Labeled new Volume "ndmp-0004" on device "NDMPStorage-device-3" (/BACKUP/PAIRED). 2015-04-14 23:53:28 47119 33 bareos-dir JobId 33: Opening tape drive BJGH-JPPG-ABKD-AHLI-LODJ-NBMJ-LDBL-FPIP@/pathnfs read/write 2015-04-14 23:53:28 47118 33 bareos-dir JobId 33: Async request NDMP4_LOG_MESSAGE 2015-04-14 23:53:28 47117 33 bareos-dir JobId 33: Using Device "NDMPStorage-device-3" to write. 2015-04-14 23:52:16 47116 33 bareos-dir JobId 33: Start NDMP Backup JobId 33, Job=VNXNDMPBackup.2015-04-14_23.52.10_12 2015-04-14 23:52:10 47113 33 bareos-dir JobId 33: No prior or suitable Full backup found in catalog. Doing FULL backup. 2015-04-14 23:52:10 47112 33 bareos-dir JobId 33: No prior Full backup Job record found. bconsole output status dir: Running Jobs: Console connected at 14-Apr-15 23:47 JobId Level Name Status ====================================================================== 33 Full VNXNDMPBackup.2015-04-14_23.52.10_12 has erred ==== | ||||
Additional Information | Os version: CentOS Linux release 7.1.1503 (Core) Baores packages: bareos-webui-14.2.0.git.1428491022-63.1.el7.noarch bareos-database-common-14.2.2-46.1.el7.x86_64 bareos-filedaemon-14.2.2-46.1.el7.x86_64 bareos-database-mysql-14.2.2-46.1.el7.x86_64 bareos-database-tools-14.2.2-46.1.el7.x86_64 bareos-storage-14.2.2-46.1.el7.x86_64 bareos-bconsole-14.2.2-46.1.el7.x86_64 bareos-14.2.2-46.1.el7.x86_64 bareos-common-14.2.2-46.1.el7.x86_64 bareos-director-14.2.2-46.1.el7.x86_64 bareos-client-14.2.2-46.1.el7.x86_64 | ||||
Tags | No tags attached. | ||||
duplicate of | 0000330 | closed | NDMP protocol error |
enabling NDMP debugging fatal error: NDMP protocol error, FHDB add_node request for unknown node |
|
Hi Marco, thanks for you answer. I have enabled debugging and snooping directive. In attachment the log. Can you take a look on this and check where is the problem ? you told me "NDMP library which seems to have some problem resolving the hostname" did you mean that the NDMP Datamover has some trouble to resolve its hostname or the Bareos Director/Storage Daemon ? Thanks and Regards |
|
Hello Marco, the problem seems to be related to the number of objects.... I have tried to backup a little sub folder ( 13MB) and backup is running fine. the folder size that previously I have tried to backup is 4 GB and have > 100.000 files my goal is to backup one fs of 2,5 TB and more than 4 milions of files..... are there some improvements/configurations that i need to change in order to be able to backup this filesystem via ndmp ? Build OS: x86_64-redhat-linux-gnu redhat CentOS Linux release 7.0.1406 (Core) JobId: 61 Job: VNXNDMPBackup.2015-04-17_12.24.14_04 Backup Level: Full (upgraded from Incremental) Client: "server_2" FileSet: "NDMP Fileset" 2015-04-17 12:24:14 Pool: "NDMPStoragePool" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "NDMPStorage-1" (From Pool resource) Scheduled time: 17-Apr-2015 12:24:13 Start time: 17-Apr-2015 12:24:16 End time: 17-Apr-2015 12:24:22 Elapsed time: 6 secs Priority: 10 NDMP Files Written: 539 SD Files Written: 1 NDMP Bytes Written: 0 (0 B) SD Bytes Written: 10,644,619 (10.64 MB) Rate: 0.0 KB/s Volume name(s): ndmp-0007 Volume Session Id: 2 Volume Session Time: 1429264970 Last Volume Bytes: 12,475,661,851 (12.47 GB) Termination: Backup OK |
|
Add: every time the job finish in status erred ( es. 62 Full VNXNDMPBackup.2015-04-17_12.32.48_04 has erred) I need to restart the SD because cancel job doesn't work. Regards Michele |
|
I closed this bug as its the same a bug 330. It also has little to do with the amount of data saved more with the fact that some NDMP implementations when sending the NDMP FH DATA records send things out of order e.g. the elements of a directory before the actual directory. The way thing are currently are implemented this means that it will try aborting the backup as it cannot handle this. For bug 330 there is a patch in 14.2 and master that allows you to set the "savefilehistory" to "false" in the job definition which should work around this problem. This only means that single file restore won't work but with the mentioned amount of files that is questionable if that will work well anyway. |
|
Hi Marco, when I wrote you last time, i already had tested the savefilehistory option without good result. I found that the problem was caused by a file level replication process from one emc array to another emc array ( where i'm reading the data via ndmp). I have now succesfully backed up the data. I have a couple of question : 1) is it possible to restore the ndmp backup to the file daemon as a normal file backup ? I have tried to save in native format, dump, tar etc but when i have restored the data on the FD I saw that is not possible browsing files or directory 2) I have succesfully backed up 71 GB in 22 minutes that is not bad but is there some fine tuning that can improve the backup speed ? ndmp I/O size or anything else ? |
|
Hi Marco, you can close the ticket. Thanks and Regards |
|
NDMP data is saved as opaque data so it can only be restored by a NDMP data agent. As such the content cannot be restore in any by a native backup filedaemon. The only real thing you can do is bextract the actual NDMP stream and use that as input on the NDMP box. People have done that restoring their NetAPP SM_TAPE NDMP backups. As to speed it may help if you spool data but that is mostly for writing directly to tape. Other then that is kind of depends on the network, filesystem you write your backup to or tape speed and of course the speed in which the source system provides the data. As to recordsize that is something you have to test it kind of depends as always. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2015-04-15 00:01 | systemike | New Issue | |
2015-04-15 04:58 | systemike | Note Added: 0001689 | |
2015-04-16 22:39 | mvwieringen | Relationship added | duplicate of 0000330 |
2015-04-16 22:40 | mvwieringen | Status | new => closed |
2015-04-16 22:40 | mvwieringen | Resolution | open => duplicate |
2015-04-17 10:50 | systemike | Note Added: 0001693 | |
2015-04-17 10:50 | systemike | Status | closed => feedback |
2015-04-17 10:50 | systemike | Resolution | duplicate => reopened |
2015-04-17 10:53 | systemike | File Added: bareos-trace_ndmp_17042015.txt.zip | |
2015-04-17 11:11 | systemike | File Added: bareos-trace_ndmp_2_17042015.txt.zip | |
2015-04-17 12:30 | systemike | Note Added: 0001694 | |
2015-04-17 12:30 | systemike | Status | feedback => new |
2015-04-17 13:08 | systemike | Note Added: 0001695 | |
2015-04-21 17:42 | mvwieringen | Note Added: 0001698 | |
2015-04-21 17:42 | mvwieringen | Status | new => feedback |
2015-04-23 15:02 | systemike | Note Added: 0001699 | |
2015-04-23 15:02 | systemike | Status | feedback => new |
2015-05-07 10:16 | systemike | Note Added: 0001711 | |
2015-05-11 22:14 | mvwieringen | Note Added: 0001717 | |
2015-05-11 22:14 | mvwieringen | File Deleted: bareos-trace_ndmp_17042015.txt.zip | |
2015-05-11 22:14 | mvwieringen | File Deleted: bareos-trace_ndmp_2_17042015.txt.zip | |
2015-05-11 22:15 | mvwieringen | Status | new => closed |
2015-05-11 22:15 | mvwieringen | Resolution | reopened => no change required |