View Issue Details

IDProjectCategoryView StatusLast Update
0000457bareos-corestorage daemonpublic2015-05-11 22:15
Reportersystemike Assigned To 
PrioritynormalSeveritycrashReproducibilityalways
Status closedResolutionno change required 
PlatformLinuxOSCentOSOS Version7
Summary0000457: NDMP BACKUP HANG
DescriptionHello,
i'm trying to backup a nfs export hosted by emc VNX via ndmp.
the backup start and after few second it hangs

below the log:



Timestamp Log Job Message
2015-04-14 23:54:21 47139 33 bareos-sd JobId 33: Elapsed time=00:00:53, Transfer rate=15.54 M Bytes/second
2015-04-14 23:54:01 47138 33 bareos-dir JobId 33: Operation ended questionably
2015-04-14 23:54:01 47137 33 bareos-dir JobId 33: Operation halted, stopping
2015-04-14 23:54:01 47136 33 bareos-dir JobId 33: Closing tape drive BJGH-JPPG-ABKD-AHLI-LODJ-NBMJ-LDBL-FPIP@/pathnfs
2015-04-14 23:54:01 47135 33 bareos-dir JobId 33: Commanding tape drive to rewind
2015-04-14 23:54:01 47134 33 bareos-dir JobId 33: Commanding tape drive to NDMP9_MTIO_EOF 2 times
2015-04-14 23:54:01 47133 33 bareos-dir JobId 33: Waiting for operation to halt
2015-04-14 23:54:01 47132 33 bareos-dir JobId 33: Operation done, cleaning up
2015-04-14 23:54:01 47131 33 bareos-dir JobId 33: DATA: bytes 0KB MOVER: written 804573KB record 12770
2015-04-14 23:54:01 47130 33 bareos-dir JobId 33: Error: LOG_MESSAGE: 'Backup is aborted.'
2015-04-14 23:54:01 47129 33 bareos-dir JobId 33: LOG_MESSAGE: 'server_archive: emctar vol 1, 8132 files, 0 bytes read, 823853056 bytes written'
2015-04-14 23:54:01 47128 33 bareos-dir JobId 33: Warning: LOG_MESSAGE: 'Backup was aborted'
2015-04-14 23:53:58 47127 33 bareos-dir JobId 33: DATA: bytes 679296KB MOVER: written 679077KB record 10778
2015-04-14 23:53:28 47126 33 bareos-dir JobId 33: Monitoring backup
2015-04-14 23:53:28 47125 33 bareos-dir JobId 33: Operation started
2015-04-14 23:53:28 47124 33 bareos-dir JobId 33: Async request NDMP4_LOG_MESSAGE
2015-04-14 23:53:28 47123 33 bareos-dir JobId 33: Waiting for operation to start
2015-04-14 23:53:28 47122 33 bareos-dir JobId 33: Commanding tape drive to rewind
2015-04-14 23:53:28 47121 33 bareos-sd JobId 33: Wrote label to prelabeled Volume "ndmp-0004" on device "NDMPStorage-device-3" (/BACKUP/PAIRED)
2015-04-14 23:53:28 47120 33 bareos-sd JobId 33: Labeled new Volume "ndmp-0004" on device "NDMPStorage-device-3" (/BACKUP/PAIRED).
2015-04-14 23:53:28 47119 33 bareos-dir JobId 33: Opening tape drive
BJGH-JPPG-ABKD-AHLI-LODJ-NBMJ-LDBL-FPIP@/pathnfs read/write
2015-04-14 23:53:28 47118 33 bareos-dir JobId 33: Async request NDMP4_LOG_MESSAGE
2015-04-14 23:53:28 47117 33 bareos-dir JobId 33: Using Device "NDMPStorage-device-3" to write.
2015-04-14 23:52:16 47116 33 bareos-dir JobId 33: Start NDMP Backup JobId 33, Job=VNXNDMPBackup.2015-04-14_23.52.10_12
2015-04-14 23:52:10 47113 33 bareos-dir JobId 33: No prior or suitable Full backup found in catalog. Doing FULL backup.
2015-04-14 23:52:10 47112 33 bareos-dir JobId 33: No prior Full backup Job record found.


bconsole output status dir:

Running Jobs:
Console connected at 14-Apr-15 23:47
 JobId Level Name Status
======================================================================
    33 Full VNXNDMPBackup.2015-04-14_23.52.10_12 has erred
====

Additional InformationOs version: CentOS Linux release 7.1.1503 (Core)
Baores packages:
bareos-webui-14.2.0.git.1428491022-63.1.el7.noarch
bareos-database-common-14.2.2-46.1.el7.x86_64
bareos-filedaemon-14.2.2-46.1.el7.x86_64
bareos-database-mysql-14.2.2-46.1.el7.x86_64
bareos-database-tools-14.2.2-46.1.el7.x86_64
bareos-storage-14.2.2-46.1.el7.x86_64
bareos-bconsole-14.2.2-46.1.el7.x86_64
bareos-14.2.2-46.1.el7.x86_64
bareos-common-14.2.2-46.1.el7.x86_64
bareos-director-14.2.2-46.1.el7.x86_64
bareos-client-14.2.2-46.1.el7.x86_64
TagsNo tags attached.

Relationships

duplicate of 0000330 closed NDMP protocol error 

Activities

systemike

systemike

2015-04-15 04:58

reporter   ~0001689

enabling NDMP debugging


fatal error: NDMP protocol error, FHDB add_node request for unknown node
systemike

systemike

2015-04-17 10:50

reporter   ~0001693

Hi Marco,
thanks for you answer.
I have enabled debugging and snooping directive. In attachment the log.

Can you take a look on this and check where is the problem ?
you told me "NDMP library which seems to have some problem resolving the
hostname"
did you mean that the NDMP Datamover has some trouble to resolve its hostname or the Bareos Director/Storage Daemon ?


Thanks and Regards
systemike

systemike

2015-04-17 12:30

reporter   ~0001694

Hello Marco,
the problem seems to be related to the number of objects....

I have tried to backup a little sub folder ( 13MB) and backup is running fine.
the folder size that previously I have tried to backup is 4 GB and have > 100.000 files
my goal is to backup one fs of 2,5 TB and more than 4 milions of files.....
are there some improvements/configurations that i need to change in order to be able to backup this filesystem via ndmp ?


  Build OS: x86_64-redhat-linux-gnu redhat CentOS Linux release 7.0.1406 (Core)
  JobId: 61
  Job: VNXNDMPBackup.2015-04-17_12.24.14_04
  Backup Level: Full (upgraded from Incremental)
  Client: "server_2"
  FileSet: "NDMP Fileset" 2015-04-17 12:24:14
  Pool: "NDMPStoragePool" (From Job resource)
  Catalog: "MyCatalog" (From Client resource)
  Storage: "NDMPStorage-1" (From Pool resource)
  Scheduled time: 17-Apr-2015 12:24:13
  Start time: 17-Apr-2015 12:24:16
  End time: 17-Apr-2015 12:24:22
  Elapsed time: 6 secs
  Priority: 10
  NDMP Files Written: 539
  SD Files Written: 1
  NDMP Bytes Written: 0 (0 B)
  SD Bytes Written: 10,644,619 (10.64 MB)
  Rate: 0.0 KB/s
  Volume name(s): ndmp-0007
  Volume Session Id: 2
  Volume Session Time: 1429264970
  Last Volume Bytes: 12,475,661,851 (12.47 GB)
  Termination: Backup OK
systemike

systemike

2015-04-17 13:08

reporter   ~0001695

Add:
every time the job finish in status erred ( es. 62 Full VNXNDMPBackup.2015-04-17_12.32.48_04 has erred) I need to restart the SD because cancel job doesn't work.


Regards
Michele
mvwieringen

mvwieringen

2015-04-21 17:42

developer   ~0001698

I closed this bug as its the same a bug 330. It also has little to do
with the amount of data saved more with the fact that some NDMP implementations
when sending the NDMP FH DATA records send things out of order e.g. the elements
of a directory before the actual directory. The way thing are currently are
implemented this means that it will try aborting the backup as it cannot handle
this.

For bug 330 there is a patch in 14.2 and master that allows you to set the
"savefilehistory" to "false" in the job definition which should work around
this problem. This only means that single file restore won't work but with
the mentioned amount of files that is questionable if that will work well
anyway.
systemike

systemike

2015-04-23 15:02

reporter   ~0001699

Hi Marco,
when I wrote you last time, i already had tested the savefilehistory option without good result.
I found that the problem was caused by a file level replication process from one emc array to another emc array ( where i'm reading the data via ndmp).
I have now succesfully backed up the data.
I have a couple of question :

1) is it possible to restore the ndmp backup to the file daemon as a normal file backup ? I have tried to save in native format, dump, tar etc but when i have restored the data on the FD I saw that is not possible browsing files or directory
2) I have succesfully backed up 71 GB in 22 minutes that is not bad but is there some fine tuning that can improve the backup speed ? ndmp I/O size or anything else ?
systemike

systemike

2015-05-07 10:16

reporter   ~0001711

Hi Marco,
you can close the ticket.

Thanks and Regards
mvwieringen

mvwieringen

2015-05-11 22:14

developer   ~0001717

NDMP data is saved as opaque data so it can only be restored by a NDMP data
agent. As such the content cannot be restore in any by a native backup filedaemon.

The only real thing you can do is bextract the actual NDMP stream and use that
as input on the NDMP box. People have done that restoring their NetAPP SM_TAPE
NDMP backups.

As to speed it may help if you spool data but that is mostly for writing
directly to tape. Other then that is kind of depends on the network, filesystem
you write your backup to or tape speed and of course the speed in which the
source system provides the data. As to recordsize that is something you have
to test it kind of depends as always.

Issue History

Date Modified Username Field Change
2015-04-15 00:01 systemike New Issue
2015-04-15 04:58 systemike Note Added: 0001689
2015-04-16 22:39 mvwieringen Relationship added duplicate of 0000330
2015-04-16 22:40 mvwieringen Status new => closed
2015-04-16 22:40 mvwieringen Resolution open => duplicate
2015-04-17 10:50 systemike Note Added: 0001693
2015-04-17 10:50 systemike Status closed => feedback
2015-04-17 10:50 systemike Resolution duplicate => reopened
2015-04-17 10:53 systemike File Added: bareos-trace_ndmp_17042015.txt.zip
2015-04-17 11:11 systemike File Added: bareos-trace_ndmp_2_17042015.txt.zip
2015-04-17 12:30 systemike Note Added: 0001694
2015-04-17 12:30 systemike Status feedback => new
2015-04-17 13:08 systemike Note Added: 0001695
2015-04-21 17:42 mvwieringen Note Added: 0001698
2015-04-21 17:42 mvwieringen Status new => feedback
2015-04-23 15:02 systemike Note Added: 0001699
2015-04-23 15:02 systemike Status feedback => new
2015-05-07 10:16 systemike Note Added: 0001711
2015-05-11 22:14 mvwieringen Note Added: 0001717
2015-05-11 22:14 mvwieringen File Deleted: bareos-trace_ndmp_17042015.txt.zip
2015-05-11 22:14 mvwieringen File Deleted: bareos-trace_ndmp_2_17042015.txt.zip
2015-05-11 22:15 mvwieringen Status new => closed
2015-05-11 22:15 mvwieringen Resolution reopened => no change required