View Issue Details

IDProjectCategoryView StatusLast Update
0001372bareos-coreGeneralpublic2022-08-10 11:00
ReporterInt Assigned To 
PriorityhighSeveritymajorReproducibilityalways
Status newResolutionopen 
PlatformLinuxOSCentOSOS Version7
Product Version19.2.10 
Summary0001372: Restore/Migrate of job distributed on two tapes fails with ERR=Cannot allocate memory
DescriptionWhen I try to Restore/Migrate a specific job distributed on two tapes, the first tape is read successfully but when mounting the second tape the job fails with error

```
20-Jul 08:02 bareos-sd JobId 21833: Ready to read from volume "NIX417L6" on device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst).
20-Jul 08:02 bareos-sd JobId 21833: Forward spacing Volume "NIX417L6" to file:block 0:1.
20-Jul 08:02 bareos-sd JobId 21833: Error: stored/block.cc:1057 Read error on fd=7 at file:blk 0:1 on device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst). ERR=Cannot allocate memory.
```

This only happens on this specific tape. Other jobs distributed on multiple tapes work correctly.

I did configure the Bareos Tape device /etc/bareos/bareos-sd.d/device/tapedrive-0.conf with "Maximum Block Size = 1048576" from the very beginning and all tapes were written using this configuration.
I started using Bareos after Version 14.2.0 so the new block size handling (label block 64k, data blocks 1M) was used on all tapes.

But the error suggests that there is a problem with the block size.
This is seconded by /var/log/messages
```
Jul 20 08:02:14 igms07 kernel: st 3:0:0:0: [st0] Failed to read 1048576 byte block with 64512 byte transfer.
```

So it seems that Bareos is trying to read with 64k block size although it is configured otherwise?


Further tests showed that I can read the tape label correctly with btape:

```
]$ btape /dev/tape/by-id/scsi-350050763121a063c-nst
Tape block granularity is 1024 bytes.
btape: stored/butil.cc:290-0 Using device: "/dev/tape/by-id/scsi-350050763121a063c-nst" for writing.
btape: stored/btape.cc:485-0 open device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst): OK
*status
 Bareos status: file=0 block=0
 Device status: BOT ONLINE IM_REP_EN file=0 block=0
Device status: TAPE BOT ONLINE IMMREPORT. ERR=
*readlabel
btape: stored/btape.cc:532-0 Volume label read correctly.

Volume Label:
Id : Bareos 2.0 immortal
VerNo : 20
VolName : NIX417L6
PrevVolName :
VolFile : 0
LabelType : VOL_LABEL
LabelSize : 167
PoolName : Scratch
MediaType : LTO
PoolType : Backup
HostName : igms07.vision.local
Date label written: 05-Nov-2018 11:42
```


I can read the tape with dd with a blocksize of 1M:
dd if=/dev/tape/by-id/scsi-350050763121a063c-nst bs=1M


btape scan fails with the same error as Bareos:

```
]$ btape /dev/tape/by-id/scsi-350050763121a063c-nst
Tape block granularity is 1024 bytes.
btape: stored/butil.cc:290-0 Using device: "/dev/tape/by-id/scsi-350050763121a063c-nst" for writing.
btape: stored/btape.cc:485-0 open device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst): OK
*rewind
btape: stored/btape.cc:581-0 Rewound "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst)
*scan
btape: stored/btape.cc:1901-0 Starting scan at file 0
btape: stored/btape.cc:1909-0 Bad status from read -1. ERR=stored/btape.cc:1907 read error on /dev/tape/by-id/scsi-350050763121a063c-nst. ERR=Cannot allocate memory.
```

but btape scanblocks works fine:

```
]$ btape /dev/tape/by-id/scsi-350050763121a063c-nst
Tape block granularity is 1024 bytes.
btape: stored/butil.cc:290-0 Using device: "/dev/tape/by-id/scsi-350050763121a063c-nst" for writing.
btape: stored/btape.cc:485-0 open device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst): OK
*rewind
btape: stored/btape.cc:581-0 Rewound "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst)
*scanblocks
1 block of 203 bytes in file 0
2213 blocks of 1048576 bytes in file 0
1 block of 1048566 bytes in file 0
1072 blocks of 1048576 bytes in file 0
1 block of 1048568 bytes in file 0
```


The question is, what is the issue here?
And how do I fix it?
TagsNo tags attached.
bareos-master: impact
bareos-master: action
bareos-19.2: impact
bareos-19.2: action
bareos-18.2: impact
bareos-18.2: action
bareos-17.2: impact
bareos-17.2: action
bareos-16.2: impact
bareos-16.2: action
bareos-15.2: impact
bareos-15.2: action
bareos-14.2: impact
bareos-14.2: action
bareos-13.2: impact
bareos-13.2: action
bareos-12.4: impact
bareos-12.4: action

Activities

Int

Int

2021-07-20 10:02

reporter   ~0004183

PS: my device configuration:

Device {

    Name = "tapedrive-0"
    DeviceType = tape
    Maximum Concurrent Jobs = 1

    # default:0, only required if the autoloader have multiple drives.
    DriveIndex = 0

    ArchiveDevice = /dev/tape/by-id/scsi-350050763121a063c-nst # Quantum LTO-7 standalone drive

    MediaType = LTO

    AutoChanger = no # default: no
    AutomaticMount = yes # default: no

    MaximumFileSize = 40GB
    MaximumBlockSize = 1048576

    Maximum Spool Size = 40GB
    Spool Directory = /var/lib/bareos/spool
}
Int

Int

2021-07-31 09:13

reporter   ~0004195

PPS:
When I read other jobs that are also on this tape, the restore/migration works just fine.
aleigh

aleigh

2022-08-04 22:28

reporter   ~0004714

I am having this same problem on a restore which spans 2 tapes. The file is particularly large, so probably the case here is that a single file spans two tapes.

bareos-dir Version: 21.0.0 (21 December 2021) Rocky Linux release 8.4 (Green Obsidian) redhat Rocky Linux release 8.4 (Green Obsidian)

[358442.558093] st 9:0:40:0: [st0] Failed to read 262144 byte block with 64512 byte transfer.

bareos-sd JobId 81: Error: stored/block.cc:1004 Read error on fd=8 at file:blk 0:1 on device "tapedrive-0" (/dev/nst0). ERR=Cannot allocate memory.

See attached job log. Note that despite everything it says, the correct media (LC0004) was inserted and mounted at the correct time. Shortly after mounting LC0004 the job failed.
bruno-at-bareos

bruno-at-bareos

2022-08-10 11:00

developer   ~0004721

Hello aleigh.

We are interested by getting more debugging information, to be able to reproduce 100% of time the problem (if we can, then getting a fix is almost certain).

So if you are able to rerun that restore, would you be able to run it with a higher debug level ?
You can consult this page to understand how to do this.
https://servicedesk.bareos.com/help/en-us/2-support/2-how-do-i-set-the-debug-level

here a debug level of 1000 would be nice.
director, storage, and concerned client should be activated.

We would like to have attached here in text form not picture.
mainly the following command on bconsole

list joblog jobid=81

(or the new jobid of restore try with debug) and also the initial backup jobid.

The whole configuration and logs are a plus do debug the situation.
Maybe our support-info tool can be used and run on director
see instruction here
https://servicedesk.bareos.com/help/en-us/2-support/3-how-to-use-the-bareos-support-info-sh-tool

You can attach the result if not too big, as private (all credential being already removed by the tool)

Issue History

Date Modified Username Field Change
2021-07-20 09:27 Int New Issue
2021-07-20 10:02 Int Note Added: 0004183
2021-07-31 09:13 Int Note Added: 0004195
2022-08-04 22:28 aleigh Note Added: 0004714
2022-08-04 22:28 aleigh File Added: Screen Shot 2022-08-04 at 1.27.33 PM.png
2022-08-10 11:00 bruno-at-bareos Note Added: 0004721