View Issue Details

IDProjectCategoryView StatusLast Update
0001414bareos-coredirectorpublic2023-03-21 17:20
Reporterembareossed Assigned Tobruno-at-bareos  
PrioritynormalSeverityminorReproducibilityrandom
Status closedResolutionno change required 
PlatformLinuxOSDevuan (Debian)OS VersionChimaera (11)
Product Version21.0.0 
Summary0001414: Some jobs not followed up by email
DescriptionOn random days for random jobs, emails are not sent out indicating final disposition. The GUI dashboard shows that a job has completed, even successfully, but no email for the job arrives in my inbox.

The jobs I am referring to are kicked off on a schedule. All of the jobs run -- or at least attempt to run -- according to the schedule, and I normally receive an email for each job as they complete.

I've checked the mail log files, searching around the time the job with the missing email finished, but there is no record of an email being sent. The log does show, however, all the other emails that were sent successfully for the rest of the jobs with the times corresponding correctly to each respective job finishing.

(I experienced this problem on 20.0.0 also, just for comparison. bareos 18 sends me emails for each job, regardless of final disposition, failed or OK).
Steps To ReproduceSet up several backup jobs, add them all to a schedule, and set them to run at some time.

Each job should send an email indicating when and how it finished.
Additional InformationI did not modify schedules or jobs for bareos 20 or 21. They were taken ver batim from my bareos 18 configuration which works correctly.

TagsNo tags attached.

Activities

bruno-at-bareos

bruno-at-bareos

2021-12-28 09:32

manager   ~0004424

Hello, could you share you Messages and Daemon configuration please ?
Check also if you have any XXXXX.mail left in /var/lib/bareos/ directory ?

Maybe your backup server is sending from time to time too much message and hit a rate limit parameter on one server in mail path ?
Or the message is dropped by some software in the mail delivery chain, thinking its a junk/spam/else ...
This are the only reasons I can think about, and of course is not related to bareos: I've got 100% of bareos mail since more a decade.

Let see if you configuration show something that can hurt messages.
embareossed

embareossed

2021-12-28 18:19

reporter   ~0004430

Yes, there are indeed many *.mail files left in the /var/lib/bareos directory!

Here are my messages *.conf files (not modified from bareos 18, but maybe need to be?):

Messages {
  Name = Daemon
  Description = "Message delivery for daemon messages (no job)."
  mailcommand = "/usr/bin/bsmtp -h localhost -f \"\(Bareos\) \<%r\>\" -s \"Bareos daemon message\" %r"
  mail = root = all, !skipped, !audit # (0000002)
  console = all, !skipped, !saved, !audit
  append = "/var/log/bareos/bareos.log" = all, !skipped, !audit
  append = "/var/log/bareos/bareos-audit.log" = audit
}
Messages {
  Name = Standard
  Description = "Reasonable message delivery -- send most everything to email address and to the console."
  operatorcommand = "/usr/bin/bsmtp -h localhost -f \"\(Bareos\) \<%r\>\" -s \"Bareos: Intervention needed for %j\" %r"
  mailcommand = "/usr/bin/bsmtp -h localhost -f \"\(Bareos\) \<%r\>\" -s \"Bareos: %t %e of %n %l\" %r"
  operator = root = mount # (0000003)
  mail = root = all, !skipped, !saved, !audit # (0000002)
  console = all, !skipped, !saved, !audit
  append = "/var/log/bareos/bareos.log" = all, !skipped, !saved, !audit
  catalog = all, !skipped, !saved, !audit
}

As far as my mail config, I am using postfix (and dovecot, to retrieve the messages on an admin host) and have not observed any errors in the mail logs. There are no additional mail systems (MTAs?) in the mail "path" -- it is simply one host, running director, storage, postfix, and dovecot and another host running thunderbird to read the messages. This same arrangement works in my bareos 18 setup, which is where I got most of my config files for bareos 21.
embareossed

embareossed

2022-01-03 22:41

reporter   ~0004447

I probably need to remark that I have bareos 18 running on devuan ascii (debian 9) and bareos 21 on devuan chimaera (debian 11).

I suppose there could be differences in the mail systems between the two, but I have mostly implemented both mail systems from the defaults out-of-the-box. Since those *.mail files are sitting in /var/lib/bareos, I am assuming bareos director still has not sent them? Does it not retry after some period of time?
bruno-at-bareos

bruno-at-bareos

2022-01-04 11:45

manager   ~0004448

Nothing special in your config (you already report the changes of %c and %n between 18 and newer version)

Usually if there are files .mail left that should mean the job is still running.
If the job is terminated, and the .mail are still there's that mean they are just orphans.
This can be the case when the director crash followed by a restart (depending how the systemd unit service is setup)
but a crash is normally noticed in machine logs and systemctl status.
embareossed

embareossed

2022-01-05 01:51

reporter   ~0004453

I had to change %c to %n due to a change in the behavior in bareos 21. But that is a superficial change anyway.

The jobs were not running, afaik. But if I notice missing emails again, I'll double check.

I am not aware of crashes -- these jobs run overnight, automatically, per the schedule object set up for them.

Devuan does not use systemd/systemctl, but I can still look at the system logs if/when this occurs again.

I have not observed missing emails for several days now.
embareossed

embareossed

2022-01-16 23:15

reporter   ~0004477

Last edited: 2022-01-16 23:19

1 missing email today. This time, though, I do not see any *.mail messages in /var/lib/bareos.

I looked at the source to the mail messages that did make it to the other system, and mapped them to the mail.log messages on the bareos system. It appears that there was another message that did get processed by the mail server on the bareos system that does not map to any message received by my mail reader.

It seems likely that this is not a bareos problem after all. I will continue looking at this problem on my side for now. I'd say to close this issue for now.

bruno-at-bareos

bruno-at-bareos

2023-03-21 17:20

manager   ~0004917

doesn't seems to be a bareos problems. Not reproducible in any of automated tests here.

Issue History

Date Modified Username Field Change
2021-12-28 02:59 embareossed New Issue
2021-12-28 09:32 bruno-at-bareos Note Added: 0004424
2021-12-28 18:19 embareossed Note Added: 0004430
2022-01-03 22:41 embareossed Note Added: 0004447
2022-01-04 11:45 bruno-at-bareos Note Added: 0004448
2022-01-05 01:51 embareossed Note Added: 0004453
2022-01-16 23:15 embareossed Note Added: 0004477
2022-01-16 23:19 embareossed Note Edited: 0004477
2023-03-21 17:20 bruno-at-bareos Assigned To => bruno-at-bareos
2023-03-21 17:20 bruno-at-bareos Status new => closed
2023-03-21 17:20 bruno-at-bareos Resolution open => no change required
2023-03-21 17:20 bruno-at-bareos Note Added: 0004917