View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001406 | bareos-core | file daemon | public | 2021-12-08 07:15 | 2022-01-31 09:31 |
Reporter | Int | Assigned To | bruno-at-bareos | ||
Priority | normal | Severity | crash | Reproducibility | sometimes |
Status | closed | Resolution | unable to reproduce | ||
Platform | 64bit | OS | Windows | OS Version | Server 2016 |
Product Version | 19.2.11 | ||||
Summary | 0001406: file daemon crashes on Windows Server 2016 | ||||
Description | Sometimes the file daemon crashes on Windows Server 2016. This happened two times in the last month. Patch level of the Server is from Microsoft October 2021 Patchday. File daemon Version: 19.2.7 (16 April 2020) VSS Linux Cross-compile Win64 Microsoft Windows Server 2012 Standard Edition (build 9200), 64-bit Error in the Windows event log: Event 1000, Application Error Name der fehlerhaften Anwendung: bareos-fd.exe, Version: 0.0.0.0, Zeitstempel: 0x5e98651f Name des fehlerhaften Moduls: libbareos.dll, Version: 0.0.0.0, Zeitstempel: 0x5e9864a8 Ausnahmecode: 0xc0000005 Fehleroffset: 0x0000000000022e8b ID des fehlerhaften Prozesses: 0x3db8 Startzeit der fehlerhaften Anwendung: 0x01d7e4e8765df91e Pfad der fehlerhaften Anwendung: C:\Program Files\Bareos\bareos-fd.exe Pfad des fehlerhaften Moduls: C:\Program Files\Bareos\libbareos.dll Berichtskennung: 37b0a4b3-d23d-4cf0-9149-b94dce2d6d2d Vollständiger Name des fehlerhaften Pakets: Anwendungs-ID, die relativ zum fehlerhaften Paket ist: | ||||
Tags | No tags attached. | ||||
Would you like to help us to understand what's going on ? Could you describe a bit more your configuration of the FD (config files for example, you can blank password) also job and fileset involved, plugins used etc. Is this happening when the FD is doing something special, what the occurences (number of time per day, week, month) Could you try to increase the debug level to 200 on the client to get nice timestamped trace and report them here ? |
|
file daemon configuration: myself.conf Client { Name = igms00-fd Maximum Concurrent Jobs = 20 } bareos-dir.conf Director { Name = bareos-dir Password = "xxx" Description = "Allow the configured Director to access this file daemon." } This is the fileset and job during which the fd crashed last time. When the crash happened the job was running for about 36 hours of estimated 96 hours. The total backup volume of a successful job would have been about 18TB in several million files. Fileset: FileSet { Name = "FileSetIGMS00_bilddaten" Enable VSS = yes Include { Options { Signature = MD5 Drive Type = fixed IgnoreCase = yes # if supported by the OS, the read time won't be adapted # this would generate a bunch of writes for no reason on the client machine. noatime = yes # If enabled, the Client will check size, age of each file after their backup # to see if they have changed during backup. If time or size mismatch, an error will raise. # In general, it is recommended to use this option. checkfilechanges = yes WildFile = "[A-Z]:/pagefile.sys" WildDir = "[A-Z]:/RECYCLER" WildDir = "[A-Z]:/$RECYCLE.BIN" WildDir = "[A-Z]:/System Volume Information" WildDir = "[A-Z]:/tmp/bareos-restores" WildDir = "[A-Z]:/Temp" Exclude = yes } File = "d:/Bilddaten" } Exclude { # Don’t add trailing / File = "d:/Bilddaten/_archivieren" File = "d:/Bilddaten/_restored" } } Job: Job { Name = "filebackup_bilddaten-igms00-fd" JobDefs = "DefaultFileJob" Pool = Bilddaten #Pools müssen explizit angegeben werden sonst werden die Pools aus "DefaultFileJob" verwendet! Full Backup Pool = Bilddaten Differential Backup Pool = Bilddaten Incremental Backup Pool = Bilddaten Client = "igms00-fd" FileSet = "FileSetIGMS00_bilddaten" Schedule = "YearlyCycle" Enabled = yes } The crash before happened while five jobs where running in parallel. The fileset and job configurations were different but similar to the one above. I started the filedaemon with option "-d 200", see screenshot attached. Is this the correct sytanx for the windows version of the file daemon? How can I verify that the service is running with debug level 200? |
|
>Is this happening when the FD is doing something special, what the occurences (number of time per day, week, month) Nothing special was done. The jobs and filesets running didn't change for months. The crash happened two times, on 2021-12-08 and 2021-11-26. This were the only occurrences so far. |
|
I wouldn't have change the start of the daemon (especially on nitty picky windows) the command given in previous comment allow to dynamically set and remove debug level. If you want to do so you can refer to the documentation https://docs.bareos.org/master/TasksAndConcepts/TheWindowsVersionOfBareos.html?highlight=windows#windows-service As the crash occur two times quite recently, it would be interesting to check if there's any traces that would have been generated. Could you check if inside the system (normally they are located in bareos working dir, but I can't be sure at 100% under windows) you can find file with .traceback and .bactrace extension. If yes, could you please attach them here. |
|
Ping ? Any news on it ? | |
Sorry, my colleague is out of office. Unfortunately, I have no spare time to help you with this. My colleague will be back after Christmas. | |
>If you want to do so you can refer to the documentation >https://docs.bareos.org/master/TasksAndConcepts/TheWindowsVersionOfBareos.html?highlight=windows#windows-service Thank you for pointing me to the right documentation chapter - with the new documentation system it is very hard to find the information to a specific topic since the search is not working well, see bug 0001351 This helped me to start the windows service in debug mode. It showed that the method I used was not working correctly. >As the crash occur two times quite recently, it would be interesting to check if there's any traces that would have been generated. I could not find any .traceback or .bactrace files. So far the crash did not occur again. I will inform you as soon as it happens again and since the debug mode is working now I hopefully will also be able to provide a trace file. |
|
Hello, beware that using a debug level may create really large *trace* file. Have a look from time to time, rotate them manually if they became to big. |
|
Hello again, Didn't you get any crash, and so a trace since a month ? Maybe also it would be the time to upgrade to last stable 21 release. Without crash or debug, keeping this ticket open doesn't make too much sense. |
|
Hello, >Hello, beware that using a debug level may create really large *trace* file. >Have a look from time to time, rotate them manually if they became to big. I learned that the hard way already before your hint, when your server ran out of disk space ;) Unfortunately I had to stop tracing because of that. Your server does not have enough free space on C: to store the trace from a full backup job. But it wasn't a big loss since the crash also did not reappear again. >Maybe also it would be the time to upgrade to last stable 21 release. I agree - I am already planning for that. >Without crash or debug, keeping this ticket open doesn't make too much sense. Yes, you can close the ticket. Many thanks for your effort! |
|
Not reproducible. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2021-12-08 07:15 | Int | New Issue | |
2021-12-09 10:31 | bruno-at-bareos | Note Added: 0004385 | |
2021-12-10 13:11 | Int | File Added: bareos-fd_debug200.png | |
2021-12-10 13:11 | Int | Note Added: 0004388 | |
2021-12-10 13:16 | Int | Note Added: 0004389 | |
2021-12-13 17:00 | bruno-at-bareos | Note Added: 0004391 | |
2021-12-22 16:33 | bruno-at-bareos | Note Added: 0004407 | |
2021-12-22 16:42 | Int | Note Added: 0004410 | |
2021-12-27 09:38 | Int | Note Added: 0004420 | |
2022-01-10 10:43 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2022-01-10 10:43 | bruno-at-bareos | Status | new => assigned |
2022-01-10 10:44 | bruno-at-bareos | Status | assigned => feedback |
2022-01-10 10:44 | bruno-at-bareos | Note Added: 0004461 | |
2022-01-27 16:24 | bruno-at-bareos | Note Added: 0004490 | |
2022-01-28 07:56 | Int | Note Added: 0004492 | |
2022-01-28 07:56 | Int | Status | feedback => assigned |
2022-01-31 09:31 | bruno-at-bareos | Status | assigned => closed |
2022-01-31 09:31 | bruno-at-bareos | Resolution | open => unable to reproduce |
2022-01-31 09:31 | bruno-at-bareos | Note Added: 0004495 |