View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000361 | bareos-core | storage daemon | public | 2014-11-06 20:25 | 2016-01-14 13:54 |
Reporter | mgostomski | Assigned To | |||
Priority | high | Severity | crash | Reproducibility | unable to reproduce |
Status | closed | Resolution | suspended | ||
Platform | Windows | OS | Server | OS Version | 2008 64bit |
Product Version | 14.2.1 | ||||
Summary | 0000361: Storage crash when running eg. two concurrent jobs | ||||
Description | When i try run two concurrent jobs on the same SD daemon (multiple devices) then he's crached with this error on Windows Event Log: Signature of the problem: Problem Event Name: APPCRASH Application Name: bareos-sd.exe Application Version: 1.0.0.0 Application Timestamp: 543d899b The name of the module with the error: pthreadGCE2.dll Version of the module with the error: 2.8.0.0 Timestamp module error: 00790070 Exception code: c0000005 Moving exception: 0000000000005471 Operating System Version: 6.1.7601.2.1.0.274.10 Locale ID: 1045 Additional Information 1: f9b3 Additional Information 2: f9b3fd4dd4211be525ef3120e05e4107 Additional Information 3: 5f9e Additional Information 4: 5f9e152a0a74f7ea07fc904fcf144ab6 | ||||
Steps To Reproduce | 1. Create and configure storage daemon 2. Create multiple devices on the same disk but different directories 3. Run two job at the same time 4. One Job running and when next is connect to storage, then bareos-sd daemon is crashed. After re-run, bareos-dir trying to rerun job, and bareos-sd again crach... and again... again... | ||||
Additional Information | 2014-11-06 19:57:51 avn01bac001-dir JobId 1030: Rescheduled Job Backup_Users.2014-11-06_19.35.58_25 at 06-Nov-2014 19:57 to re-run in 60 seconds (06-Nov-2014 19:58). 2014-11-06 19:57:51 avn01bac001-dir JobId 1030: Job Backup_Users.2014-11-06_19.35.58_25 waiting 60 seconds for scheduled start time. 2014-11-06 19:58:53 avn01bac001-dir JobId 1030: Start Backup JobId 1030, Job=Backup_Users.2014-11-06_19.35.58_25 2014-11-06 19:58:54 avn01bac001-dir JobId 1030: Using Device "avn02adc001" to write. 2014-11-06 20:07:37 avn02adc001 JobId 1030: Volume "Avena0114" previously written, moving to end of data. 2014-11-06 20:07:37 avn02adc001 JobId 1030: Ready to append to end of Volume "Avena0114" size=205 2014-11-06 20:07:37 rwodzik-fd JobId 1030: Created 28 wildcard excludes from FilesNotToBackup Registry key 2014-11-06 20:07:40 rwodzik-fd JobId 1030: Generate VSS snapshots. Driver="Win64 VSS", Drive(s)="C" VMP(s)=0 2014-11-06 20:08:17 avn02adc001 JobId 1030: Fatal error: stored/append.c:191 FI=8 from FD not positive or sequential=0 2014-11-06 20:08:17 avn02adc001 JobId 1030: Elapsed time=00:00:39, Transfer rate=0 Bytes/second 2014-11-06 20:08:17 rwodzik-fd JobId 1030: Error: lib/bsock_tcp.c:422 Write error sending 6124 bytes to Storage daemon:188.252.6.146:9103: ERR=Input/output error 2014-11-06 20:08:17 rwodzik-fd JobId 1030: Fatal error: filed/backup.c:984 Network send error to SD. ERR=Input/output error 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "Task Scheduler Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "VSS Metadata Store Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "Performance Counters Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "System Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "Shadow Copy Optimization Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "ASR Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "COM+ REGDB Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "Registry Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "BITS Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "WMI Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 20:08:20 rwodzik-fd JobId 1030: VSS Writer (BackupComplete): "MSSearch Service Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-06 19:59:47 avn01bac001-dir JobId 1030: Error: Bareos avn01bac001-dir 14.2.1 (12Sep14): Build OS: x86_64-pc-linux-gnu debian Debian GNU/Linux 7.0 (wheezy) JobId: 1030 Job: Backup_Users.2014-11-06_19.35.58_25 Backup Level: Full Client: "rwodzik-fd" 14.4.0 (08Oct14) Microsoft Windows 7 Professional Service Pack 1 (build 7601), 64-bit,Cross-compile,Win64 FileSet: "Users" 2014-11-06 11:02:52 Pool: "AvenaFull" (From Job FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "avn02adc001" (From command line) Scheduled time: 06-Nov-2014 19:35:58 Start time: 06-Nov-2014 19:58:54 End time: 06-Nov-2014 19:59:47 Elapsed time: 53 secs Priority: 1 FD Files Written: 8 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: 100.0 % VSS: yes Encryption: no Accurate: no Volume name(s): Volume Session Id: 1 Volume Session Time: 1415300855 Last Volume Bytes: 0 (0 B) Non-fatal FD errors: 1 SD Errors: 1 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** | ||||
Tags | No tags attached. | ||||
Can you please give me the exact windows version you are running the server on? Is your director also running on windows What version is your director? Please give as much info on your setup as possible so that we can hopefully reproduce it. |
|
If we run only one job for whichever device this job run ok, but if we add second job, then storages crashed. Version of Windows Storage Bareos: 14.4.0 My Director running on Linux Debian (version: avn01bac001-dir Version: 14.2.1) ##################### # WINDOWS STORAGE # # bareos-sd.conf # ##################### Storage { # definition of myself Name = avn02adc001 Heartbeat Interval = 20 Maximum Concurrent Jobs = 25 } Director { Name = avn01bac001-dir Password = "#SERCRETPASS#" } Director { Name = avn02adc001-mon Password = "#SERCRETPASS#" Monitor = yes } Device { Name = avn02adc001 Device Type = File Media Type = File Archive Device = H:\Bareos\ Random Access = yes RemovableMedia = no Autoselect = yes Requires Mount = no LabelMedia = yes Maximum Concurrent Jobs = 25 } Device { Name = avn02adc001Mgostomski Device Type = File Media Type = FileMgostomski Archive Device = H:\BareosMgostomski\ Random Access = yes RemovableMedia = no Autoselect = yes Requires Mount = no LabelMedia = yes Maximum Concurrent Jobs = 25 } Device { Name = avn02adc001Dnaczk Device Type = File Media Type = FileDnaczk Archive Device = H:\BareosDnaczk\ Random Access = yes RemovableMedia = no Autoselect = yes Requires Mount = no LabelMedia = yes Maximum Concurrent Jobs = 25 } Messages { Name = Standard director = avn01bac001-dir = all } ######################### # END OF BAREOS-SD.CONF # ######################### ##################### # DIRECOTR ON LINUX # # storages.conf # ##################### Storage { Name = avn02adc001 Address = #SERCRETPASS# Password = "#SERCRETPASS#" Device = avn02adc001 Device = avn02adc001Mgostomski Device = avn02adc001Dnaczk Media Type = File SDPort = 9103 Maximum Concurrent Jobs = 8 } Storage { Name = avn02adc001Mgostomski Address = #SERCRETPASS# Password = "#SERCRETPASS#" Device = avn02adc001Mgostomski Media Type = FileMgostomski SDPort = 9103 Maximum Concurrent Jobs = 8 } Storage { Name = avn02adc001Dnaczk Address = #SERCRETPASS# Password = "#SERCRETPASS#" Device = avn02adc001Dnaczk Media Type = FileDnaczk SDPort = 9103 Maximum Concurrent Jobs = 8 } ######################### # END OF STORAGES.CONF # ######################### |
|
The crash is on an assert in the SD code Fatal error: stored/append.c:191 FI=8 from FD not positive or sequential=0 Seems the filed sends as first the FileIndex 8 instead of 1 |
|
@mvwieringen How to resolve this issue? When we run only one job, then all is ok... Storage crashed only if we run two or more concurrent jobs. |
|
Error from first Job with run after run second Job. This Job Backups 206MB and, crash.. 2014-11-07 12:48:11 avn01bac001-dir JobId 15: Start Backup JobId 15, Job=Backup_Users.2014-11-07_12.48.09_22 2014-11-07 12:48:12 avn01bac001-dir JobId 15: Using Device "avn02adc001" to write. 2014-11-07 12:56:58 avn02adc001 JobId 15: Volume "Avn010002" previously written, moving to end of data. 2014-11-07 12:56:58 avn02adc001 JobId 15: Ready to append to end of Volume "Avn010002" size=463712440 2014-11-07 12:56:58 avn02adc001 JobId 15: Spooling data ... 2014-11-07 12:56:58 dnaczk-fd JobId 15: Created 28 wildcard excludes from FilesNotToBackup Registry key 2014-11-07 12:57:01 dnaczk-fd JobId 15: Generate VSS snapshots. Driver="Win32 VSS", Drive(s)="C" VMP(s)=0 2014-11-07 12:57:46 dnaczk-fd JobId 15: Error: lib/bsock_tcp.c:422 Write error sending 65536 bytes to Storage daemon:188.252.6.146:9103: ERR=Input/output error 2014-11-07 12:57:46 dnaczk-fd JobId 15: Fatal error: filed/backup.c:984 Network send error to SD. ERR=Input/output error 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "Task Scheduler Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "VSS Metadata Store Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "Performance Counters Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "System Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "ASR Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "MSSearch Service Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "Shadow Copy Optimization Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "WMI Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "Registry Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:57:51 dnaczk-fd JobId 15: VSS Writer (BackupComplete): "COM+ REGDB Writer", State: 0x1 (VSS_WS_STABLE) 2014-11-07 12:49:00 avn01bac001-dir JobId 15: Error: Director's comm line to SD dropped. 2014-11-07 12:50:08 avn01bac001-dir JobId 15: Error: Bareos avn01bac001-dir 14.2.1 (12Sep14): Build OS: x86_64-pc-linux-gnu debian Debian GNU/Linux 7.0 (wheezy) JobId: 15 Job: Backup_Users.2014-11-07_12.48.09_22 Backup Level: Full Client: "dnaczk-fd" 14.4.0 (08Oct14) Microsoft Windows 7 Professional Service Pack 1 (build 7601), 32-bit,Cross-compile,Win32 FileSet: "Users" 2014-11-07 12:13:51 Pool: "AvenaFull" (From Job FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "avn02adc001" (From Job resource) Scheduled time: 07-Nov-2014 12:48:09 Start time: 07-Nov-2014 12:48:12 End time: 07-Nov-2014 12:50:08 Elapsed time: 1 min 56 secs Priority: 1 FD Files Written: 770 SD Files Written: 0 FD Bytes Written: 208,004,626 (208.0 MB) SD Bytes Written: 0 (0 B) Rate: 1793.1 KB/s Software Compression: None VSS: yes Encryption: no Accurate: no Volume name(s): Volume Session Id: 1 Volume Session Time: 1415361377 Last Volume Bytes: 0 (0 B) Non-fatal FD errors: 2 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** |
|
Program blows up in line 786 of vol_mgr.c: (gdb) n 784 free(vol->vol_name); (gdb) 785 vol->vol_name = NULL; (gdb) 786 vol->destroy_mutex(); (gdb) Program received signal SIGSEGV, Segmentation fault. 0x000000006ea05471 in ?? () from c:\Program Files\Bareos\pthreadGCE2.dll (gdb) |
|
Mutex is zero: VOLRES::destroy_mutex (this=0x24fee18) at ../../stored/vol_mgr.h:66 66 void destroy_mutex() { pthread_mutex_destroy(&m_mutex); }; (gdb) p m_mutex $3 = (pthread_mutex_t) 0x0 (gdb) |
|
This is probably related to 0000414 | |
Could you please try if you still get this error with the new version: http://download.bareos.org/bareos/release/15.2/windows/winbareos-15.2.2-postvista-64-bit-r35.1.exe |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2014-11-06 20:25 | mgostomski | New Issue | |
2014-11-07 09:14 | pstorz | Note Added: 0001043 | |
2014-11-07 09:15 | pstorz | Assigned To | => pstorz |
2014-11-07 09:15 | pstorz | Status | new => assigned |
2014-11-07 11:02 | mgostomski | Note Added: 0001045 | |
2014-11-07 11:13 | mvwieringen | Note Added: 0001046 | |
2014-11-07 11:41 | mgostomski | Note Added: 0001049 | |
2014-11-07 13:02 | mgostomski | Note Added: 0001051 | |
2014-11-07 13:03 | mgostomski | Note Edited: 0001051 | |
2014-11-18 16:15 | pstorz | Note Added: 0001064 | |
2014-11-18 16:21 | pstorz | Note Added: 0001065 | |
2015-03-31 14:42 | pstorz | Status | assigned => confirmed |
2015-03-31 14:58 | mvwieringen | Assigned To | pstorz => |
2015-11-13 22:30 | stephand | Note Added: 0001964 | |
2015-11-26 12:13 | stephand | Note Added: 0002016 | |
2015-11-26 12:13 | stephand | Assigned To | => stephand |
2015-11-26 12:13 | stephand | Status | confirmed => feedback |
2016-01-14 13:54 | mvwieringen | Status | feedback => closed |
2016-01-14 13:54 | mvwieringen | Assigned To | stephand => |
2016-01-14 13:54 | mvwieringen | Resolution | open => suspended |