View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000961 | bareos-core | director | public | 2018-06-09 06:07 | 2023-04-06 15:21 |
Reporter | progserega | Assigned To | bruno-at-bareos | ||
Priority | urgent | Severity | crash | Reproducibility | sometimes |
Status | closed | Resolution | unable to reproduce | ||
Platform | Linux | OS | Debian | OS Version | 9 |
Product Version | 17.2.5 | ||||
Summary | 0000961: director crash after 4-10 hours of work | ||||
Description | Bareos starting in LXC in Proxmox 5.1. For container set 8 Gb hard drive and 16 Gb RAM. Bareos-dir crash every 0000004:0000004 hour of work. I start bareos-dir in gdb for get stack: (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/sbin/bareos-dir -f -v -d 10 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". bareos-dir (10): dird.c:245-0 Debug level = 10 bareos-dir (9): inc_conf.c:390-0 set wildfile 555555846518 size=1 [A-Z]:/pagefile.sys bareos-dir (9): inc_conf.c:390-0 set wilddir 555555846518 size=1 [A-Z]:/RECYCLER bareos-dir (9): inc_conf.c:390-0 set wilddir 555555846518 size=2 [A-Z]:/$RECYCLE.BIN bareos-dir (9): inc_conf.c:390-0 set wilddir 555555846518 size=3 [A-Z]:/System Volume Information bareos-dir (9): inc_conf.c:390-0 set wildbase 555555841088 size=1 *.avi bareos-dir (9): inc_conf.c:390-0 set wildbase 555555841088 size=2 *.mkv bareos-dir (9): inc_conf.c:390-0 set wild 55555583fa28 size=1 /var/www/localhost/htdocs [New Thread 0x7ffff41bc700 (LWP 43570)] [New Thread 0x7fffebfff700 (LWP 43572)] [New Thread 0x7fffeb7fe700 (LWP 43573)] [New Thread 0x7fffeaffd700 (LWP 43574)] [New Thread 0x7fffea7fc700 (LWP 44718)] [New Thread 0x7fffe9ffb700 (LWP 44726)] [Thread 0x7fffe9ffb700 (LWP 44726) exited] [Thread 0x7fffea7fc700 (LWP 44718) exited] Thread 5 "bareos-dir" received signal SIGUSR2, User defined signal 2. [Switching to Thread 0x7fffeaffd700 (LWP 43574)] 0x00007ffff56f57dd in nanosleep () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: Нет такого файла или каталога. (gdb) bt #0 0x00007ffff56f57dd in nanosleep () at ../sysdeps/unix/syscall-template.S:84 0000001 0x00007ffff6ea51f4 in bmicrosleep(int, int) () from /usr/lib/bareos/libbareos-17.2.4.so 0000002 0x00007ffff6ecf0c8 in register_watchdog(s_watchdog_t*) () from /usr/lib/bareos/libbareos-17.2.4.so 0000003 0x00007ffff6ea8107 in start_thread_timer(JCR*, unsigned long, unsigned int) () from /usr/lib/bareos/libbareos-17.2.4.so 0000004 0x00007ffff6ea37c0 in BSOCK_TCP::connect(JCR*, int, long, long, char const*, char*, char*, int, bool) () from /usr/lib/bareos/libbareos-17.2.4.so 0000005 0x00005555555a74cd in ?? () 0000006 0x0000555555833108 in ?? () 0000007 0x0000000000000000 in ?? () (gdb) | ||||
Additional Information | last lines from bareos.log on director host: Check file /var/lib/bareos/bareos-dir.9101.pid 08-июн 13:00 bareos-dir JobId 35: shell command: run BeforeJob "/scripts/backup/bareos_zabbix_status.sh" 08-июн 13:00 bareos-dir JobId 35: Start Admin JobId 35, Job=AdminJobZabbixStatus.2018-06-08_13.00.00_01 08-июн 13:00 bareos-dir JobId 35: BAREOS 17.2.4 (21Sep17): 08-июн-2018 13:00:02 JobId: 35 Job: AdminJobZabbixStatus.2018-06-08_13.00.00_01 Scheduled time: 08-июн-2018 13:00:00 Start time: 08-июн-2018 13:00:02 End time: 08-июн-2018 13:00:02 Termination: Admin OK 08-июн 13:00 bareos-dir JobId 35: shell command: run AfterJob "touch /tmp/bareos_admin_test_after_job" 08-июн 14:00 bareos-dir JobId 36: shell command: run BeforeJob "/scripts/backup/bareos_zabbix_status.sh" 08-июн 14:00 bareos-dir JobId 36: Start Admin JobId 36, Job=AdminJobZabbixStatus.2018-06-08_14.00.00_20 08-июн 14:00 bareos-dir JobId 36: BAREOS 17.2.4 (21Sep17): 08-июн-2018 14:00:02 JobId: 36 Job: AdminJobZabbixStatus.2018-06-08_14.00.00_20 Scheduled time: 08-июн-2018 14:00:00 Start time: 08-июн-2018 14:00:02 End time: 08-июн-2018 14:00:02 Termination: Admin OK 08-июн 14:00 bareos-dir JobId 36: shell command: run AfterJob "touch /tmp/bareos_admin_test_after_job" 08-июн 15:00 bareos-dir JobId 37: shell command: run BeforeJob "/scripts/backup/bareos_zabbix_status.sh" 08-июн 15:00 bareos-dir JobId 37: Start Admin JobId 37, Job=AdminJobZabbixStatus.2018-06-08_15.00.00_03 08-июн 15:00 bareos-dir JobId 37: BAREOS 17.2.4 (21Sep17): 08-июн-2018 15:00:02 JobId: 37 Job: AdminJobZabbixStatus.2018-06-08_15.00.00_03 Scheduled time: 08-июн-2018 15:00:00 Start time: 08-июн-2018 15:00:02 End time: 08-июн-2018 15:00:02 Termination: Admin OK 08-июн 15:00 bareos-dir JobId 37: shell command: run AfterJob "touch /tmp/bareos_admin_test_after_job" | ||||
Tags | No tags attached. | ||||
In system logs: июн 14 14:16:08 bareos bareos-dir[43341]: BAREOS interrupted by signal 7: BUS error |
|
Start bareos-dir in console: /usr/sbin/bareos-dir -d 99 -f -v result after start backup job: bareos-dir (50): postgresql.c:248-0 db_user=bareos db_name=bareos db_password=DyVek9IXYe1QbLzL bareos-dir (20): ua_output.c:567-0 list: llist jobid=209 bareos-dir (50): cram-md5.c:68-0 send: auth cram-md5 <544362531.1528958430@bareos-dir> ssl=0 bareos-dir (50): cram-md5.c:94-0 Authenticate OK J/o8sY02C7xkhnAjlEzO6g bareos-dir (99): cram-md5.c:143-0 sending resp to challenge: e9+Ovk+wKR+UY4Ea9Q+PbA bareos-dir (10): ua_audit.c:143-0 : Console [admin] from [127.0.0.1] cmdline list joblog jobid=209 limit=1000 offset=0 bareos-dir (50): postgresql.c:246-0 pg_real_connect done bareos-dir (50): postgresql.c:248-0 db_user=bareos db_name=bareos db_password=DyVek9IXYe1QbLzL bareos-dir (20): ua_output.c:567-0 list: list joblog jobid=209 limit=1000 offset=0 bareos-dir (10): ua_audit.c:143-0 : Console [admin] from [127.0.0.1] cmdline list joblog jobid=209 limit=1000 offset=1000 bareos-dir (20): ua_output.c:567-0 list: list joblog jobid=209 limit=1000 offset=1000 bareos-dir (50): cram-md5.c:68-0 send: auth cram-md5 <1717595899.1528958430@bareos-dir> ssl=0 bareos-dir (50): cram-md5.c:94-0 Authenticate OK t/WObVoLokZ2DPiGYtk5Iw bareos-dir (99): cram-md5.c:143-0 sending resp to challenge: ByU+x8/A7T/kC6tIQ7JKdC bareos-dir (10): ua_audit.c:143-0 : Console [admin] from [127.0.0.1] cmdline llist jobmedia jobid=209 bareos-dir (50): postgresql.c:246-0 pg_real_connect done bareos-dir (50): postgresql.c:248-0 db_user=bareos db_name=bareos db_password=DyVek9IXYe1QbLzL bareos-dir (20): ua_output.c:567-0 list: llist jobmedia jobid=209 BAREOS interrupted by signal 7: BUS error Kaboom! bareos-dir, bareos-dir got signal 7 - BUS error. Attempting traceback. Kaboom! exepath=/usr/sbin/ Calling: /usr/sbin/btraceback /usr/sbin/bareos-dir 44527 /var/lib/bareos |
|
last log in bareos.log director log: 14-июн 16:39 cloud-fd JobId 209: shell command: run ClientAfterJob "/scripts/backup/bacula_clear_backup_dir /tmp/bacula_mysql_backup" 14-июн 16:39 bareos-sd JobId 209: Elapsed time=00:00:01, Transfer rate=69.34 M Bytes/second 14-июн 16:39 cloud-fd JobId 209: ClientAfterJob: удаляю файлы во временной директории /tmp/bacula_mysql_backup: 14-июн 16:39 cloud-fd JobId 209: ClientAfterJob: removed '//tmp/bacula_mysql_backup/nextcloud.sql.gz' 14-июн 16:39 cloud-fd JobId 209: ClientAfterJob: удаление файлов прошло без ошибок. завершение скрипта /scripts/backup/bacula_clear_ 14-июн 16:39 cloud-fd JobId 209: ClientAfterJob: backup_dir 14-июн 16:39 bareos-dir JobId 209: sql_create.c:872 Insert of attributes batch table done 14-июн 16:39 bareos-dir JobId 209: Bareos bareos-dir 17.2.4 (21Sep17): Build OS: x86_64-pc-linux-gnu debian Debian GNU/Linux 9.3 (stretch) JobId: 209 Job: backup-cloud.rs.int-MysqlYear.2018-06-14_16.39.11_07 Backup Level: Full Client: "cloud.rs.int-fd" 17.2.4 (21Sep17) x86_64-pc-linux-gnu,debian,Debian GNU/Linux 8.0 (jessie),Debian_8.0,x86_64 FileSet: "MysqlFileSet" 2018-06-14 14:11:57 Pool: "cloud.rs.int-DbPoolYear" (From command line) Catalog: "MyCatalog" (From Client resource) Storage: "DbStorage" (From Job resource) Scheduled time: 14-июн-2018 16:39:11 Start time: 14-июн-2018 16:39:50 End time: 14-июн-2018 16:39:52 Elapsed time: 2 secs Priority: 10 FD Files Written: 2 SD Files Written: 2 FD Bytes Written: 69,346,787 (69.34 MB) SD Bytes Written: 69,347,007 (69.34 MB) Rate: 34673.4 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): cloud.rs.int-DbPoolYear-2018.06.14-0 Volume Session Id: 67 Volume Session Time: 1528268023 Last Volume Bytes: 69,400,243 (69.40 MB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK |
|
Is this still reproducible with current code (Bareos >21) ? | |
can't be reproduced with recent 22 code. We have director running 24/24. | |
Date Modified | Username | Field | Change |
---|---|---|---|
2018-06-09 06:07 | progserega | New Issue | |
2018-06-14 06:58 | progserega | Note Added: 0003040 | |
2018-06-14 08:43 | progserega | Note Added: 0003041 | |
2018-06-14 08:46 | progserega | Note Added: 0003042 | |
2023-03-23 16:43 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2023-03-23 16:43 | bruno-at-bareos | Status | new => feedback |
2023-03-23 16:43 | bruno-at-bareos | Note Added: 0004950 | |
2023-04-06 15:21 | bruno-at-bareos | Status | feedback => closed |
2023-04-06 15:21 | bruno-at-bareos | Resolution | open => unable to reproduce |
2023-04-06 15:21 | bruno-at-bareos | Note Added: 0004965 |