View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000631 | bareos-core | director | public | 2016-03-04 10:43 | 2019-01-31 10:22 |
Reporter | otto | Assigned To | arogge_adm | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | Debian | OS Version | 8 |
Product Version | 15.2.2 | ||||
Summary | 0000631: No DB-reconnect after connection lost (again) | ||||
Description | After Restart postgres, the director lost connection to the database. There is still no reconnect to the db. (The Issue 426 was marked as resolved since 15.2.1, but still not working!) The logs says: JobId 0: Fatal error: sql_get.c:621 sql_get.c:621 query SELECT PoolId,Name,NumVols,MaxVols,UseOnce,UseCatalog,AcceptAnyVolume,AutoPrune,Recycle,VolRetention,VolUseDuration,MaxVolJobs,MaxVolFiles,MaxVolBytes,PoolType,LabelType,LabelFormat,RecyclePoolId,ScratchPoolId,ActionOnPurge,MinBlocksize,MaxBlocksize FROM Pool WHERE Pool.Name='TWR-3W' failed: bareos-dir: no connection to the server bareos-dir: | ||||
Steps To Reproduce | 1. Restart Database: service postgresq restart 2. Test Director ... bconsole * status dir days=0 * run job ... 3. Look in Logs ... | ||||
Tags | No tags attached. | ||||
By default reconnecting is disabled as it might lead to surprises. You can enable it or even say the director should see things as a fatal error and then your systemd will probably restart the director. I would not use it myself and not randomly restart your postgresql and if you do at a proper moment restart your director. You are looking for the Reconnect and ExitOnFatal keywords in the Catalog resources. Reconnect will only happen when the connection is not in the middle of a Transaction and its in no way a supported option. Given the complexity it might harm more then do good so that is why the default is off. |
|
In Bareos 16.2.4 on CentOS 7, ExitOnFatal = yes does not work here. I restart PostgreSQL, and then perform a bconsole status director. Log prints all the time in a loop: 14-Jun 18:53 backup-dir: Fatal Error at postgresql.c:670 because: Fatal database error 14-Jun 18:53 backup-dir JobId 0: Fatal error: sql_get.c:661 sql_get.c:661 query SELECT PoolId,Name,NumVols,MaxVols,UseOnce,UseCatalog,AcceptAnyVolume,AutoPrune,Recycle,VolRetention,VolUseDuration,MaxVolJobs,MaxVolFiles,MaxVolBytes,PoolType,LabelType,LabelFormat,RecyclePoolId,ScratchPoolId,ActionOnPurge,MinBlocksize,MaxBlocksize FROM Pool WHERE Pool.Name='Default' failed: no connection to the server I expect that the director exits on the fatal error as stated in the docs. bareos-dir -xc: Catalog { Name = "MyCatalog" DbAddress = "127.0.0.1" DbPort = 0 DbPassword = "****" DbUser = "bareos" DbName = "bareos" DbDriver = "postgresql" MultipleConnections = no ExitOnFatal = yes } |
|
For now I use "Reconnect = yes", which works when the db is restarted until ExitOnFatal works. | |
Fix committed to bareos dev branch with changesetid 10827. | |
bareos: dev 3078a939 2019-01-16 16:22 Ported: N/A Details Diff |
catalog: make "Exit On Fatal" work as expected fixes 0000631: No DB-reconnect after connection lost the catalog backends previously logged a fatal error with the database with M_FATAL. This patch changes the message to M_ERROR_TERM which will also quit the director. |
Affected Issues 0000631 |
|
mod - core/src/cats/mysql.cc | Diff File | ||
mod - core/src/cats/postgresql.cc | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2016-03-04 10:43 | otto | New Issue | |
2016-03-29 13:14 | joergs | Relationship added | related to 0000426 |
2016-03-30 17:07 | mvwieringen | Note Added: 0002225 | |
2016-03-30 17:08 | mvwieringen | Status | new => feedback |
2017-06-14 18:56 | robertoschwald | Note Added: 0002669 | |
2017-06-14 18:59 | robertoschwald | Note Added: 0002670 | |
2019-01-16 15:30 | arogge_adm | Changeset attached | => bareos dev 3078a939 |
2019-01-16 15:30 | arogge_adm | Note Added: 0003191 | |
2019-01-16 15:30 | arogge_adm | Status | feedback => resolved |
2019-01-16 15:30 | arogge_adm | Resolution | open => fixed |
2019-01-31 10:07 | arogge_adm | Relationship added | child of 0001040 |
2019-01-31 10:22 | arogge_adm | Assigned To | => arogge_adm |
2019-01-31 10:22 | arogge_adm | Status | resolved => closed |