View Issue Details

IDProjectCategoryView StatusLast Update
0001061bareos-core[All Projects] directorpublic2019-04-11 09:07
ReporterhostedpowerAssigned Toarogge 
PriorityhighSeveritymajorReproducibilityalways
Status resolvedResolutionno change required 
PlatformLinuxOSDebianOS Version9
Product Version18.2.5 
Target VersionFixed in Version 
Summary0001061: Tremendous MySQL load
DescriptionHi,


It seems the queries are very un-optimized again since upgrading to bareos 18.2.6 :/

Steps To ReproduceBareos generates tons of inefficient non indexed queries like:

 SELECT Path.Path, T1.Name, T1.FileIndex, T1.JobId, LStat, DeltaSeq , Fhinfo, Fhnode FROM ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.Name AS Name, LStat , DeltaSeq, Fhinfo, Fhnode, Job.JobTDate AS JobTDate FROM Job, File, (SELECT MAX(JobTDate) AS JobTDate, PathId, FileName FROM (SELECT JobTDate, PathId, File.Name AS FileName FROM File JOIN Job USING (JobId) WHERE File.JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206) UNION ALL SELECT JobTDate, PathId, File.Name AS FileName FROM BaseFiles JOIN File USING (FileId) JOIN Job ON (BaseJobId = Job.JobId) WHERE BaseFiles.JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206) ) AS tmp GROUP BY PathId, FileName) AS T1 WHERE (Job.JobId IN (SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206)) OR Job.JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND T1.PathId = File.PathId AND T1.FileName = File.Name ) AS T1 JOIN Path ON (Path.PathId = T1.PathId) WHERE FileIndex > 0 ORDER BY T1.JobTDate, FileIndex ASC ;

Additional InformationPlease provide proper indexes, SQL without proper indexes is very bad for performance and makes bareos almost crash :/
TagsNo tags attached.
bareos-master: impact
bareos-master: action
bareos-18.2: impact
bareos-18.2: action
bareos-17.2: impact
bareos-17.2: action
bareos-16.2: impact
bareos-16.2: action
bareos-15.2: impact
bareos-15.2: action
bareos-14.2: impact
bareos-14.2: action
bareos-13.2: impact
bareos-13.2: action
bareos-12.4: impact
bareos-12.4: action

Activities

der_andrew

der_andrew

2019-02-20 10:04

reporter   ~0003272

+1
hostedpower

hostedpower

2019-02-28 22:35

reporter   ~0003275

Hi,


The version is very cumbersome. Because the queries are so heavy and long lasting, also the other jobs are waiting for the lock and cannot finish until all locks are gone. We had many issues since upgrading :(

Example where you can see the lock:

| 19777 | bareos | localhost | bareos | Query | 318 | Waiting for table metadata lock | LOCK TABLES Path write, batch write, Path as p write | 0 | 0 |
| 19779 | bareos | localhost | bareos | Sleep | 954 | | NULL | 0 | 0 |
| 19780 | bareos | localhost | bareos | Query | 143 | Waiting for table metadata lock | LOCK TABLES Path write, batch write, Path as p write | 0 | 0 |
| 19783 | bareos | localhost | bareos | Sleep | 57 | | NULL | 0 | 0 |
| 19784 | bareos | localhost | bareos | Sleep | 169 | | NULL | 0 | 0 |
| 19785 | bareos | localhost | bareos | Sleep | 952 | | NULL | 0 | 0 |
| 20090 | bareos | localhost | bareos | Sleep | 2 | | NULL | 0 | 0 |
| 20091 | bareos | localhost | bareos | Query | 324 | Creating sort index | SELECT Path.Path, T1.Name, T1.FileIndex, T1.JobId, LStat, DeltaSeq , Fhinfo, Fhnode FROM ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.Name AS Name, LStat , DeltaSeq, Fhinfo, Fhnode, Job.JobTDate AS JobTDate FROM Job, File, (SELECT MAX(JobTDate) AS JobTDate, PathId, FileName FROM (SELECT JobTDate, PathId, File.Name AS FileName FROM File JOIN Job USING (JobId) WHERE File.JobId IN (56417,56969,53458,53635,53812,53988,54166,54344,54513,54678,54841,55013,55189,55545,55738,55923,56106,56288,56469,56651,56833) UNION ALL SELECT JobTDate, PathId, File.Name AS FileName FROM BaseFiles JOIN File USING (FileId) JOIN Job ON (BaseJobId = Job.JobId) WHERE BaseFiles.JobId IN (56417,56969,53458,53635,53812,53988,54166,54344,54513,54678,54841,55013,55189,55545,55738,55923,56106,56288,56469,56651,56833) ) AS tmp GROUP BY PathId, FileName) AS T1 WHERE (Job.JobId IN (SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (56417,56969,53458,53635,53812,53988,54166,54344,54513,54678,54841,55013,55189,55545,55738,55923,56106,56288,56469,56651,56833)) OR Job.JobId IN (56417,56969,53458,53635,53812,53988,54166,54344,54513,54678,54841,55013,55189,55545,55738,55923,56106,56288,56469,56651,56833)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND T1.PathId = File.PathId AND T1.FileName = File.Name ) AS T1 JOIN Path ON (Path.PathId = T1.PathId) WHERE FileIndex > 0 ORDER BY T1.JobTDate, FileIndex ASC | 0 | 12132482 |
| 20096 | bareos | localhost | bareos | Query | 317 | Waiting for table metadata lock | SELECT Path.Path, T1.Name, T1.FileIndex, T1.JobId, LStat, DeltaSeq , Fhinfo, Fhnode FROM ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.Name AS Name, LStat , DeltaSeq, Fhinfo, Fhnode, Job.JobTDate AS JobTDate FROM Job, File, (SELECT MAX(JobTDate) AS JobTDate, PathId, FileName FROM (SELECT JobTDate, PathId, File.Name AS FileName FROM File JOIN Job USING (JobId) WHERE File.JobId IN (56419,56971,53460,53637,53814,53990,54168,54346,54515,54680,54843,55015,55191,55547,55740,55925,56108,56290,56471,56653,56835) UNION ALL SELECT JobTDate, PathId, File.Name AS FileName FROM BaseFiles JOIN File USING (FileId) JOIN Job ON (BaseJobId = Job.JobId) WHERE BaseFiles.JobId IN (56419,56971,53460,53637,53814,53990,54168,54346,54515,54680,54843,55015,55191,55547,55740,55925,56108,56290,56471,56653,56835) ) AS tmp GROUP BY PathId, FileName) AS T1 WHERE (Job.JobId IN (SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (56419,56971,53460,53637,53814,53990,54168,54346,54515,54680,54843,55015,55191,55547,55740,55925,56108,56290,56471,56653,56835)) OR Job.JobId IN (56419,56971,53460,53637,53814,53990,54168,54346,54515,54680,54843,55015,55191,55547,55740,55925,56108,56290,56471,56653,56835)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND T1.PathId = File.PathId AND T1.FileName = File.Name ) AS T1 JOIN Path ON (Path.PathId = T1.PathId) WHERE FileIndex > 0 ORDER BY T1.JobTDate, FileIndex ASC | 0 | 0 |
| 20224 | root | localhost | NULL | Query | 0 | starting | show full processlist | 0 | 0 |
+-------+--------+-----------+--------+---------+------+---------------------------------+---

When we lock in the console, we see jobs which are finished for 30 minutes, yet they cannot finish permanently since they are waiting for the lock :/
hostedpower

hostedpower

2019-03-03 15:10

reporter   ~0003276

Some jobs hang forever because of it, we have many failed jobs since upgrading :(
tomer

tomer

2019-03-04 12:42

reporter   ~0003277

The query in the attached file can be used as an initial draft to potentially optimize the slow query in this issue report.
Unfortunately, I have little knowledge of bareos internals, so I'll appreciate hearing the thoughts from the community / contributors.

Some notes about the changes in the query:
1. Filtering conditions from the outer query (such as FileIndex > 0) can be added to the relevant subqueries as well, to reduce the amount of data returned from them, as early as possible in the query execution process.
2. Sometimes joins can be avoided and transformed to a subquery in the SELECT clause. This can be effective in some cases.
3. Selecting data from potentially empty tables is redundant (for example, the table BaseFiles can be empty in the case I saw), so the query can be built dynamically if the application had that knowledge before running the query (for example, by checking the size of the table before running the query).

If anyone can share some more information about the goals and internals of this query, in details, it will be very helpful and we can work together on further optimizing the query.

query_draft.sql (2,558 bytes)
arogge

arogge

2019-03-13 18:07

developer   ~0003279

From which version did you upgrade?
There have been no changes to that query for a really long time.

Can you provide more information especially concerning the "non indexed" part? Do you have an EXPLAIN for that statement?
hostedpower

hostedpower

2019-03-14 21:17

reporter   ~0003293

Hello,


mysql> explain SELECT Path.Path, T1.Name, T1.FileIndex, T1.JobId, LStat, DeltaSeq , Fhinfo, Fhnode FROM ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.Name AS Name, LStat , DeltaSeq, Fhinfo, Fhnode, Job.JobTDate AS JobTDate FROM Job, File, (SELECT MAX(JobTDate) AS JobTDate, PathId, FileName FROM (SELECT JobTDate, PathId, File.Name AS FileName FROM File JOIN Job USING (JobId) WHERE File.JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206) UNION ALL SELECT JobTDate, PathId, File.Name AS FileName FROM BaseFiles JOIN File USING (FileId) JOIN Job ON (BaseJobId = Job.JobId) WHERE BaseFiles.JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206) ) AS tmp GROUP BY PathId, FileName) AS T1 WHERE (Job.JobId IN (SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206)) OR Job.JobId IN (54468,55303,51880,52055,52230,52405,52584,52759,52934,53111,53288,53475,53652,53829,54005,54183,54361,54530,54695,54858,55030,55206)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND T1.PathId = File.PathId AND T1.FileName = File.Name ) AS T1 JOIN Path ON (Path.PathId = T1.PathId) WHERE FileIndex > 0 ORDER BY T1.JobTDate, FileIndex ASC ;
+----+-------------+------------+------------+--------+------------------------------------------------------------------------+-------------------------------+---------+----------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+------------------------------------------------------------------------+-------------------------------+---------+----------------------------------------+------+----------+----------------------------------------------+
| 1 | PRIMARY | <derived3> | NULL | ALL | NULL | NULL | NULL | NULL | 24 | 100.00 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | Path | NULL | eq_ref | PRIMARY | PRIMARY | 4 | T1.PathId | 1 | 100.00 | NULL |
| 1 | PRIMARY | Job | NULL | ref | PRIMARY,jobtdate_idx,job_idx_jobtdate_jobid,job_idx_jobid_jobtdate | job_idx_jobtdate_jobid | 9 | T1.JobTDate | 1 | 100.00 | Using where; Using index |
| 1 | PRIMARY | File | NULL | ref | JobId_PathId_Name,PathId_JobId_FileIndex,idxPIchk,file_idx_pathid_name | JobId_PathId_Name | 265 | bareos.Job.JobId,T1.PathId,T1.FileName | 1 | 33.33 | Using where |
| 6 | SUBQUERY | BaseFiles | NULL | index | basefiles_jobid_idx,basefiles_idx_basejobid_jobid | basefiles_idx_basejobid_jobid | 8 | NULL | 1 | 100.00 | Using where; Using index |
| 3 | DERIVED | <derived4> | NULL | ALL | NULL | NULL | NULL | NULL | 24 | 100.00 | Using temporary; Using filesort |
| 4 | DERIVED | File | NULL | range | JobId_PathId_Name | JobId_PathId_Name | 4 | NULL | 22 | 100.00 | Using index condition |
| 4 | DERIVED | Job | NULL | eq_ref | PRIMARY,job_idx_jobid_jobtdate | PRIMARY | 4 | bareos.File.JobId | 1 | 100.00 | NULL |
| 5 | UNION | BaseFiles | NULL | ALL | basefiles_jobid_idx,basefiles_idx_basejobid_jobid | NULL | NULL | NULL | 1 | 100.00 | Using where |
| 5 | UNION | Job | NULL | eq_ref | PRIMARY,job_idx_jobid_jobtdate | PRIMARY | 4 | bareos.BaseFiles.BaseJobId | 1 | 100.00 | NULL |
| 5 | UNION | File | NULL | eq_ref | PRIMARY | PRIMARY | 8 | bareos.BaseFiles.FileId | 1 | 100.00 | NULL |
+----+-------------+------------+------------+--------+------------------------------------------------------------------------+-------------------------------+---------+----------------------------------------+------+----------+----------------------------------------------+
11 rows in set, 1 warning (0.00 sec)

mysql>

The query doesn't (really) work anymore since it's empty now.

Maybe the problem was there longer time and it might have gotten worse with more data.
arogge

arogge

2019-03-15 09:32

developer   ~0003294

The problem is that we cannot anticipate what your data will look like:
Some clients have a lot of small files, some clients have only a few big files.
Some clients have only a few files changing in Incremental backups, some clients have many files changing in a backup.
Some clients run daily backup and a full backup every week resulting in a short backup chain (1 x Full, 6 x Incr), some clients run 4 or more backups per day, one full backup a month and a diff once a week resulting in incredibly long backup chains like 1 x Full + 1 x Diff + 32 x Incr.

Now the execution plan EXPLAIN will throw out depends on the amount of data and the value distribution in the indexes. This is why I asked wether you could provide an execution plan: I can run EXPLAIN on my testing dataset, but the result will differ from your real-world setup. The result will even differ depending on the MySQL version and tuning parameters.

Having said that, we try to optimize database stuff as good as possible for the average case. We look into PostgreSQL a lot more than in MySQL/MariaDB. For the Bareos use-case PostgreSQL offers better performance and manageability in most cases, which is why we encourage you to use it instead of MySQL/MariaDB and why we put more effort into PostgreSQL than into MySQL/MariaDB.

Last but not least the issue that you're seeing is common. Your database server has a limited amount of memory and usually your Bareos catalog won't fit in it completely. This is not a problem as PostgreSQL and MySQL/MariaDB know what to put in memory and what not.
Concerning your specific query you can be quite sure that the File-table won't fit in memory completely. However as long as the index JobId_PathId_Name fits into memory and most of the other tables are cached in memory your queries will run fast. That index is (on your installation) 265 bytes per row in the file table. Which means around 265 MB per 1 Million files.
As soon as this doesn't fit into memory anymore your query performance will drop dramatically (read: probably 10 to 100 times slower). The only way to solve this is to make sure that the file-tables's indexes (at least those that are used often) fit into memory (and to make sure your database server is configured to use that amount of memory, of course).

So if you claim that a query runs slowly we cannot find out why this might be the case without the execution plan for that query on *your* database server with *your* dataset. Usually (we get quite a lot of support requests concerning this) it boils down to non-optimal tuning of the databases's memory parameters or simply not enough memory.
arogge

arogge

2019-03-25 08:50

developer   ~0003297

You can try to add an index on Job(JobTDate):

CREATE INDEX jobtdate_idx ON Job (JobTDate);
ANALYZE TABLE Job;

This may improve your experience.
arogge

arogge

2019-04-11 09:07

developer   ~0003328

I'm settings this to resolved. If anyone can provide more information, don't hesitate to reopen the issue.

Issue History

Date Modified Username Field Change
2019-02-19 00:23 hostedpower New Issue
2019-02-20 10:04 der_andrew Note Added: 0003272
2019-02-28 22:35 hostedpower Note Added: 0003275
2019-03-03 15:10 hostedpower Note Added: 0003276
2019-03-04 12:42 tomer File Added: query_draft.sql
2019-03-04 12:42 tomer Note Added: 0003277
2019-03-13 18:07 arogge Note Added: 0003279
2019-03-14 08:53 arogge Assigned To => arogge
2019-03-14 08:53 arogge Status new => feedback
2019-03-14 21:17 hostedpower Note Added: 0003293
2019-03-14 21:17 hostedpower Status feedback => assigned
2019-03-15 09:32 arogge Status assigned => feedback
2019-03-15 09:32 arogge Note Added: 0003294
2019-03-25 08:50 arogge Note Added: 0003297
2019-04-11 09:07 arogge Status feedback => resolved
2019-04-11 09:07 arogge Resolution open => no change required
2019-04-11 09:07 arogge Note Added: 0003328