View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0001227||bareos-core||[All Projects] file daemon||public||2020-04-10 13:57||2020-12-16 09:03|
|Fixed in Version|
|Summary||0001227: Fix fadvise bug - sync bacula commit|
|Description||During backups we notice on our RHEL systems that the complete memory gets occupied by Page Cache.|
An explanation of the cause can already by found in this mail thread: http://bacula.10910.n7.nabble.com/Support-for-HAVE-POSIX-FADVISE-on-Bacula-Client-td82345.html
In bacula this has been fixed using following commit "Fix fadvise bug found by Robert Heinzmann":
Would be great if this fix can also applied in bareos.
|Tags||No tags attached.|
What is the use-case you need this for? The whole point of a page cache is to do exactly what it does there.
However, the current code is apparently broken, as the check for the filemode is simply implemented wrong.
Fixing this would introduce a call to fdatasync() for every file read by the filedaemon. I don't think this is a good idea at all. Either the page cache is clean (and can be released without fdatasync()) or the page cache is unclean and it is definitely not our duty to fdatasync() files we have just been reading.
From the bacula mail thread I can understand that they did make the same observations but nevertheless decided that implementing it was an overall better solution: see
What we notice is that RAM memory is completely in use by the page cache.
As the files are not evicted from the page cache, they consume RAM after the files were being read by the bareos-fd process.
Some processes (on servers with memory intensive processes) are getting swap space as the kernel might think this is more beneficial opposed to evicting entries from the page cache.
Mostly this is around the time of the backup process.
So I am not arguing against the page cache. Because the “fix” will still use the page cache, but with FADV_DONTNEED it will immediately be released from the page cache. And that is why it's important to us.
The problem I see is that the FADV_DONTNEED will also tell the kernel to release data from the page cache that may be in use by other processes. I hope modern kernels are smart enough not to release the cached data in these cases.
if you want the fix applied, please prepare a pull request on github with the change so I can discuss it with the other core developers. I think there is a good chance we will accept this.
|As per request, I've created https://github.com/bareos/bareos/pull/496|
|Fix committed to bareos master branch with changesetid 13301.|
bareos: master ea666394
Ported: N/ADetails Diff
|Do not call fdatasync() and fix m_flags->flags_
Fixes 0001227: Fix fadvise bug
Suggested changes implemented:
|mod - core/src/findlib/bfile.cc||Diff File|
|2020-04-10 13:57||gjelu||New Issue|
|2020-04-17 15:03||arogge||Assigned To||=> arogge|
|2020-04-17 15:03||arogge||Status||new => feedback|
|2020-04-17 15:03||arogge||Note Added: 0003942|
|2020-04-23 18:00||gjelu||Note Added: 0003963|
|2020-04-23 18:00||gjelu||Status||feedback => assigned|
|2020-04-24 10:15||arogge||Status||assigned => feedback|
|2020-04-24 10:15||arogge||Note Added: 0003964|
|2020-04-27 14:07||gjelu||Note Added: 0003965|
|2020-04-27 14:07||gjelu||Status||feedback => assigned|
|2020-05-11 17:24||gjelu||Changeset attached||=> bareos master ea666394|
|2020-05-11 17:24||gjelu||Note Added: 0003987|
|2020-06-09 15:15||arogge||Status||assigned => resolved|
|2020-06-09 15:15||arogge||Resolution||open => fixed|
|2020-11-30 15:47||arogge||Relationship added||related to 0001282|
|2020-12-16 09:03||arogge||Status||resolved => closed|