0000359: Make accurate multi-thread? - Bareos Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000359	bareos-core	file daemon	public	2014-11-05 09:46	2023-07-03 15:39

Reporter	tigerfoot	Assigned To	bruno-at-bareos
Priority	low	Severity	feature	Reproducibility	always
Status	closed	Resolution	won't fix
Platform	Linux	OS	openSUSE	OS Version	13x
Product Version	13.2.3

Summary	0000359: Make accurate multi-thread?
Description	On big fileset with 4G files using high level of accurate take as much time as running a full backup. Incremental 05-Nov 05:21 orville-sd JobId 62: Despooling elapsed time = 00:00:29, Transfer rate = 495.2 M Bytes/second 05-Nov 05:21 orville-sd JobId 62: Elapsed time=04:50:20, Transfer rate=823.3 K Bytes/second 05-Nov 05:21 orville-sd JobId 62: Sending spooled attrs to the Director. Despooling 12,563,395 bytes
Steps To Reproduce	Have a very big number of files our case is +4,000,000,000 for a total of 3TB of data. Set Job Accurate = Yes Set in fileset accurate = mcspug1 Run a full, then incremental.
Additional Information	bareos-fd is only using one core ( 8 availables ). Making the process using the available cores or a number given in conf couldn't it improve the treatment of accurate ?
Tags	No tags attached.

mvwieringen 2014-11-07 11:24 developer ~0001048	Maybe you can explain to me how multi-threading will make things go faster. The accurate code fetches the values from the database and sends them over the socket to the filed. Then its stored into a memory hash table or into an LMDB for bareos-14.2. So what will multi-threading bring unless you open multiple sockets etc. e.g. you have a resource starvation on the socket. The only thing you could think about is instead of sending the items one by one send them as bigger chunks or use some socket compression.

tigerfoot 2014-11-07 12:15 developer ~0001050	Sorry if not enough clear and especially using certainly wrong words, for non dev language translated to dev. Don't shoot the fool :-) So from your point there's no way to check if files have changed in multi-* process? if file changed -> put them on the tobedone-queue. I guess that checksuming 4 to 8 file at the same time result in a whole shorter backup time, than doing the checksum one by one. Or to check where the time is most spended, what would be the procedure to see how much time is needed to build the accurate information on dir, the time spended to send it to -fd, the time to build the hash/lmdb the time spended to locally build what as to be backuped.

mvwieringen 2014-11-07 14:12 developer ~0001052 Last edited: 2014-11-07 14:17	If you want to see where time is spend you need to use profiling. It might be worthwhile to split the scan process but that means decoupling the scan process done in the findlib and the actual saving of the data. It may also mean we need more read only transactions for the LMDB as multiple threads will be accessing the accurate data. Same is true for the in memory hash as multiple threads may be updating the data. Given our current workload don't think we will be spending much time on this but if you want to try fork the code and try it out not sure how hard it will be.

bruno-at-bareos 2023-07-03 15:39 manager ~0005113	New work is in progress for 23.

Date Modified	Username	Field	Change
2014-11-05 09:46	tigerfoot	New Issue
2014-11-07 11:24	mvwieringen	Note Added: 0001048
2014-11-07 12:15	tigerfoot	Note Added: 0001050
2014-11-07 14:12	mvwieringen	Note Added: 0001052
2014-11-07 14:17	mvwieringen	Note Edited: 0001052
2015-11-06 18:05	maik	Priority	normal => low
2015-11-06 18:05	maik	Status	new => acknowledged
2023-07-03 15:39	bruno-at-bareos	Assigned To	=> bruno-at-bareos
2023-07-03 15:39	bruno-at-bareos	Status	acknowledged => closed
2023-07-03 15:39	bruno-at-bareos	Resolution	open => won't fix
2023-07-03 15:39	bruno-at-bareos	Note Added: 0005113