View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000226 | bareos-core | director | public | 2013-09-18 13:12 | 2015-03-25 19:19 |
Reporter | cvelasco | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | Linux | OS | any | OS Version | 3 |
Product Version | 12.4.4 | ||||
Summary | 0000226: Bwlimit not working in Windows 64bits client | ||||
Description | Director is version 12.4.5 linux 64bits Client is version 12.4.4 windows 2008 R2 64bits When you set "maximum bandwith" in job the speed of the job is really low, far of the target marked. Client is Windows 2008 R2 virtualized in qemu64 (kvm) without HPET (don't know if this could be an issue). With the job with bwlimit set to 50000 the speed is really really slow (2kbps): === client-fd Version: 12.4.4 (12 June 2013) VSS Linux Cross-compile Win64 Daemon started 18-Sep-13 12:08. Jobs: run=0 running=1. Microsoft Windows Server 2008 R2 Enterprise Edition Service Pack 1 (build 7601), 64-bit Heap: heap=0 smbytes=333,774 max_bytes=333,878 bufs=151 max_bufs=151 Sizeof: boffset_t=8 size_t=8 debug=0 trace=1 bwlimit=0kB/s Running Jobs: Director connected at: 18-Sep-13 12:09 JobId 7144 Job xxxxxxxxx.2013-09-18_12.10.30_13 is running. VSS Full Backup Job started: 18-Sep-13 12:10 Files=3 Bytes=340,144 Bytes/sec=2,834 Errors=0 Bwlimit=50,000 Files Examined=3 Processing file: N:xxxxxxxxxxxxxxx SDReadSeqNo=5 fd=792 === Job has NOT compression: === Job: name=xxxxxxx JobType=66 protocol=0 level=Full Priority=30 Enabled=1 MaxJobs=1 Resched=0 Times=5 Interval=1,800 Spool=0 Accurate=0 MaximumBandwidth=50000 --> Client: name=client-fd protocol=0 authtype=0 address=xxxxxxx FDport=9102 MaxJobs=1 JobRetention=2 months FileRetention=2 months AutoPrune=1 SoftQuota=0 SoftQuotaGrace=0 secs HardQuota=0 StrictQuotas=0 --> Catalog: name=MyCatalog address=xxxxxx DBport=3306 db_name=bareos db_driver=mysql db_user=bareos MutliDBConn=0 --> FileSet: name=xxxxxxx set O S N I N:xxxxxx N --> Schedule: name=xxxxx --> Run Level=Full === The SAME job without bwlimit the speed rises above 70k, the limit of the line. === client-fd Version: 12.4.4 (12 June 2013) VSS Linux Cross-compile Win64 Daemon started 18-Sep-13 12:16. Jobs: run=0 running=1. Microsoft Windows Server 2008 R2 Enterprise Edition Service Pack 1 (build 7601), 64-bit Heap: heap=0 smbytes=339,994 max_bytes=340,660 bufs=159 max_bufs=159 Sizeof: boffset_t=8 size_t=8 debug=0 trace=1 bwlimit=0kB/s Running Jobs: Director connected at: 18-Sep-13 12:17 JobId 7145 Job xxxxxxx.2013-09-18_12.18.56_03 is running. VSS Full Backup Job started: 18-Sep-13 12:18 Files=63 Bytes=9,207,565 Bytes/sec=72,500 Errors=0 Bwlimit=0 Files Examined=63 Processing file: N:xxxxxxxx SDReadSeqNo=5 fd=780 === | ||||
Steps To Reproduce | 1. Setup job with "Maximum Bandwidth". 2. Run job. 3. Tray in client or network devices show really slow speed. | ||||
Tags | No tags attached. | ||||
Virtualizarion is NOT a difference here. Tested the same issue with another client adjacent to the original one (same LAN, router, line). This is client Windows 7 64bits. REAL, not virtualized. With bandwidth limit really low speed: === client2-fd Version: 12.4.4 (12 June 2013) VSS Linux Cross-compile Win64 Daemon started 18-Sep-13 14:33. Jobs: run=0 running=1. Microsoft Windows 7 Professional Service Pack 1 (build 7601), 64-bit Heap: heap=0 smbytes=335,288 max_bytes=335,296 bufs=136 max_bufs=136 Sizeof: boffset_t=8 size_t=8 debug=0 trace=1 bwlimit=0kB/s Running Jobs: Director connected at: 18-Sep-13 14:34 JobId 7148 Job xxxxx.2013-09-18_14.36.37_08 is running. VSS Full Backup Job started: 18-Sep-13 14:36 Files=3 Bytes=7,856 Bytes/sec=25 Errors=0 Bwlimit=30,000 Files Examined=3 Processing file: xxxxxxxx SDReadSeqNo=5 fd=944 === Without bandwidth limit it speeds up to the line rate: === client2-fd Version: 12.4.4 (12 June 2013) VSS Linux Cross-compile Win64 Daemon started 18-Sep-13 14:45. Jobs: run=0 running=1. Microsoft Windows 7 Professional Service Pack 1 (build 7601), 64-bit Heap: heap=0 smbytes=335,292 max_bytes=335,300 bufs=136 max_bufs=136 Sizeof: boffset_t=8 size_t=8 debug=0 trace=1 bwlimit=0kB/s Running Jobs: Director connected at: 18-Sep-13 14:47 JobId 7149 Job xxxxx.2013-09-18_14.47.19_05 is running. VSS Full Backup Job started: 18-Sep-13 14:47 Files=3 Bytes=11,542,192 Bytes/sec=60,748 Errors=0 Bwlimit=0 Files Examined=3 Processing file: xxxxxxxx SDReadSeqNo=5 fd=912 === |
|
I think the timer resolution on Windows is just to bad to let things work for so low bandwidth. I have seen that also on FreeBSD the bandwidth kind of wanders of from what you really want. I think anything under 200 Kb/s will not work to well even on systems like Linux where 512 Kb/s and 1 Mb/s are things which seems to work. You could try enabling the allowbandwidthbursting to see if when you allow bursting it stays somewhat closer to the actual bandwidth setting but don't expect anything as your bandwidth settings are just to low to be able to control them in a decent matter with the current code we inherited from Bacula which just sleeps a certain amount of miliseconds to get to the wanted bandwidth. |
|
Sorry, how the allowbandwidthbursting works? I tested the bandwidth control against a linux client trying to shape it at 30 megabits per second, with similar problem. Not as low as this, but at best it went to 20mbps or so. |
|
Just set Allow Bandwidth Bursting = true in your filed config under FileDaemon. This will allow the filed to use the bytes from a previous timeslice that it didn't use. The original code always kept the the bandwidth under the maximum and as such it will never reach the actual bandwidth setting. e.g. the overall bandwidth will be much lower then the actual set bandwidth. With bursting on however it could happen that sometimes it uses more then the bandwidth set to get to the actual speed. It could be that the code can be made smarter, but for now this is what it is. The limiting code is in src/lib/bsock.c so you can look there if it can be improved. |
|
Tried allowbandwidthbursting, better, but it doesn't make a difference really. === client-fd Version: 12.4.4 (12 June 2013) VSS Linux Cross-compile Win64 Daemon started 18-Sep-13 17:20. Jobs: run=0 running=1. Microsoft Windows Server 2008 R2 Enterprise Edition Service Pack 1 (build 7601), 64-bit Heap: heap=0 smbytes=333,932 max_bytes=334,036 bufs=151 max_bufs=151 Sizeof: boffset_t=8 size_t=8 debug=0 trace=1 bwlimit=0kB/s Running Jobs: Director connected at: 18-Sep-13 17:21 JobId 7150 Job xxxxxxxx.2013-09-18_17.21.55_03 is running. VSS Full Backup Job started: 18-Sep-13 17:21 Files=3 Bytes=340,144 Bytes/sec=1,137 Errors=0 Bwlimit=50,000 Files Examined=3 Processing file: xxxxxxxx SDReadSeqNo=5 fd=792 === |
|
I have look into code and there is something I don't understand here. In src/lib/bnet.c function control_bwlimit is called when read and when write. bsock->control_bwlimit(nread); bsock->control_bwlimit(nwritten); But control_bwlimit uses only one variable m_last_tick (defined in src/lib/bsock.h). I think reads and writes are both overwritting this variable and messing all thing. |
|
As I already stated in my first reply I don't think bandwidth limiting is ever going to work for any limit below 200 Kbps and maybe even higher for some platforms due to the limited resolution of the timers and sleep method used. (e.g. nanosleep is translated in mingw (the cross compiler used) into some native Windows calls and I seriously wonder they will give you a good enough resolution. Regarding your observation about the m_last_tick variable that is nothing more then a system time when it last checked the bandwidth limit. You can argue that send and receive should have separate counters but the way it works now is that the total bandwidth is the aggregate of input and output bytes given that the backup speed is mainly dominated by write bandwidth any way (e.g. the responses of the SD to the backup stream are minimal anyway if it sends responses at all when blasting the file data (don't know the protocol by hart without having to investigate)) I think it should be no problem. So overwriting the m_last_tick variable is no problem as it only gets updated when a full check is run if the data needs to be slowed down by inserting some sleep interval. We also see on regression tests ranges between 950 and 1100 kbps when we limit to 1024 kbps. So it seems to work good enough on some platforms if you want something more accurate you probably need to look into TC for Linux or some other trafficshaper which are much more accurate. |
|
Sorry but I don't see this working at all. From a linux to linux with bwlimit to 2500kbps.. it goes really down the mark. === linux-fd Version: 12.4.5 (04 September 2013) x86_64-unknown-linux-gnu unknown unknown Daemon started 17-Sep-13 23:50. Jobs: run=7 running=0. Heap: heap=151,552 smbytes=755,720 max_bytes=851,095 bufs=275 max_bufs=471 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 bwlimit=0kB/s Running Jobs: JobId 7152 Job xxxxx.2013-09-19_00.44.36_03 is running. Full Backup Job started: 19-Sep-13 00:44 Files=13,794 Bytes=512,785,087 Bytes/sec=1,811,961 Errors=0 Bwlimit=2,500,000 Files Examined=13,794 Processing file: xxxxxxxxx SDReadSeqNo=5 fd=5 === About the m_last_tick it is not that read+written is added, this could be fine, although not right to me. It is rather that you get into a "race condition". For flow control we need time elapsed and amount of traffic sent. Time elapsed is calculated with now and m_last_tick. But if m_last_tick has been modified by received traffic then time elapsed is calculated wrong, tricked to believe you have sent a lot of traffic in little time, and this is wrong. On the other hand, if we have a problem of precision then we need to call this function only when strictly necessary to avoid losing precision in every call. I would test it without the read call to see how it goes, but cross-compiling a windows bareos is above my level :( Anyway, I have been looking into other codes and googling to see how other projects solve this problem. I found that the consensus is to use select function to do this. From the man select: === Some code calls select() with all three sets empty, nfds zero, and a non-NULL timeout as a fairly portable way to sleep with subsecond precision. === I looked specially into Proftpd project where I use bwlimit there. In src/throttle.c there is interesting code: === /* Setup for the select. We use select() instead of usleep() because it * seems to be far more portable across platforms. * * ideal and elapsed are in milleconds, but tv_usec will be microseconds, * so be sure to convert properly. */ tv.tv_usec = (ideal - elapsed) * 1000; tv.tv_sec = tv.tv_usec / 1000000L; tv.tv_usec = tv.tv_usec % 1000000L; pr_log_debug(DEBUG7, "transferring too fast, delaying %ld sec%s, %ld usecs", (long int) tv.tv_sec, tv.tv_sec == 1 ? "" : "s", (long int) tv.tv_usec); /* No interruptions, please... */ xfer_rate_sigmask(TRUE); if (select(0, NULL, NULL, NULL, &tv) < 0) { === It seems portable to Windows too, although with some more help. http://stackoverflow.com/questions/85122/sleep-less-than-one-millisecond === On Windows, however, the use of select forces you to include the Winsock library which has to be initialized like this in your application: WORD wVersionRequested = MAKEWORD(1,0); WSADATA wsaData; WSAStartup(wVersionRequested, &wsaData); And then the select won't allow you to be called without any socket so you have to do a little more to create a microsleep method: int usleep(long usec) { struct timeval tv; fd_set dummy; SOCKET s = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); FD_ZERO(&dummy); FD_SET(s, &dummy); tv.tv_sec = usec/1000000L; tv.tv_usec = usec%1000000L; return select(0, 0, 0, &dummy, &tv); } All these created usleep methods return zero when successful and non-zero for errors. === Right now, bareos/bacula uses bmicrosleep that uses nanosleep if present, if not it uses something really strange to me (the old way that I doubt it works fine at all): === #ifdef HAVE_NANOSLEEP status = nanosleep(&timeout, NULL); if (!(status < 0 && errno == ENOSYS)) { return status; } /* If we reach here it is because nanosleep is not supported by the OS */ #endif /* Do it the old way */ gettimeofday(&tv, &tz); timeout.tv_nsec += tv.tv_usec * 1000; timeout.tv_sec += tv.tv_sec; while (timeout.tv_nsec >= 1000000000) { timeout.tv_nsec -= 1000000000; timeout.tv_sec++; } === So, I think the select method should be the way to go. |
|
First of all as we compile for windows with MINGW we use src/win32/compat/include/mingwconfig.h as config file for what options should be used if you look there you will find that HAVE_NANOSLEEP is enabled. As to using select, poll or the now used pthreads alternative when nanosleep is not available I think they are all comparable. The all use the timeout method for not hanging the select, poll or waiting on a pthread conditional that never gets raised. As to your linux test, did you run that with Allow Bandwidth Bursting ? If so then its indeed quite off then again the overall bandwidth calculation is not to accurate either if not yes that is what I have seen too and that is why the new option was introduced. We might want to make that the default in a newer version. If you want to get a deeper insight in things you could try running the fd with a high debug level (I think 450 or higher will trigger the debug messages in the limiting code.) As this is just seriously hard to debug (if not next to impossible) I see very low probability it will be fixed or made better any time soon. That doesn't mean that you cannot work on enhancements yourself and if you can show that they behave much better then please send a patch and we will seriously consider changing the code but currently we are all to busy to work on something that is going to take serious time with little gain or promise of any gain at all. |
|
Tested with burst on. Better, but still 10% off down the mark. Both machines are real, not virtual, and both have HPET. === linux-fd Version: 12.4.5 (04 September 2013) x86_64-unknown-linux-gnu unknown unknown Daemon started 19-Sep-13 13:08. Jobs: run=0 running=0. Heap: heap=270,336 smbytes=749,634 max_bytes=755,502 bufs=227 max_bufs=234 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 bwlimit=0kB/s Running Jobs: JobId 7160 Job xxxxx.2013-09-19_13.09.06_16 is running. Full Backup Job started: 19-Sep-13 13:09 Files=4,496 Bytes=182,409,435 Bytes/sec=2,280,117 Errors=0 Bwlimit=2,500,000 Files Examined=4,496 Processing file: xxxxxx SDReadSeqNo=5 fd=5 === mingw AFAIK actually has not nanosleep. I think it is defined in mingwconfig.h because in src/win32/compat/compat.c there is a nanosleep that makes a call to a simple Sleep. Sleep((req->tv_sec * 1000) + (req->tv_nsec/1000000)); And this is not accurate at all. |
|
Ok nice that we have such a great replacement for nanosleep while the fallback code in bsys.c probably works better with pthread. Up until now I think nanosleep was not really used on windows for anything critical only sleeping for 0.10 seconds but that is no real problem for this ancient implementation. As to the 10% offset, that is better then I expected, I think you have to make the code seriously more complex to be able to get closer as you have to keep track over the full time of the backup of unused bytes in previous timeslices and use those for bursting too to be able to get much closer to the set limit. It will also mean people are going to start complaining that it sometimes uses less and sometimes more then what they want. So I think it will be a serious adventure to improve it much. But like I said before write a patch benchmark it and show the better approach and I will import it. As to the windows nanosleep problem. I don't have time to look into fixing that right now (just before a small holiday and the OSBConf next week.) We have a bunch of other windows fixes in the development pipeline I will see if I can put this fix in too and maybe either release it as part of the whole set of patches for windows or if that takes to long find an other path for bringing it to testing. |
|
Test from linux to linux with low bwlimit is really bad. Burst is on. === linux-fd Version: 12.4.5 (04 September 2013) x86_64-unknown-linux-gnu unknown unknown Daemon started 19-Sep-13 13:26. Jobs: run=0 running=0. Heap: heap=270,336 smbytes=747,713 max_bytes=755,014 bufs=226 max_bufs=227 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 bwlimit=0kB/s Running Jobs: JobId 7161 Job xxxx.2013-09-19_19.25.39_05 is running. Full Backup Job started: 19-Sep-13 19:25 Files=1,992 Bytes=143,232,644 Bytes/sec=384,001 Errors=0 Bwlimit=25,000 Files Examined=1,992 Processing file: xxxxx SDReadSeqNo=5 fd=5 === |
|
Fix committed to bareos master branch with changesetid 1150. | |
Fix committed to bareos2015 bareos-14.2 branch with changesetid 5028. | |
Due to the reimport of the Github repository to bugs.bareos.org, the status of some tickets have been changed. These tickets will be closed again. Sorry for the noise. |
|
bareos: master e5195308 2013-09-27 14:22 Ported: N/A Details Diff |
Bwlimit not working in Windows 64bits client The nanosleep implementation in compat.c for windows if of poor quality. Try using the fallback to the pthread_cond_timedwait() in bsys.c by no longer claiming in the mingwconfig.h that we have a working nanosleep() and remove the poor implementation. There are a couple of ways of sleeping in a somewhat portable way e.g. select(), poll() or pthread_cond_timedwait(). So for as we use pthreads anyway everywhere we leave the pthread_cond_timedwait() method. Also check if bmicrosleep() returns early in the bandwidth limiting code and if it does schedule an new bmicrosleep() of the sleep time remaining. Fixes 0000226: Bwlimit not working in Windows 64bits client |
Affected Issues 0000226 |
|
mod - src/lib/bsock.c | Diff File | ||
mod - src/lib/bsys.c | Diff File | ||
mod - src/win32/compat/compat.c | Diff File | ||
mod - src/win32/compat/include/mingwconfig.h | Diff File | ||
bareos2015: bareos-14.2 8dd095e3 2013-09-27 16:22 Ported: N/A Details Diff |
Bwlimit not working in Windows 64bits client The nanosleep implementation in compat.c for windows if of poor quality. Try using the fallback to the pthread_cond_timedwait() in bsys.c by no longer claiming in the mingwconfig.h that we have a working nanosleep() and remove the poor implementation. There are a couple of ways of sleeping in a somewhat portable way e.g. select(), poll() or pthread_cond_timedwait(). So for as we use pthreads anyway everywhere we leave the pthread_cond_timedwait() method. Also check if bmicrosleep() returns early in the bandwidth limiting code and if it does schedule an new bmicrosleep() of the sleep time remaining. Fixes 0000226: Bwlimit not working in Windows 64bits client |
Affected Issues 0000226 |
|
mod - src/lib/bsock.c | Diff File | ||
mod - src/lib/bsys.c | Diff File | ||
mod - src/win32/compat/compat.c | Diff File | ||
mod - src/win32/compat/include/mingwconfig.h | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2013-09-18 13:12 | cvelasco | New Issue | |
2013-09-18 14:54 | cvelasco | Note Added: 0000663 | |
2013-09-18 16:00 | mvwieringen | Note Added: 0000664 | |
2013-09-18 16:01 | mvwieringen | Assigned To | => mvwieringen |
2013-09-18 16:01 | mvwieringen | Status | new => feedback |
2013-09-18 16:08 | cvelasco | Note Added: 0000665 | |
2013-09-18 16:08 | cvelasco | Status | feedback => assigned |
2013-09-18 16:19 | mvwieringen | Note Added: 0000666 | |
2013-09-18 16:20 | mvwieringen | Status | assigned => feedback |
2013-09-18 17:29 | cvelasco | Note Added: 0000667 | |
2013-09-18 17:29 | cvelasco | Status | feedback => assigned |
2013-09-18 19:49 | cvelasco | Note Added: 0000668 | |
2013-09-18 19:50 | cvelasco | Note Edited: 0000668 | |
2013-09-18 20:09 | mvwieringen | Note Added: 0000669 | |
2013-09-18 20:09 | mvwieringen | Status | assigned => feedback |
2013-09-19 01:56 | cvelasco | Note Added: 0000670 | |
2013-09-19 01:56 | cvelasco | Status | feedback => assigned |
2013-09-19 09:53 | mvwieringen | Note Added: 0000671 | |
2013-09-19 09:59 | mvwieringen | Assigned To | mvwieringen => |
2013-09-19 09:59 | mvwieringen | Status | assigned => feedback |
2013-09-19 13:23 | cvelasco | Note Added: 0000672 | |
2013-09-19 13:23 | cvelasco | Status | feedback => new |
2013-09-19 15:10 | mvwieringen | Note Added: 0000673 | |
2013-09-19 19:34 | cvelasco | Note Added: 0000674 | |
2013-09-27 17:17 | mvwieringen | Changeset attached | => bareos master e5195308 |
2013-09-27 17:17 | mvwieringen | Note Added: 0000678 | |
2013-09-27 17:17 | mvwieringen | Assigned To | => mvwieringen |
2013-09-27 17:17 | mvwieringen | Status | new => resolved |
2013-09-27 17:17 | mvwieringen | Resolution | open => fixed |
2013-11-16 15:32 | mvwieringen | Status | resolved => closed |
2013-11-16 15:32 | mvwieringen | Assigned To | mvwieringen => |
2015-03-25 16:51 | mvwieringen | Changeset attached | => bareos2015 bareos-14.2 8dd095e3 |
2015-03-25 16:51 | mvwieringen | Note Added: 0001488 | |
2015-03-25 16:51 | mvwieringen | Status | closed => resolved |
2015-03-25 19:19 | joergs | Note Added: 0001633 | |
2015-03-25 19:19 | joergs | Status | resolved => closed |