When I run this command on 10-STABLE on a uniprocessor system while running the misc/dnetc port: cd /usr/src time make buildworld && time make buildkernel && time make installkernel On revision 266422 with SCHED_ULE, I get (showing the time lines only): 7045.988u 897.681s 4:00:33.89 55.0% 29430+492k 27927+17003io 30943pf+519w 1155.683u 149.422s 52:49.60 41.1% 25418+410k 7452+20843io 12166pf+248w 7.101u 4.838s 8:03.57 2.4% 5905+221k 1179+9461io 1345pf+67w On revision 267211 with SCHED_4BSD: 6950.087u 665.074s 2:40:36.19 79.0% 29929+502k 33651+17368io 31151pf+151w 1148.066u 134.312s 26:40.95 80.1% 26234+426k 9681+24613io 11917pf+106w 6.774u 4.369s 0:33.90 32.8% 3110+320k 1388+10979io 1514pf+3w Since the majority of my systems are uniprocessors and I like to run dnetc, SCHED_ULE has been a dealbreaker for me since day one. Consequently I can't use freebsd_update. The party line seems to be, "Well, everybody knows SCHED_ULE sucks on uniprocessors." Hello? Not everybody has upgraded to multiple core or hyperthreaded processors yet. Do we really want to write off every uniprocessor piece of hardware out here? The other assertion I hear is that SCHED_ULE really excels on some unspecified workload or other. I'd love to see exactly how much better it does than 4BSD on these mythological loads. -- George
On Sun, Jun 8, 2014 at 11:15 AM, George Mitchell <george+freebsd at m5p.com> wrote:> When I run this command on 10-STABLE on a uniprocessor system while > running the misc/dnetc port: > > cd /usr/src > time make buildworld && time make buildkernel && time make installkernel > > On revision 266422 with SCHED_ULE, I get (showing the time lines only): > > 7045.988u 897.681s 4:00:33.89 55.0% 29430+492k 27927+17003io > 30943pf+519w > 1155.683u 149.422s 52:49.60 41.1% 25418+410k 7452+20843io > 12166pf+248w > 7.101u 4.838s 8:03.57 2.4% 5905+221k 1179+9461io 1345pf+67w > > On revision 267211 with SCHED_4BSD: > > 6950.087u 665.074s 2:40:36.19 79.0% 29929+502k 33651+17368io > 31151pf+151w > 1148.066u 134.312s 26:40.95 80.1% 26234+426k 9681+24613io > 11917pf+106w > 6.774u 4.369s 0:33.90 32.8% 3110+320k 1388+10979io 1514pf+3w > > Since the majority of my systems are uniprocessors and I like to > run dnetc, SCHED_ULE has been a dealbreaker for me since day one. > Consequently I can't use freebsd_update. > > The party line seems to be, "Well, everybody knows SCHED_ULE sucks > on uniprocessors." Hello? Not everybody has upgraded to multiple > core or hyperthreaded processors yet. Do we really want to write > off every uniprocessor piece of hardware out here? > > The other assertion I hear is that SCHED_ULE really excels on some > unspecified workload or other. I'd love to see exactly how much > better it does than 4BSD on these mythological loads. -- George >I am also at least ambivalent about the merits of ULE. The choice to run 4BSD does not prevent the use of freebsd_update. It does require a kernel re-build after the update, though. Just keep a GENERIC kernel (which can be downloaded for any release) in /boot. freebsd-update will use that for the upgrade. Once the upgrade is complete you can build the kernel as you wish, and reboot. Just leave /boot/GENERIC there to have it ready for the next time you need to upgrade or update. There is no need to rebuild the modules when all your custom kernel does is to switch schedulers, so it is fairly quick. This may not be adequate for you, but it does remove the need for many of the steps in the full system re-build. Full instructions for all of this are in Chapter 23, Section 2 of the Handbook. -- R. Kevin Oberman, Network Engineer, Retired E-mail: rkoberman at gmail.com
On Sun, Jun 8, 2014 at 8:15 PM, George Mitchell <george+freebsd at m5p.com> wrote:> When I run this command on 10-STABLE on a uniprocessor system while > running the misc/dnetc port: > > cd /usr/src > time make buildworld && time make buildkernel && time make installkernel > > On revision 266422 with SCHED_ULE, I get (showing the time lines only): > > 7045.988u 897.681s 4:00:33.89 55.0% 29430+492k 27927+17003io > 30943pf+519w > 1155.683u 149.422s 52:49.60 41.1% 25418+410k 7452+20843io > 12166pf+248w > 7.101u 4.838s 8:03.57 2.4% 5905+221k 1179+9461io 1345pf+67w > > On revision 267211 with SCHED_4BSD: > > 6950.087u 665.074s 2:40:36.19 79.0% 29929+502k 33651+17368io > 31151pf+151w > 1148.066u 134.312s 26:40.95 80.1% 26234+426k 9681+24613io > 11917pf+106w > 6.774u 4.369s 0:33.90 32.8% 3110+320k 1388+10979io 1514pf+3w > > Since the majority of my systems are uniprocessors and I like to > run dnetc, SCHED_ULE has been a dealbreaker for me since day one. > Consequently I can't use freebsd_update. > > The party line seems to be, "Well, everybody knows SCHED_ULE sucks > on uniprocessors." Hello? Not everybody has upgraded to multiple > core or hyperthreaded processors yet. Do we really want to write > off every uniprocessor piece of hardware out here? >Yes? Can you even buy a system today that is uniprocessor? My phone is a dual core thing, and it got written of because of its "meagre" hardware. Top of the line phones has 8 cores. So, seriously, what non-ancient system have you acquired that is uniprocessor? Please include links for available hardware for laptops, desktops or servers. /A> > The other assertion I hear is that SCHED_ULE really excels on some > unspecified workload or other. I'd love to see exactly how much > better it does than 4BSD on these mythological loads. -- George > _______________________________________________ > freebsd-stable at freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" >
George Mitchell wrote this message on Sun, Jun 08, 2014 at 14:15 -0400:> When I run this command on 10-STABLE on a uniprocessor system while > running the misc/dnetc port: > > cd /usr/src > time make buildworld && time make buildkernel && time make installkernel > > On revision 266422 with SCHED_ULE, I get (showing the time lines only): > > 7045.988u 897.681s 4:00:33.89 55.0% 29430+492k 27927+17003io > 30943pf+519w > 1155.683u 149.422s 52:49.60 41.1% 25418+410k 7452+20843io 12166pf+248w > 7.101u 4.838s 8:03.57 2.4% 5905+221k 1179+9461io 1345pf+67w > > On revision 267211 with SCHED_4BSD: > > 6950.087u 665.074s 2:40:36.19 79.0% 29929+502k 33651+17368io > 31151pf+151w > 1148.066u 134.312s 26:40.95 80.1% 26234+426k 9681+24613io 11917pf+106w > 6.774u 4.369s 0:33.90 32.8% 3110+320k 1388+10979io 1514pf+3w > > Since the majority of my systems are uniprocessors and I like to > run dnetc, SCHED_ULE has been a dealbreaker for me since day one. > Consequently I can't use freebsd_update. > > The party line seems to be, "Well, everybody knows SCHED_ULE sucks > on uniprocessors." Hello? Not everybody has upgraded to multiple > core or hyperthreaded processors yet. Do we really want to write > off every uniprocessor piece of hardware out here? > > The other assertion I hear is that SCHED_ULE really excels on some > unspecified workload or other. I'd love to see exactly how much > better it does than 4BSD on these mythological loads. -- GeorgeWere you running dnetc at the same time as buildworld? If you are, then did you also measure how much work dnetc did durning the same period of time? If you were running dnetc, your complaint is that one processor hog wasn't able to hog the processor as much as another processor hog? If the numbers above are to be believed, _ULE is doing a better job than _4BSD since it more evenly shared the processor w/ the other processor hog, in that they both got ~50% of the cpu... If this is the case, then you need to use nice w/ buildworld to give it higher priority... Also, you did not say if you've applied the various sysctl changes that have been suggested on the mailing list... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
On Sun, Jun 08, 2014 at 02:15:36PM -0400, George Mitchell wrote:> Since the majority of my systems are uniprocessors and I like to > run dnetc, SCHED_ULE has been a dealbreaker for me since day one. > Consequently I can't use freebsd_update. > > The party line seems to be, "Well, everybody knows SCHED_ULE sucks > on uniprocessors." Hello? Not everybody has upgraded to multiple > core or hyperthreaded processors yet. Do we really want to write > off every uniprocessor piece of hardware out here? > > The other assertion I hear is that SCHED_ULE really excels on some > unspecified workload or other. I'd love to see exactly how much > better it does than 4BSD on these mythological loads. -- GeorgeIt doesn't seem to be only for uniprocessor systems 4BSD is the better choice. Another time when the schedulers were discussed on these lists, I checked first the ULE one which I was using, then 4BSD with a workload I knew rendered the two core machine close to unusable. I simply disconnected power to get some unclean filesystems and then tried to use the machine while the background filesystem check was running. Usage was running texteditors, X, ssh, browsers and the like. 4BSD performed better. The machine was almost usable with a little patience. Since then I've usually changed to 4BSD on other machines aswell, and at least on one 4 core machine, I notice that according to top, load is spread more even among the processors. While compiling base and some ports at the same time I've seen ULE keeping one processor busy while the others are close to 100 % idle, while 4BSD seems to keep all atleast halfway busy. I don't have any numbers other than that though, changing to 4BSD comes more from how I've the experienced using the system with each of them feels like. Eivind N Evensen
On 0608T1415, George Mitchell wrote:> When I run this command on 10-STABLE on a uniprocessor system while > running the misc/dnetc port: > > cd /usr/src > time make buildworld && time make buildkernel && time make installkernel > > On revision 266422 with SCHED_ULE, I get (showing the time lines only): > > 7045.988u 897.681s 4:00:33.89 55.0% 29430+492k 27927+17003io > 30943pf+519w > 1155.683u 149.422s 52:49.60 41.1% 25418+410k 7452+20843io 12166pf+248w > 7.101u 4.838s 8:03.57 2.4% 5905+221k 1179+9461io 1345pf+67w > > On revision 267211 with SCHED_4BSD: > > 6950.087u 665.074s 2:40:36.19 79.0% 29929+502k 33651+17368io > 31151pf+151w > 1148.066u 134.312s 26:40.95 80.1% 26234+426k 9681+24613io 11917pf+106w > 6.774u 4.369s 0:33.90 32.8% 3110+320k 1388+10979io 1514pf+3w > > Since the majority of my systems are uniprocessors and I like to > run dnetc, SCHED_ULE has been a dealbreaker for me since day one. > Consequently I can't use freebsd_update.Are the results above from kernel built with "options SMP"? If so, could you try without it? Otherwise, can you try with that option?
On Sun, Jun 8, 2014 at 8:15 PM, George Mitchell <george+freebsd at m5p.com> wrote:> When I run this command on 10-STABLE on a uniprocessor system while > running the misc/dnetc port: > > cd /usr/src > time make buildworld && time make buildkernel && time make installkernel > > On revision 266422 with SCHED_ULE, I get (showing the time lines only): > > 7045.988u 897.681s 4:00:33.89 55.0% 29430+492k 27927+17003io > 30943pf+519w > 1155.683u 149.422s 52:49.60 41.1% 25418+410k 7452+20843io 12166pf+248w > 7.101u 4.838s 8:03.57 2.4% 5905+221k 1179+9461io 1345pf+67w > > On revision 267211 with SCHED_4BSD: > > 6950.087u 665.074s 2:40:36.19 79.0% 29929+502k 33651+17368io > 31151pf+151w > 1148.066u 134.312s 26:40.95 80.1% 26234+426k 9681+24613io 11917pf+106w > 6.774u 4.369s 0:33.90 32.8% 3110+320k 1388+10979io 1514pf+3w > > Since the majority of my systems are uniprocessors and I like to > run dnetc, SCHED_ULE has been a dealbreaker for me since day one. > Consequently I can't use freebsd_update.So I think that the problem here is that essentially dnetc behaves in entirely different ways between the 2 systems, but you just don't care about how much work it is able to carry during your buildworld workload. To high-level description, it is like the CPU runtime is partitioned in balanced way in the ULE case while for 4BSD there is a huge bias toward the buildworld CPU%. Both threads (actually set of threads for buildworld) all fall in the time-share priority range and they are treacted all the same by the scheduler. However, differently from 4BSD, ULE has algorithms that essentially adjust dynamically the priority of threads to calculate properly the interactivity scores and that dynamically recalculate the thresholds for the RR quantum for timeshare priority threads. The quantum decreases proportionaly to the runqueue load. This essentially means that more stuff you will push to the runqueue (and buildworld spawns quite a bit of threads, in your schedgraph traces there were around 5/6 new) higher there will be the turnaround to properly partition the CPU times between all these time-share priority thread. It will also mean there will be much more context switches than the 4BSD case. In the 4BSD case, instead, the RR-quantum remains essentially fixed. I cannot say for sure because I don't know its code, but I expect that dnetc has some provisioning to perform manually some yielding after a "fraction" of the expected RR-quantum is used. So I expect this computational time to be smaller than 100ms (quantum default time slice for 4BSD). To get out of this situation and prove that what I'm saying is right you can try 2 different things: - Renice the buildworld to get it out of the timeshare-priority area but bring it into the kerne/real-time area. I suspect this will make dnetc to essentially perform very little job, but I expect you don't really care. However it will also make the workload compete against kernel services. - Enlarge your RR quantum for timeshare priority threads. You can do that via the kern.sched.quantum sysctl. I think that can aim for 200ms. I think this should be the preferred case. Of course, please keep disabiling the SMP option for your uniprocessor kernel. Attilio -- Peace can only be achieved by understanding - A. Einstein