Jobst Schmalenbach
2019-Jul-04 06:43 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
Hi I need some advice what to do next, even if someone tells me to check out (an)other mailing list(s), tuning site or point me in a better direction how to solve my annoying problem: one server is much faster for certain tasks although on "shitty" hardware. I have tried many things to solve my issue - changed buffer/pool/cache/etc mysqld - changed server settings apache/php - changed various OS settings (sysctl) e.g. turned off IPV6 but havent figured it out. I have a development server (local) and life servers (data center) Used mainly for many different websites and one online training site. the development and life server in question run the same software setup: - CentOS Linux release 7.6.1810 - bind 32:9.9.4-74.el7_6.1 - Apache/2.4.6 (CentOS) - PHP 7.1.29 - mysqld Ver 5.7.26 - wordpress, woocommerce, wishlistmember, Sensei etc - software are all in the same stages of updates. - even many of the linux conf files are the same (/etc/host, bind, etc) - the databases are copies/identical Life server is a Poweredge M710,48GB,2xXeon L5630,LSI Raid1 SSD Dev server is a DIY, GIGABYTE MX31-BS0, 32GB, 1xXeon E3-1245,MDADM RAID0 1TB Seagate Spinners Clearly the development server is hardware wise way below the specs of the Dell but software wise they are identical (they get upgraded at the same time). During normal operations (i.e. display websites, online training courses etc) the DELL displays the websites faster although it sits 1000KM up north in a datacenter on a different network than the local server on the same network as my machine. Yet the DEV server outshines the DELL when creating a few large custom tables, ie the local server takes 5s while the DELL takes 15s (small tables), more for bigger tables. The task is based on: - level, member, course, group are all ID's - members can belong to a group, a level and can access many courses - the ID restricts what they can access and what they belong to. - a course for each member can have various stages of completion - using an API (wishlist member) that performs LOCAL calls when accessed locally I can get who belongs to what and make up my info I need, then use PHP to make up the table. - DB calls ARE LOCAL! Now when I try to create a table of members belonging to the same group level doing the same course with different stages of completion the DELL takes on average 3 times longer to complete the table (normally about 20 to 30 rows). I have put microtime() calls before and after certain calls, and it's visibly different: DEV Jul 04 04:57:26 UTC _members took 0.0005459785461425 ms Jul 04 04:57:26 UTC _members took 0.0005321502685546 ms LIFE Jul 04 05:00:36 UTC _members took 0.0014369487762451 ms Jul 04 05:00:36 UTC _members took 0.0013291835784912 ms If I do this 300+ times, the outcome is very different. So my questions: - How can it be that the DELL takes so much longer alltough on the far better hardware? - How can it be allthough everything (software/os/plugins) is the same? - This even happens if the DELL is on low load (i.e. middle of the night) and only serves a few requests. Same software, same config, same database, same amount of data in the database yet on better hardware it's slower? Any ideas anyone? -- Jobst Schmalenbach
Simon Matter
2019-Jul-04 07:07 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
> Hi > > I need some advice what to do next, even if someone tells me to > check out (an)other mailing list(s), tuning site or point me in a better > direction how to solve my annoying problem: one server is much faster > for certain tasks although on "shitty" hardware. > > I have tried many things to solve my issue > - changed buffer/pool/cache/etc mysqld > - changed server settings apache/php > - changed various OS settings (sysctl) e.g. turned off IPV6 > but havent figured it out. > > I have a development server (local) and life servers (data center) > Used mainly for many different websites and one online training site. > > the development and life server in question run the same software setup: > - CentOS Linux release 7.6.1810 > - bind 32:9.9.4-74.el7_6.1 > - Apache/2.4.6 (CentOS) > - PHP 7.1.29 > - mysqld Ver 5.7.26 > - wordpress, woocommerce, wishlistmember, Sensei etc > - software are all in the same stages of updates. > - even many of the linux conf files are the same (/etc/host, bind, etc) > - the databases are copies/identical > > Life server is a Poweredge M710,48GB,2xXeon L5630,LSI Raid1 SSD > Dev server is a DIY, GIGABYTE MX31-BS0, 32GB, 1xXeon E3-1245,MDADM RAID0 > 1TB Seagate Spinners > > Clearly the development server is hardware wise way below the specs of the > Dell but > software wise they are identical (they get upgraded at the same time). > > During normal operations (i.e. display websites, online training courses > etc) the DELL > displays the websites faster although it sits 1000KM up north in a > datacenter on > a different network than the local server on the same network as my > machine. > > Yet the DEV server outshines the DELL when creating a few large custom > tables, ie > the local server takes 5s while the DELL takes 15s (small tables), more > for bigger tables. > > The task is based on: > - level, member, course, group are all ID's > - members can belong to a group, a level and can access many courses > - the ID restricts what they can access and what they belong to. > - a course for each member can have various stages of completion > - using an API (wishlist member) that performs LOCAL calls when accessed > locally > I can get who belongs to what and make up my info I need, then use PHP > to make up the table. > - DB calls ARE LOCAL! > > Now when I try to create a table of members belonging to the same group > level > doing the same course with different stages of completion the DELL takes > on average > 3 times longer to complete the table (normally about 20 to 30 rows). > > I have put microtime() calls before and after certain calls, and it's > visibly different: > DEV > Jul 04 04:57:26 UTC _members took 0.0005459785461425 ms > Jul 04 04:57:26 UTC _members took 0.0005321502685546 ms > LIFE > Jul 04 05:00:36 UTC _members took 0.0014369487762451 ms > Jul 04 05:00:36 UTC _members took 0.0013291835784912 ms > If I do this 300+ times, the outcome is very different. > > > So my questions: > > - How can it be that the DELL takes so much longer alltough on the far > better hardware? > - How can it be allthough everything (software/os/plugins) is the same? > - This even happens if the DELL is on low load (i.e. middle of the night) > and > only serves a few requests. > > Same software, same config, same database, same amount of data in the > database > yet on better hardware it's slower?Two ideas: a) the DELL maybe faster over all but if I'm right single core speed is slower than on DEV machine. b) how do the LSI/SSD perform compared to the MDADM/RAID0 on the DEV server? I'm not sure the DELL is a clear winner here. Regards, Simon
Roberto Ragusa
2019-Jul-04 07:39 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
On 7/4/19 8:43 AM, Jobst Schmalenbach wrote:> Clearly the development server is hardware wise way below the specs of the Dell but > software wise they are identical (they get upgraded at the same time).As a first step, you have to test subsystems one by one. Try this to see how fast the CPU and kernel are (including meltdown/spectre slowdowns): time dd 2>/dev/null if=/dev/zero of=/dev/null bs=1 count=1000000 Then try this to see how fast your disks are for DB operations: cd /a/directory/on/the/filesystem/you/want/to/test time bash -c "for((i=0;i<1000;i++)); do dd 2>/dev/null if=/dev/zero of=test bs=1 count=1 conv=fsync;done" rm test Regards. -- Roberto Ragusa mail at robertoragusa.it
Gordon Messmer
2019-Jul-04 17:46 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
On 7/3/19 11:43 PM, Jobst Schmalenbach wrote:> - How can it be that the DELL takes so much longer alltough on the far better hardware?It looks like the DIY system has a CPU that's nearly twice as fast as the Dell's.? The additional CPU in the Dell will run more tasks concurrently, but it won't make a single process faster. You might also think that the SSD RAID would make the Dell faster, but that will only be true if the process that you're testing performs a significant amount of IO.? If your DB operations are happening mostly in memory (that is, if the data is cached), then the faster CPU will be the primary determining factor. The other thing that you left out of your description is the amount of data on each server.? If your live server has a lot of data in its DB and the dev system has a small dataset suitable for testing, then generally you'd expect that the dev system's data is more likely to live in cache and avoid disk IO, and processing the smaller set will also take less CPU time. https://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+E3-1245+%40+3.30GHz&id=1202 https://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+L5630+%40+2.13GHz&id=2086
Steven Tardy
2019-Jul-05 05:18 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
On Thu, Jul 4, 2019 at 2:43 AM Jobst Schmalenbach <jobst at barrett.com.au> wrote:> the development and life server in question run the same software setup: > - CentOS Linux release 7.6.1810 > - bind 32:9.9.4-74.el7_6.1 > - Apache/2.4.6 (CentOS) > - PHP 7.1.29 > - mysqld Ver 5.7.26 > - wordpress, woocommerce, wishlistmember, Sensei etc > - software are all in the same stages of updates. > - even many of the linux conf files are the same (/etc/host, bind, etc) > - the databases are copies/identical > > Life server is a Poweredge M710,48GB,2xXeon L5630,LSI Raid1 SSD > Dev server is a DIY, GIGABYTE MX31-BS0, 32GB, 1xXeon E3-1245,MDADM RAID0 > 1TB Seagate Spinners > > During normal operations (i.e. display websites, online training courses > etc) the DELL > displays the websites faster although it sits 1000KM up north in a > datacenter on > a different network than the local server on the same network as my > machine. > > Yet the DEV server outshines the DELL when creating a few large custom > tables, ie > the local server takes 5s while the DELL takes 15s (small tables), more > for bigger tables. > > > I have put microtime() calls before and after certain calls, and it's > visibly different: > DEV > Jul 04 04:57:26 UTC _members took 0.0005459785461425 ms > Jul 04 04:57:26 UTC _members took 0.0005321502685546 ms > LIFE > Jul 04 05:00:36 UTC _members took 0.0014369487762451 ms > Jul 04 05:00:36 UTC _members took 0.0013291835784912 ms > If I do this 300+ times, the outcome is very different. > > > So my questions: > > - How can it be that the DELL takes so much longer alltough on the far > better hardware? > - How can it be allthough everything (software/os/plugins) is the same? > - This even happens if the DELL is on low load (i.e. middle of the night) > and > only serves a few requests.As others have said the DEV server is a generation newer CPU. For CPU details I often reference Intels ?ark? pages: https://ark.intel.com/content/www/us/en/ark/products/47927/intel-xeon-processor-l5630-12m-cache-2-13-ghz-5-86-gt-s-intel-qpi.html 12M Cache, 2.13 GHz, 5.86 GT/s Intel? QPI https://ark.intel.com/content/www/us/en/ark/products/52274/intel-xeon-processor-e3-1245-8m-cache-3-30-ghz.html 8M Cache, 3.30 GHz The ?generations? I mentioned are: Code NameProducts formerly Westmere EP <https://ark.intel.com/content/www/us/en/ark/products/codename/54534/westmere-ep.html> Code NameProducts formerly Sandy Bridge <https://ark.intel.com/content/www/us/en/ark/products/codename/29900/sandy-bridge.html> Westmere systems used DDR at 800/1066MHz. Sandy Bridge systems used DDR at 1066/1333MHz. Not a huge difference, but likely another contributing factor of performance. I would also look at power settings in the BIOS and c-state settings in the BIOS and OS as disabling c-states (often enabled by default to meet green/energy star compliance) can make a noticeable performance difference. Hope that helps.
Gordon Messmer
2019-Jul-05 18:48 UTC
[CentOS] Have you run "tuned-adm profile throughput-performance" ?
On 7/4/19 10:18 PM, Steven Tardy wrote:> I would also look at power settings in the BIOS and c-state settings in the > BIOS and OS as disabling c-states (often enabled by default to meet > green/energy star compliance) can make a noticeable performance difference.I'd be surprised if it did, but now that you mention it, I think that we should probably mention more often that CentOS's default performance policy is power-saving, which will cut maximum performance in half.? Every physical system running CentOS should have run "tuned-adm profile throughput-performance". http://jperrin.org/centos/boosting-centos-server-performance/
Jobst Schmalenbach
2019-Jul-06 00:37 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
On Thu, Jul 04, 2019 at 09:39:18AM +0200, Roberto Ragusa (mail at robertoragusa.it) wrote:> On 7/4/19 8:43 AM, Jobst Schmalenbach wrote: > >Clearly the development server is hardware wise way below the specs of the Dell but > >software wise they are identical (they get upgraded at the same time). > As a first step, you have to test subsystems one by one.Thank you for the tips. Here are the results (DELL is faster overall):> time dd 2>/dev/null if=/dev/zero of=/dev/null bs=1 count=1000000[DIY ~] #>time dd 2>/dev/null if=/dev/zero of=/dev/null bs=1 count=1000000 real 0m1.931s user 0m1.022s sys 0m0.896s [DELL ~] #>time dd 2>/dev/null if=/dev/zero of=/dev/null bs=1 count=1000000 real 0m1.308s user 0m0.389s sys 0m0.919s Dell faster overall> cd /a/directory/on/the/filesystem/you/want/to/test > time bash -c "for((i=0;i<1000;i++)); do dd 2>/dev/null if=/dev/zero of=test bs=1 count=1 conv=fsync;done" > rm test[DIY /mnt] #>time bash -c "for((i=0;i<1000;i++)); do dd 2>/dev/null if=/dev/zero of=test bs=1 count=1 conv=fsync;done" real 1m12.944s user 0m1.604s sys 0m2.595s [DELL /mnt] #>time bash -c "for((i=0;i<1000;i++)); do dd 2>/dev/null if=/dev/zero of=test bs=1 count=1 conv=fsync;done" real 0m2.270s user 0m0.509s sys 0m1.475s Expected the DIY to be slower here, it's running MDADM RAID1 on Seagete Spinners compared to LSI RAID1 SSD The result shows the DELL overall is faster, back to the drawing board after I followed all the other hints in this thread. Jobst
Jobst Schmalenbach
2019-Jul-06 00:40 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
On Thu, Jul 04, 2019 at 09:07:35AM +0200, Simon Matter via CentOS (centos at centos.org) wrote:> > Hi > > Two ideas: > > a) the DELL maybe faster over all but if I'm right single core speed is > slower than on DEV machine.Yes, but since BOTH have "other" things to do at the same time the sheer number of CPUs of the DELL should help> > b) how do the LSI/SSD perform compared to the MDADM/RAID0 on the DEV > server? I'm not sure the DELL is a clear winner here.See my answer to the disk task test to another email.
Jobst Schmalenbach
2019-Jul-06 00:52 UTC
[CentOS] Performance issues/difference of two servers running same task (one is quicker)
On Thu, Jul 04, 2019 at 10:46:19AM -0700, Gordon Messmer (gordon.messmer at gmail.com) wrote:> On 7/3/19 11:43 PM, Jobst Schmalenbach wrote: > > - How can it be that the DELL takes so much longer alltough on the far better hardware? > It looks like the DIY system has a CPU that's nearly twice as fast > as the Dell's.? The additional CPU in the Dell will run more tasks > concurrently, but it won't make a single process faster. > > You might also think that the SSD RAID would make the Dell faster, > but that will only be true if the process that you're testing > performs a significant amount of IO.? If your DB operations are > happening mostly in memory (that is, if the data is cached), then > the faster CPU will be the primary determining factor.I made the buffer pool size on the DELL double the size of the DIY when I started trying to figure out why the speed difference.> > The other thing that you left out of your description is the amount > of data on each server.? If your live server has a lot of data in > its DB and the dev system has a small dataset suitable for testing, > then generally you'd expect that the dev system's data is more > likely to live in cache and avoid disk IO, and processing the > smaller set will also take less CPU time.Most of the DB's are small as they contain websites. The biggest DB is the Online Training DB, which are the same on both machine as I constantly copy the data from the life server to the DIY. Very good analysis indeed. Makes total sense. -- Jobst Schmalenbach Road to hell is paved with NAND gates.
Possibly Parallel Threads
- Performance issues/difference of two servers running same task (one is quicker)
- Performance issues/difference of two servers running same task (one is quicker)
- Performance issues/difference of two servers running same task (one is quicker)
- Performance issues/difference of two servers running same task (one is quicker)
- Performance issues/difference of two servers running same task (one is quicker)