Hi Lang, > That suggests an optimization quality issue, rather than compile-time overhead Yes that makes sense. The long running applications (6+ hours) JIT the rules once (taking a few seconds) and then run the generated machine code for hours. With no additional JIT'ing. > if we can configure the CodeGen pipeline properly we can get the performance back to the same level as the legacy JIT. Sounds great. Happy to help with whatever is needed. Speaking of which: We generate low overhead profiling code as part of the generated IR. We use it for identifying performance bottlenecks in the higher level (before IR) optimizing stages. So I think it would be possible for me to identify a function that runs much slower in 3.7.1. than in 3.5.2. And extract the IR. Would that help? Cheers Morten On 05/02/16 13:46, Lang Hames wrote:> Hi Morten, > > > Here are the results (for a small but representational run): > > That suggests an optimization quality issue, rather than compile-time > overhead. That's good news - I'd take it as a good sign that the MC > and linking overhead aren't a big deal either, and if we can configure > the CodeGen pipeline properly we can get the performance back to the > same level as the legacy JIT. > > Cheers, > Lang. > > > On Thu, Feb 4, 2016 at 6:41 PM, Morten Brodersen via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi Keno, > > Thanks for the fast ISel suggestion. > > Here are the results (for a small but representational run): > > LLVM 3.5.2 (old JIT): 4m44s > > LLVM 3.7.1 (MCJit) no fast ISel: 7m31s > > LLVM 3.7.1 (MCJit) fast ISel: 7m39s > > So not much of a difference unfortunately. > > > On 05/02/16 11:05, Keno Fischer wrote: >> Yes, unfortunately, this is very much known. Over in the julia >> project, we've recently gone through this and taken the hit >> (after doing some work to fix the very extreme corner cases that >> we were hitting). We're not entirely sure why the slowdown is >> this noticable, but at least in our case, profiling didn't reveal >> any remaining low hanging fruits that are responsible. One thing >> you can potentially try if you haven't yet is to enable fast ISel >> and see if that brings you closer to the old runtimes. >> >> On Thu, Feb 4, 2016 at 7:00 PM, Morten Brodersen via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi All, >> >> We recently upgraded a number of applications from LLVM 3.5.2 >> (old JIT) to LLVM 3.7.1 (MCJit). >> >> We made the minimum changes needed for the switch (no changes >> to the IR generated or the IR optimizations applied). >> >> The resulting code pass all tests (8000+). >> >> However the runtime performance dropped significantly: 30% to >> 40% for all applications. >> >> The applications I am talking about optimize airline rosters >> and pairings. LLVM is used for compiling high level business >> rules to efficient machine code. >> >> A typical optimization run takes 6 to 8 hours. So a 30% to >> 40% reduction in speed has real impact (=> we can't upgrade >> from 3.5.2). >> >> We have triple checked and reviewed the changes we made from >> old JIT to MCJIt. We also tried different ways to optimize >> the IR. >> >> However all results indicate that the performance drop >> happens in the (black box) IR to machine code stage. >> >> So my question is if the runtime performance reduction is >> known/expected for MCJit vs. old JIT? Or if we might be doing >> something wrong? >> >> If you need more information, in order to understand the >> issue, please tell us so that we can provide you with more >> details. >> >> Thanks >> Morten >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/a51c27c0/attachment-0001.html>
----- Original Message -----> From: "Morten Brodersen via llvm-dev" <llvm-dev at lists.llvm.org> > To: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Thursday, February 4, 2016 9:21:58 PM > Subject: Re: [llvm-dev] MCJit Runtine Performance > > > Hi Lang, > > > That suggests an optimization quality issue, rather than > > compile-time overhead > > Yes that makes sense. The long running applications (6+ hours) JIT > the rules once (taking a few seconds) and then run the generated > machine code for hours. With no additional JIT'ing. > > > if we can configure the CodeGen pipeline properly we can get the > > performance back to the same level as the legacy JIT. > > Sounds great. Happy to help with whatever is needed. > > Speaking of which: > > We generate low overhead profiling code as part of the generated IR. > We use it for identifying performance bottlenecks in the higher > level (before IR) optimizing stages. > > So I think it would be possible for me to identify a function that > runs much slower in 3.7.1. than in 3.5.2. And extract the IR. > > Would that help?It seems quite likely to help. Please do. -Hal> > Cheers > Morten > > > On 05/02/16 13:46, Lang Hames wrote: > > > > Hi Morten, > > > > Here are the results (for a small but representational run): > > > That suggests an optimization quality issue, rather than compile-time > overhead. That's good news - I'd take it as a good sign that the MC > and linking overhead aren't a big deal either, and if we can > configure the CodeGen pipeline properly we can get the performance > back to the same level as the legacy JIT. > > > Cheers, > Lang. > > > > > On Thu, Feb 4, 2016 at 6:41 PM, Morten Brodersen via llvm-dev < > llvm-dev at lists.llvm.org > wrote: > > > > Hi Keno, > > Thanks for the fast ISel suggestion. > > Here are the results (for a small but representational run): > > LLVM 3.5.2 (old JIT): 4m44s > > LLVM 3.7.1 (MCJit) no fast ISel: 7m31s > > LLVM 3.7.1 (MCJit) fast ISel: 7m39s > > So not much of a difference unfortunately. > > > > > On 05/02/16 11:05, Keno Fischer wrote: > > > > Yes, unfortunately, this is very much known. Over in the julia > project, we've recently gone through this and taken the hit (after > doing some work to fix the very extreme corner cases that we were > hitting). We're not entirely sure why the slowdown is this > noticable, but at least in our case, profiling didn't reveal any > remaining low hanging fruits that are responsible. One thing you can > potentially try if you haven't yet is to enable fast ISel and see if > that brings you closer to the old runtimes. > > > On Thu, Feb 4, 2016 at 7:00 PM, Morten Brodersen via llvm-dev < > llvm-dev at lists.llvm.org > wrote: > > > Hi All, > > We recently upgraded a number of applications from LLVM 3.5.2 (old > JIT) to LLVM 3.7.1 (MCJit). > > We made the minimum changes needed for the switch (no changes to the > IR generated or the IR optimizations applied). > > The resulting code pass all tests (8000+). > > However the runtime performance dropped significantly: 30% to 40% for > all applications. > > The applications I am talking about optimize airline rosters and > pairings. LLVM is used for compiling high level business rules to > efficient machine code. > > A typical optimization run takes 6 to 8 hours. So a 30% to 40% > reduction in speed has real impact (=> we can't upgrade from 3.5.2). > > We have triple checked and reviewed the changes we made from old JIT > to MCJIt. We also tried different ways to optimize the IR. > > However all results indicate that the performance drop happens in the > (black box) IR to machine code stage. > > So my question is if the runtime performance reduction is > known/expected for MCJit vs. old JIT? Or if we might be doing > something wrong? > > If you need more information, in order to understand the issue, > please tell us so that we can provide you with more details. > > Thanks > Morten > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
OK. I will ask the optimization guys to extract a good example from the production code. On 05/02/16 14:26, Hal Finkel wrote:> ----- Original Message ----- >> From: "Morten Brodersen via llvm-dev" <llvm-dev at lists.llvm.org> >> To: "llvm-dev" <llvm-dev at lists.llvm.org> >> Sent: Thursday, February 4, 2016 9:21:58 PM >> Subject: Re: [llvm-dev] MCJit Runtine Performance >> >> >> Hi Lang, >> >>> That suggests an optimization quality issue, rather than >>> compile-time overhead >> Yes that makes sense. The long running applications (6+ hours) JIT >> the rules once (taking a few seconds) and then run the generated >> machine code for hours. With no additional JIT'ing. >> >>> if we can configure the CodeGen pipeline properly we can get the >>> performance back to the same level as the legacy JIT. >> Sounds great. Happy to help with whatever is needed. >> >> Speaking of which: >> >> We generate low overhead profiling code as part of the generated IR. >> We use it for identifying performance bottlenecks in the higher >> level (before IR) optimizing stages. >> >> So I think it would be possible for me to identify a function that >> runs much slower in 3.7.1. than in 3.5.2. And extract the IR. >> >> Would that help? > It seems quite likely to help. Please do. > > -Hal > >> Cheers >> Morten >> >> >> On 05/02/16 13:46, Lang Hames wrote: >> >> >> >> Hi Morten, >> >> >>> Here are the results (for a small but representational run): >> >> That suggests an optimization quality issue, rather than compile-time >> overhead. That's good news - I'd take it as a good sign that the MC >> and linking overhead aren't a big deal either, and if we can >> configure the CodeGen pipeline properly we can get the performance >> back to the same level as the legacy JIT. >> >> >> Cheers, >> Lang. >> >> >> >> >> On Thu, Feb 4, 2016 at 6:41 PM, Morten Brodersen via llvm-dev < >> llvm-dev at lists.llvm.org > wrote: >> >> >> >> Hi Keno, >> >> Thanks for the fast ISel suggestion. >> >> Here are the results (for a small but representational run): >> >> LLVM 3.5.2 (old JIT): 4m44s >> >> LLVM 3.7.1 (MCJit) no fast ISel: 7m31s >> >> LLVM 3.7.1 (MCJit) fast ISel: 7m39s >> >> So not much of a difference unfortunately. >> >> >> >> >> On 05/02/16 11:05, Keno Fischer wrote: >> >> >> >> Yes, unfortunately, this is very much known. Over in the julia >> project, we've recently gone through this and taken the hit (after >> doing some work to fix the very extreme corner cases that we were >> hitting). We're not entirely sure why the slowdown is this >> noticable, but at least in our case, profiling didn't reveal any >> remaining low hanging fruits that are responsible. One thing you can >> potentially try if you haven't yet is to enable fast ISel and see if >> that brings you closer to the old runtimes. >> >> >> On Thu, Feb 4, 2016 at 7:00 PM, Morten Brodersen via llvm-dev < >> llvm-dev at lists.llvm.org > wrote: >> >> >> Hi All, >> >> We recently upgraded a number of applications from LLVM 3.5.2 (old >> JIT) to LLVM 3.7.1 (MCJit). >> >> We made the minimum changes needed for the switch (no changes to the >> IR generated or the IR optimizations applied). >> >> The resulting code pass all tests (8000+). >> >> However the runtime performance dropped significantly: 30% to 40% for >> all applications. >> >> The applications I am talking about optimize airline rosters and >> pairings. LLVM is used for compiling high level business rules to >> efficient machine code. >> >> A typical optimization run takes 6 to 8 hours. So a 30% to 40% >> reduction in speed has real impact (=> we can't upgrade from 3.5.2). >> >> We have triple checked and reviewed the changes we made from old JIT >> to MCJIt. We also tried different ways to optimize the IR. >> >> However all results indicate that the performance drop happens in the >> (black box) IR to machine code stage. >> >> So my question is if the runtime performance reduction is >> known/expected for MCJit vs. old JIT? Or if we might be doing >> something wrong? >> >> If you need more information, in order to understand the issue, >> please tell us so that we can provide you with more details. >> >> Thanks >> Morten >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>