On Fri, Jun 17, 2011 at 4:50 PM, Eli Friedman <eli.friedman at gmail.com> wrote:>>>> The plan is to form calls to these intrinsics in InstCombine. Legalizer can expand these intrinsics if they are not legal. The expansion should be fairly straight forward and produce code that is at least as good as what LLVM is currently generating for these code sequence. >>>> >>>> Comments? >>> >>> Is there some reason why pattern-matching this in an ARM-specific >>> DAGCombine doesn't work? >> >> It's not possible to look beyond a single BB at isel time. > > Anything that we can match to ssat should be of the form max(min(x, > SATMAX), SATMIN) (where max and min are icmp+select pairs). If the > min and max aren't in the same block, and we don't have an IR > transformation to put them in the same block, we should fix that > rather than introducing an instrinsic for this special case, I > think...Okay, thinking about it a bit more, I don't think this is practical. I'm still skeptical that adding platform-independent intrinsics for arbitrary ARM instructions is a good idea simply because we don't have the infrastructure to handle them otherwise. It wouldn't be especially hard to allow target-specific transforms on IR... -Eli
On Jun 17, 2011, at 5:49 PM, Eli Friedman wrote:> On Fri, Jun 17, 2011 at 4:50 PM, Eli Friedman <eli.friedman at gmail.com> wrote: >>>>> The plan is to form calls to these intrinsics in InstCombine. Legalizer can expand these intrinsics if they are not legal. The expansion should be fairly straight forward and produce code that is at least as good as what LLVM is currently generating for these code sequence. >>>>> >>>>> Comments? >>>> >>>> Is there some reason why pattern-matching this in an ARM-specific >>>> DAGCombine doesn't work? >>> >>> It's not possible to look beyond a single BB at isel time. >> >> Anything that we can match to ssat should be of the form max(min(x, >> SATMAX), SATMIN) (where max and min are icmp+select pairs). If the >> min and max aren't in the same block, and we don't have an IR >> transformation to put them in the same block, we should fix that >> rather than introducing an instrinsic for this special case, I >> think... > > Okay, thinking about it a bit more, I don't think this is practical.I think this will solve this particular problem but I'm not certain it's the right thing to do in general. In general, llvm already has the tendency to form selects too aggressively. What's your concern?> > I'm still skeptical that adding platform-independent intrinsics for > arbitrary ARM instructions is a good idea simply because we don't have > the infrastructure to handle them otherwise. It wouldn't be > especially hard to allow target-specific transforms on IR...It's easy to expand these intrinsics in legalizer time. Also llvm is already generating inferior code for these patterns for ARM and x86. Evan> > -Eli
On Sun, Jun 19, 2011 at 10:09 AM, Evan Cheng <evan.cheng at apple.com> wrote:> > On Jun 17, 2011, at 5:49 PM, Eli Friedman wrote: > >> On Fri, Jun 17, 2011 at 4:50 PM, Eli Friedman <eli.friedman at gmail.com> wrote: >>>>>> The plan is to form calls to these intrinsics in InstCombine. Legalizer can expand these intrinsics if they are not legal. The expansion should be fairly straight forward and produce code that is at least as good as what LLVM is currently generating for these code sequence. >>>>>> >>>>>> Comments? >>>>> >>>>> Is there some reason why pattern-matching this in an ARM-specific >>>>> DAGCombine doesn't work? >>>> >>>> It's not possible to look beyond a single BB at isel time. >>> >>> Anything that we can match to ssat should be of the form max(min(x, >>> SATMAX), SATMIN) (where max and min are icmp+select pairs). If the >>> min and max aren't in the same block, and we don't have an IR >>> transformation to put them in the same block, we should fix that >>> rather than introducing an instrinsic for this special case, I >>> think... >> >> Okay, thinking about it a bit more, I don't think this is practical. > > I think this will solve this particular problem but I'm not certain it's the right thing to do in general. In general, llvm already has the tendency to form selects too aggressively. What's your concern?I wasn't really considering the issues with converting branches into selects, actually... I didn't realize that's what you were running into. :) I was thinking more in terms of instruction sinking, which we really don't have an aggressive IR pass for at the moment. But you're right, converting chains of branches like that into selects without considering what we'll end up with is a bit suspicious.>> I'm still skeptical that adding platform-independent intrinsics for >> arbitrary ARM instructions is a good idea simply because we don't have >> the infrastructure to handle them otherwise. It wouldn't be >> especially hard to allow target-specific transforms on IR... > > It's easy to expand these intrinsics in legalizer time. Also llvm is already generating inferior code for these patterns for ARM and x86.We also have to consider whether forming sset will require substantial changes to other passes; thinking about it, it won't be too bad here. I still don't see the point of a usat intrinsic; we ought to already handle constructs that should form into usat effectively. -Eli