search for: smallsize

Displaying 5 results from an estimated 5 matches for "smallsize".

2015 Sep 14
3
RFC: speedups with instruction side-data (ADCE, perhaps others?)
...ta when used and sized right", and i've found this to mostly be true after looking harder. In the case you are looking at, i see: - SmallPtrSet<Instruction*, 128> Alive; This seems ... wrong. In fact, it seems optimally bad. This says the small ptr set size is 128. That is, the smallsize is 128. SmallPtrSet will linear search the array when it's size <= smallsize, and otherwise fall back to building a non-small array and using better algorithms. Linear searching a 128 member array == slow I bet if you change the 128 to 8, you will see significant speedups. On Mon, Sep...
2007 Jun 22
0
[LLVMdev] df_ext_iterator in LiveIntervalAnalysis
Nice idea. Please also try using SmallPtrSet (with a sufficiently large size) instead of std::set for traversal after everything is working. Using std::set can really hurt compile time in case of large basic block numbers. Is there a way to dynamically adjust "SmallSize" based on number of basic blocks in the function? Evan On Jun 21, 2007, at 10:20 PM, Fernando Magno Quintao Pereira wrote: > > I would like to make a suggestion. In the LiveIntervalAnalysis class, > instead of numbering the instructions in the order in which basic > blocks &...
2007 Jun 22
4
[LLVMdev] df_ext_iterator in LiveIntervalAnalysis
I would like to make a suggestion. In the LiveIntervalAnalysis class, instead of numbering the instructions in the order in which basic blocks are stored in the machine function, use the df_ext_iterator. It will order the instruction according to the dominance tree (or it seems to be doing so). There are many advantages in doing this. One of them is that, once you traverse the dominance tree
2015 Sep 14
7
RFC: speedups with instruction side-data (ADCE, perhaps others?)
I’ve been playing around with optimizing performance various passes and noticed something about ADCE: it keeps an Alive set that requires a lot of upkeep time-wise, but if each instruction had a /single bit/ of side data (to represent liveness, solely within the pass), the set wouldn’t be needed. I tested this out and observed a ~1/3 reduction in time in ADCE: 1454ms to 982ms according to a
2014 Dec 02
7
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
Hi, In feedback from game studios a common issue is the replacement of loops with calls to memcpy/memset. These loops are often hand-optimised, and highly-efficient and the developers strongly want a way to control the compiler (i.e. leave my loop alone). The culprit is of course the loop-idiom recognizer. This replaces any loop that looks like a memset/memcpy with calls. This affects loops