Displaying 3 results from an estimated 3 matches for "shardmask".
2014 Apr 18
2
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
...cost of
> this can be really high.
>
> It's possible to do it w/o a mutex, fast-path overhead is only a virtually
> zero overhead (if implemented properly in the compiler) atomic consume load:
>
> const int maxshard = 4096;
> uint64* shard[maxshard];
> atomic<int> shardmask;
>
> void inline inccounter(int idx)
> {
> int shardidx = gettid() & atomic_load(&shardmask, memory_order_consume);
> shard[shardidx][idx]++;
> }
>
> int pthread_create(...)
> {
> if (updateshardcount()) {
> shardlock();
> if (updateshardcount()) {
>...
2014 Apr 18
2
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
...39;s possible to do it w/o a mutex, fast-path overhead is only a
>>> virtually zero overhead (if implemented properly in the compiler) atomic
>>> consume load:
>>>
>>> const int maxshard = 4096;
>>> uint64* shard[maxshard];
>>> atomic<int> shardmask;
>>>
>>> void inline inccounter(int idx)
>>> {
>>> int shardidx = gettid() & atomic_load(&shardmask, memory_order_consume);
>>> shard[shardidx][idx]++;
>>> }
>>>
>>> int pthread_create(...)
>>> {
>>>...
2014 Apr 18
4
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
On Apr 17, 2014, at 2:04 PM, Chandler Carruth <chandlerc at google.com> wrote:
> On Thu, Apr 17, 2014 at 1:27 PM, Justin Bogner <mail at justinbogner.com> wrote:
> Chandler Carruth <chandlerc at google.com> writes:
> > if (thread-ID != main's thread-ID && shard_count < std::min(MAX, NUMBER_OF_CORES)) {
> > shard_count = std::min(MAX,