search for: num_of_cores

Displaying 2 results from an estimated 2 matches for "num_of_cores".

2014 Apr 17
2
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
Good thinking, but why do you think runtime selection of shard count is better than compile time selection? For single threaded apps, shard count is always 1, so why paying the penalty to check thread id each time function is entered? For multi-threaded apps, I would expect MAX to be smaller than NUM_OF_CORES to avoid excessive memory consumption, then you always end up with N == MAX. If MAX is larger than NUM_OF_CORES, for large MT apps, the # of threads tends to be larger than NUM_OF_CORES, so it also ends up with N == MAX. For rare cases, the shard count may switch between MAX and NUM_OF_CORES, bu...
2014 Apr 17
9
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
Hi, The current design of -fprofile-instr-generate has the same fundamental flaw as the old gcc's gcov instrumentation: it has contention on counters. A trivial synthetic test case was described here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-October/066116.html For the problem to appear we need to have a hot function that is simultaneously executed by multiple threads -- then we will