Displaying 3 results from an estimated 3 matches for "double4".
Did you mean:
double
2013 Feb 09
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
...c.
In its current state the library has rough edges, e.g. the precision of
many math functions is not yet ideal, and exceptional cases (nan, inf) are
probably not yet all handled correctly. I would be happy if vecmathlib
could be used in LLVM.
For example, assuming that there is a data type "double4" containing a
vector of 4 double precision values, vecmathlib provides a function double4
pow(double4, double4) that implements pow(). In the general case, i.e. if
no system-specific machine instructions are available, this would use
Taylor expansions to calculate pow(x,y)=exp(y*log(x)).
I wo...
2013 Feb 07
5
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin, gentlemen,
I'm afraid I have to escalate this issue at this point. Since it was
discussed for the first time last summer, it was sufficient for us for a
while to have lowering of math calls into intrinsics disabled at DragonEgg
level, and link them against CUDA math functions at LLVM IR level. Now I
can say: this is not sufficient any longer, and we need NVPTX backend to
deal with
2019 Jan 05
1
unsorted - suggestion for performance improvement and ALTREP support for POSIXct
I believe the performance of isUnsorted() in sort.c could be improved by
calling REAL() once (outside of the for loop), rather than calling it twice
inside the loop. As an aside, it is implemented in the faster way in
doSort() (sort.c line 401). The example below shows the performance
improvement for a vectors of double of moving REAL() outside the for loop.
# example as implemented in