Hadrien G. via llvm-dev
2017-Mar-11 12:41 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
Thank you! Is this information available programmatically through some LLVM API, so that next time some hardware manufacturer does some crazy experiment, my code can be automatically compatible with it as soon as LLVM is? Le 11/03/2017 à 13:38, Bruce Hoult a écrit :> PowerPC G5 (970) and all recent IBM Power have 128 byte cache lines. I > believe Itanium is also 128. > > Intel has stuck with 64 recently with x86, at least at L1. I believe > multiple adjacent lines may be combined into a "block" (with a single > tag) at L2 or higher in some of them. > > ARM can be 32 or 64. > > > > On Sat, Mar 11, 2017 at 3:24 PM, Hadrien G. <knights_of_ni at gmx.com > <mailto:knights_of_ni at gmx.com>> wrote: > > I guess that in this case, what I would like to know is a > reasonable upper bound of the cache line size on the target > architecture. Something that I can align my data structures on at > compile time so as to minimize the odds of false sharing. Think > std::hardware_destructive_interference_size in C++17. > > > Le 11/03/2017 à 13:16, Bruce Hoult a écrit : >> There's no way to know, until you run on real hardware. It could >> be different every time the binary is run. You have to ask the OS >> or hardware, and that's system dependent. >> >> The cache line size can even change in the middle of the program >> running, for example if your program is moved between a "big" and >> "LITTLE" core on ARM. In this case the OS is supposed to lie to >> you and tell you the smallest of the cache line sizes (but that >> can only work if cache line operations are non-estructive! No >> "zero cache line" or "throw away local changes in cache line" >> like on PowerPC). It also means that you might not places things >> far enough apart to be on different cache lines on the bigger >> core, and so not acheive the optimal result you wanted. It's a mess! >> >> >> On Sat, Mar 11, 2017 at 2:13 PM, Hadrien G. via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi everyone, >> >> I'm hailing from the Rust community, where there is a >> discussion about adding facilities for aligning data on an L1 >> cache line boundary. One example of situation where this is >> useful is when building thread synchronization primitives, >> where avoiding false sharing can be a critical concern. >> >> Now, when it comes to implementation, I have this gut feeling >> that we probably do not need to hardcode every target's cache >> line size in rustc ourselves, because there is probably a way >> to get this information directly from the LLVM toolchain that >> we are using. Is my gut right on this one? And can you >> provide us with some details on how it should be done? >> >> Thanks in advance, >> Hadrien >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* <http://www.mailscanner.info/>, > and is > believed to be clean. > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170311/3c423454/attachment.html>
Hal Finkel via llvm-dev
2017-Mar-11 13:53 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
On 03/11/2017 06:41 AM, Hadrien G. via llvm-dev wrote:> Thank you! Is this information available programmatically through some > LLVM API, so that next time some hardware manufacturer does some crazy > experiment, my code can be automatically compatible with it as soon as > LLVM is?Yes, using TargetTransformInfo, you can call TTI->getCacheLineSize(). Not all targets provide this information, however, and as Bruce pointed out, there are environments where this does not make sense ( caveat emptor). -Hal> > Le 11/03/2017 à 13:38, Bruce Hoult a écrit : >> PowerPC G5 (970) and all recent IBM Power have 128 byte cache lines. >> I believe Itanium is also 128. >> >> Intel has stuck with 64 recently with x86, at least at L1. I believe >> multiple adjacent lines may be combined into a "block" (with a single >> tag) at L2 or higher in some of them. >> >> ARM can be 32 or 64. >> >> >> >> On Sat, Mar 11, 2017 at 3:24 PM, Hadrien G. <knights_of_ni at gmx.com >> <mailto:knights_of_ni at gmx.com>> wrote: >> >> I guess that in this case, what I would like to know is a >> reasonable upper bound of the cache line size on the target >> architecture. Something that I can align my data structures on at >> compile time so as to minimize the odds of false sharing. Think >> std::hardware_destructive_interference_size in C++17. >> >> >> Le 11/03/2017 à 13:16, Bruce Hoult a écrit : >>> There's no way to know, until you run on real hardware. It could >>> be different every time the binary is run. You have to ask the >>> OS or hardware, and that's system dependent. >>> >>> The cache line size can even change in the middle of the program >>> running, for example if your program is moved between a "big" >>> and "LITTLE" core on ARM. In this case the OS is supposed to lie >>> to you and tell you the smallest of the cache line sizes (but >>> that can only work if cache line operations are non-estructive! >>> No "zero cache line" or "throw away local changes in cache line" >>> like on PowerPC). It also means that you might not places things >>> far enough apart to be on different cache lines on the bigger >>> core, and so not acheive the optimal result you wanted. It's a mess! >>> >>> >>> On Sat, Mar 11, 2017 at 2:13 PM, Hadrien G. via llvm-dev >>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>> Hi everyone, >>> >>> I'm hailing from the Rust community, where there is a >>> discussion about adding facilities for aligning data on an >>> L1 cache line boundary. One example of situation where this >>> is useful is when building thread synchronization >>> primitives, where avoiding false sharing can be a critical >>> concern. >>> >>> Now, when it comes to implementation, I have this gut >>> feeling that we probably do not need to hardcode every >>> target's cache line size in rustc ourselves, because there >>> is probably a way to get this information directly from the >>> LLVM toolchain that we are using. Is my gut right on this >>> one? And can you provide us with some details on how it >>> should be done? >>> >>> Thanks in advance, >>> Hadrien >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>> >>> >> >> >> -- >> This message has been scanned for viruses and >> dangerous content by *MailScanner* >> <http://www.mailscanner.info/>, and is >> believed to be clean. >> >> > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170311/a0e02acb/attachment.html>
Hadrien G. via llvm-dev
2017-Mar-12 16:56 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
Well, thank you all for these informations! Hadrien Le 11/03/2017 à 14:53, Hal Finkel a écrit :> > > On 03/11/2017 06:41 AM, Hadrien G. via llvm-dev wrote: >> Thank you! Is this information available programmatically through >> some LLVM API, so that next time some hardware manufacturer does some >> crazy experiment, my code can be automatically compatible with it as >> soon as LLVM is? > > Yes, using TargetTransformInfo, you can call TTI->getCacheLineSize(). > Not all targets provide this information, however, and as Bruce > pointed out, there are environments where this does not make sense ( > caveat emptor). > > -Hal > >> >> Le 11/03/2017 à 13:38, Bruce Hoult a écrit : >>> PowerPC G5 (970) and all recent IBM Power have 128 byte cache lines. >>> I believe Itanium is also 128. >>> >>> Intel has stuck with 64 recently with x86, at least at L1. I believe >>> multiple adjacent lines may be combined into a "block" (with a >>> single tag) at L2 or higher in some of them. >>> >>> ARM can be 32 or 64. >>> >>> >>> >>> On Sat, Mar 11, 2017 at 3:24 PM, Hadrien G. <knights_of_ni at gmx.com >>> <mailto:knights_of_ni at gmx.com>> wrote: >>> >>> I guess that in this case, what I would like to know is a >>> reasonable upper bound of the cache line size on the target >>> architecture. Something that I can align my data structures on >>> at compile time so as to minimize the odds of false sharing. >>> Think std::hardware_destructive_interference_size in C++17. >>> >>> >>> Le 11/03/2017 à 13:16, Bruce Hoult a écrit : >>>> There's no way to know, until you run on real hardware. It >>>> could be different every time the binary is run. You have to >>>> ask the OS or hardware, and that's system dependent. >>>> >>>> The cache line size can even change in the middle of the >>>> program running, for example if your program is moved between a >>>> "big" and "LITTLE" core on ARM. In this case the OS is supposed >>>> to lie to you and tell you the smallest of the cache line sizes >>>> (but that can only work if cache line operations are >>>> non-estructive! No "zero cache line" or "throw away local >>>> changes in cache line" like on PowerPC). It also means that you >>>> might not places things far enough apart to be on different >>>> cache lines on the bigger core, and so not acheive the optimal >>>> result you wanted. It's a mess! >>>> >>>> >>>> On Sat, Mar 11, 2017 at 2:13 PM, Hadrien G. via llvm-dev >>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>> >>>> Hi everyone, >>>> >>>> I'm hailing from the Rust community, where there is a >>>> discussion about adding facilities for aligning data on an >>>> L1 cache line boundary. One example of situation where this >>>> is useful is when building thread synchronization >>>> primitives, where avoiding false sharing can be a critical >>>> concern. >>>> >>>> Now, when it comes to implementation, I have this gut >>>> feeling that we probably do not need to hardcode every >>>> target's cache line size in rustc ourselves, because there >>>> is probably a way to get this information directly from the >>>> LLVM toolchain that we are using. Is my gut right on this >>>> one? And can you provide us with some details on how it >>>> should be done? >>>> >>>> Thanks in advance, >>>> Hadrien >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>>> >>>> >>> >>> >>> -- >>> This message has been scanned for viruses and >>> dangerous content by *MailScanner* >>> <http://www.mailscanner.info/>, and is >>> believed to be clean. >>> >>> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170312/4d364847/attachment.html>