Hadrien G. via llvm-dev
2017-Mar-11 12:24 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
I guess that in this case, what I would like to know is a reasonable upper bound of the cache line size on the target architecture. Something that I can align my data structures on at compile time so as to minimize the odds of false sharing. Think std::hardware_destructive_interference_size in C++17. Le 11/03/2017 à 13:16, Bruce Hoult a écrit :> There's no way to know, until you run on real hardware. It could be > different every time the binary is run. You have to ask the OS or > hardware, and that's system dependent. > > The cache line size can even change in the middle of the program > running, for example if your program is moved between a "big" and > "LITTLE" core on ARM. In this case the OS is supposed to lie to you > and tell you the smallest of the cache line sizes (but that can only > work if cache line operations are non-estructive! No "zero cache line" > or "throw away local changes in cache line" like on PowerPC). It also > means that you might not places things far enough apart to be on > different cache lines on the bigger core, and so not acheive the > optimal result you wanted. It's a mess! > > > On Sat, Mar 11, 2017 at 2:13 PM, Hadrien G. via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi everyone, > > I'm hailing from the Rust community, where there is a discussion > about adding facilities for aligning data on an L1 cache line > boundary. One example of situation where this is useful is when > building thread synchronization primitives, where avoiding false > sharing can be a critical concern. > > Now, when it comes to implementation, I have this gut feeling that > we probably do not need to hardcode every target's cache line size > in rustc ourselves, because there is probably a way to get this > information directly from the LLVM toolchain that we are using. Is > my gut right on this one? And can you provide us with some details > on how it should be done? > > Thanks in advance, > Hadrien > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170311/1b7321d2/attachment.html>
Bruce Hoult via llvm-dev
2017-Mar-11 12:38 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
PowerPC G5 (970) and all recent IBM Power have 128 byte cache lines. I believe Itanium is also 128. Intel has stuck with 64 recently with x86, at least at L1. I believe multiple adjacent lines may be combined into a "block" (with a single tag) at L2 or higher in some of them. ARM can be 32 or 64. On Sat, Mar 11, 2017 at 3:24 PM, Hadrien G. <knights_of_ni at gmx.com> wrote:> I guess that in this case, what I would like to know is a reasonable upper > bound of the cache line size on the target architecture. Something that I > can align my data structures on at compile time so as to minimize the odds > of false sharing. Think std::hardware_destructive_interference_size in > C++17. > > > Le 11/03/2017 à 13:16, Bruce Hoult a écrit : > > There's no way to know, until you run on real hardware. It could be > different every time the binary is run. You have to ask the OS or hardware, > and that's system dependent. > > The cache line size can even change in the middle of the program running, > for example if your program is moved between a "big" and "LITTLE" core on > ARM. In this case the OS is supposed to lie to you and tell you the > smallest of the cache line sizes (but that can only work if cache line > operations are non-estructive! No "zero cache line" or "throw away local > changes in cache line" like on PowerPC). It also means that you might not > places things far enough apart to be on different cache lines on the bigger > core, and so not acheive the optimal result you wanted. It's a mess! > > > On Sat, Mar 11, 2017 at 2:13 PM, Hadrien G. via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi everyone, >> >> I'm hailing from the Rust community, where there is a discussion about >> adding facilities for aligning data on an L1 cache line boundary. One >> example of situation where this is useful is when building thread >> synchronization primitives, where avoiding false sharing can be a critical >> concern. >> >> Now, when it comes to implementation, I have this gut feeling that we >> probably do not need to hardcode every target's cache line size in rustc >> ourselves, because there is probably a way to get this information directly >> from the LLVM toolchain that we are using. Is my gut right on this one? And >> can you provide us with some details on how it should be done? >> >> Thanks in advance, >> Hadrien >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is > believed to be clean. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170311/979b5c1b/attachment.html>
Hadrien G. via llvm-dev
2017-Mar-11 12:41 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
Thank you! Is this information available programmatically through some LLVM API, so that next time some hardware manufacturer does some crazy experiment, my code can be automatically compatible with it as soon as LLVM is? Le 11/03/2017 à 13:38, Bruce Hoult a écrit :> PowerPC G5 (970) and all recent IBM Power have 128 byte cache lines. I > believe Itanium is also 128. > > Intel has stuck with 64 recently with x86, at least at L1. I believe > multiple adjacent lines may be combined into a "block" (with a single > tag) at L2 or higher in some of them. > > ARM can be 32 or 64. > > > > On Sat, Mar 11, 2017 at 3:24 PM, Hadrien G. <knights_of_ni at gmx.com > <mailto:knights_of_ni at gmx.com>> wrote: > > I guess that in this case, what I would like to know is a > reasonable upper bound of the cache line size on the target > architecture. Something that I can align my data structures on at > compile time so as to minimize the odds of false sharing. Think > std::hardware_destructive_interference_size in C++17. > > > Le 11/03/2017 à 13:16, Bruce Hoult a écrit : >> There's no way to know, until you run on real hardware. It could >> be different every time the binary is run. You have to ask the OS >> or hardware, and that's system dependent. >> >> The cache line size can even change in the middle of the program >> running, for example if your program is moved between a "big" and >> "LITTLE" core on ARM. In this case the OS is supposed to lie to >> you and tell you the smallest of the cache line sizes (but that >> can only work if cache line operations are non-estructive! No >> "zero cache line" or "throw away local changes in cache line" >> like on PowerPC). It also means that you might not places things >> far enough apart to be on different cache lines on the bigger >> core, and so not acheive the optimal result you wanted. It's a mess! >> >> >> On Sat, Mar 11, 2017 at 2:13 PM, Hadrien G. via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi everyone, >> >> I'm hailing from the Rust community, where there is a >> discussion about adding facilities for aligning data on an L1 >> cache line boundary. One example of situation where this is >> useful is when building thread synchronization primitives, >> where avoiding false sharing can be a critical concern. >> >> Now, when it comes to implementation, I have this gut feeling >> that we probably do not need to hardcode every target's cache >> line size in rustc ourselves, because there is probably a way >> to get this information directly from the LLVM toolchain that >> we are using. Is my gut right on this one? And can you >> provide us with some details on how it should be done? >> >> Thanks in advance, >> Hadrien >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* <http://www.mailscanner.info/>, > and is > believed to be clean. > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170311/3c423454/attachment.html>
David Abdurachmanov via llvm-dev
2017-Mar-11 12:43 UTC
[llvm-dev] Is there a way to know the target's L1 data cache line size?
Cavium ThunderX has 128 bytes cache lines. david> On 11 Mar 2017, at 13:38, Bruce Hoult via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ARM can be 32 or 64.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170311/de67b808/attachment.html>
Apparently Analagous Threads
- Is there a way to know the target's L1 data cache line size?
- Is there a way to know the target's L1 data cache line size?
- Share mounts in SMBv1 mode, but fails weirdly in SMBv2 mode
- Share mounts in SMBv1 mode, but fails weirdly in SMBv2 mode
- Share mounts in SMBv1 mode, but fails weirdly in SMBv2 mode