On Jan 19, 2013, at 6:00 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> On 1/19/2013 7:55 PM, Sean Silva wrote: >> >> Although SmallString is actually pretty inefficient, since it keeps >> the string data separate from the "vector" header. I believe libc++'s >> std::string actually reuses the pointers in the "vector header" as the >> storage for the "small" size, and so in that case std::string is >> effectively a very memory efficient SmallString with storage of >> roughly 3 pointers' worth of chars. > > Why do we actually have all the local versions of containers that already exist as a part of the C++ standard (such as SmallVector, DenseMap, et al.)?See: http://llvm.org/docs/ProgrammersManual.html#picking-the-right-data-structure-for-a-task -Chris
On 1/19/2013 8:36 PM, Chris Lattner wrote:> > See: > http://llvm.org/docs/ProgrammersManual.html#picking-the-right-data-structure-for-a-taskWere the "small n" characteristics the main motivation? Memory-wise, STL classes allow user-defined allocators, so use of things like memory pools should be relatively straightforward. Just wondering... :) -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
On Jan 19, 2013, at 7:04 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> On 1/19/2013 8:36 PM, Chris Lattner wrote: >> >> See: >> http://llvm.org/docs/ProgrammersManual.html#picking-the-right-data-structure-for-a-task > > Were the "small n" characteristics the main motivation?It is one of the motivations.> Memory-wise, STL classes allow user-defined allocators, so use of things like memory pools should be relatively straightforward. Just wondering... :)Yes, you get some of the benefits that way, but not all of it. -Chris
On 01/19/2013 06:36 PM, Chris Lattner wrote:> On Jan 19, 2013, at 6:00 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote: >> On 1/19/2013 7:55 PM, Sean Silva wrote: >>> >>> Although SmallString is actually pretty inefficient, since it keeps >>> the string data separate from the "vector" header. I believe libc++'s >>> std::string actually reuses the pointers in the "vector header" as the >>> storage for the "small" size, and so in that case std::string is >>> effectively a very memory efficient SmallString with storage of >>> roughly 3 pointers' worth of chars. >> >> Why do we actually have all the local versions of containers that already exist as a part of the C++ standard (such as SmallVector, DenseMap, et al.)? > > See: > http://llvm.org/docs/ProgrammersManual.html#picking-the-right-data-structure-for-a-task > > -Chris >Embarrassed to admit I have not read that manual yet have done a full port. :) Just did things from seeing how other people did them. Never too late to learn though. Time to peruse through the main docs page again and of course to read the Programmers Manual! Reed
On Jan 19, 2013, at 10:04 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> Were the "small n" characteristics the main motivation? Memory-wise, STL classes allow user-defined allocators, so use of things like memory pools should be relatively straightforward. Just wondering... :)Just fyi: http://home.roadrunner.com/~hinnant/stack_alloc.html This may or may not be appropriate for llvm. But it is an example of a fully C++11 conforming way to make a SmallContainer using a custom allocator and any C++11 std::container, or any container that mimics the std container requirements. One thing to keep in mind when designing SmallContainer (either from scratch or using an allocator) is that there is a tension between internal buffer size and C++11 move semantics. If the buffer is internal to the container (as is typical for std::string these days), the bigger the buffer the more expensive the container move members are. If the container never needs to be moved, this isn't a concern of course. But otherwise watch out for it. Typically the cost of a container move operation is proportional to sizeof(container). It was this very issue that drove the design of libc++'s std::string: make the internal buffer as large as possible without increasing sizeof(string). On Jan 20, 2013, at 11:39 AM, Benjamin Kramer <benny.kra at gmail.com> wrote:> For example DenseMap is a hash table that embeds both keys and values into a big table of memory. This gives nice cache behavior for simple pointer-to-pointer mappings which are common throughout LLVM while std::map wastes memory and has worse cache behavior. For big keys DenseMap wastes a ton of memory and becomes slow, std::map is better for those cases.Agreed. Also comparing to std::unordered_map: llvm::DenseMap (http://en.wikipedia.org/wiki/Hash_map#Open_addressing) is a completely different data structure than std::unordered_map (http://en.wikipedia.org/wiki/Hash_map#Separate_chaining). Neither is best in all use cases. llvm::DenseMap tends to run at very low load factors compared to std::unordered_map (i.e. carries around more empty buckets). And std::unordered_map has a higher per-entry overhead which is significant when the entry is one or two words. Howard