Louis Dionne via llvm-dev
2019-Feb-06 17:35 UTC
[llvm-dev] [libcxx-dev] Removing deprecated <ext/hash_set>, <ext/hash_map> and <ext/__hash>
> On Feb 5, 2019, at 23:18, Chandler Carruth <chandlerc at gmail.com> wrote: > > FWIW, I'm pretty sure we still have plenty of code using them. We've not done any analysis to see what timeframe that code could be be updated on -- our focus has been on *adopting* libc++ (and easing that path), not removing the uses of weird things that it also supports. I'll point out that I'd rather focus our energy on adopting libc++ than even doing this analysis. ;] > > I somewhat agree with Joerg that it would be good to understand the motivation. Much like a bunch of our other compatibility things (GCC flags, language extensions, etc.), having these headers helps encourage / ease adoption which seems a generally good thing for libc++. I can imagine that there is some large cost to keeping these around that would motivate removing them, but I don't see anything about that in the above?To be clear: there is not a large cost in keeping those headers around. I don't think that's the question, since the same could be said of almost any removal of deprecated API. The cost of keeping code around is usually not that large if you decide not to maintain it anymore. But that's called code rot, and it's generally not a good idea to accumulate too much of it. hash_map probably has bugs that we haven't and won't fix, etc. Libc++ implements a Standard. __gnu_cxx::hash_map is not part of (any version of) that Standard, and so it does not belong in libc++. When I remove or rename some internal function inside libc++ that uses reserved identifiers, I don't bother asking on this list. It doesn't mean that I purposefully try to break users (quite the opposite), but I know where the line is drawn when/if users break because they use implementation details. This is also very similar to how we deprecate AND remove TSes one year after they are merged into the Standard. The amount of breakage caused by removing `<experimental/optional>` was non-trivial in our case, but we dealt with it and now our code is better. What I'm trying to do here is understand whether and why __gnu_cxx::hash_map is special in that respect. Why was it put there in the first place? Why is it so hard to get rid of? If somebody can explain why it should be kept around, I'm happy to talk about it (this is why we have this thread). However, the argument of "we're too busy/lazy to update our code" doesn't sound like a very strong one to me, especially coming from someone that has excellent tools to deal with this kind of problem. So there must be something else underneath, and that's what I would love to understand. Louis> > -Chandler > > On Mon, Feb 4, 2019 at 11:32 AM Louis Dionne via libcxx-dev <libcxx-dev at lists.llvm.org <mailto:libcxx-dev at lists.llvm.org>> wrote: > Hi, > > Libc++ has been shipping the <ext/hash_set>, <ext/hash_map> and <ext/__hash> headers for a while and they are deprecated. Those headers contain data structures like __gnu_cxx::hash_map that have replacements like std::unordered_map. I would like to remove those headers. I've put up a patch for review but I won't commit it until we have a sort of plan because I know some people have expressed feelings about removing the headers in the past: https://reviews.llvm.org/D57688 <https://reviews.llvm.org/D57688>. > > FWIW, I've compiled a large code base with that patch and I didn't get any trouble. I suspect the amount of breakage this will cause is manageable especially if we give advance notice, but others might disagree. > > Is anybody opposed to removing those headers? If you're opposed to the removal, please explain why and what removal timeline/plan would work for you. > > Thanks! > Louis > > _______________________________________________ > libcxx-dev mailing list > libcxx-dev at lists.llvm.org <mailto:libcxx-dev at lists.llvm.org> > https://lists.llvm.org/cgi-bin/mailman/listinfo/libcxx-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/libcxx-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190206/961626ca/attachment-0001.html>
Marshall Clow via llvm-dev
2019-Feb-06 18:23 UTC
[llvm-dev] [libcxx-dev] Removing deprecated <ext/hash_set>, <ext/hash_map> and <ext/__hash>
On Wed, Feb 6, 2019 at 9:36 AM Louis Dionne via libcxx-dev < libcxx-dev at lists.llvm.org> wrote:> On Feb 5, 2019, at 23:18, Chandler Carruth <chandlerc at gmail.com> wrote: > > FWIW, I'm pretty sure we still have plenty of code using them. We've not > done any analysis to see what timeframe that code could be be updated on -- > our focus has been on *adopting* libc++ (and easing that path), not > removing the uses of weird things that it also supports. I'll point out > that I'd rather focus our energy on adopting libc++ than even doing this > analysis. ;] > > I somewhat agree with Joerg that it would be good to understand the > motivation. Much like a bunch of our other compatibility things (GCC flags, > language extensions, etc.), having these headers helps encourage / ease > adoption which seems a generally good thing for libc++. I can imagine that > there is some large cost to keeping these around that would motivate > removing them, but I don't see anything about that in the above? > > > To be clear: there is not a large cost in keeping those headers around. I > don't think that's the question, since the same could be said of almost any > removal of deprecated API. The cost of keeping code around is usually not > that large if you decide not to maintain it anymore. But that's called code > rot, and it's generally not a good idea to accumulate too much of it. > hash_map probably has bugs that we haven't and won't fix, etc. >FWIW -- My response (for several years) to any reported problems in `__gnu_cxx::hash_map/set` has been "use the ones in std::, then" -- Marshall P.S. The implementation in __gnu_cxx is NOT a faithful implementation of what libstdc++ has; it is a wrapper over std::unordered_map/set. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190206/43fa353e/attachment.html>
Chandler Carruth via llvm-dev
2019-Feb-06 18:47 UTC
[llvm-dev] [libcxx-dev] Removing deprecated <ext/hash_set>, <ext/hash_map> and <ext/__hash>
On Wed, Feb 6, 2019 at 9:36 AM Louis Dionne via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > On Feb 5, 2019, at 23:18, Chandler Carruth <chandlerc at gmail.com> wrote: > > FWIW, I'm pretty sure we still have plenty of code using them. We've not > done any analysis to see what timeframe that code could be be updated on -- > our focus has been on *adopting* libc++ (and easing that path), not > removing the uses of weird things that it also supports. I'll point out > that I'd rather focus our energy on adopting libc++ than even doing this > analysis. ;] > > I somewhat agree with Joerg that it would be good to understand the > motivation. Much like a bunch of our other compatibility things (GCC flags, > language extensions, etc.), having these headers helps encourage / ease > adoption which seems a generally good thing for libc++. I can imagine that > there is some large cost to keeping these around that would motivate > removing them, but I don't see anything about that in the above? > > > To be clear: there is not a large cost in keeping those headers around. I > don't think that's the question, since the same could be said of almost any > removal of deprecated API. The cost of keeping code around is usually not > that large if you decide not to maintain it anymore. But that's called code > rot, and it's generally not a good idea to accumulate too much of it. > hash_map probably has bugs that we haven't and won't fix, etc. > > Libc++ implements a Standard. __gnu_cxx::hash_map is not part of (any > version of) that Standard, and so it does not belong in libc++. When I > remove or rename some internal function inside libc++ that uses reserved > identifiers, I don't bother asking on this list. It doesn't mean that I > purposefully try to break users (quite the opposite), but I know where the > line is drawn when/if users break because they use implementation details. > This is also very similar to how we deprecate AND remove TSes one year > after they are merged into the Standard. The amount of breakage caused by > removing `<experimental/optional>` was non-trivial in our case, but we > dealt with it and now our code is better. >FWIW, I think extensions for compatibility are interestingly different from TSes. Clang also implements standards but supports a wide range of extensions for compatibility with both GCC and MSVC. Unlike TSes and other things with a clear path forward, many of these extensions don't have that clear path forward. I'm not trying to say __gnu_cxx::hash_map is *necessarily* the same, just that I think there is likely to be different needs for compatibility extensions than for "early versions" of standards-tracked things. And of course, any such way of handling needs to have a reasonable benefit given the cost, and have a reasonable plan in place for maintaining it and supporting it without holding back the larger project.> > What I'm trying to do here is understand whether and why > __gnu_cxx::hash_map is special in that respect. Why was it put there in > the first place? Why is it so hard to get rid of? >Sure, I'm happy to provide some of the historical context (I think I was one of the people arguing it should be added). When we first tried even minor adoption of libc++ we couldn't make it far in even small scale experiments because so much of our code relied on __gnu_cxx::hash_map. This was a lot of years ago, and we had no realistic way to migrate things, much less to do so when we hadn't even conducted the experiment to see what moving to libc++ would look like. At the time, I argued that this would likely be reasonably common for code building against libstdc++ for many years, much like reliance on GNU extensions to C and C++ is widespread. To make adoption effective, some of these need to be supported even if we would prefer people converge on the standard facilities. Things like `__thread` come to mind. Some of this code isn't even being updated to C++11 actively. While it builds with C++11, this is largely because C++11 didn't break anything it relied on. I think it is important that Clang and libc++ both support such code to some extent in order to maximize their adoption in the larger Linux ecosystem (and likely BSD, but I know less there). That support doesn't have to be perfect of course, and the engineering costs should be considered for anything like this. But even the imperfect __gnu_cxx::hash_* stuff libc++ provides has empirically had value for at least a few code bases trying to adopt libc++. So I don't think the tradeoff is trivial here. If somebody can explain why it should be kept around, I'm happy to talk> about it (this is why we have this thread). >I think the argument when it was added still holds: there is likely to be a great deal of code written against this because for many years it was the only viable option, and it has been stable and reliably available in libstdc++. I think making it easy for existing codebases to move to libc++ is good for the project, the same way that I think making it easy to move to Clang is good for that project. I can make a somewhat stronger argument that Google specifically would need this to remain to continue working on adoption. We will likely be able to eventually move off of this, but our priority at the moment is getting *on* to libc++, and there are already enough barriers there that I would very much like to avoid adding another component to that. However, the argument of "we're too busy/lazy to update our code" doesn't> sound like a very strong one to me, especially coming from someone that has > excellent tools to deal with this kind of problem. >Let's avoid terms like "lazy". =] I don't think what I said was that we were busy? Sorry if it came across that way. I've tried to expand on it a bit above. But my second point was not about specific needs of existing users. Instead, it was about users we *don't yet have*. I think there may be significant value to them of supporting some extensions like this, and hopefully above I've given more useful details about this value. Let me know if this is the kind of information you're looking for. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190206/6a69fb7f/attachment-0001.html>
Chandler Carruth via llvm-dev
2019-Feb-06 18:49 UTC
[llvm-dev] [libcxx-dev] Removing deprecated <ext/hash_set>, <ext/hash_map> and <ext/__hash>
On Wed, Feb 6, 2019 at 10:23 AM Marshall Clow via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Wed, Feb 6, 2019 at 9:36 AM Louis Dionne via libcxx-dev < > libcxx-dev at lists.llvm.org> wrote: > >> On Feb 5, 2019, at 23:18, Chandler Carruth <chandlerc at gmail.com> wrote: >> >> FWIW, I'm pretty sure we still have plenty of code using them. We've not >> done any analysis to see what timeframe that code could be be updated on -- >> our focus has been on *adopting* libc++ (and easing that path), not >> removing the uses of weird things that it also supports. I'll point out >> that I'd rather focus our energy on adopting libc++ than even doing this >> analysis. ;] >> >> I somewhat agree with Joerg that it would be good to understand the >> motivation. Much like a bunch of our other compatibility things (GCC flags, >> language extensions, etc.), having these headers helps encourage / ease >> adoption which seems a generally good thing for libc++. I can imagine that >> there is some large cost to keeping these around that would motivate >> removing them, but I don't see anything about that in the above? >> >> >> To be clear: there is not a large cost in keeping those headers around. I >> don't think that's the question, since the same could be said of almost any >> removal of deprecated API. The cost of keeping code around is usually not >> that large if you decide not to maintain it anymore. But that's called code >> rot, and it's generally not a good idea to accumulate too much of it. >> hash_map probably has bugs that we haven't and won't fix, etc. >> > > FWIW -- My response (for several years) to any reported problems in > `__gnu_cxx::hash_map/set` has been "use the ones in std::, then" > > -- Marshall > > P.S. The implementation in __gnu_cxx is NOT a faithful implementation of > what libstdc++ has; it is a wrapper over std::unordered_map/set. >See my longer response to Louis for details, but FWIW, even this level of "support" has been materially useful for our codebase. ::shrug:: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190206/3db17806/attachment.html>
Joerg Sonnenberger via llvm-dev
2019-Feb-06 20:56 UTC
[llvm-dev] [libcxx-dev] Removing deprecated <ext/hash_set>, <ext/hash_map> and <ext/__hash>
On Wed, Feb 06, 2019 at 12:35:56PM -0500, Louis Dionne via llvm-dev wrote:> What I'm trying to do here is understand whether and why > __gnu_cxx::hash_map is special in that respect. Why was it put there > in the first place? Why is it so hard to get rid of?It was put in because it used to be very popular. Many programs later moved to TR1 and then to C++11, but for a time there was no alternative in STL. It is hard to get rid of it, because quite a bit code is still around without active maintainer. It still serves a purpose, it more or less works as is, so artifically breaking it just creates work for little reason. Keep in mind that code like that is not *my* code. Please don't pull the OpenSSL approach here where the API is randomly changed because it can be changed and the patching of hundreds of existing projects is left for others to figure out. Joerg
Louis Dionne via llvm-dev
2019-Feb-11 19:17 UTC
[llvm-dev] [libcxx-dev] Removing deprecated <ext/hash_set>, <ext/hash_map> and <ext/__hash>
> On Feb 6, 2019, at 13:47, Chandler Carruth <chandlerc at gmail.com> wrote: > > [...]Thanks for the reply, Chandler. The idea that `hash_map` can ease adoption of libc++ for new users makes sense, and I do agree it provides value to libc++. I also think we must be careful not to try to mimic too many bad things that libstdc++ does just for the purpose of easing adoption, since that will be bad for libc++ in a different way.> > What I'm trying to do here is understand whether and why __gnu_cxx::hash_map is special in that respect. Why was it put there in the first place? Why is it so hard to get rid of? > > Sure, I'm happy to provide some of the historical context (I think I was one of the people arguing it should be added). > [...]Thanks a lot for this, I wasn't aware of the context. I'd be very interested to see how much breakage this causes in something like FreeBSD just so we have a more concrete idea of what software this breaks. Perhaps the breakage will be smaller than the last time this was attempted? Otherwise, we're operating on assumptions. Dmitry, do you think you could do the exp-run you were talking about? I'd be very interested to see the results. Louis -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190211/32b535f8/attachment.html>