Chandler Carruth via llvm-dev
2017-Jan-04  14:11 UTC
[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities
A long time ago I suggested that we might want to add gmock to compliment the facilities provided by gtest in LLVM's unittests. It didn't go over well: 1) There was concern over the benefit vs. the cost 2) Also concern about what the facilities would look like in practice and whether they would actually help 3) At the time, I didn't have good, large examples of what these things might look like or why they might be attractive 4) I didn't provide any real explanation of what gmock *did* and so it was vague and unclear. Since then, a lot has changed. We have more heavy use of unit testing in the project with more developers finding benefit from it. And I think I have compelling examples. ## Matchers To start off, it is important to understand that there are two components to what gmock offers. The first has very little to do with "mocks". It is actually a matcher language and system for writing test predicates: EXPECT_EQ(expected, actual); EXPECT_NE(something, something); Become instead: EXPECT_THAT(actual, Eq(expected)); EXPECT_THAT(actual, Ne(not-expected)); This pattern moves the *matcher* out of the *macro*, giving it a proper C++ API. With that, we get two huge benefits: extensibility and composability. You can easily write a matcher that summarizes concisely the expectation for custom data types. And you can compose these matchers in powerful ways. I'll give one example here: EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), Eq(key2, value2), Eq(key3, value3))); Here I'm composing equality matchers inside a matcher that can handle *unordered* container element-wise comparison for generic, arbitrary containers. With a small patch, I've even extended it to support arbitrary iterator ranges! Combine this with custom matchers for the elements, and it becomes a very expressive an declarative way to write expectations in tests. I wanted to give a realistic and compelling example so I rewrote an entire test: https://reviews.llvm.org/D28290 Note that I moved *every* EXPECT to the new syntax so this is essentially worst-case. It also involves a non-trivial custom matcher. Despite this, the code is shorter, easier to read and easier to maintain. It has fewer unnecessary orderings enforced. And it is much easier to extend. Also, the error messages when it fails are substantially improved because these composed matchers have logic to carefully explain *why* they failed to match. I hope folks find this compelling. I think this alone is worth carrying the gmock code in tree -- it is just used by tests and not substantially larger than gtest. Even if we decide we want nothing to do with mocks, I would very much like to have the matchers. ## Mocks So, now let's consider mocks. First off, what are mocks? I'll give a fairly casual definition here: they are test objects which implement some API and allow the test to explicitly set expectations on how that API is used and how it in turn should behave. For a more detailed vocabulary see [1] and for a more lengthy discussion see [2]. As came up in the original discussion, LLVM relatively infrequently has a need to test API interactions in this way. Usually we're in the business of translating things from format A to B (instructions, metadata, whatever) and can write down one format and write checks against the other format for tests. This is a wonderful world to live in with tests. I never want LLVM to *decrease* how much we leverage this. But we *do* have API interactions that we need to test. We have plugin APIs, and hookable interfaces, ranging from Clang frontend actions to JIT listeners. We also have *generic* code in ADT that is all about API interactions. Most generic code in fact is -- we want it to work for *any* T that behaves in a certain way, so we need to give it interesting Ts to test it. My immediate example is the pass manager. We plug in a bunch of passes to it, and expect it to run them in a precise way over specific bits of IR. When testing this, it is extremely cumbersome to write a test pass which does this in interesting and yet controllable and comprehensible ways. Let's look at a concrete example: https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L481-L509 Here we have over 20 lines of code spent testing that the correct set of things happened the correct number of times. I had to write a long comment just to explain what these numbers mean. And I still never understand whether a change in the numbers really means a good or bad thing. Now, we *have* detailed logging based tests use FileCheck which is the primary way to avoid this in LLVM. But it isn't enough. In these tests we want to carefully *permute* the behavior of very specific runs of individual passes. A simple example of this can be seen here where we have somewhat magical state in a pass to flip-flop its behavior: https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L138-L139 And it gets more complicated if you want statefulness like triggering on the *3rd* run of the pass. But this is exactly the kinds of scenarios that I needed to write tests for in order to get the code to be correct. I have consistently found and been able to fix bugs throughout the pass manager by writing careful unittests. Mocks with GoogleMock are, IMO, a *tool to create interesting and debuggable test objects*. These objects can then be fed into an API to exercise it in ways that are hard or impossible to control from a command line in sufficient granularity and precision. While doing this is never fun and should be avoided where possible, when we need to do this I think it provides a powerful tool for the job. Here is how it works at the highest level: 1) Create a class with a MOCK_METHOD*(...) API. This API is then hookable by gmock. 2) Use some APIs to register default behaviors for the APIs. 3) Setup the *minimal* amount of expected API interactions for a given test. IE, for this test to pass, X has to happen and in response to that my code needs to do Y. 4) Feed this class, or a wrapper around it if you need a copyable object, into the system you are testing and run it. If the expected interactions don't occur, you get a trace of which ones failed and why. These traces are somewhat verbose and hard to read, but they actually have the information needed to debug the system which saves you from building infrastructure to extract that over and over again. But a concrete example will likely work better. I've used gmock to build the unit tests for a major revision of the LoopPassManager in the new pass manager. This is a substantial redesign that now handles inserting new loops, deleting loops, and invalidating analyses. The tests for it are, IMO, dramatically more readable than the test linked above. And they are substantially more thorough and precise: https://reviews.llvm.org/D28292 I hope this is compelling for folks. Just writing and debugging this one test was extremely compelling for me. I ended up with much better coverage and precision than I would have using any other technique without a tremendous amount of plumbing essentially re-inventing a framework to build test pass objects that work exactly the way these mock pass handles do. That said, all is not perfect. For instance, gmock suffers from being designed in C++98 world. It has relatively poor support for move and value semantics, which resulted in my using a wrapper around the mock interfaces in the above patch to let a pimpl idiom provide the value semantics I wanted. However, that idiom works well, and this didn't substantially impede my use of the infrastructure. Also, I remain very sympathetic to the idea that this kind of testing apparatus should be relatively rarely needed. We shouldn't be writing new complex unit tests for APIs every week. But even a few use cases such as to test ADTs and generic tools like the pass manager seem to justify the cost to me, and I'm happy to help draw up fairly restrictive guidance around mocks for the coding standards. Thanks, and sorry for the long email, but I wanted to try and lay out the issues in a way folks could understand, and the examples, while hopefully useful, are quite large and complex. Please don't hesitate to ask questions if stuff isn't clear. -Chandler [1]: https://en.wikipedia.org/wiki/Test_double [2]: http://martinfowler.com/articles/mocksArentStubs.html -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170104/b1da1d7c/attachment.html>
Zachary Turner via llvm-dev
2017-Jan-04  17:49 UTC
[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities
TL;DR - I want this. On Wed, Jan 4, 2017 at 6:11 AM Chandler Carruth via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > ## Matchers > > To start off, it is important to understand that there are two components > to what gmock offers. The first has very little to do with "mocks". It is > actually a matcher language and system for writing test predicates: > > EXPECT_EQ(expected, actual); > EXPECT_NE(something, something); > > Become instead: > > EXPECT_THAT(actual, Eq(expected)); > EXPECT_THAT(actual, Ne(not-expected)); > > This pattern moves the *matcher* out of the *macro*, giving it a proper > C++ API. With that, we get two huge benefits: extensibility and > composability. You can easily write a matcher that summarizes concisely the > expectation for custom data types. And you can compose these matchers in > powerful ways. I'll give one example here: > > EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), Eq(key2, > value2), Eq(key3, value3))); >> Here I'm composing equality matchers inside a matcher that can handle > *unordered* container element-wise comparison for generic, arbitrary > containers. With a small patch, I've even extended it to support arbitrary > iterator ranges! Combine this with custom matchers for the elements, and it > becomes a very expressive an declarative way to write expectations in tests. > > I wanted to give a realistic and compelling example so I rewrote an entire > test: https://reviews.llvm.org/D28290 Note that I moved *every* EXPECT to > the new syntax so this is essentially worst-case. It also involves a > non-trivial custom matcher. Despite this, the code is shorter, easier to > read and easier to maintain. It has fewer unnecessary orderings enforced. > And it is much easier to extend. Also, the error messages when it fails are > substantially improved because these composed matchers have logic to > carefully explain *why* they failed to match. > > I hope folks find this compelling. I think this alone is worth carrying > the gmock code in tree -- it is just used by tests and not substantially > larger than gtest. Even if we decide we want nothing to do with mocks, I > would very much like to have the matchers. >+1, these look amazing. Often times I find myself writing many EXPECT statements to test a single logical condition. When you want to do this for many different inputs / outputs of an API it turns into a long list of expect statements that the person reading the test can't easily grok and see how they're related. Here's an example from the formatv tests that I wrote: Replacements = formatv_object_base::parseFormatString("{0,-3}"); ASSERT_EQ(1u, Replacements.size()); EXPECT_EQ(ReplacementType::Format, Replacements[0].Type); EXPECT_EQ(0u, Replacements[0].Index); EXPECT_EQ(3u, Replacements[0].Align); EXPECT_EQ(AlignStyle::Left, Replacements[0].Where); EXPECT_EQ("", Replacements[0].Options); It would be nice if I could write: EXPECT_THAT(Replacements, ReplacementsAre(Rep(Format, 0, 3, Left, ""))); This isn't a huge win here, but if you have a longer format string where there's multiple replacements, you end up 5 lines per replacement, which starts to become very unwieldy and hard to follow. Now multiply that by the number of different edge cases you want to test, and you end up losing test coverage because you have to balance maintainability of the test's code with test coverage, and adding 100 lines to test one API hurts readability more than it helps test coverage. Another thing. Often times I find myself writing a function to test a complex condition, like this: EXPECT_TRUE(Value, ComplexTest(Value)); But then you lose the error message ability to see why the complex test failed. You say this is handled by the matcher infrastructure although I don't see an example, but I'll take your word for it. If so, these matchers seem like an across the board win and I hope to be able to use them in-tree soon.> > > ## Mocks > > So, now let's consider mocks. First off, what are mocks? I'll give a > fairly casual definition here: they are test objects which implement some > API and allow the test to explicitly set expectations on how that API is > used and how it in turn should behave. For a more detailed vocabulary see > [1] and for a more lengthy discussion see [2]. > > As came up in the original discussion, LLVM relatively infrequently has a > need to test API interactions in this way. Usually we're in the business of > translating things from format A to B (instructions, metadata, whatever) > and can write down one format and write checks against the other format for > tests. This is a wonderful world to live in with tests. I never want LLVM > to *decrease* how much we leverage this. >You're forgetting about that troublesome LLVM subproject that nobody wants to think about which does things completely differently: LLDB. ;-) LLDB *very frequently* has a need to test API interactions in this way, and is *very infrequently* in the business of translating things from format A to format B.> > Also, I remain very sympathetic to the idea that this kind of testing > apparatus should be relatively rarely needed. We shouldn't be writing new > complex unit tests for APIs every week. But even a few use cases such as to > test ADTs and generic tools like the pass manager seem to justify the cost > to me, and I'm happy to help draw up fairly restrictive guidance around > mocks for the coding standards. >In LLDB, I think this will end up being the most useful kind of unit test. There is so little test coverage right now precisely because certain things in an interactive application are hard/impossible to test with a garbage-in garbage-out model. Consider me on board. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170104/c08c5d57/attachment.html>
Pete Cooper via llvm-dev
2017-Jan-04  19:55 UTC
[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities
> On Jan 4, 2017, at 9:49 AM, Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > TL;DR - I want this.For the most part, +1 from me too. A few comments though.> > On Wed, Jan 4, 2017 at 6:11 AM Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > ## Matchers > > To start off, it is important to understand that there are two components to what gmock offers. The first has very little to do with "mocks". It is actually a matcher language and system for writing test predicates: > > EXPECT_EQ(expected, actual); > EXPECT_NE(something, something); > > Become instead: > > EXPECT_THAT(actual, Eq(expected)); > EXPECT_THAT(actual, Ne(not-expected));For the cases where you have containers and other non-trivial objects, I completely agree that this is compelling. However, for simple cases like string equality I don't like the change from EXPECT_EQ(a, b) to EXPECT_THAT(a, Eq(b)). Which brings me to what I guess is my main question. Are we going to be able to keep using EXPECT_EQ (and others) via gtest? Or are we going to slowly migrate from gtest to gmock? I don't think you are suggesting phasing out gtest, but at the same time I'm not really sure why we should have both. It may be easier to move completely to gmock if its more powerful, even if the checks are sometimes more verbose for simple cases. Anyway for at least the subset of cases which need the more powerful forms of testing, this seems like a reasonable thing to add. Cheers, Pete> > This pattern moves the *matcher* out of the *macro*, giving it a proper C++ API. With that, we get two huge benefits: extensibility and composability. You can easily write a matcher that summarizes concisely the expectation for custom data types. And you can compose these matchers in powerful ways. I'll give one example here: > > EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), Eq(key2, value2), Eq(key3, value3))); > > Here I'm composing equality matchers inside a matcher that can handle *unordered* container element-wise comparison for generic, arbitrary containers. With a small patch, I've even extended it to support arbitrary iterator ranges! Combine this with custom matchers for the elements, and it becomes a very expressive an declarative way to write expectations in tests. > > I wanted to give a realistic and compelling example so I rewrote an entire test: https://reviews.llvm.org/D28290 <https://reviews.llvm.org/D28290> Note that I moved *every* EXPECT to the new syntax so this is essentially worst-case. It also involves a non-trivial custom matcher. Despite this, the code is shorter, easier to read and easier to maintain. It has fewer unnecessary orderings enforced. And it is much easier to extend. Also, the error messages when it fails are substantially improved because these composed matchers have logic to carefully explain *why* they failed to match. > > I hope folks find this compelling. I think this alone is worth carrying the gmock code in tree -- it is just used by tests and not substantially larger than gtest. Even if we decide we want nothing to do with mocks, I would very much like to have the matchers. > > +1, these look amazing. Often times I find myself writing many EXPECT statements to test a single logical condition. When you want to do this for many different inputs / outputs of an API it turns into a long list of expect statements that the person reading the test can't easily grok and see how they're related. Here's an example from the formatv tests that I wrote: > > Replacements = formatv_object_base::parseFormatString("{0,-3}"); > ASSERT_EQ(1u, Replacements.size()); > EXPECT_EQ(ReplacementType::Format, Replacements[0].Type); > EXPECT_EQ(0u, Replacements[0].Index); > EXPECT_EQ(3u, Replacements[0].Align); > EXPECT_EQ(AlignStyle::Left, Replacements[0].Where); > EXPECT_EQ("", Replacements[0].Options); > > It would be nice if I could write: > > EXPECT_THAT(Replacements, ReplacementsAre(Rep(Format, 0, 3, Left, ""))); > > This isn't a huge win here, but if you have a longer format string where there's multiple replacements, you end up 5 lines per replacement, which starts to become very unwieldy and hard to follow. Now multiply that by the number of different edge cases you want to test, and you end up losing test coverage because you have to balance maintainability of the test's code with test coverage, and adding 100 lines to test one API hurts readability more than it helps test coverage. > > Another thing. Often times I find myself writing a function to test a complex condition, like this: > > EXPECT_TRUE(Value, ComplexTest(Value)); > > But then you lose the error message ability to see why the complex test failed. You say this is handled by the matcher infrastructure although I don't see an example, but I'll take your word for it. If so, these matchers seem like an across the board win and I hope to be able to use them in-tree soon. > > > > > > ## Mocks > > So, now let's consider mocks. First off, what are mocks? I'll give a fairly casual definition here: they are test objects which implement some API and allow the test to explicitly set expectations on how that API is used and how it in turn should behave. For a more detailed vocabulary see [1] and for a more lengthy discussion see [2]. > > As came up in the original discussion, LLVM relatively infrequently has a need to test API interactions in this way. Usually we're in the business of translating things from format A to B (instructions, metadata, whatever) and can write down one format and write checks against the other format for tests. This is a wonderful world to live in with tests. I never want LLVM to *decrease* how much we leverage this. > You're forgetting about that troublesome LLVM subproject that nobody wants to think about which does things completely differently: LLDB. ;-) LLDB *very frequently* has a need to test API interactions in this way, and is *very infrequently* in the business of translating things from format A to format B. > > > Also, I remain very sympathetic to the idea that this kind of testing apparatus should be relatively rarely needed. We shouldn't be writing new complex unit tests for APIs every week. But even a few use cases such as to test ADTs and generic tools like the pass manager seem to justify the cost to me, and I'm happy to help draw up fairly restrictive guidance around mocks for the coding standards. > > In LLDB, I think this will end up being the most useful kind of unit test. There is so little test coverage right now precisely because certain things in an interactive application are hard/impossible to test with a garbage-in garbage-out model. > > Consider me on board. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170104/45001030/attachment.html>
Philip Reames via llvm-dev
2017-Jan-05  06:16 UTC
[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities
No strong opinion, but certainly not opposed. If it makes testing pass manager changes easier, SGTM. Philip On 01/04/2017 06:11 AM, Chandler Carruth via llvm-dev wrote:> A long time ago I suggested that we might want to add gmock to > compliment the facilities provided by gtest in LLVM's unittests. It > didn't go over well: > > 1) There was concern over the benefit vs. the cost > 2) Also concern about what the facilities would look like in practice > and whether they would actually help > 3) At the time, I didn't have good, large examples of what these > things might look like or why they might be attractive > 4) I didn't provide any real explanation of what gmock *did* and so it > was vague and unclear. > > Since then, a lot has changed. We have more heavy use of unit testing > in the project with more developers finding benefit from it. And I > think I have compelling examples. > > ## Matchers > > To start off, it is important to understand that there are two > components to what gmock offers. The first has very little to do with > "mocks". It is actually a matcher language and system for writing test > predicates: > > EXPECT_EQ(expected, actual); > EXPECT_NE(something, something); > > Become instead: > > EXPECT_THAT(actual, Eq(expected)); > EXPECT_THAT(actual, Ne(not-expected)); > > This pattern moves the *matcher* out of the *macro*, giving it a > proper C++ API. With that, we get two huge benefits: extensibility and > composability. You can easily write a matcher that summarizes > concisely the expectation for custom data types. And you can compose > these matchers in powerful ways. I'll give one example here: > > EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), > Eq(key2, value2), Eq(key3, value3))); > > Here I'm composing equality matchers inside a matcher that can handle > *unordered* container element-wise comparison for generic, arbitrary > containers. With a small patch, I've even extended it to support > arbitrary iterator ranges! Combine this with custom matchers for the > elements, and it becomes a very expressive an declarative way to write > expectations in tests. > > I wanted to give a realistic and compelling example so I rewrote an > entire test: https://reviews.llvm.org/D28290 Note that I moved *every* > EXPECT to the new syntax so this is essentially worst-case. It also > involves a non-trivial custom matcher. Despite this, the code is > shorter, easier to read and easier to maintain. It has fewer > unnecessary orderings enforced. And it is much easier to extend. Also, > the error messages when it fails are substantially improved because > these composed matchers have logic to carefully explain *why* they > failed to match. > > I hope folks find this compelling. I think this alone is worth > carrying the gmock code in tree -- it is just used by tests and not > substantially larger than gtest. Even if we decide we want nothing to > do with mocks, I would very much like to have the matchers. > > > ## Mocks > > So, now let's consider mocks. First off, what are mocks? I'll give a > fairly casual definition here: they are test objects which implement > some API and allow the test to explicitly set expectations on how that > API is used and how it in turn should behave. For a more detailed > vocabulary see [1] and for a more lengthy discussion see [2]. > > As came up in the original discussion, LLVM relatively infrequently > has a need to test API interactions in this way. Usually we're in the > business of translating things from format A to B (instructions, > metadata, whatever) and can write down one format and write checks > against the other format for tests. This is a wonderful world to live > in with tests. I never want LLVM to *decrease* how much we leverage this. > > But we *do* have API interactions that we need to test. We have plugin > APIs, and hookable interfaces, ranging from Clang frontend actions to > JIT listeners. We also have *generic* code in ADT that is all about > API interactions. Most generic code in fact is -- we want it to work > for *any* T that behaves in a certain way, so we need to give it > interesting Ts to test it. > > My immediate example is the pass manager. We plug in a bunch of passes > to it, and expect it to run them in a precise way over specific bits > of IR. When testing this, it is extremely cumbersome to write a test > pass which does this in interesting and yet controllable and > comprehensible ways. Let's look at a concrete example: > > https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L481-L509 > > Here we have over 20 lines of code spent testing that the correct set > of things happened the correct number of times. I had to write a long > comment just to explain what these numbers mean. And I still never > understand whether a change in the numbers really means a good or bad > thing. > > Now, we *have* detailed logging based tests use FileCheck which is the > primary way to avoid this in LLVM. But it isn't enough. In these tests > we want to carefully *permute* the behavior of very specific runs of > individual passes. A simple example of this can be seen here where we > have somewhat magical state in a pass to flip-flop its behavior: > https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L138-L139 > > And it gets more complicated if you want statefulness like triggering > on the *3rd* run of the pass. > > But this is exactly the kinds of scenarios that I needed to write > tests for in order to get the code to be correct. I have consistently > found and been able to fix bugs throughout the pass manager by writing > careful unittests. > > Mocks with GoogleMock are, IMO, a *tool to create interesting and > debuggable test objects*. These objects can then be fed into an API to > exercise it in ways that are hard or impossible to control from a > command line in sufficient granularity and precision. While doing this > is never fun and should be avoided where possible, when we need to do > this I think it provides a powerful tool for the job. > > Here is how it works at the highest level: > 1) Create a class with a MOCK_METHOD*(...) API. This API is then > hookable by gmock. > 2) Use some APIs to register default behaviors for the APIs. > 3) Setup the *minimal* amount of expected API interactions for a given > test. IE, for this test to pass, X has to happen and in response to > that my code needs to do Y. > 4) Feed this class, or a wrapper around it if you need a copyable > object, into the system you are testing and run it. > > If the expected interactions don't occur, you get a trace of which > ones failed and why. These traces are somewhat verbose and hard to > read, but they actually have the information needed to debug the > system which saves you from building infrastructure to extract that > over and over again. > > But a concrete example will likely work better. I've used gmock to > build the unit tests for a major revision of the LoopPassManager in > the new pass manager. This is a substantial redesign that now handles > inserting new loops, deleting loops, and invalidating analyses. The > tests for it are, IMO, dramatically more readable than the test linked > above. And they are substantially more thorough and precise: > > https://reviews.llvm.org/D28292 > > I hope this is compelling for folks. Just writing and debugging this > one test was extremely compelling for me. I ended up with much better > coverage and precision than I would have using any other technique > without a tremendous amount of plumbing essentially re-inventing a > framework to build test pass objects that work exactly the way these > mock pass handles do. > > That said, all is not perfect. For instance, gmock suffers from being > designed in C++98 world. It has relatively poor support for move and > value semantics, which resulted in my using a wrapper around the mock > interfaces in the above patch to let a pimpl idiom provide the value > semantics I wanted. However, that idiom works well, and this didn't > substantially impede my use of the infrastructure. > > Also, I remain very sympathetic to the idea that this kind of testing > apparatus should be relatively rarely needed. We shouldn't be writing > new complex unit tests for APIs every week. But even a few use cases > such as to test ADTs and generic tools like the pass manager seem to > justify the cost to me, and I'm happy to help draw up fairly > restrictive guidance around mocks for the coding standards. > > > Thanks, and sorry for the long email, but I wanted to try and lay out > the issues in a way folks could understand, and the examples, while > hopefully useful, are quite large and complex. > > Please don't hesitate to ask questions if stuff isn't clear. > -Chandler > > [1]: https://en.wikipedia.org/wiki/Test_double > [2]: http://martinfowler.com/articles/mocksArentStubs.html > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170104/aa4f2b4d/attachment.html>
Matthias Braun via llvm-dev
2017-Jan-05  23:19 UTC
[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities
- Providing some universal helpers for various situations that you want to EXPECT() on sounds great. - I can see how the "Mocks" stuff can help in the pass manager case. There is some cost learning yet another library just to test a feature, so we should keep pushing for simple/well known solutions (mostly thinking of "helper-command | FileCheck") by default and at least require people to write long justifications like this when they want to use gmock :) - Matthias> On Jan 4, 2017, at 6:11 AM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > A long time ago I suggested that we might want to add gmock to compliment the facilities provided by gtest in LLVM's unittests. It didn't go over well: > > 1) There was concern over the benefit vs. the cost > 2) Also concern about what the facilities would look like in practice and whether they would actually help > 3) At the time, I didn't have good, large examples of what these things might look like or why they might be attractive > 4) I didn't provide any real explanation of what gmock *did* and so it was vague and unclear. > > Since then, a lot has changed. We have more heavy use of unit testing in the project with more developers finding benefit from it. And I think I have compelling examples. > > ## Matchers > > To start off, it is important to understand that there are two components to what gmock offers. The first has very little to do with "mocks". It is actually a matcher language and system for writing test predicates: > > EXPECT_EQ(expected, actual); > EXPECT_NE(something, something); > > Become instead: > > EXPECT_THAT(actual, Eq(expected)); > EXPECT_THAT(actual, Ne(not-expected)); > > This pattern moves the *matcher* out of the *macro*, giving it a proper C++ API. With that, we get two huge benefits: extensibility and composability. You can easily write a matcher that summarizes concisely the expectation for custom data types. And you can compose these matchers in powerful ways. I'll give one example here: > > EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), Eq(key2, value2), Eq(key3, value3))); > > Here I'm composing equality matchers inside a matcher that can handle *unordered* container element-wise comparison for generic, arbitrary containers. With a small patch, I've even extended it to support arbitrary iterator ranges! Combine this with custom matchers for the elements, and it becomes a very expressive an declarative way to write expectations in tests. > > I wanted to give a realistic and compelling example so I rewrote an entire test: https://reviews.llvm.org/D28290 <https://reviews.llvm.org/D28290> Note that I moved *every* EXPECT to the new syntax so this is essentially worst-case. It also involves a non-trivial custom matcher. Despite this, the code is shorter, easier to read and easier to maintain. It has fewer unnecessary orderings enforced. And it is much easier to extend. Also, the error messages when it fails are substantially improved because these composed matchers have logic to carefully explain *why* they failed to match. > > I hope folks find this compelling. I think this alone is worth carrying the gmock code in tree -- it is just used by tests and not substantially larger than gtest. Even if we decide we want nothing to do with mocks, I would very much like to have the matchers. > > > ## Mocks > > So, now let's consider mocks. First off, what are mocks? I'll give a fairly casual definition here: they are test objects which implement some API and allow the test to explicitly set expectations on how that API is used and how it in turn should behave. For a more detailed vocabulary see [1] and for a more lengthy discussion see [2]. > > As came up in the original discussion, LLVM relatively infrequently has a need to test API interactions in this way. Usually we're in the business of translating things from format A to B (instructions, metadata, whatever) and can write down one format and write checks against the other format for tests. This is a wonderful world to live in with tests. I never want LLVM to *decrease* how much we leverage this. > > But we *do* have API interactions that we need to test. We have plugin APIs, and hookable interfaces, ranging from Clang frontend actions to JIT listeners. We also have *generic* code in ADT that is all about API interactions. Most generic code in fact is -- we want it to work for *any* T that behaves in a certain way, so we need to give it interesting Ts to test it. > > My immediate example is the pass manager. We plug in a bunch of passes to it, and expect it to run them in a precise way over specific bits of IR. When testing this, it is extremely cumbersome to write a test pass which does this in interesting and yet controllable and comprehensible ways. Let's look at a concrete example: > > https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L481-L509 <https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L481-L509> > > Here we have over 20 lines of code spent testing that the correct set of things happened the correct number of times. I had to write a long comment just to explain what these numbers mean. And I still never understand whether a change in the numbers really means a good or bad thing. > > Now, we *have* detailed logging based tests use FileCheck which is the primary way to avoid this in LLVM. But it isn't enough. In these tests we want to carefully *permute* the behavior of very specific runs of individual passes. A simple example of this can be seen here where we have somewhat magical state in a pass to flip-flop its behavior: > https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L138-L139 <https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L138-L139> > > And it gets more complicated if you want statefulness like triggering on the *3rd* run of the pass. > > But this is exactly the kinds of scenarios that I needed to write tests for in order to get the code to be correct. I have consistently found and been able to fix bugs throughout the pass manager by writing careful unittests. > > Mocks with GoogleMock are, IMO, a *tool to create interesting and debuggable test objects*. These objects can then be fed into an API to exercise it in ways that are hard or impossible to control from a command line in sufficient granularity and precision. While doing this is never fun and should be avoided where possible, when we need to do this I think it provides a powerful tool for the job. > > Here is how it works at the highest level: > 1) Create a class with a MOCK_METHOD*(...) API. This API is then hookable by gmock. > 2) Use some APIs to register default behaviors for the APIs. > 3) Setup the *minimal* amount of expected API interactions for a given test. IE, for this test to pass, X has to happen and in response to that my code needs to do Y. > 4) Feed this class, or a wrapper around it if you need a copyable object, into the system you are testing and run it. > > If the expected interactions don't occur, you get a trace of which ones failed and why. These traces are somewhat verbose and hard to read, but they actually have the information needed to debug the system which saves you from building infrastructure to extract that over and over again. > > But a concrete example will likely work better. I've used gmock to build the unit tests for a major revision of the LoopPassManager in the new pass manager. This is a substantial redesign that now handles inserting new loops, deleting loops, and invalidating analyses. The tests for it are, IMO, dramatically more readable than the test linked above. And they are substantially more thorough and precise: > > https://reviews.llvm.org/D28292 <https://reviews.llvm.org/D28292> > > I hope this is compelling for folks. Just writing and debugging this one test was extremely compelling for me. I ended up with much better coverage and precision than I would have using any other technique without a tremendous amount of plumbing essentially re-inventing a framework to build test pass objects that work exactly the way these mock pass handles do. > > That said, all is not perfect. For instance, gmock suffers from being designed in C++98 world. It has relatively poor support for move and value semantics, which resulted in my using a wrapper around the mock interfaces in the above patch to let a pimpl idiom provide the value semantics I wanted. However, that idiom works well, and this didn't substantially impede my use of the infrastructure. > > Also, I remain very sympathetic to the idea that this kind of testing apparatus should be relatively rarely needed. We shouldn't be writing new complex unit tests for APIs every week. But even a few use cases such as to test ADTs and generic tools like the pass manager seem to justify the cost to me, and I'm happy to help draw up fairly restrictive guidance around mocks for the coding standards. > > > Thanks, and sorry for the long email, but I wanted to try and lay out the issues in a way folks could understand, and the examples, while hopefully useful, are quite large and complex. > > Please don't hesitate to ask questions if stuff isn't clear. > -Chandler > > [1]: https://en.wikipedia.org/wiki/Test_double <https://en.wikipedia.org/wiki/Test_double> > [2]: http://martinfowler.com/articles/mocksArentStubs.html <http://martinfowler.com/articles/mocksArentStubs.html> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170105/fd3bd4b5/attachment.html>
Chandler Carruth via llvm-dev
2017-Jan-06  00:01 UTC
[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities
So far I've not heard any objections to the core of: 1) add the utility code 2) use it in the *clear* places where it makes a substantial improvement, both matchers and mocks I'd really like to hear if there are serious concerns here, but so far this looks like pretty strong consensus. If possible I'd like to make progress on landing the actual code Friday, so if you haven't given a shout yet, please do. Of course, if new concerns come up, we can always revisit this. It's just internal testing utilities, so it seems especially low-risk. We still need to sort out several details of course: a) I will put together some good LLVM-focused primitives (mostly around matchers) in a common location. At the least this will give us a good pattern to follow as new bits of common stuff come up. I'll get the initial skeleton of this quickly and then everyone should be able to chip in with the bits that they need. I'll send this out as a relatively small follow-up patch that we can discuss in code review to get the location / pattern right. b) We will definitely want some guidelines around *how* and *when* to use this stuff. I'll try and distill something more brief than my email and incorporating some of the comments on this thread, and put it up for review as an addition to the coding standards. This will take me a bit more time but I'm happy to make sure this happens. This code review can then serve as a place to discuss the somewhat mechanical bits that are still important such as should we write `EXPECT_EQ(b, a)`, `EXPECT_THAT(a, Eq(b))`, or (with some custom magic) `EXPECT(a, Eq(b))`. c) It might be helpful to have an LLVM-focused explanatory guide to how gtest+gmock work and how to use them effectively. I'm not the best at writing this documentation, so if anyone else wants to take a stab at it, honestly I'd appreciate that. Happy to review of course. But if no one else feels like they can help with this, I can try to pull this together as well. It will definitely take a bit though. I don't think any of these really need to be blocking as it seems like the example usages I posted weren't terribly controversail, and it'll be easy to update based on any changes in suggested practice from (a) or (b). Does that sound right? Anything I'm missing? Any concerns with this path forward? Also, thanks everyone! I know my writeup was a bit long, appreciate taking the time. -Chandler On Thu, Jan 5, 2017 at 3:19 PM Matthias Braun <mbraun at apple.com> wrote:> - Providing some universal helpers for various situations that you want to > EXPECT() on sounds great. > - I can see how the "Mocks" stuff can help in the pass manager case. There > is some cost learning yet another library just to test a feature, so we > should keep pushing for simple/well known solutions (mostly thinking of > "helper-command | FileCheck") by default and at least require people to > write long justifications like this when they want to use gmock :) > > - Matthias > > On Jan 4, 2017, at 6:11 AM, Chandler Carruth via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > A long time ago I suggested that we might want to add gmock to compliment > the facilities provided by gtest in LLVM's unittests. It didn't go over > well: > > 1) There was concern over the benefit vs. the cost > 2) Also concern about what the facilities would look like in practice and > whether they would actually help > 3) At the time, I didn't have good, large examples of what these things > might look like or why they might be attractive > 4) I didn't provide any real explanation of what gmock *did* and so it was > vague and unclear. > > Since then, a lot has changed. We have more heavy use of unit testing in > the project with more developers finding benefit from it. And I think I > have compelling examples. > > ## Matchers > > To start off, it is important to understand that there are two components > to what gmock offers. The first has very little to do with "mocks". It is > actually a matcher language and system for writing test predicates: > > EXPECT_EQ(expected, actual); > EXPECT_NE(something, something); > > Become instead: > > EXPECT_THAT(actual, Eq(expected)); > EXPECT_THAT(actual, Ne(not-expected)); > > This pattern moves the *matcher* out of the *macro*, giving it a proper > C++ API. With that, we get two huge benefits: extensibility and > composability. You can easily write a matcher that summarizes concisely the > expectation for custom data types. And you can compose these matchers in > powerful ways. I'll give one example here: > > EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), Eq(key2, > value2), Eq(key3, value3))); > > Here I'm composing equality matchers inside a matcher that can handle > *unordered* container element-wise comparison for generic, arbitrary > containers. With a small patch, I've even extended it to support arbitrary > iterator ranges! Combine this with custom matchers for the elements, and it > becomes a very expressive an declarative way to write expectations in tests. > > I wanted to give a realistic and compelling example so I rewrote an entire > test: https://reviews.llvm.org/D28290 Note that I moved *every* EXPECT to > the new syntax so this is essentially worst-case. It also involves a > non-trivial custom matcher. Despite this, the code is shorter, easier to > read and easier to maintain. It has fewer unnecessary orderings enforced. > And it is much easier to extend. Also, the error messages when it fails are > substantially improved because these composed matchers have logic to > carefully explain *why* they failed to match. > > I hope folks find this compelling. I think this alone is worth carrying > the gmock code in tree -- it is just used by tests and not substantially > larger than gtest. Even if we decide we want nothing to do with mocks, I > would very much like to have the matchers. > > > ## Mocks > > So, now let's consider mocks. First off, what are mocks? I'll give a > fairly casual definition here: they are test objects which implement some > API and allow the test to explicitly set expectations on how that API is > used and how it in turn should behave. For a more detailed vocabulary see > [1] and for a more lengthy discussion see [2]. > > As came up in the original discussion, LLVM relatively infrequently has a > need to test API interactions in this way. Usually we're in the business of > translating things from format A to B (instructions, metadata, whatever) > and can write down one format and write checks against the other format for > tests. This is a wonderful world to live in with tests. I never want LLVM > to *decrease* how much we leverage this. > > But we *do* have API interactions that we need to test. We have plugin > APIs, and hookable interfaces, ranging from Clang frontend actions to JIT > listeners. We also have *generic* code in ADT that is all about API > interactions. Most generic code in fact is -- we want it to work for *any* > T that behaves in a certain way, so we need to give it interesting Ts to > test it. > > My immediate example is the pass manager. We plug in a bunch of passes to > it, and expect it to run them in a precise way over specific bits of IR. > When testing this, it is extremely cumbersome to write a test pass which > does this in interesting and yet controllable and comprehensible ways. > Let's look at a concrete example: > > > https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L481-L509 > > Here we have over 20 lines of code spent testing that the correct set of > things happened the correct number of times. I had to write a long comment > just to explain what these numbers mean. And I still never understand > whether a change in the numbers really means a good or bad thing. > > Now, we *have* detailed logging based tests use FileCheck which is the > primary way to avoid this in LLVM. But it isn't enough. In these tests we > want to carefully *permute* the behavior of very specific runs of > individual passes. A simple example of this can be seen here where we have > somewhat magical state in a pass to flip-flop its behavior: > > https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L138-L139 > > And it gets more complicated if you want statefulness like triggering on > the *3rd* run of the pass. > > But this is exactly the kinds of scenarios that I needed to write tests > for in order to get the code to be correct. I have consistently found and > been able to fix bugs throughout the pass manager by writing careful > unittests. > > Mocks with GoogleMock are, IMO, a *tool to create interesting and > debuggable test objects*. These objects can then be fed into an API to > exercise it in ways that are hard or impossible to control from a command > line in sufficient granularity and precision. While doing this is never fun > and should be avoided where possible, when we need to do this I think it > provides a powerful tool for the job. > > Here is how it works at the highest level: > 1) Create a class with a MOCK_METHOD*(...) API. This API is then hookable > by gmock. > 2) Use some APIs to register default behaviors for the APIs. > 3) Setup the *minimal* amount of expected API interactions for a given > test. IE, for this test to pass, X has to happen and in response to that my > code needs to do Y. > 4) Feed this class, or a wrapper around it if you need a copyable object, > into the system you are testing and run it. > > If the expected interactions don't occur, you get a trace of which ones > failed and why. These traces are somewhat verbose and hard to read, but > they actually have the information needed to debug the system which saves > you from building infrastructure to extract that over and over again. > > But a concrete example will likely work better. I've used gmock to build > the unit tests for a major revision of the LoopPassManager in the new pass > manager. This is a substantial redesign that now handles inserting new > loops, deleting loops, and invalidating analyses. The tests for it are, > IMO, dramatically more readable than the test linked above. And they are > substantially more thorough and precise: > > https://reviews.llvm.org/D28292 > > I hope this is compelling for folks. Just writing and debugging this one > test was extremely compelling for me. I ended up with much better coverage > and precision than I would have using any other technique without a > tremendous amount of plumbing essentially re-inventing a framework to build > test pass objects that work exactly the way these mock pass handles do. > > That said, all is not perfect. For instance, gmock suffers from being > designed in C++98 world. It has relatively poor support for move and value > semantics, which resulted in my using a wrapper around the mock interfaces > in the above patch to let a pimpl idiom provide the value semantics I > wanted. However, that idiom works well, and this didn't substantially > impede my use of the infrastructure. > > Also, I remain very sympathetic to the idea that this kind of testing > apparatus should be relatively rarely needed. We shouldn't be writing new > complex unit tests for APIs every week. But even a few use cases such as to > test ADTs and generic tools like the pass manager seem to justify the cost > to me, and I'm happy to help draw up fairly restrictive guidance around > mocks for the coding standards. > > > Thanks, and sorry for the long email, but I wanted to try and lay out the > issues in a way folks could understand, and the examples, while hopefully > useful, are quite large and complex. > > Please don't hesitate to ask questions if stuff isn't clear. > -Chandler > > [1]: https://en.wikipedia.org/wiki/Test_double > [2]: http://martinfowler.com/articles/mocksArentStubs.html > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170106/70ae2f39/attachment.html>