Hi, Further to previous discussions titled "your opinion about testing" I''d like to propose a meta data format for the test script files and would obviously welcome peoples input; Effectively each test in the scripts is represented by a function and a call to run_test, so we have test_function() { ...code } run_test function "Description of the function" I''d like to propose that above every function a here document is placed that contains yaml v1.2 encoded data (yaml.org) with 2 characters for the indent. The block will start with << TEST_METADATA and be terminated with TEST_METADATA. We might want to place it in a comment block but this is not really required. The block will also be wrapped at 80 characters for readability. The compulsory elements to the data will be Name: Name of the function, this ensures pairing between function and comments is not just file relative. Summary: Will often be the description after the run_test but not always as the tense will change Description: A full description of the function, the more information here the better. Components: This is the component described in the commit message (http://wiki.whamcloud.com/display/PUB/Commit+Comments) to make this useful we will need to come up a with a defined set of components that will need to be enforced in the commit message. The format of this entry will be a yaml array. Prerequisites: Pre-requisite tests that must be run before this test can be run. This is again an array which presumes a test may have multiple pre-requisites, but the data should not contain a chain of prerequisites, i.e. if A requires B and B requires C, the pre-requisites of A is B not B & C. TicketIDs: This is an array of ticket numbers that this test explicitly tests. In theory we should aim for the state where every ticket has a test associated with it, and in future we should be able to carry out a gap analysis. As time goes on we may well expand this compulsory list, but this is I believe a sensible starting place. Being part of the source this data will be subject to the same review process as any other change and so we cannot store dynamic data here, such as pass rates etc. Do people think that additional data fields should be permitted on an adhoc basis or should a control list of permitted data elements be kept. I''m tempted to say that adhoc additional fields should be allowed, although this could lead to name clashes if people are not careful. Below is an simple example. ======================================================================<<TEST_METADATA Name: before_upgrade_create_data Summary: Copies lustre source into a node specific directory and then creates a tarball using that directory Description: This should be called prior to upgrading Lustre and creates a set of data on the Lustre partition which be accessed and checked after the upgrade has taken place. Several methods are using including tar''ing directories so the can later be untar''ed and compared, along with create sha1''s of stored data. Component: - lnet - recovery Prerequisites: - before_upgrade_clear_filesystem TicketIDs: - LU-123 - LU-432 TEST_METADATA test_before_upgrade_create_data() { ... } run_test before_upgrade_create_data "Copying lustre source into a directory $IOP_DIR1, creating and then using source to create a tarball" ====================================================================== As I said comments, inputs and thoughts much appreciated Thanks Chris
Roman Grigoryev
2012-Apr-30 18:15 UTC
[Lustre-discuss] Metadata storage in test script files
Hi Cris, I''m glad to read next emails on this direction. Please don''t consider this as criticism, I just would like to get more clearness: what is target of adding this metadata? Do you have plans to use the metadata in other scripts? How? Does this metadata go to to results? Also please see more my comments inline: On 04/30/2012 08:50 PM, Chris wrote:> Hi, > > Further to previous discussions titled "your opinion about testing" I''d > like to propose a meta data format for the test script files and would > obviously welcome peoples input; > > Effectively each test in the scripts is represented by a function and a > call to run_test, so we have > > test_function() { > ...code > } > > run_test function "Description of the function" > > I''d like to propose that above every function a here document is placed > that contains yaml v1.2 encoded data (yaml.org) with 2 characters for > the indent. The block will start with << TEST_METADATA and be terminated > with TEST_METADATA. We might want to place it in a comment block but > this is not really required. The block will also be wrapped at 80 > characters for readability. > > The compulsory elements to the data will be > Name: Name of the function, this ensures pairing between > function and comments is not just file relative. > Summary: Will often be the description after the run_test but > not always as the tense will change > Description: A full description of the function, the more > information here the better. > Components: This is the component described in the commit message > (http://wiki.whamcloud.com/display/PUB/Commit+Comments) to make this > useful we will need to come up a with a defined set of components that > will need to be enforced in the commit message. The format of this entry > will be a yaml array. > Prerequisites: Pre-requisite tests that must be run before this test > can be run. This is again an array which presumes a test may have > multiple pre-requisites, but the data should not contain a chain of > prerequisites, i.e. if A requires B and B requires C, the pre-requisites > of A is B not B & C.On which step do you want to check chains? And what is logical base for this prerequisites exclude case that current tests have hidden dependencies? I don''t see any difference between one test which have body from tests a,b,c and this prerequisites definition. Could you please explain more why we need this field?> TicketIDs: This is an array of ticket numbers that this test > explicitly tests. In theory we should aim for the state where every > ticket has a test associated with it, and in future we should be able to > carry out a gap analysis. >I suggest add keywords(Components could be translated as keywords too) and test type (stress, benchmark, load, functional, negative, etc) for quick filtering. For example, SLOW could transform to keyword. Also, I would like to mention, we have 3 different logical types of data: 1) just human-readable descriptions 2) filtering and targeting fields (Componens, keywords if you agree with my suggestion) 3) framework directives(Prerequisites)> As time goes on we may well expand this compulsory list, but this is I > believe a sensible starting place. > > Being part of the source this data will be subject to the same review > process as any other change and so we cannot store dynamic data here, > such as pass rates etc.What you you think, maybe it is good idea to keep metadata separately? This can be useful for simplifying changing data via script for mass modification also as adding tickets and pass rate and execution time on ''gold'' configurations? Thanks, Roman> > Do people think that additional data fields should be permitted on an > adhoc basis or should a control list of permitted data elements be kept. > I''m tempted to say that adhoc additional fields should be allowed, > although this could lead to name clashes if people are not careful. > > Below is an simple example. > > ======================================================================> <<TEST_METADATA > Name: > before_upgrade_create_data > Summary: > Copies lustre source into a node specific directory and then creates > a tarball using that directory > Description: > This should be called prior to upgrading Lustre and creates a set of > data on the Lustre partition > which be accessed and checked after the upgrade has taken place. > Several methods are using > including tar''ing directories so the can later be untar''ed and > compared, along with create sha1''s > of stored data. > Component: > - lnet > - recovery > Prerequisites: > - before_upgrade_clear_filesystem > TicketIDs: > - LU-123 > - LU-432 > TEST_METADATA > > test_before_upgrade_create_data() { > ... > } > > run_test before_upgrade_create_data "Copying lustre source into a > directory $IOP_DIR1, creating and then using source to create a tarball" > ======================================================================> > As I said comments, inputs and thoughts much appreciated > > Thanks > > Chris > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On 30/04/2012 19:15, Roman Grigoryev wrote:> Hi Cris, > I''m glad to read next emails on this direction. > Please don''t consider this as criticism, I just would like to get more > clearness: what is target of adding this metadata? Do you have plans to > use the metadata in other scripts? How? Does this metadata go to to results? > > Also please see more my comments inline:The metadata can be used in a multitude of ways, for example we can create dynamic test sets based on the changes made or target area of testing. What we are doing here is creating an understanding of the tests that we have so that we can improve our processes and testing capabilities in the future. The metadata does not go to the results. The metadata is a database in it''s own right and should metadata about a test be required it would be accessed from the source (database) itself.> On 04/30/2012 08:50 PM, Chris wrote:... snip ...>> Prerequisites: Pre-requisite tests that must be run before this test >> can be run. This is again an array which presumes a test may have >> multiple pre-requisites, but the data should not contain a chain of >> prerequisites, i.e. if A requires B and B requires C, the pre-requisites >> of A is B not B& C. > On which step do you want to check chains? And what is logical base for > this prerequisites exclude case that current tests have hidden > dependencies? > I don''t see any difference between one test which have body from tests > a,b,c and this prerequisites definition. > Could you please explain more why we need this field?As I said we can mine this data any-time and anyway that we want, and the purpose of this discussion is the data not how we use it. But as an example something that dynamically built test sets would need to know prerequisites. The suffix of a,b,c could be used to generate prerequisite information but it is firstly inflexible, for example I bet ''b'',''c'' and ''d'' are often dependent on ''a'' but not each other, secondly and more importantly we want a standard form for storing metadata because we want to introduce order and knowledge into the test scripts that we have today.>> TicketIDs: This is an array of ticket numbers that this test >> explicitly tests. In theory we should aim for the state where every >> ticket has a test associated with it, and in future we should be able to >> carry out a gap analysis. >> > I suggest add keywords(Components could be translated as keywords too) > and test type (stress, benchmark, load, functional, negative, etc) for > quick filtering. For example, SLOW could transform to keyword.This seems like a reasonable idea although we need a name that describes what it is, we will need to define that set of possible words as we need to with the Components elements. What should this field be called - we should not reduce the value of this data why genericizing it into ''keywords''.> Also, I would like to mention, we have 3 different logical types of data: > 1) just human-readable descriptions > 2) filtering and targeting fields (Componens, keywords if you agree with > my suggestion) > 3) framework directives(Prerequisites) > >> As time goes on we may well expand this compulsory list, but this is I >> believe a sensible starting place. >> >> Being part of the source this data will be subject to the same review >> process as any other change and so we cannot store dynamic data here, >> such as pass rates etc. > What you you think, maybe it is good idea to keep metadata separately? > This can be useful for simplifying changing data via script for mass > modification also as adding tickets and pass rate and execution time on > ''gold'' configurations?It would be easier to store the data separately and we could use Maloo but it''s very important that this data becomes part of the Lustre ''source'' so that everybody can benefit from it. Adding tickets is not a problem as part of the resolution issue is to ensure that at least one test exercises the problem and proves it has been fixed, the fact that this assurance process requires active interaction by an engineer with the scripts is a positive. As for pass rate, execution time and gold configurations this information is just not 1 dimensional enough to store in the source. Chris
Roman Grigoryev
2012-May-02 03:23 UTC
[Lustre-discuss] Metadata storage in test script files
Hi Cris, On 05/01/2012 08:17 PM, Chris wrote:> On 30/04/2012 19:15, Roman Grigoryev wrote: >> Hi Cris, >> I''m glad to read next emails on this direction. >> Please don''t consider this as criticism, I just would like to get more >> clearness: what is target of adding this metadata? Do you have plans to >> use the metadata in other scripts? How? Does this metadata go to to >> results? >> >> Also please see more my comments inline: > The metadata can be used in a multitude of ways, for example we can > create dynamic test sets based on > the changes made or target area of testing. What we are doing here is > creating an understanding of the > tests that we have so that we can improve our processes and testing > capabilities in the future.I think that when are are defining tool we should say about purpose. F.e. good description and summary is not needed for creating dynamic test sets. I think, it very important to say how will we use it. Continue of this idea please read below.> > The metadata does not go to the results. The metadata is a database in > it''s own right and should metadata > about a test be required it would be accessed from the source (database) > itself.I think fields like title, summary, and, possible. description should be present in results too. It can be very helpful for quickly understanding test results.> >> On 04/30/2012 08:50 PM, Chris wrote: > ... snip ... > >>> Prerequisites: Pre-requisite tests that must be run before this test >>> can be run. This is again an array which presumes a test may have >>> multiple pre-requisites, but the data should not contain a chain of >>> prerequisites, i.e. if A requires B and B requires C, the pre-requisites >>> of A is B not B& C. >> On which step do you want to check chains? And what is logical base for >> this prerequisites exclude case that current tests have hidden >> dependencies? >> I don''t see any difference between one test which have body from tests >> a,b,c and this prerequisites definition. >> Could you please explain more why we need this field? > As I said we can mine this data any-time and anyway that we want, and > the purpose of this > discussion is the data not how we use it. But as an example something > that dynamically built > test sets would need to know prerequisites. > > The suffix of a,b,c could be used to generate prerequisite information > but it is firstly inflexible, for example > I bet ''b'',''c'' and ''d'' are often dependent on ''a'' but not each other, > secondly and more importantly we want a > standard form for storing metadata because we want to introduce order > and knowledge into the test > scripts that we have today.Why I asked about way of usage: if we want to use this information in scripts and in other automated way we must strictly specify logic on items and provides tool for check it. F.e. we will use it when built test execution queue. We have chain like this: test C prerequisite B, test B prerequisite A. Test A doesn''t have prerequisite. In one good day test A became excluded. Is it possible to execute test C? But if we will not use it in scripting there is no big logical problem. (My opinion: I don''t like this situation and think that test dependencies should be used only in very specific and rare case.)> >>> TicketIDs: This is an array of ticket numbers that this test >>> explicitly tests. In theory we should aim for the state where every >>> ticket has a test associated with it, and in future we should be able to >>> carry out a gap analysis. >>> >> I suggest add keywords(Components could be translated as keywords too) >> and test type (stress, benchmark, load, functional, negative, etc) for >> quick filtering. For example, SLOW could transform to keyword. > This seems like a reasonable idea although we need a name that describes > what it is, > we will need to define that set of possible words as we need to with the > Components elements.I mean that ''keywords'' should be separated from components but could be logically included. I think, ''Components'' is special type of keywords.> > What should this field be called - we should not reduce the value of > this data why genericizing it > into ''keywords''. >> Also, I would like to mention, we have 3 different logical types of >> data: >> 1) just human-readable descriptions >> 2) filtering and targeting fields (Componens, keywords if you agree with >> my suggestion) >> 3) framework directives(Prerequisites) >> >>> As time goes on we may well expand this compulsory list, but this is I >>> believe a sensible starting place. >>> >>> Being part of the source this data will be subject to the same review >>> process as any other change and so we cannot store dynamic data here, >>> such as pass rates etc. >> What you you think, maybe it is good idea to keep metadata separately? >> This can be useful for simplifying changing data via script for mass >> modification also as adding tickets and pass rate and execution time on >> ''gold'' configurations? > It would be easier to store the data separately and we could use Maloo > but it''s very important > that this data becomes part of the Lustre ''source'' so that everybody can > benefit from it. Adding > tickets is not a problem as part of the resolution issue is to ensure > that at least one test exercises > the problem and proves it has been fixed, the fact that this assurance > process requires active > interaction by an engineer with the scripts is a positive. > > As for pass rate, execution time and gold configurations this > information is just not 1 dimensional > enough to store in the source. >I''m not accidentally in previous letter said about group of fields. All meta data may be separated by rare and often changed fields. F.e. Summary will change not so often. But test timeout in golden configuration (I mean that this timeout will be set as default based on ''gold'' configuration and can be overloaded in specific configuration) could be more variable(and possible more important for testing). Using separated files provides more flexibility and nobody stop us to commit it to lustre repo and it became " Lustre ''source''". In separated files we can use format which we want and all information will be available without parsing shell script or without running it. More over, in great future, it give us very simple migration from shell to other language. Few words how we done this task in our wrapper test framework(see attached sample yaml): The file contains set of tags. Main entity is test, in this sample element <id> is <Tests> array define logic entity ''test''. Every test inherit vales from common description (fields which described out of <Tests> array). A test can override any field or add new fields. <groupname>, <executor>, <description>, <reference>, <roles>, <tags> - are common fields. All other are executor-specific and used in executors. -- Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: conf-sanity_tests.yaml Type: application/x-yaml Size: 1546 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120502/ad6182fe/attachment.bin
Andreas Dilger
2012-May-02 04:14 UTC
[Lustre-discuss] Metadata storage in test script files
On 2012-05-01, at 9:23 PM, Roman Grigoryev wrote:> On 05/01/2012 08:17 PM, Chris wrote: >> The metadata can be used in a multitude of ways, for example we can >> create dynamic test sets based on >> the changes made or target area of testing. What we are doing here is >> creating an understanding of the >> tests that we have so that we can improve our processes and testing >> capabilities in the future. > > I think that when are are defining tool we should say about purpose. > F.e. good description and summary is not needed for creating dynamic > test sets. I think, it very important to say how will we use it. > Continue of this idea please read below. > >> The metadata does not go to the results. The metadata is a database in >> it''s own right and should metadata about a test be required it would be accessed from the source (database) itself. > > I think fields like title, summary, and, possible. description should be > present in results too. It can be very helpful for quickly understanding > test results.I think what Chris was suggesting is the opposite of what you state here. He was writing that the "test metadata" under discussion here is the static description of the test to be stored with the test itself. Chris is specifically excluding any runtime data from being stored with the test, not (as you suggest) excluding the display of this description in the test results.>>> On 04/30/2012 08:50 PM, Chris wrote: >>>> Prerequisites: Pre-requisite tests that must be run before this test can be run. This is again an array which presumes a test may >>>> have multiple pre-requisites, but the data should not contain a >>>> chain of prerequisites, i.e. if A requires B and B requires C, the >>>> pre-requisites of A is B not B & C. >>> On which step do you want to check chains? And what is logical base >>> for this prerequisites exclude case that current tests have hidden >>> dependencies? >>> I don''t see any difference between one test which have body from tests a,b,c and this prerequisites definition. >>> Could you please explain more why we need this field? >> As I said we can mine this data any-time and anyway that we want, and >> the purpose of this discussion is the data not how we use it. But as >> an example something that dynamically built >> test sets would need to know prerequisites. >> >> The suffix of a,b,c could be used to generate prerequisite information >> but it is firstly inflexible, for example I bet ''b'',''c'' and ''d'' are >> often dependent on ''a'' but not each other, secondly and more >> importantly we want a standard form for storing metadata because we >> want to introduce order and knowledge into the test >> scripts that we have today. > > Why I asked about way of usage: if we want to use this information in > scripts and in other automated way we must strictly specify logic on > items and provides tool for check it.I think it is sufficient to have a well-structured repository of test metadata, and then multiple uses can be found for this data. Even for human use, a good description of what the test is supposed to check, and why this test exists would be a good start. The test metadata format is extensible, so should we need more fields in the future it will be possible to add them. I think the hardest work will be to get good text descriptions of the tests, not mechanical issues like dependencies and such.> F.e. we will use it when built test execution queue. We have chain like > this: test C prerequisite B, test B prerequisite A. Test A doesn''t have > prerequisite. In one good day test A became excluded. Is it possible to > execute test C? > But if we will not use it in scripting there is no big logical problem. > > (My opinion: I don''t like this situation and think that test > dependencies should be used only in very specific and rare case.) > >> >>>> TicketIDs: This is an array of ticket numbers that this test >>>> explicitly tests. In theory we should aim for the state where >>>> every ticket has a test associated with it, and in future we >>>> should be able to carry out a gap analysis. >>>> >>> I suggest add keywords(Components could be translated as keywords too) and test type (stress, benchmark, load, functional, negative, >>> etc) for quick filtering. For example, SLOW could transform to >>> keyword. >> This seems like a reasonable idea although we need a name that describes what it is, we will need to define that set of possible >> words as we need to with the Components elements. > > I mean that ''keywords'' should be separated from components but could be > logically included. I think, ''Components'' is special type of keywords. > >> What should this field be called - we should not reduce the value of >> this data why genericizing it into ''keywords''. >> >>> Also, I would like to mention, we have 3 different logical types of >>> data: >>> 1) just human-readable descriptions >>> 2) filtering and targeting fields (Componens, keywords if you agree with >>> my suggestion) >>> 3) framework directives(Prerequisites) >>> >>>> As time goes on we may well expand this compulsory list, but this is I >>>> believe a sensible starting place. >>>> >>>> Being part of the source this data will be subject to the same review >>>> process as any other change and so we cannot store dynamic data here, >>>> such as pass rates etc. >>> What you you think, maybe it is good idea to keep metadata separately? >>> This can be useful for simplifying changing data via script for mass >>> modification also as adding tickets and pass rate and execution time on >>> ''gold'' configurations? >> It would be easier to store the data separately and we could use Maloo >> but it''s very important that this data becomes part of the Lustre >> ''source'' so that everybody can benefit from it. Adding tickets is >> not a problem as part of the resolution issue is to ensure that at >> least one test exercises the problem and proves it has been fixed, >> the fact that this assurance process requires active >> interaction by an engineer with the scripts is a positive. >> >> As for pass rate, execution time and gold configurations this >> information is just not 1 dimensional enough to store in the source. > > I''m not accidentally in previous letter said about group of fields. All > meta data may be separated by rare and often changed fields. F.e. > Summary will change not so often. But test timeout in golden > configuration (I mean that this timeout will be set as default based on > ''gold'' configuration and can be overloaded in specific configuration) > could be more variable(and possible more important for testing).I think this is something that needs to live outside the test metadata being described here. The definition of "golden configuration" is hard to define, and depends heavily on factors that change from one environment to the next. Ideally, tests will be written so that they can run under a wide range of configurations (number of clients, servers, virtual and real nodes). A further goal might be to allow many non-destructive functional subtests to be run in parallel, which would further skew the time taken, but would allow much more efficient use of test resources.> Using separated files provides more flexibility and nobody stop us to > commit it to lustre repo and it became " Lustre ''source''". In separated > files we can use format which we want and all information will be > available without parsing shell script or without running it. More over, > in great future, it give us very simple migration from shell to other > language.I think the metadata format should be chosen so that it is trivial to extract the test metadata without having to execute or parse the shell (or other) test language itself. Simple filtering and regexp should be enough.> Few words how we done this task in our wrapper test framework(see > attached sample yaml): > > The file contains set of tags. Main entity is test, in this sample > element <id> is <Tests> array define logic entity ''test''. Every test > inherit vales from common description (fields which described out of > <Tests> array). A test can override any field or add new fields. > > <groupname>, <executor>, <description>, <reference>, <roles>, <tags> - > are common fields. All other are executor-specific and used in executors. > > -- > Thanks, > Roman > <conf-sanity_tests.yaml>_______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Whamcloud, Inc. Principal Lustre Engineer http://www.whamcloud.com/
On 02/05/2012 04:23, Roman Grigoryev wrote:> Hi Cris, > > On 05/01/2012 08:17 PM, Chris wrote: >> The metadata can be used in a multitude of ways, for example we can >> create dynamic test sets based on >> the changes made or target area of testing. What we are doing here is >> creating an understanding of the >> tests that we have so that we can improve our processes and testing >> capabilities in the future. > I think that when are are defining tool we should say about purpose. > F.e. good description and summary is not needed for creating dynamic > test sets. I think, it very important to say how will we use it. > Continue of this idea please read below.The purpose is to enable use to develop and store knowledge/information about the tests, the information should be in a conical form, objective and correct. If we do this then the whole community can make use of it as they see fit. I want to ensure that the initial set of stored variables describes the tests as completely as reasonably possible. The conical description of each test is not effected by the usage to which the data is put.>> The metadata does not go to the results. The metadata is a database in >> it''s own right and should metadata >> about a test be required it would be accessed from the source (database) >> itself. > I think fields like title, summary, and, possible. description should be > present in results too. It can be very helpful for quickly understanding > test results.They can be presented as part of results but I would not store with the results, if for example Maloo presents the description it will fetch it from the correct version of the source, we should not be making copies of data. I cannot suppose if you should store this information with your results because I have no insight into your private testing practices.> >>> On 04/30/2012 08:50 PM, Chris wrote: >> ... snip ... >> >> >> As I said we can mine this data any-time and anyway that we want, and >> the purpose of this >> discussion is the data not how we use it. But as an example something >> that dynamically built >> test sets would need to know prerequisites. >> >> The suffix of a,b,c could be used to generate prerequisite information >> but it is firstly inflexible, for example >> I bet ''b'',''c'' and ''d'' are often dependent on ''a'' but not each other, >> secondly and more importantly we want a >> standard form for storing metadata because we want to introduce order >> and knowledge into the test >> scripts that we have today. > Why I asked about way of usage: if we want to use this information in > scripts and in other automated way we must strictly specify logic on > items and provides tool for check it. > > F.e. we will use it when built test execution queue. We have chain like > this: test C prerequisite B, test B prerequisite A. Test A doesn''t have > prerequisite. In one good day test A became excluded. Is it possible to > execute test C? > But if we will not use it in scripting there is no big logical problem. > > (My opinion: I don''t like this situation and think that test > dependencies should be used only in very specific and rare case.)I don''t think people should introduce dependencies either, but they have and we have to deal with that fact. In your example if C is dependent on A and A is removed then C cannot be run.> >>> I suggest add keywords(Components could be translated as keywords too) >>> and test type (stress, benchmark, load, functional, negative, etc) for >>> quick filtering. For example, SLOW could transform to keyword. >> This seems like a reasonable idea although we need a name that describes >> what it is, >> we will need to define that set of possible words as we need to with the >> Components elements. > I mean that ''keywords'' should be separated from components but could be > logically included. I think, ''Components'' is special type of keywords.I don''t think of Components as a keyword, I think of it as a factual piece of data and if we want to add the test purpose then we should call it that. The use of keywords in data is generally a typeless catch-all. All of this metadata should be clear and well defined which does not in my opinion allow scope for a keywords element. I would suggest that we add a variable called Purposes which is an array containing a set of predefined elements like stress, benchmark, load and functional etc. For example Purposes: - stress - load>> It would be easier to store the data separately and we could use Maloo >> but it''s very important >> that this data becomes part of the Lustre ''source'' so that everybody can >> benefit from it. Adding >> tickets is not a problem as part of the resolution issue is to ensure >> that at least one test exercises >> the problem and proves it has been fixed, the fact that this assurance >> process requires active >> interaction by an engineer with the scripts is a positive. >> >> As for pass rate, execution time and gold configurations this >> information is just not 1 dimensional >> enough to store in the source. >> > I''m not accidentally in previous letter said about group of fields. All > meta data may be separated by rare and often changed fields. F.e. > Summary will change not so often. But test timeout in golden > configuration (I mean that this timeout will be set as default based on > ''gold'' configuration and can be overloaded in specific configuration) > could be more variable(and possible more important for testing).What exactly is a gold configuration? Lustre has such breadth of possibilities that gold configurations would be a matrix of distro/architecture/distro version/interconnect/cpu speed/memory/storage/oss count/client count/... . To try and summarise this into some useful single value does not make any sense to me.> Using separated files provides more flexibility and nobody stop us to > commit it to lustre repo and it became " Lustre ''source''". In separated > files we can use format which we want and all information will be > available without parsing shell script or without running it. More over, > in great future, it give us very simple migration from shell to other > language.This data is valuable and needs to be treated with the same respect and discipline as we treat the source, to imagine we can have a ''free for all'' where people just update it at will does not work. The controls on what goes into the Lustre tree are there for very good reason and we are not going to circumvent those controls. We have to invest in this as we do with all the test infrastructure, it cannot be done on the cheap. Parsing the scripts for the data is easy because computers are really good at it. I would expect someone will write a library to access and modify the data as required, I''d also expect them to publish that library. If test''s were re-written then this data will probably change, and the cost of migrating unchanged data will be insignificant compared to the cost of re-writing the test itself. Chris
Hi Andreas, On 05/02/2012 08:14 AM, Andreas Dilger wrote:> On 2012-05-01, at 9:23 PM, Roman Grigoryev wrote:>>>> On 04/30/2012 08:50 PM, Chris wrote: >>>>> Prerequisites: Pre-requisite tests that must be run before this test can be run. This is again an array which presumes a test may >>>>> have multiple pre-requisites, but the data should not contain a >>>>> chain of prerequisites, i.e. if A requires B and B requires C, the >>>>> pre-requisites of A is B not B & C. >>>> On which step do you want to check chains? And what is logical base >>>> for this prerequisites exclude case that current tests have hidden >>>> dependencies? >>>> I don''t see any difference between one test which have body from tests a,b,c and this prerequisites definition. >>>> Could you please explain more why we need this field? >>> As I said we can mine this data any-time and anyway that we want, and >>> the purpose of this discussion is the data not how we use it. But as >>> an example something that dynamically built >>> test sets would need to know prerequisites. >>> >>> The suffix of a,b,c could be used to generate prerequisite information >>> but it is firstly inflexible, for example I bet ''b'',''c'' and ''d'' are >>> often dependent on ''a'' but not each other, secondly and more >>> importantly we want a standard form for storing metadata because we >>> want to introduce order and knowledge into the test >>> scripts that we have today. >> >> Why I asked about way of usage: if we want to use this information in >> scripts and in other automated way we must strictly specify logic on >> items and provides tool for check it. > > I think it is sufficient to have a well-structured repository of test > metadata, and then multiple uses can be found for this data. Even for > human use, a good description of what the test is supposed to check, > and why this test exists would be a good start.I absolute agree that good description, summary and other fields are very important.> > The test metadata format is extensible, so should we need more fields > in the future it will be possible to add them. I think the hardest > work will be to get good text descriptions of the tests, not mechanical > issues like dependencies and such.I think this work will be pretty long and I suggest to ask it only for new and changed tests. In this case, possibility to have some kind of description inheritance is good solution.> >> F.e. we will use it when built test execution queue. We have chain like >> this: test C prerequisite B, test B prerequisite A. Test A doesn''t have >> prerequisite. In one good day test A became excluded. Is it possible to >> execute test C? >> But if we will not use it in scripting there is no big logical problem. >> >> (My opinion: I don''t like this situation and think that test >> dependencies should be used only in very specific and rare case.) >> >>> >>>>> TicketIDs: This is an array of ticket numbers that this test >>>>> explicitly tests. In theory we should aim for the state where >>>>> every ticket has a test associated with it, and in future we >>>>> should be able to carry out a gap analysis. >>>>> >>>> I suggest add keywords(Components could be translated as keywords too) and test type (stress, benchmark, load, functional, negative, >>>> etc) for quick filtering. For example, SLOW could transform to >>>> keyword. >>> This seems like a reasonable idea although we need a name that describes what it is, we will need to define that set of possible >>> words as we need to with the Components elements. >> >> I mean that ''keywords'' should be separated from components but could be >> logically included. I think, ''Components'' is special type of keywords. >> >>> What should this field be called - we should not reduce the value of >>> this data why genericizing it into ''keywords''. >>> >>>> Also, I would like to mention, we have 3 different logical types of >>>> data: >>>> 1) just human-readable descriptions >>>> 2) filtering and targeting fields (Componens, keywords if you agree with >>>> my suggestion) >>>> 3) framework directives(Prerequisites) >>>> >>>>> As time goes on we may well expand this compulsory list, but this is I >>>>> believe a sensible starting place. >>>>> >>>>> Being part of the source this data will be subject to the same review >>>>> process as any other change and so we cannot store dynamic data here, >>>>> such as pass rates etc. >>>> What you you think, maybe it is good idea to keep metadata separately? >>>> This can be useful for simplifying changing data via script for mass >>>> modification also as adding tickets and pass rate and execution time on >>>> ''gold'' configurations? >>> It would be easier to store the data separately and we could use Maloo >>> but it''s very important that this data becomes part of the Lustre >>> ''source'' so that everybody can benefit from it. Adding tickets is >>> not a problem as part of the resolution issue is to ensure that at >>> least one test exercises the problem and proves it has been fixed, >>> the fact that this assurance process requires active >>> interaction by an engineer with the scripts is a positive. >>> >>> As for pass rate, execution time and gold configurations this >>> information is just not 1 dimensional enough to store in the source. >> >> I''m not accidentally in previous letter said about group of fields. All >> meta data may be separated by rare and often changed fields. F.e. >> Summary will change not so often. But test timeout in golden >> configuration (I mean that this timeout will be set as default based on >> ''gold'' configuration and can be overloaded in specific configuration) >> could be more variable(and possible more important for testing). > > I think this is something that needs to live outside the test metadata > being described here. The definition of "golden configuration" is > hard to define, and depends heavily on factors that change from one > environment to the next.We could separate dynamic and static metadata. But it will be good if both set of data use one engine and storage type with just different sources.> > Ideally, tests will be written so that they can run under a wide range > of configurations (number of clients, servers, virtual and real nodes). > A further goal might be to allow many non-destructive functional subtests > to be run in parallel, which would further skew the time taken, but > would allow much more efficient use of test resources.It will be very good if we have big enough set of fully independent tests.> >> Using separated files provides more flexibility and nobody stop us to >> commit it to lustre repo and it became " Lustre ''source''". In separated >> files we can use format which we want and all information will be >> available without parsing shell script or without running it. More over, >> in great future, it give us very simple migration from shell to other >> language. > > I think the metadata format should be chosen so that it is trivial to > extract the test metadata without having to execute or parse the shell > (or other) test language itself. Simple filtering and regexp should > be enough. >Why do you want do ''filtering and regexp'' with some error probability for selecting data and also do special code injection to shell script when we can avoid it? It is good chance to start work for run away from shell there. If question is in developers comfort I prefer to suggest tools for checking metadata completeness then have code and metadata in one file. Also, I don''t see good way to use ''metadata inheritance'' way in shell without adding pretty unclear shell code, so switch to metadata usage should be one-monent or test framework just ignore it and metadata became just static text for external scripts. -- Thanks, Roman
Roman Grigoryev
2012-May-02 15:53 UTC
[Lustre-discuss] Metadata storage in test script files
Hi Andreas, On 05/02/2012 08:14 AM, Andreas Dilger wrote:> On 2012-05-01, at 9:23 PM, Roman Grigoryev wrote:>>>> On 04/30/2012 08:50 PM, Chris wrote: >>>>> Prerequisites: Pre-requisite tests that must be run before this test can be run. This is again an array which presumes a test may >>>>> have multiple pre-requisites, but the data should not contain a >>>>> chain of prerequisites, i.e. if A requires B and B requires C, the >>>>> pre-requisites of A is B not B & C. >>>> On which step do you want to check chains? And what is logical base >>>> for this prerequisites exclude case that current tests have hidden >>>> dependencies? >>>> I don''t see any difference between one test which have body from tests a,b,c and this prerequisites definition. >>>> Could you please explain more why we need this field? >>> As I said we can mine this data any-time and anyway that we want, and >>> the purpose of this discussion is the data not how we use it. But as >>> an example something that dynamically built >>> test sets would need to know prerequisites. >>> >>> The suffix of a,b,c could be used to generate prerequisite information >>> but it is firstly inflexible, for example I bet ''b'',''c'' and ''d'' are >>> often dependent on ''a'' but not each other, secondly and more >>> importantly we want a standard form for storing metadata because we >>> want to introduce order and knowledge into the test >>> scripts that we have today. >> >> Why I asked about way of usage: if we want to use this information in >> scripts and in other automated way we must strictly specify logic on >> items and provides tool for check it. > > I think it is sufficient to have a well-structured repository of test > metadata, and then multiple uses can be found for this data. Even for > human use, a good description of what the test is supposed to check, > and why this test exists would be a good start.I absolute agree that good description, summary and other fields are very important.> > The test metadata format is extensible, so should we need more fields > in the future it will be possible to add them. I think the hardest > work will be to get good text descriptions of the tests, not mechanical > issues like dependencies and such.I think this work will be pretty long and I suggest to ask it only for new and changed tests. In this case, possibility to have some kind of description inheritance is good solution.> >> F.e. we will use it when built test execution queue. We have chain like >> this: test C prerequisite B, test B prerequisite A. Test A doesn''t have >> prerequisite. In one good day test A became excluded. Is it possible to >> execute test C? >> But if we will not use it in scripting there is no big logical problem. >> >> (My opinion: I don''t like this situation and think that test >> dependencies should be used only in very specific and rare case.) >> >>> >>>>> TicketIDs: This is an array of ticket numbers that this test >>>>> explicitly tests. In theory we should aim for the state where >>>>> every ticket has a test associated with it, and in future we >>>>> should be able to carry out a gap analysis. >>>>> >>>> I suggest add keywords(Components could be translated as keywords too) and test type (stress, benchmark, load, functional, negative, >>>> etc) for quick filtering. For example, SLOW could transform to >>>> keyword. >>> This seems like a reasonable idea although we need a name that describes what it is, we will need to define that set of possible >>> words as we need to with the Components elements. >> >> I mean that ''keywords'' should be separated from components but could be >> logically included. I think, ''Components'' is special type of keywords. >> >>> What should this field be called - we should not reduce the value of >>> this data why genericizing it into ''keywords''. >>> >>>> Also, I would like to mention, we have 3 different logical types of >>>> data: >>>> 1) just human-readable descriptions >>>> 2) filtering and targeting fields (Componens, keywords if you agree with >>>> my suggestion) >>>> 3) framework directives(Prerequisites) >>>> >>>>> As time goes on we may well expand this compulsory list, but this is I >>>>> believe a sensible starting place. >>>>> >>>>> Being part of the source this data will be subject to the same review >>>>> process as any other change and so we cannot store dynamic data here, >>>>> such as pass rates etc. >>>> What you you think, maybe it is good idea to keep metadata separately? >>>> This can be useful for simplifying changing data via script for mass >>>> modification also as adding tickets and pass rate and execution time on >>>> ''gold'' configurations? >>> It would be easier to store the data separately and we could use Maloo >>> but it''s very important that this data becomes part of the Lustre >>> ''source'' so that everybody can benefit from it. Adding tickets is >>> not a problem as part of the resolution issue is to ensure that at >>> least one test exercises the problem and proves it has been fixed, >>> the fact that this assurance process requires active >>> interaction by an engineer with the scripts is a positive. >>> >>> As for pass rate, execution time and gold configurations this >>> information is just not 1 dimensional enough to store in the source. >> >> I''m not accidentally in previous letter said about group of fields. All >> meta data may be separated by rare and often changed fields. F.e. >> Summary will change not so often. But test timeout in golden >> configuration (I mean that this timeout will be set as default based on >> ''gold'' configuration and can be overloaded in specific configuration) >> could be more variable(and possible more important for testing). > > I think this is something that needs to live outside the test metadata > being described here. The definition of "golden configuration" is > hard to define, and depends heavily on factors that change from one > environment to the next.We could separate dynamic and static metadata. But it will be good if both set of data use one engine and storage type with just different sources.> > Ideally, tests will be written so that they can run under a wide range > of configurations (number of clients, servers, virtual and real nodes). > A further goal might be to allow many non-destructive functional subtests > to be run in parallel, which would further skew the time taken, but > would allow much more efficient use of test resources.It will be very good if we have big enough set of fully independent tests.> >> Using separated files provides more flexibility and nobody stop us to >> commit it to lustre repo and it became " Lustre ''source''". In separated >> files we can use format which we want and all information will be >> available without parsing shell script or without running it. More over, >> in great future, it give us very simple migration from shell to other >> language. > > I think the metadata format should be chosen so that it is trivial to > extract the test metadata without having to execute or parse the shell > (or other) test language itself. Simple filtering and regexp should > be enough. >Why do you want do ''filtering and regexp'' with some error probability for selecting data and also do special code injection to shell script when we can avoid it? It is good chance to start work for run away from shell there. If question is in developers comfort I prefer to suggest tools for checking metadata completeness then have code and metadata in one file. Also, I don''t see good way to use ''metadata inheritance'' way in shell without adding pretty unclear shell code, so switch to metadata usage should be one-monent or test framework just ignore it and metadata became just static text for external scripts. -- Thanks, Roman
On 02/05/2012 16:44, Roman wrote:> >> I think this is something that needs to live outside the test metadata >> being described here. The definition of "golden configuration" is >> hard to define, and depends heavily on factors that change from one >> environment to the next. > We could separate dynamic and static metadata. But it will be good if > both set of data use one engine and storage type with just different > sources.I think we all understand the static metadata and I believe that the data in my original examples is static data. This data relates to a version of the test scripts and so can live as part of the test script managed using the same git mechanisms. Could you explain what you mean by dynamic data so that we can all understand exactly what you are suggesting we store.> Also, I don''t see good way to use ''metadata inheritance'' way in shell > without adding pretty unclear shell code, so switch to metadata usage > should be one-monent or test framework just ignore it and metadata > became just static text for external scripts.I''m not sure if there is a place for inheritance in this particular situation but if there is then we need to be clear of one thing. There can be no implicit inheritance for these scripts. I.e. We can''t have a single attribute at the top of a file that applies to all tests. The reason for this is because one major reason for having metadata is that we cause the data to be collected properly, each test needs to have the data explicitly captured. If a test does not have the data captured then we do not have any data - and no data is a fact (data) in itself, If a test inherits data from another test then that must have be explicitly set. We cannot allow sweeping inheritance that allows us to imagine we have learnt something when actually we''ve just taken a short cut to give the impression of knowledge. Chris
Roman Grigoryev
2012-May-02 16:35 UTC
[Lustre-discuss] Metadata storage in test script files
Hi, On 05/02/2012 01:25 PM, Chris wrote:> On 02/05/2012 04:23, Roman Grigoryev wrote: >> Hi Cris, >> >> On 05/01/2012 08:17 PM, Chris wrote: >>> The metadata can be used in a multitude of ways, for example we can >>> create dynamic test sets based on >>> the changes made or target area of testing. What we are doing here is >>> creating an understanding of the >>> tests that we have so that we can improve our processes and testing >>> capabilities in the future. >> I think that when are are defining tool we should say about purpose. >> F.e. good description and summary is not needed for creating dynamic >> test sets. I think, it very important to say how will we use it. >> Continue of this idea please read below. > The purpose is to enable use to develop and store knowledge/information > about the tests, the information should be in a conical form, objective > and correct. If we do this then the whole community can make use of it > as they see fit. I want to ensure that the initial set of stored > variables describes the tests as completely as reasonably possible. The > conical description of each test is not effected by the usage to which > the data is put. > >>> The metadata does not go to the results. The metadata is a database in >>> it''s own right and should metadata >>> about a test be required it would be accessed from the source (database) >>> itself. >> I think fields like title, summary, and, possible. description should be >> present in results too. It can be very helpful for quickly understanding >> test results. > They can be presented as part of results but I would not store with the > results, if for example Maloo presents the description it will fetch it > from the correct version of the source, we should not be making copies > of data.ok, good.> > I cannot suppose if you should store this information with your results > because I have no insight into your private testing practices.I just want to have info not only in maloo or other big systems but in default test harness. Developers can run results by hand, tester also should have possibility to execute in specific environment. If we can provides some helpful info - i think it is good. few kilobytes is not so match as logs, but can help in some cases.>> >>>> On 04/30/2012 08:50 PM, Chris wrote: >>> ... snip ... >>> >>> >>> As I said we can mine this data any-time and anyway that we want, and >>> the purpose of this >>> discussion is the data not how we use it. But as an example something >>> that dynamically built >>> test sets would need to know prerequisites. >>> >>> The suffix of a,b,c could be used to generate prerequisite information >>> but it is firstly inflexible, for example >>> I bet ''b'',''c'' and ''d'' are often dependent on ''a'' but not each other, >>> secondly and more importantly we want a >>> standard form for storing metadata because we want to introduce order >>> and knowledge into the test >>> scripts that we have today. >> Why I asked about way of usage: if we want to use this information in >> scripts and in other automated way we must strictly specify logic on >> items and provides tool for check it. >> >> F.e. we will use it when built test execution queue. We have chain like >> this: test C prerequisite B, test B prerequisite A. Test A doesn''t have >> prerequisite. In one good day test A became excluded. Is it possible to >> execute test C? >> But if we will not use it in scripting there is no big logical problem. >> >> (My opinion: I don''t like this situation and think that test >> dependencies should be used only in very specific and rare case.) > I don''t think people should introduce dependencies either, but they have > and we have to deal with that fact. In your example if C is dependent on > A and A is removed then C cannot be run.Maybe I''m incorrect, but fight with dependencies looks like more important then adding descriptions.>> >>>> I suggest add keywords(Components could be translated as keywords too) >>>> and test type (stress, benchmark, load, functional, negative, etc) for >>>> quick filtering. For example, SLOW could transform to keyword. >>> This seems like a reasonable idea although we need a name that describes >>> what it is, >>> we will need to define that set of possible words as we need to with the >>> Components elements. >> I mean that ''keywords'' should be separated from components but could be >> logically included. I think, ''Components'' is special type of keywords. > I don''t think of Components as a keyword, I think of it as a factual > piece of data and if we want to add the test purpose then we should call > it that. The use of keywords in data is generally a typeless catch-all. > All of this metadata should be clear and well defined which does not in > my opinion allow scope for a keywords element.I agreed that Components aren''t keywords.> > I would suggest that we add a variable called Purposes which is an array > containing a set of predefined elements like stress, benchmark, load and > functional etc. > > For example > > Purposes: > - stress > - loadWhat about SLOW(which should be named as be smoke or sanity) , negative keywords? It is not about purposes but mostly about test type.> >>> It would be easier to store the data separately and we could use Maloo >>> but it''s very important >>> that this data becomes part of the Lustre ''source'' so that everybody can >>> benefit from it. Adding >>> tickets is not a problem as part of the resolution issue is to ensure >>> that at least one test exercises >>> the problem and proves it has been fixed, the fact that this assurance >>> process requires active >>> interaction by an engineer with the scripts is a positive. >>> >>> As for pass rate, execution time and gold configurations this >>> information is just not 1 dimensional >>> enough to store in the source. >>> >> I''m not accidentally in previous letter said about group of fields. All >> meta data may be separated by rare and often changed fields. F.e. >> Summary will change not so often. But test timeout in golden >> configuration (I mean that this timeout will be set as default based on >> ''gold'' configuration and can be overloaded in specific configuration) >> could be more variable(and possible more important for testing).> What exactly is a gold configuration? Lustre has such breadth of > possibilities that gold configurations would be a matrix of > distro/architecture/distro version/interconnect/cpu > speed/memory/storage/oss count/client count/... . To try and summarise > this into some useful single value does not make any sense to me.I incorrectly used phrase ''gold configurations'', correctly says ''development configuration'' or maybe ''default configuration''. I absolutely agree that some test characteristic are relative to configuration. But, for many tests it is possible (and for many people is could be very helpful) to have suggested f.e. timeout which indicates assumed upper limit of execution time on often used configuration. In this case, ''default configuration'' should be, I think, 4 nodes VM or 4 nodes real cluster in one subnet. For covering other configuration, we can use option ''timeout multiplexor'' in test framework and this option allows to cover 100% configurations, I think. Currently I use 300 sec per test in my scripts, it is overkill for the most of tests and only for few tests I set longer time. (Also I think it should be true for the most of configuration exclude f.e. configuration with complex lnet routing or systems under high load)>> Using separated files provides more flexibility and nobody stop us to >> commit it to lustre repo and it became " Lustre ''source''". In separated >> files we can use format which we want and all information will be >> available without parsing shell script or without running it. More over, >> in great future, it give us very simple migration from shell to other >> language. > This data is valuable and needs to be treated with the same respect and > discipline as we treat the source, to imagine we can have a ''free for > all'' where people just update it at will does not work. The controls on > what goes into the Lustre tree are there for very good reason and we are > not going to circumvent those controls. We have to invest in this as we > do with all the test infrastructure, it cannot be done on the cheap.I don''t really understand why we cannot cover by discipline separated yaml sources too as shell code. More over, the most of yaml testing can be done by automated tools. I don''t say that this data should be ''free for all''. But, I think, it is good idea(mostly for test developers): providing way for user to override main metadata by his own metadata with simple switch.> > Parsing the scripts for the data is easy because computers are really > good at it. I would expect someone will write a library to access and > modify the data as required, I''d also expect them to publish that library.It looks not so simple to have one more library for cutting from shell yaml data which have to parse by yaml reader. Question is not in cpu time, but in full complexity of bash, external utilities and libraries and test for them.> > If test''s were re-written then this data will probably change, and the > cost of migrating unchanged data will be insignificant compared to the > cost of re-writing the test itself.I''m not sure what do you mean there, could you please explain. Thanks, Roman
Roman Grigoryev
2012-May-02 17:05 UTC
[Lustre-discuss] Metadata storage in test script files
Hi Chris, On 05/02/2012 08:06 PM, Chris wrote:> On 02/05/2012 16:44, Roman wrote: >> >>> I think this is something that needs to live outside the test metadata >>> being described here. The definition of "golden configuration" is >>> hard to define, and depends heavily on factors that change from one >>> environment to the next. >> We could separate dynamic and static metadata. But it will be good if >> both set of data use one engine and storage type with just different >> sources. > > I think we all understand the static metadata and I believe that the > data in my original examples is static data. This data relates to a > version of the test scripts and so can live as part of the test script > managed using the same git mechanisms. > > Could you explain what you mean by dynamic data so that we can all > understand exactly what you are suggesting we store.As true dynamic data I can imagine only tickets now. And I''m not sure how it important to keep in test sources, it think umbrella for old bugzilla, WC jira and maybe other bug sources is more important. But I can imagine situation when we want to update meta data in many tests. F.e. somebody done by test coverage and want to add it to meta information.> >> Also, I don''t see good way to use ''metadata inheritance'' way in shell >> without adding pretty unclear shell code, so switch to metadata usage >> should be one-monent or test framework just ignore it and metadata >> became just static text for external scripts. > > I''m not sure if there is a place for inheritance in this particular > situation but if there is then we need to be clear of one thing. There > can be no implicit for these scripts. I.e. We can''t have a > single attribute at the top of a file that applies to all tests. The > reason for this is because one major reason for having metadata is that > we cause the data to be collected properly, each test needs to have the > data explicitly captured. If a test does not have the data captured then > we do not have any data - and no data is a fact (data) in itself, If a > test inherits data from another test then that must have be explicitly set. > > We cannot allow sweeping inheritance that allows us to imagine we have > learnt something when actually we''ve just taken a short cut to give the > impression of knowledge.Yes, I mean inheritance from "single attribute at the top of a file" (with overriding if defined in detailed level). Why we can''t have single attribute at the top which is default values? Going over all tests manually is very big task. Back to your original definition, f.e. all tests from lustre-rsync should be on one component (maybe, as I understand), there is no big reasons to duplicate componetns. -- Thanks, Roman
Andreas Dilger
2012-May-02 19:01 UTC
[Lustre-discuss] Metadata storage in test script files
I''m chopping out most of the discussion, to try and focus on the core issues here. On 2012-05-02, at 10:35 AM, Roman Grigoryev wrote:> On 05/02/2012 01:25 PM, Chris wrote: >> I cannot suppose if you should store this information with your results >> because I have no insight into your private testing practices. > > I just want to have info not only in maloo or other big systems but in > default test harness. Developers can run results by hand, tester also > should have possibility to execute in specific environment. If we can > provides some helpful info - i think it is good. few kilobytes is not so > match as logs, but can help in some cases.I don''t think you two are in disagreement here. We want the test descriptions and other metadata with the tests, open for any usage (human, test scripts, different test harnesses, etc).>> I don''t think people should introduce dependencies either, but they have and we have to deal with that fact. In your example if C is dependent on A and A is removed then C cannot be run. > > Maybe I''m incorrect, but fight with dependencies looks like more > important then adding descriptions.For the short term. However, finding dependencies is easily done through simple mechanical steps (e.g. try to run each subtest independently). Since the policy in the past was to make all tests independent, I expect that not very many tests will actually have dependencies. However, the main reason for having good descriptions of the tests is to gain an understanding of what part of the code the tests are trying to exercise, what problem they were written to verify, and what value they provide. We cannot reasonably rewrite or modify tests safely if we don''t have a good understanding of what they are doing today. Also, this helps people running and debugging the tests and their failures for the long term. Cheers, Andreas -- Andreas Dilger Whamcloud, Inc. Principal Lustre Engineer http://www.whamcloud.com/
Roman Grigoryev
2012-May-03 09:17 UTC
[Lustre-discuss] Metadata storage in test script files
Hi, On 05/02/2012 11:01 PM, Andreas Dilger wrote:> I''m chopping out most of the discussion, to try and focus on the core issues here. > > On 2012-05-02, at 10:35 AM, Roman Grigoryev wrote: >> On 05/02/2012 01:25 PM, Chris wrote: >>> I cannot suppose if you should store this information with your results >>> because I have no insight into your private testing practices. >> >> I just want to have info not only in maloo or other big systems but in >> default test harness. Developers can run results by hand, tester also >> should have possibility to execute in specific environment. If we can >> provides some helpful info - i think it is good. few kilobytes is not so >> match as logs, but can help in some cases. > > I don''t think you two are in disagreement here. We want the test descriptions and other > metadata with the tests, open for any usage (human, test scripts, different test harnesses, etc).I absolute agree. My point is just about form: machine usage need formal description of fields and tools for simple check it.> >>> I don''t think people should introduce dependencies either, but they have and we have to deal with that fact. In your example >>> if C is dependent on A and A is removed then C cannot be run. >> >> Maybe I''m incorrect, but fight with dependencies looks like more >> important then adding descriptions. > > For the short term. However, finding dependencies is easily done through simple mechanical steps (e.g. try to run each subtest > independently). Since the policy in the past was to make all tests independent, I expect that not very many tests will actually > have dependencies.Just now I''m working on this task.> > However, the main reason for having good descriptions of the tests is to gain an understanding of what part of the > code the tests are trying to exercise, what problem they were written to verify, and what value they provide. > We cannot reasonably rewrite or modify tests safely if we don''t have a good understanding of what they are doing today. > Also, this helps people running and debugging the tests and their failures for the long term.I absolute agree with common target and text descriptions for humans. I just don''t really see why test refactoring and test understanding (creating summary-descriptions) cannot be combine into one. (Also I have feeling that developer will find many errors when go around test for get description. I have some experience in same tasks and could say that fresh look to old tests often find problems.). -- Thanks, Roman
Chris Gearing
2012-May-04 14:46 UTC
[Lustre-discuss] Metadata storage in test script files
Hi Roman, I think we may have rat-holed here and perhaps it''s worth just re-stating what I''m trying to achieve here. We have a need to be able to test in a more directed and targeted manner, to be able to focus on a unit of code like lnet or an attribute of capability like performance. However since starting work on the Lustre test infrastructure it has become clear to me that knowledge about the capability, functionality and purpose of individual tests is very general and held in the heads of Lustre engineers. Because we are talking about targeting tests we require knowledge about the capability, functionality and purpose of the tests not the outcome of running the tests, or to put it another way what the tests can do not what they have done. One key fact about cataloguing the the capabilities of the tests is that for almost every imaginable case the capability of the test only changes if the test itself changes and so the rate of change of the data in the catalogue is the same and actually much less than the rate of change test code itself. The only exception to this could be that a test suddenly discovers a new bug which has to have a new ticket attached to it, although this should be a very very rare if we manage our development process properly. This requirement leads to the conclusion that we need to catalogue all of the tests within the current test-framework and a catalogue equates to a database, hence we need a database of the capability, functionality and purpose of the individual tests. With this requirement in mind it would be easy to create a database using something like mysql that could be used by applications like the Lustre test system, but using an approach like that would make the database very difficult to share and will be even harder to attach the knowledge to the Lustre tree which is were it belongs. So the question I want to solve is how to catalogue the capabilities of the individual tests in a database, store that data as part of the Lustre source and as a bonus make the data readable and even carefully editable by people as well as machines. Now to focus on the last point I do not think we should constrain ourselves to something that can be read by machine using just bash, we do have access to structure languages and should make use of that fact. The solution to all of this seemed to be to store the catalogue about the tests as part of the tests themselves, this provides for human and machine accessibility, implicit version control and certainty the what ever happens to Lustre source the data goes with it. It is also the case that by keeping the catalogue with the subject the maintenance of the catalogue is more likely to occur than if the two are separate. My original use of the term test metadata is intended as a more modern term for catalogue or the [test] library. So to refresh everybody''s mind, I''d like to suggest that we place test metadata in the source code itself using the following format, where the here doc is inserted into the copy about the test function itself. ======================================================================<<TEST_METADATA Name: before_upgrade_create_data Summary: Copies lustre source into a node specific directory and then creates a tarball using that directory Description: This should be called prior to upgrading Lustre and creates a set of data on the Lustre partition which be accessed and checked after the upgrade has taken place. Several methods are using including tar''ing directories so the can later be untar''ed and compared, along with create sha1''s of stored data. Component: - lnet - recovery Prerequisites: - before_upgrade_clear_filesystem TicketIDs: - LU-123 - LU-432 Purposes: - upgrade TEST_METADATA test_before_upgrade_create_data() { ... } run_test before_upgrade_create_data "Copying lustre source into a directory $IOP_DIR1, creating and then using source to create a tarball" ====================================================================== Again thoughts and input very much appreciated Chris
Nathan Rutman
2012-May-07 18:33 UTC
[Lustre-discuss] Metadata storage in test script files
On May 4, 2012, at 7:46 AM, Chris Gearing wrote:> Hi Roman, > > I think we may have rat-holed here and perhaps it''s worth just > re-stating what I''m trying to achieve here. > > We have a need to be able to test in a more directed and targeted > manner, to be able to focus on a unit of code like lnet or an attribute > of capability like performance. However since starting work on the > Lustre test infrastructure it has become clear to me that knowledge > about the capability, functionality and purpose of individual tests is > very general and held in the heads of Lustre engineers. Because we are > talking about targeting tests we require knowledge about the capability, > functionality and purpose of the tests not the outcome of running the > tests, or to put it another way what the tests can do not what they have > done. > > One key fact about cataloguing the the capabilities of the tests is that > for almost every imaginable case the capability of the test only changes > if the test itself changes and so the rate of change of the data in the > catalogue is the same and actually much less than the rate of change > test code itself. The only exception to this could be that a test > suddenly discovers a new bug which has to have a new ticket attached to > it, although this should be a very very rare if we manage our > development process properly. > > This requirement leads to the conclusion that we need to catalogue all > of the tests within the current test-framework and a catalogue equates > to a database, hence we need a database of the capability, functionality > and purpose of the individual tests. With this requirement in mind it > would be easy to create a database using something like mysql that could > be used by applications like the Lustre test system, but using an > approach like that would make the database very difficult to share and > will be even harder to attach the knowledge to the Lustre tree which is > were it belongs. > > So the question I want to solve is how to catalogue the capabilities of > the individual tests in a database, store that data as part of the > Lustre source and as a bonus make the data readable and even carefully > editable by people as well as machines. Now to focus on the last point I > do not think we should constrain ourselves to something that can be read > by machine using just bash, we do have access to structure languages and > should make use of that fact. >I think we all agree 100% on the above...> The solution to all of this seemed to be to store the catalogue about > the tests as part of the tests themselves... but not necessarily that conclusion.> , this provides for human and > machine accessibility, implicit version control and certainty the what > ever happens to Lustre source the data goes with it. It is also the case > that by keeping the catalogue with the subject the maintenance of the > catalogue is more likely to occur than if the two are separate.I agree with all those. But there are some difficulties with this as well: 1. bash isn''t a great language to encapsulate this metadata 2. this further locks us in to current test implementation - there''s not much possibility to start writing tests in another language if we''re parsing through looking for bash-formatted metadata. Sure, multiple parsers could be written... 3. difficulty changing md of groups of tests en-mass - eg. add "slow" keyword to a set of tests 4. no inheritance of characteristics - each test must explicitly list every piece of md. This not only blows up the amount of md it also is a source for typos, etc. to cause problems. 5. no automatic modification of characteristics. In particular, one piece of md I would like to see is "maximum allowed test time" for each test. Ideally, this could be measured and adjusted automatically based on historical and ongoing run data. But it would be dangerous to allow automatic modification to the script itself. To address those problems, I think a database-type approach is exactly right, or perhaps a YAML file with hierarchical inheritance. To some degree, this is a "evolution vs revolution" question, and I prefer to come down on the revolution-enabling design, despite the problems you list. Basically, I believe the separated MD model allows for the replacement of test-framework, and this, to my mind, is the majority driver for adding the MD at all.> > My original use of the term test metadata is intended as a more modern > term for catalogue or the [test] library. > > So to refresh everybody''s mind, I''d like to suggest that we place test > metadata in the source code itself using the following format, where the > here doc is inserted into the copy about the test function itself. > > ======================================================================> <<TEST_METADATA > Name: > before_upgrade_create_data > Summary: > Copies lustre source into a node specific directory and then creates > a tarball using that directory > Description: > This should be called prior to upgrading Lustre and creates a set of > data on the Lustre partition > which be accessed and checked after the upgrade has taken place. > Several methods are using > including tar''ing directories so the can later be untar''ed and > compared, along with create sha1''s > of stored data. > Component: > - lnet > - recovery > Prerequisites: > - before_upgrade_clear_filesystem > TicketIDs: > - LU-123 > - LU-432 > Purposes: > - upgrade > TEST_METADATA > > test_before_upgrade_create_data() { > ... > } > > run_test before_upgrade_create_data "Copying lustre source into a > directory $IOP_DIR1, creating and then using source to create a tarball" > ======================================================================> > Again thoughts and input very much appreciated > > Chris > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Chris Gearing
2012-May-07 21:34 UTC
[Lustre-discuss] Metadata storage in test script files
On Mon, May 7, 2012 at 7:33 PM, Nathan Rutman <nrutman at gmail.com> wrote:> > On May 4, 2012, at 7:46 AM, Chris Gearing wrote: > > > Hi Roman, > > > > I think we may have rat-holed here and perhaps it''s worth just > > re-stating what I''m trying to achieve here. > > > > We have a need to be able to test in a more directed and targeted > > manner, to be able to focus on a unit of code like lnet or an attribute > > of capability like performance. However since starting work on the > > Lustre test infrastructure it has become clear to me that knowledge > > about the capability, functionality and purpose of individual tests is > > very general and held in the heads of Lustre engineers. Because we are > > talking about targeting tests we require knowledge about the capability, > > functionality and purpose of the tests not the outcome of running the > > tests, or to put it another way what the tests can do not what they have > > done. > > > > One key fact about cataloguing the the capabilities of the tests is that > > for almost every imaginable case the capability of the test only changes > > if the test itself changes and so the rate of change of the data in the > > catalogue is the same and actually much less than the rate of change > > test code itself. The only exception to this could be that a test > > suddenly discovers a new bug which has to have a new ticket attached to > > it, although this should be a very very rare if we manage our > > development process properly. > > > > This requirement leads to the conclusion that we need to catalogue all > > of the tests within the current test-framework and a catalogue equates > > to a database, hence we need a database of the capability, functionality > > and purpose of the individual tests. With this requirement in mind it > > would be easy to create a database using something like mysql that could > > be used by applications like the Lustre test system, but using an > > approach like that would make the database very difficult to share and > > will be even harder to attach the knowledge to the Lustre tree which is > > were it belongs. > > > > So the question I want to solve is how to catalogue the capabilities of > > the individual tests in a database, store that data as part of the > > Lustre source and as a bonus make the data readable and even carefully > > editable by people as well as machines. Now to focus on the last point I > > do not think we should constrain ourselves to something that can be read > > by machine using just bash, we do have access to structure languages and > > should make use of that fact. > > > I think we all agree 100% on the above... > > > The solution to all of this seemed to be to store the catalogue about > > the tests as part of the tests themselves > ... but not necessarily that conclusion. > > > , this provides for human and > > machine accessibility, implicit version control and certainty the what > > ever happens to Lustre source the data goes with it. It is also the case > > that by keeping the catalogue with the subject the maintenance of the > > catalogue is more likely to occur than if the two are separate. > > I agree with all those. But there are some difficulties with this as well: > 1. bash isn''t a great language to encapsulate this metadata >The thing to focus on I think is the data captured not the format. The parser for yaml encapsulated in the source or anywhere else is a small amount of effort compared to capturing the data in the first place. If we capture the data and it''s machine readable then changing the format is easy. There are many advantages today to keeping the source and the metadata in the same place, one being that when reviewing new or updated tests the reviewers can and will be encouraged to by the locality to ensure the metadata matches the new or revised test. If the two are not together then they have very little chance of being kept in sync. 2. this further locks us in to current test implementation - there''s not> much possibility to start writing tests in another language if we''re > parsing through looking for bash-formatted metadata. Sure, multiple parsers > could be written... >I don''t think it is a lock in at all, the data is machine readable and moving to a new format when and should we need it will be easy. Let''s focus on capturing the data so we increase our knowledge, once we have the data we can manipulate it however we want. The data and the metadata together in my opinion increases the chance of capturing and updating the data given todays methods and tools. 3. difficulty changing md of groups of tests en-mass - eg. add "slow"> keyword to a set of tests >The data can read and written by machine and the libraries/application to do this would be written. Referring back to the description of the metadata we would not be making sweeping changes to test metadata because the metadata should only change when the test changes [exceptions will always apply but we should not optimize for exceptions]. Also I don''t think ''slow'' would not be part of the metadata because it is not an attribute of the test, it is an attribute of how the test is used. We need to be strict and clear here. The metadata describes the functionality of the test code and slow is not a test code function, if we want to be able to select ''slow'' then we need to understand what code functionality of a test cause it to be a ''slow'' test and ensure those attributes are captured.> 4. no inheritance of characteristics - each test must explicitly list > every piece of md. This not only blows up the amount of md it also is a > source for typos, etc. to cause problems. >I''m not against inheritance but the inheritance must be explicit not implicit we want to draw out knowledge about the tests if we just allow people to say ''all 200 tests in this file are X, Y, Z'' then that is what will happen no one will check each test to make sure it is true and our data will be corrupted before we start. So explicit inheritance might make sense, and please do propose an inheritance model for the data, we can discuss the storage format later but today let''s just understand how inheritance relates to our bash tests.> 5. no automatic modification of characteristics. In particular, one piece > of md I would like to see is "maximum allowed test time" for each test. > Ideally, this could be measured and adjusted automatically based on > historical and ongoing run data. But it would be dangerous to allow > automatic modification to the script itself. > >I really do not think maximum test time as a measurement is a piece of test metadata. Metadata describes the functionality of the test that is encapsulated within the test code itself, if the code said ''run for 60 minutes and no more'' then maximum time would be an attribute. Maybe there are a set of useful attributes like amount of storage used, or minimum clients, or minimum osts etc. etc, again these can only be metadata if they are implicitly in the test code, and for most tests they would not be definable, and the variability might be impossible to systematically capture, although I do think it''s worth having a go.> To address those problems, I think a database-type approach is exactly > right, or perhaps a YAML file with hierarchical inheritance. > To some degree, this is a "evolution vs revolution" question, and I prefer > to come down on the revolution-enabling design, despite the problems you > list. Basically, I believe the separated MD model allows for the > replacement of test-framework, and this, to my mind, is the majority driver > for adding the MD at all. > > Database is good and I believe metadata in the source fulfils thatobjective whilst being something that we can manage with what we have today manually, whilst easily creating tools for some automation. When we do begin work on a new test framework approach we will have all the data at hand to be manipulated in any way that we want, including if we want separating it and storing it somewhere else. I don''t think creating the metadata however is linked with a new test-framework, creating the metadata is required because today we do not know what we have and we need to know what we have today whatever strategy we use for the future. Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120507/22f7110b/attachment-0001.html
Roman Grigoryev
2012-May-08 13:51 UTC
[Lustre-discuss] Metadata storage in test script files
Hi, On 05/08/2012 01:34 AM, Chris Gearing wrote:> > > On Mon, May 7, 2012 at 7:33 PM, Nathan Rutman <nrutman at gmail.com > <mailto:nrutman at gmail.com>> wrote: > > > On May 4, 2012, at 7:46 AM, Chris Gearing wrote: > > > Hi Roman, > > > > I think we may have rat-holed here and perhaps it''s worth just > > re-stating what I''m trying to achieve here. > > > > We have a need to be able to test in a more directed and targeted > > manner, to be able to focus on a unit of code like lnet or an > attribute > > of capability like performance. However since starting work on the > > Lustre test infrastructure it has become clear to me that knowledge > > about the capability, functionality and purpose of individual tests is > > very general and held in the heads of Lustre engineers. Because we are > > talking about targeting tests we require knowledge about the > capability, > > functionality and purpose of the tests not the outcome of running the > > tests, or to put it another way what the tests can do not what > they have > > done. > > > > One key fact about cataloguing the the capabilities of the tests > is that > > for almost every imaginable case the capability of the test only > changes > > if the test itself changes and so the rate of change of the data > in the > > catalogue is the same and actually much less than the rate of change > > test code itself. The only exception to this could be that a test > > suddenly discovers a new bug which has to have a new ticket > attached to > > it, although this should be a very very rare if we manage our > > development process properly. > > > > This requirement leads to the conclusion that we need to catalogue all > > of the tests within the current test-framework and a catalogue equates > > to a database, hence we need a database of the capability, > functionality > > and purpose of the individual tests. With this requirement in mind it > > would be easy to create a database using something like mysql that > could > > be used by applications like the Lustre test system, but using an > > approach like that would make the database very difficult to share and > > will be even harder to attach the knowledge to the Lustre tree > which is > > were it belongs. > > > > So the question I want to solve is how to catalogue the > capabilities of > > the individual tests in a database, store that data as part of the > > Lustre source and as a bonus make the data readable and even carefully > > editable by people as well as machines. Now to focus on the last > point I > > do not think we should constrain ourselves to something that can > be read > > by machine using just bash, we do have access to structure > languages and > > should make use of that fact. > > > I think we all agree 100% on the above... > > > The solution to all of this seemed to be to store the catalogue about > > the tests as part of the tests themselves > ... but not necessarily that conclusion. > > > > , this provides for human and > > machine accessibility, implicit version control and certainty the what > > ever happens to Lustre source the data goes with it. It is also > the case > > that by keeping the catalogue with the subject the maintenance of the > > catalogue is more likely to occur than if the two are separate. > > I agree with all those. But there are some difficulties with this > as well: > 1. bash isn''t a great language to encapsulate this metadata > > > The thing to focus on I think is the data captured not the format. The > parser for yaml encapsulated in the source or anywhere else is a small > amount of effort compared to capturing the data in the first place. If > we capture the data and it''s machine readable then changing the format > is easy. > > There are many advantages today to keeping the source and the metadata > in the same place, one being that when reviewing new or updated tests > the reviewers can and will be encouraged to by the locality to ensure > the metadata matches the new or revised test. If the two are not > together then they have very little chance of being kept in sync.Also I have more then one concerns. You are suggesting to put in bash structure which has his formal description. Who and when will check that a embedded structure is correct? Formal structure must be checked by tools not by eyes. For example I use Rx tools with schema definition for yaml. Extracting yaml data and checked it separately decrease comfort of using tools. To be honest, I don''t see big difference between using 2 files and one file from developer point of view. This is more about discipline question then comfort. Absolutely same developer could ignore description which is placed nearly. (From my experience with tests live cycle, descriptions became good after few cycles of adding,changing and review them, often it is result of developer-user interaction. As result, the most problematic tests have the best descriptions)> > 2. this further locks us in to current test implementation - there''s > not much possibility to start writing tests in another language if > we''re parsing through looking for bash-formatted metadata. Sure, > multiple parsers could be written... > > > I don''t think it is a lock in at all, the data is machine readable and > moving to a new format when and should we need it will be easy. Let''s > focus on capturing the data so we increase our knowledge, once we have > the data we can manipulate it however we want. The data and the metadata > together in my opinion increases the chance of capturing and updating > the data given todays methods and tools. > > 3. difficulty changing md of groups of tests en-mass - eg. add > "slow" keyword to a set of tests > > > The data can read and written by machine and the libraries/application > to do this would be written. Referring back to the description of the > metadata we would not be making sweeping changes to test metadata > because the metadata should only change when the test changes > [exceptions will always apply but we should not optimize for exceptions]. > > Also I don''t think ''slow'' would not be part of the metadata because it > is not an attribute of the test, it is an attribute of how the test is > used. We need to be strict and clear here. The metadata describes the > functionality of the test code and slow is not a test code function, if > we want to be able to select ''slow'' then we need to understand what code > functionality of a test cause it to be a ''slow'' test and ensure those > attributes are captured.Do you suggest has separated metadata about how test is used? There is some logical vagueness: tests metadata can became test usage metadata and back. Where is border? For example Component from your suggestion also can be test usage metadata. "SLOW",in general, is set of tests with big coverage and small time. if we put to tests info about his coverage it became tests metadata?> > > 4. no inheritance of characteristics - each test must explicitly > list every piece of md. This not only blows up the amount of md it > also is a source for typos, etc. to cause problems. > > > I''m not against inheritance but the inheritance must be explicit not > implicit we want to draw out knowledge about the tests if we just allow > people to say ''all 200 tests in this file are X, Y, Z'' then that is what > will happen no one will check each test to make sure it is true and our > data will be corrupted before we start.Absolutely same behavior, but with copy-paste, is possible for adding info to every tests. And i don''t see problems with implicit inheritance of f.e. Components field. In some tests suites is really possible when all tests have one Components set. More over, I think, it is possible to get some test Components based on test coverage automatically. Maybe we can solve this via enabling implicit inheritance for limited list of fields?> > So explicit inheritance might make sense, and please do propose an > inheritance model for the data, we can discuss the storage format later > but today let''s just understand how inheritance relates to our bash tests.What is ''explicit inheritance'' in case of your suggestion? Ans why it need?> > > 5. no automatic modification of characteristics. In particular, one > piece of md I would like to see is "maximum allowed test time" for > each test. Ideally, this could be measured and adjusted > automatically based on historical and ongoing run data. But it > would be dangerous to allow automatic modification to the script itself. > > > I really do not think maximum test time as a measurement is a piece of > test metadata.If we want to provide some help,advice to new user where we should store this data? What is difference between Tickets, Component, Purposes and ''Assumed Execution time''? All fields not just precise descriptions but also are advices.> > Metadata describes the functionality of the test that is encapsulated > within the test code itself, if the code said ''run for 60 minutes and no > more'' then maximum time would be an attribute.it will be different field. 60 min - ''max time''. 45 min ''Assumed Execution time''. No conflict there.> > Maybe there are a set of useful attributes like amount of storage used, > or minimum clients, or minimum osts etc. etc, again these can only be > metadata if they are implicitly in the test code, and for most tests > they would not be definable, and the variability might be impossible to > systematically capture, although I do think it''s worth having a go.But this data 1) helpful 2) I already use it. Where we could store this data? Thanks, Roman> > > > To address those problems, I think a database-type approach is > exactly right, or perhaps a YAML file with hierarchical inheritance. > To some degree, this is a "evolution vs revolution" question, and I > prefer to come down on the revolution-enabling design, despite the > problems you list. Basically, I believe the separated MD model > allows for the replacement of test-framework, and this, to my mind, > is the majority driver for adding the MD at all. > > Database is good and I believe metadata in the source fulfils that > objective whilst being something that we can manage with what we have > today manually, whilst easily creating tools for some automation. When > we do begin work on a new test framework approach we will have all the > data at hand to be manipulated in any way that we want, including if we > want separating it and storing it somewhere else. > > I don''t think creating the metadata however is linked with a new > test-framework, creating the metadata is required because today we do > not know what we have and we need to know what we have today whatever > strategy we use for the future. > > Chris >