i''m stuck on what seems like a really easy problem. i''ve got a table that lists software product names along with when the software was built, the version number, when it was deployed, etc. As such, in the "name" column there are a lot of rows with the same name, as we tend to do multiple builds of the same product frequently. What I want to do is to extract from the table the LATEST production date for EACH software title. So instead of seeing a thousand-line table of all the builds ever done, just the row containing the latest production date for each software is shown. Can this be done in activerecord, or will I have to resort to SQL to construct the query. I''ve been told this is trivial in SQL-land, with a couple of max(date) functions in the SELECT clause and a GROUP BY, but I just can''t seem to get it to work. Been searching through the activerecord docs for that "aha!" magic piece of code to do the trick, but so far no luck. Any help as always is most appreciated.
Dave, Do you have a parent table of products with the builds as a child table? If so, AWDR has several similar examples which recommend storing the latest (in your scenario) in a field in the parent table (latest_build), with the Product controller having a method to manage the field. Alternatively why not use Build.Find(1, :order => build_date)? Bill dave davidson wrote:> i''m stuck on what seems like a really easy problem. i''ve got a table that > lists software product names along with when the software was built, the > version number, when it was deployed, etc. As such, in the "name" column there > are a lot of rows with the same name, as we tend to do multiple builds of the > same product frequently. > > What I want to do is to extract from the table the LATEST production date for > EACH software title. So instead of seeing a thousand-line table of all the > builds ever done, just the row containing the latest production date for each > software is shown. > > Can this be done in activerecord, or will I have to resort to SQL to construct > the query. I''ve been told this is trivial in SQL-land, with a couple of > max(date) functions in the SELECT clause and a GROUP BY, but I just can''t seem > to get it to work. Been searching through the activerecord docs for > that "aha!" magic piece of code to do the trick, but so far no luck. > > Any help as always is most appreciated.
I think the easiest solution is to use a one-line sql statemnt: MyModel.find_by_sql("SELECT name, MAX(created_at) FROM your_table GROUP BY 1 ORDER BY 1") If I''m not mistaken, I believe this is standard SQL2, so it should be easily executed by all the majors. -lv Bill Lazar wrote:> Dave, > > Do you have a parent table of products with the builds as a child table? > If so, AWDR has several similar examples which recommend storing the > latest (in your scenario) in a field in the parent table (latest_build), > with the Product controller having a method to manage the field. > > Alternatively why not use Build.Find(1, :order => build_date)? > > Bill > > dave davidson wrote: > >> i''m stuck on what seems like a really easy problem. i''ve got a table >> that lists software product names along with when the software was >> built, the version number, when it was deployed, etc. As such, in the >> "name" column there are a lot of rows with the same name, as we tend >> to do multiple builds of the same product frequently. >> What I want to do is to extract from the table the LATEST production >> date for EACH software title. So instead of seeing a thousand-line >> table of all the builds ever done, just the row containing the latest >> production date for each software is shown. >> Can this be done in activerecord, or will I have to resort to SQL to >> construct the query. I''ve been told this is trivial in SQL-land, with >> a couple of max(date) functions in the SELECT clause and a GROUP BY, >> but I just can''t seem to get it to work. Been searching through the >> activerecord docs for that "aha!" magic piece of code to do the trick, >> but so far no luck. >> >> Any help as always is most appreciated.
Is that equivalent to? SELECT name, MAX(created_at) FROM your_table GROUP BY name ORDER BY name On 11/5/05, Lou Vanek <vanek-9H8CmIPm+GA@public.gmane.org> wrote:> I think the easiest solution is to use a one-line sql statemnt: > > MyModel.find_by_sql("SELECT name, MAX(created_at) FROM your_table GROUP BY 1 ORDER BY 1") >
yes, I just used a little (sql) shorthand. -lv Cuong Tran wrote:> Is that equivalent to? > > SELECT name, MAX(created_at) FROM your_table GROUP BY name ORDER BY name > > > On 11/5/05, Lou Vanek <vanek-9H8CmIPm+GA@public.gmane.org> wrote: > >>I think the easiest solution is to use a one-line sql statemnt: >> >>MyModel.find_by_sql("SELECT name, MAX(created_at) FROM your_table GROUP BY 1 ORDER BY 1")
Lou Vanek <vanek@...> writes:> > yes, I just used a little (sql) shorthand. > > -lv > > Cuong Tran wrote: > > Is that equivalent to? > > > > SELECT name, MAX(created_at) FROM your_table GROUP BY name ORDER BY name > > > > > > On 11/5/05, Lou Vanek <vanek@...> wrote: > > > >>I think the easiest solution is to use a one-line sql statemnt: > >> > >>MyModel.find_by_sql("SELECT name, MAX(created_at) FROM your_table GROUP BY 1ORDER BY 1")>Hi guys, Thanks much for the SQL hints, worked like a charm! Now, to normalize the schema a bit, I''d like to pull out the "product name" from the builds table (as I said previously, we do a lot of builds of the same product so there is redundancy there) and refer to the product from there. Something like: products: CREATE TABLE `products` ( `id` int(10) unsigned NOT NULL auto_increment, `name` varchar(65) NOT NULL default '''', `vendor` varchar(10) NOT NULL default '''', PRIMARY KEY (`id`) ); builds: CREATE TABLE `builds` ( `id` int(10) unsigned NOT NULL auto_increment, `received_on` datetime NOT NULL default ''0000-00-00 00:00:00'', `submitted_by` varchar(55) default NULL, `product_id` int(11) NOT NULL default ''0'', `version` varchar(50) default NULL, `production_date` date NOT NULL default ''0000-00-00'', `install_date` datetime default NULL, PRIMARY KEY (`id`) ); the products model would have a "has_many :builds" and build would have "belongs_to :products." So far so good, right? However, even before I get to the rails part, I want to test the query in SQL (using MySQL) and find, as before, the latest install date grouped by product: SELECT products.name , builds.version, max(builds.install_date) FROM products, builds INNER JOIN builds on builds.product_id = products.id GROUP BY products.name However, I''m getting an error when I run the query: #1066 - Not unique table/alias: ''builds'' What am I doing wrong in the query? Do I need to add something to the DDL to associate the tables to do an inner join properly? Dave
try, SELECT products.name , builds.version, max(builds.install_date) FROM builds INNER JOIN products on builds.product_id = products.id GROUP BY 1,2 dave davidson wrote:> Lou Vanek <vanek@...> writes: > > >>yes, I just used a little (sql) shorthand. >> >>-lv >> >>Cuong Tran wrote: >> >>> Is that equivalent to? >>> >>> SELECT name, MAX(created_at) FROM your_table GROUP BY name ORDER BY name >>> >>> >>>On 11/5/05, Lou Vanek <vanek@...> wrote: >>> >>> >>>>I think the easiest solution is to use a one-line sql statemnt: >>>> >>>>MyModel.find_by_sql("SELECT name, MAX(created_at) FROM your_table GROUP BY 1 > > ORDER BY 1") > > > > Hi guys, > > Thanks much for the SQL hints, worked like a charm! Now, to normalize the > schema a bit, I''d like to pull out the "product name" from the builds table (as > I said previously, we do a lot of builds of the same product so there is > redundancy there) and refer to the product from there. Something like: > > products: > CREATE TABLE `products` ( > `id` int(10) unsigned NOT NULL auto_increment, > `name` varchar(65) NOT NULL default '''', > `vendor` varchar(10) NOT NULL default '''', > PRIMARY KEY (`id`) > ); > > builds: > CREATE TABLE `builds` ( > `id` int(10) unsigned NOT NULL auto_increment, > `received_on` datetime NOT NULL default ''0000-00-00 00:00:00'', > `submitted_by` varchar(55) default NULL, > `product_id` int(11) NOT NULL default ''0'', > `version` varchar(50) default NULL, > `production_date` date NOT NULL default ''0000-00-00'', > `install_date` datetime default NULL, > PRIMARY KEY (`id`) > ); > > the products model would have a "has_many :builds" and build would have > "belongs_to :products." So far so good, right? However, even before I get to > the rails part, I want to test the query in SQL (using MySQL) and find, as > before, the latest install date grouped by product: > > SELECT products.name , builds.version, max(builds.install_date) > FROM products, builds > INNER JOIN builds on builds.product_id = products.id > GROUP BY products.name > > However, I''m getting an error when I run the query: > > #1066 - Not unique table/alias: ''builds'' > > What am I doing wrong in the query? Do I need to add something to the DDL to > associate the tables to do an inner join properly? > > Dave
Lou Vanek <vanek@...> writes:> > try, > > SELECT products.name , builds.version, max(builds.install_date) > FROM builds > INNER JOIN products on builds.product_id = products.id > GROUP BY 1,2 >Hi Lou, The query you suggested didn''t generate an error, but I''m getting some weird results. The output of the query correctly groups the list by product name (so that the query produces a list that is as long as the number of products), and the latest install date for each product is there. However, the the values in the version column are not matching up to the corresponding date entries. In other words, the output of the version column in the query is not the same one that is the latest installed. Something like a table consisting of... products.name,builds.version,builds.install_date ------------------------------------------------- Product1,ENG_1.09.12.23,2004-09-02 11:03 Product1,ANA.1.01.11.00,2005-04-12 15:44 Product1,DEF_2.02.12.54,2005-04-15 09:15 Product2,PPQ_5.09.12.00,2005-05-30 14:02 Product1,ENG_2.09.66.00,2004-03-17 12:44 Product3,ENG.4.00.12.03,2005-04-19 16:30 Product1,NGE_1.03.12.11,2004-12-08 08:00 Product1,ENG_2.19.12.00,2005-04-16 13:04 Product2,PPQ_2.09.12.00,2004-05-01 14:02 Product3,ENG.3.09.12.03,2004-09-01 16:30 ...produces a result set of: products.name, builds.version, max( builds.install_date ) --------------------------------------------------------- Product1,ANA.1.01.11.00,2005-04-16 13:04 Product2,PPQ_5.09.12.00,2005-05-30 14:02 Product3,ENG.3.09.12.03,2005-04-19 16:30 The latest install dates are right, but the corresponding versions are not. This behavior is not consistent accross the data; some of the results have the correct version corresponding to the latest date. I tried tinkering with the GROUP BY clause but am not having any luck thus far.
yeah, better yet: SELECT products.id, products.name , max(builds.install_date) FROM builds INNER JOIN products on builds.product_id = products.id GROUP BY 1,2 If you keep the ''version'' field in the above select query then you will get a unique row for each version, which I don''t think is what you want. You will have to go back and do a separate SQL query to get the version. SELECT version FROM builds WHERE builds.install_date = ? AND product.id = ? # fill in values from above query -or- Build.find(:first, :select => ''verson'', conditions => [''install_date = ? AND product.id = ?'', @install_date, @product_id]) dave davidson wrote:> Lou Vanek <vanek@...> writes: > > >>try, >> >>SELECT products.name , builds.version, max(builds.install_date) >>FROM builds >>INNER JOIN products on builds.product_id = products.id >>GROUP BY 1,2 >> > > > Hi Lou, > > The query you suggested didn''t generate an error, but I''m getting some weird > results. The output of the query correctly groups the list by product name (so > that the query produces a list that is as long as the number of products), and > the latest install date for each product is there. However, the the values in > the version column are not matching up to the corresponding date entries. In > other words, the output of the version column in the query is not the same one > that is the latest installed. Something like a table consisting of... > > products.name,builds.version,builds.install_date > ------------------------------------------------- > Product1,ENG_1.09.12.23,2004-09-02 11:03 > Product1,ANA.1.01.11.00,2005-04-12 15:44 > Product1,DEF_2.02.12.54,2005-04-15 09:15 > Product2,PPQ_5.09.12.00,2005-05-30 14:02 > Product1,ENG_2.09.66.00,2004-03-17 12:44 > Product3,ENG.4.00.12.03,2005-04-19 16:30 > Product1,NGE_1.03.12.11,2004-12-08 08:00 > Product1,ENG_2.19.12.00,2005-04-16 13:04 > Product2,PPQ_2.09.12.00,2004-05-01 14:02 > Product3,ENG.3.09.12.03,2004-09-01 16:30 > > ...produces a result set of: > > products.name, builds.version, max( builds.install_date ) > --------------------------------------------------------- > Product1,ANA.1.01.11.00,2005-04-16 13:04 > Product2,PPQ_5.09.12.00,2005-05-30 14:02 > Product3,ENG.3.09.12.03,2005-04-19 16:30 > > The latest install dates are right, but the corresponding versions are not. > This behavior is not consistent accross the data; some of the results have the > correct version corresponding to the latest date. I tried tinkering with the > GROUP BY clause but am not having any luck thus far.
Lou Vanek <vanek@...> writes:> > yeah, better yet: > > SELECT products.id, products.name , max(builds.install_date) > FROM builds > INNER JOIN products on builds.product_id = products.id > GROUP BY 1,2 > > If you keep the ''version'' field in the above select query then you will get > a unique row for each version, which I don''t think is what you want. > You will have to go back and do a separate SQL query to get the version. > > SELECT version > FROM builds > WHERE builds.install_date = ? AND product.id = ? # fill in values from abovequery> -or- > > Build.find(:first, :select => ''verson'', > conditions => [''install_date = ? AND product.id = ?'', <at> install_date,<at> product_id]) Interesting. Is there a way of combining the two queries into one so that the one query will take care of finding the product/date result set and then plugging it into the second query where the version column is selected? <tangent> My goal is to have the system automated, so that when a build of a certain product is done, and the information is added to the builds table, a person navigating to the "Latest Builds" page will see it updated dynamically to reflect the just-completed build. I realize that we could make a static page listing the current latest builds and manually update it, and it seems so inelegant (and prone to lack of updating). Am I approaching this problem incorrectly by starting with the DB and translating it into Ruby? Should I be doing this all in Ruby and sticking the code in a controller? My initial instinct for this project was to start with the database. My thought was that normalizing the tables as much as possible would allow the rest to fall into place when designing the Rails project. I spent many hours and countless revisions of the schema trying to do this, and I ended up right where I started-- wondering when I will get to actually start coding the Rails project. Now, I''m left wondering if I should have started the design with Rails. Doing mockups of what I want to accomplish and then fitting the DB schema to that. Anyway, sorry for the rambling. I needed a break from SQL queries :) </tangent>
this combines the two into one sql query (assuming the db you are using can handle subselects): SELECT products.name, builds.version, builds.install_date FROM builds INNER JOIN products ON builds.product_id = products.id, (SELECT builds.product_id, MAX(builds.install_date) AS dt FROM builds GROUP BY 1) AS x WHERE builds.product_id = x.product_id AND builds.install_date = x.dt [This was tested on MySQL 4.1.12] dave davidson wrote:> Lou Vanek <vanek@...> writes: > > >>yeah, better yet: >> >>SELECT products.id, products.name , max(builds.install_date) >> FROM builds >> INNER JOIN products on builds.product_id = products.id >> GROUP BY 1,2 >> >>If you keep the ''version'' field in the above select query then you will get >>a unique row for each version, which I don''t think is what you want. >>You will have to go back and do a separate SQL query to get the version. >> >>SELECT version >> FROM builds >> WHERE builds.install_date = ? AND product.id = ? # fill in values from above > > query > > > >>-or- >> >>Build.find(:first, :select => ''verson'', >> conditions => [''install_date = ? AND product.id = ?'', <at> install_date, > > <at> product_id]) > > Interesting. Is there a way of combining the two queries into one so that the > one query will take care of finding the product/date result set and then > plugging it into the second query where the version column is selected? > > <tangent> > My goal is to have the system automated, so that when a build of a certain > product is done, and the information is added to the builds table, a person > navigating to the "Latest Builds" page will see it updated dynamically to > reflect the just-completed build. > > I realize that we could make a static page listing the current latest builds and > manually update it, and it seems so inelegant (and prone to lack of updating). > Am I approaching this problem incorrectly by starting with the DB and > translating it into Ruby? Should I be doing this all in Ruby and sticking the > code in a controller? > > My initial instinct for this project was to start with the database. My thought > was that normalizing the tables as much as possible would allow the rest to fall > into place when designing the Rails project. I spent many hours and countless > revisions of the schema trying to do this, and I ended up right where I > started-- wondering when I will get to actually start coding the Rails project. > > Now, I''m left wondering if I should have started the design with Rails. Doing > mockups of what I want to accomplish and then fitting the DB schema to that. > Anyway, sorry for the rambling. I needed a break from SQL queries :) > </tangent>