thr3ads.net - Rails - What is the most efficient way to split a table into 2 groups? [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Patrick Doyle

2009-Sep-28 21:32 UTC

What is the most efficient way to split a table into 2 groups?

I have the following:

@lot = Lot.find(params[:id])

part_nums = Part.all(:conditions => ["id <> ?",
@lot.part.id])

I guess I should mention that

Lot :belongs_to => :part

I was looking at the log following the execution of these two
statements and I saw something like this:

Lot Load (0.4ms) SELECT * FROM "lots" WHERE
("lots"."id" = 13)
Part Load (0.3ms) SELECT * FROM "parts" WHERE
("parts"."id" = 2)
Part Load (0.9ms) SELECT * FROM "parts" WHERE (id <> 2)

It looked a bit silly to me -- first I grab a record from the "parts"
table with an ID of 2, then I grab all the records from the parts
table whose ID is not 2.

I played around a little with :include clauses, thinking that I
should, at least, be able to fetch the record from the "lots" table
and the "parts" table simultaneously (with something like a joins
clause and a "lots.part_id = parts.id" WHERE clause), but
didn''t get
anywhere with that.

I will end up leaving this the way it is (most likely), especially
since this isn''t the right phase of development to be worrying about
optimization, but I am curious how one might do this most efficiently.

Is it most efficient to grab 1 record where record.id = blah and then
all the (rest of the) records where record.id <> blah?

Is it more efficient to grab all the records at once and write some
ruby code to select the one record from the rest?  If so, what would
that code look like?  I don''t like this:

everything = Part.all
the_one = everything.select {|x| x.id == 2}
the_rest = everything.reject {|x| x.id != 2}

That''s going to iterate over all of the records.  twice!  in
interpreted code!

Any thoughts, ideas, or snide remarks?

--wpd

Frederick Cheung

2009-Sep-28 21:42 UTC

head link

Re: What is the most efficient way to split a table into 2 groups?

On Sep 28, 10:32 pm, Patrick Doyle
<wpds...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:> I will end up leaving this the way it is (most likely), especially
> since this isn''t the right phase of development to be worrying
about
> optimization, but I am curious how one might do this most efficiently.
>
> Is it most efficient to grab 1 record where record.id = blah and then
> all the (rest of the) records where record.id <> blah?
>
> Is it more efficient to grab all the records at once and write some
> ruby code to select the one record from the rest?  If so, what would
> that code look like?  I don''t like this:
Try both, benchmark them - don''t take some random person on the
internet''s word for it!>
> everything = Part.all
> the_one = everything.select {|x| x.id == 2}
> the_rest = everything.reject {|x| x.id != 2}
>
> That''s going to iterate over all of the records.  twice!  in
interpreted code!
partition will do it in one

Fred>
> Any thoughts, ideas, or snide remarks?
>
> --wpd

Marnen Laibow-Koser

2009-Sep-28 22:07 UTC

head link

Re: What is the most efficient way to split a table into 2 g

Patrick Doyle wrote:> I have the following:
> 
> @lot = Lot.find(params[:id])
> 
> part_nums = Part.all(:conditions => ["id <> ?",
@lot.part.id])
> 
> I guess I should mention that
> 
> Lot :belongs_to => :part
> 
> I was looking at the log following the execution of these two
> statements and I saw something like this:
> 
> Lot Load (0.4ms) SELECT * FROM "lots" WHERE
("lots"."id" = 13)
> Part Load (0.3ms) SELECT * FROM "parts" WHERE
("parts"."id" = 2)
> Part Load (0.9ms) SELECT * FROM "parts" WHERE (id <> 2)
> 
> It looked a bit silly to me -- first I grab a record from the
"parts"
> table with an ID of 2, then I grab all the records from the parts
> table whose ID is not 2.
I think what''s happening here is this: you''re calling
@lot.part.id, so
Rails needs to load @lot.part, which accounts for the extra query.  To 
fix, use the :joins option on Lot.find, or simply call @lot.part_id.
> 
> I played around a little with :include clauses, thinking that I
> should, at least, be able to fetch the record from the "lots"
table
> and the "parts" table simultaneously (with something like a joins
> clause and a "lots.part_id = parts.id" WHERE clause), but
didn''t get
> anywhere with that.
:includes is useless here.  :joins will do the trick.

> 
> I will end up leaving this the way it is (most likely), especially
> since this isn''t the right phase of development to be worrying
about
> optimization, but I am curious how one might do this most efficiently.
> 
> Is it most efficient to grab 1 record where record.id = blah and then
> all the (rest of the) records where record.id <> blah?
> 
> Is it more efficient to grab all the records at once and write some
> ruby code to select the one record from the rest?  
That''s what I think I''d do.
> If so, what would
> that code look like?  I don''t like this:
> 
> everything = Part.all
> the_one = everything.select {|x| x.id == 2}
> the_rest = everything.reject {|x| x.id != 2}
> 
> That''s going to iterate over all of the records.  twice!  in
interpreted
> code!
So just iterate once, and test for the special value as you go.  If you 
can''t do that, then make 2 DB queries.
> 
> Any thoughts, ideas, or snide remarks?
> 
> --wpd
Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org
-- 
Posted via http://www.ruby-forum.com/.

Marnen Laibow-Koser

2009-Sep-28 22:18 UTC

head link

Re: What is the most efficient way to split a table into 2 g

Frederick Cheung wrote:
[...]>> That''s going to iterate over all of the records. �twice! �in
interpreted code!
> 
> partition will do it in one
Hey, that''s good to know!

If you need to do the partitioning in the DB, you could use ORDER BY 
abs(id - :magic_id).  That way the first record will be the magic
one.> 
> Fred
Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org
-- 
Posted via http://www.ruby-forum.com/.

Patrick Doyle

2009-Sep-29 12:27 UTC

head link

Re: What is the most efficient way to split a table into 2 groups?

On Mon, Sep 28, 2009 at 5:42 PM, Frederick Cheung
<frederick.cheung-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:>> On Sep 28, 10:32 pm, Patrick Doyle
<wpds...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> partition will do it in one
>Thanks Fred, that''s exactly what I was looking for.
> Try both, benchmark them - don''t take some random person on the
> internet''s word for it!
Ahh, but you are not "some random person on the internet".  You are an
active member of this community who regularly answers questions for
newbies and unfortunates who can be too lazy to look things up
themselves.  As such, I have come to appreciate and respect your
answers, especially the ones that say, "the method for which you are
looking is #partition, but it''s up to you to determine if it makes
things go faster or not" :-)

Thanks again for all the help.

--wpd

Patrick Doyle

2009-Sep-29 12:30 UTC

head link

Re: What is the most efficient way to split a table into 2 g

On Mon, Sep 28, 2009 at 6:18 PM, Marnen Laibow-Koser
<rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>
wrote:> Frederick Cheung wrote:
> [...]
>>> That''s going to iterate over all of the records. �twice!
�in interpreted code!
>>
>> partition will do it in one
>
> Hey, that''s good to know!
>
> If you need to do the partitioning in the DB, you could use ORDER BY
> abs(id - :magic_id).  That way the first record will be the magic one.
Oh that''s clever!  I like that!

Thanks.

--wpd

Patrick Doyle

2009-Sep-29 16:29 UTC

head link

Re: What is the most efficient way to split a table into 2 g

On Mon, Sep 28, 2009 at 6:07 PM, Marnen Laibow-Koser
<rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>
wrote:> Patrick Doyle wrote:
>> I have the following:
>>
>> @lot = Lot.find(params[:id])
>>
>> part_nums = Part.all(:conditions => ["id <> ?",
@lot.part.id])
>>
>> I guess I should mention that
>>
>> Lot :belongs_to => :part
>>
>> I was looking at the log following the execution of these two
>> statements and I saw something like this:
>>
>> Lot Load (0.4ms) SELECT * FROM "lots" WHERE
("lots"."id" = 13)
>> Part Load (0.3ms) SELECT * FROM "parts" WHERE
("parts"."id" = 2)
>> Part Load (0.9ms) SELECT * FROM "parts" WHERE (id <> 2)
>>
>> It looked a bit silly to me -- first I grab a record from the
"parts"
>> table with an ID of 2, then I grab all the records from the parts
>> table whose ID is not 2.
>
> I think what''s happening here is this: you''re calling
@lot.part.id, so
> Rails needs to load @lot.part, which accounts for the extra query.  To
> fix, use the :joins option on Lot.find, or simply call @lot.part_id.
>I should let this be, but now I''m curious... what does :joins actually
do for me?

(in script/console)>> l=Lot.find(13, :joins => :part)
produces (in the logfile)

SELECT "lots".* FROM "lots" INNER JOIN "parts" ON
"parts".id "lots".part_id WHERE
("lots"."id" = 13)

and that looks like what I wanted.  But when I follow that up with
>> l.part
I see in the logfile:

SELECT * from "parts" WHERE ("parts"."id" = 2)

so it seems that the inner join was wasted.  Is this a development vs.
production thing?  (I know, I''m not supposed to worry about
optimization now, but I figure if I learn how to write generally more
optimal code by default, it will only help).

If I try:
>> l=Lot.find(13, :include => :part)
I see in the logfile:

SELECT * FROM "lots" WHERE ("lots"."id" = 13)
SELECT * FROM "parts" WHERE ("parts".id" = 2)

so neither the :joins nor the :include options seem to help, although
the :include option prevents the subsequent database access when I
finally access l.part

Again, this is not stopping me from anything, and I really should
leave the optimization phase for later, but I am curious about the
intended behavior of :include and :joins, and puzzled by the apparent
behavior.

--wpd

Possibly Parallel Threads

Search for more apparently analagous threads

Rails - Sep 2009 - What is the most efficient way to split a table into 2 groups?

What is the most efficient way to split a table into 2 groups?

Re: What is the most efficient way to split a table into 2 groups?

Re: What is the most efficient way to split a table into 2 g

Re: What is the most efficient way to split a table into 2 g

Re: What is the most efficient way to split a table into 2 groups?

Re: What is the most efficient way to split a table into 2 g

Re: What is the most efficient way to split a table into 2 g

Possibly Parallel Threads