gregclarke-ALpN63DfoWpx3z9c7Zyw2w@public.gmane.org
2005-Oct-14 00:09 UTC
oci_adapter: limit and offset queries slow; possible solution
Hi
(First post to this list, so here goes....)
I''ve written an RoR search engine against an Oracle 9i database
maintained
via a legacy system. I plan to release the search engine on our intranet
asap and hope it is the beginning of lots more Rails apps where i work.
My RoR search engine models reference views rather than tables and some of
these views access fairly large tables eg. 3.6 million records.
I have found that the way limit and offset are implemented in the
oci_adapter under activerecord-1.11.1 works fine for small views, but is
unworkably slow for large views.
I''ve now modified my copy of the oci_adapter to handle limit and offset
in
a different way and, so far, i''ve found that it performs well for both
large and small views. Instead of queries blowing out to minutes, i''m
back to reliable responses of a few seconds or less - i''d prefer
sub-second, of course.
Up till now I''ve been a little frustrated by some of the default
behaviour
of Rails when it comes to producing sql for large views and i''m hoping
that my mod to the oci_adapter.rb may solve some of those problems.
I''m interested to know if anyone else has had similar problems on large
views in oracle.
Here are some more details if you are interested:
I know the response time delay is coming from the sql that is rewritten by
the oci_adapter.rb because i can see the slow sql in the development log.
I''ve also taken that sql and run it in sqlplus. Here''s one
now:
select * from (select raw_sql_.*, rownum raw_rnum_ from
(SELECT * FROM docket_files WHERE officer_pt_id = ''XYZ1838'' )
raw_sql_
where rownum <= 10) where raw_rnum_ > 0
This took: 00:04:01.07 to return 6 records.
If i run the core sql, which is
SELECT * FROM docket_files WHERE officer_pt_id = ''XYZ1838''
it takes 00:00:02.00 to return 6 records.
If i run the first query again it takes about the same time ie. no caching
effect. If i run the core sql first, it still runs fast ie. no caching
effect necessary to explain its speed.
The explain plans for the nested queries show costly full table scans, but
the core sql typically has a good explain plan.
What i did to the oci_adapter.rb:
I removed the code that rewrites the core sql into nested sql to take
advantage of rownum. I instead apply limit and offset in the fetch
processing. Basically, you fetch all the records up to the limit, but you
only grab row/col vals between the offset and the limit.
I''ve written various search engines in pl/sql in the past and have seen
various ways of implementing offset and limit, but the approach i''ve
just
described is often the best way, because if you have a well-written query,
re-executing it is very quick. (Use of bind variables is a topic for
another post.)
Anyway, here''s my development code, for what it''s worth. This
fragment
contains a call to bind_sql, which is sth else i''ve written to send sql
with bind variables to the oracle database, pending a better way of
getting rails to send bind variables. Just replace *bind_sql(sql) with
sql. My code also assumes you only have an offset if you have a limit.
So far it has worked in all my queries, but be aware this code shows a
working concept rather than a fully tested solution.
# Required for my altered version of select_all
def get_row(cols,row)
hash = Hash.new
cols.each_with_index { |col, i|
hash[col] = case row[i]
when OCI8::LOB
name == ''Writable Large Object'' ? row[i]:
row[i].read
when OraDate
(row[i].hour == 0 and row[i].minute == 0 and row[i].second
== 0) ?
row[i].to_date : row[i].to_time
else row[i]
end unless col == ''raw_rnum_''
}
hash
end
# select_all altered by Greg Clarke 13-10-2005
#
# i found the doubly nested sql technique does not scale well. it is
# unworkably slow for some of my views.
# i''ve implemented limit and offset using procedural logic in
this
version
# of select_all.
#
def select_all(sql, name = nil)
log("select_all sql: #{sql}",name)
offset = sql =~ /OFFSET (\d+)$/ ? $1.to_i : 0
sql, limit = $1, $2.to_i if sql =~ /(.*)(?: LIMIT[](\d+))(\s*OFFSET
\d+)?$/
cursor = log(sql, name) { @connection.exec *bind_sql(sql) }
cols = cursor.get_col_names.map { |x| x.downcase }
rows = []
rcount = 0;
finished = false;
while (row = cursor.fetch) && (finished == false)
rcount = rcount.next
if limit then
log("############# we have limit: #{limit} and rcount:
#{rcount}; and offset: #{offset}",name)
if rcount > offset then
rows << get_row(cols,row)
end
if rcount >= (limit + offset) then
finished = true
end
else
rows << get_row(cols,row)
end
end
rows
ensure
cursor.close if cursor
end
Greg Clarke
Canberra, Australia