gregclarke-ALpN63DfoWpx3z9c7Zyw2w@public.gmane.org
2005-Oct-14 00:09 UTC
oci_adapter: limit and offset queries slow; possible solution
Hi (First post to this list, so here goes....) I''ve written an RoR search engine against an Oracle 9i database maintained via a legacy system. I plan to release the search engine on our intranet asap and hope it is the beginning of lots more Rails apps where i work. My RoR search engine models reference views rather than tables and some of these views access fairly large tables eg. 3.6 million records. I have found that the way limit and offset are implemented in the oci_adapter under activerecord-1.11.1 works fine for small views, but is unworkably slow for large views. I''ve now modified my copy of the oci_adapter to handle limit and offset in a different way and, so far, i''ve found that it performs well for both large and small views. Instead of queries blowing out to minutes, i''m back to reliable responses of a few seconds or less - i''d prefer sub-second, of course. Up till now I''ve been a little frustrated by some of the default behaviour of Rails when it comes to producing sql for large views and i''m hoping that my mod to the oci_adapter.rb may solve some of those problems. I''m interested to know if anyone else has had similar problems on large views in oracle. Here are some more details if you are interested: I know the response time delay is coming from the sql that is rewritten by the oci_adapter.rb because i can see the slow sql in the development log. I''ve also taken that sql and run it in sqlplus. Here''s one now: select * from (select raw_sql_.*, rownum raw_rnum_ from (SELECT * FROM docket_files WHERE officer_pt_id = ''XYZ1838'' ) raw_sql_ where rownum <= 10) where raw_rnum_ > 0 This took: 00:04:01.07 to return 6 records. If i run the core sql, which is SELECT * FROM docket_files WHERE officer_pt_id = ''XYZ1838'' it takes 00:00:02.00 to return 6 records. If i run the first query again it takes about the same time ie. no caching effect. If i run the core sql first, it still runs fast ie. no caching effect necessary to explain its speed. The explain plans for the nested queries show costly full table scans, but the core sql typically has a good explain plan. What i did to the oci_adapter.rb: I removed the code that rewrites the core sql into nested sql to take advantage of rownum. I instead apply limit and offset in the fetch processing. Basically, you fetch all the records up to the limit, but you only grab row/col vals between the offset and the limit. I''ve written various search engines in pl/sql in the past and have seen various ways of implementing offset and limit, but the approach i''ve just described is often the best way, because if you have a well-written query, re-executing it is very quick. (Use of bind variables is a topic for another post.) Anyway, here''s my development code, for what it''s worth. This fragment contains a call to bind_sql, which is sth else i''ve written to send sql with bind variables to the oracle database, pending a better way of getting rails to send bind variables. Just replace *bind_sql(sql) with sql. My code also assumes you only have an offset if you have a limit. So far it has worked in all my queries, but be aware this code shows a working concept rather than a fully tested solution. # Required for my altered version of select_all def get_row(cols,row) hash = Hash.new cols.each_with_index { |col, i| hash[col] = case row[i] when OCI8::LOB name == ''Writable Large Object'' ? row[i]: row[i].read when OraDate (row[i].hour == 0 and row[i].minute == 0 and row[i].second == 0) ? row[i].to_date : row[i].to_time else row[i] end unless col == ''raw_rnum_'' } hash end # select_all altered by Greg Clarke 13-10-2005 # # i found the doubly nested sql technique does not scale well. it is # unworkably slow for some of my views. # i''ve implemented limit and offset using procedural logic in this version # of select_all. # def select_all(sql, name = nil) log("select_all sql: #{sql}",name) offset = sql =~ /OFFSET (\d+)$/ ? $1.to_i : 0 sql, limit = $1, $2.to_i if sql =~ /(.*)(?: LIMIT[](\d+))(\s*OFFSET \d+)?$/ cursor = log(sql, name) { @connection.exec *bind_sql(sql) } cols = cursor.get_col_names.map { |x| x.downcase } rows = [] rcount = 0; finished = false; while (row = cursor.fetch) && (finished == false) rcount = rcount.next if limit then log("############# we have limit: #{limit} and rcount: #{rcount}; and offset: #{offset}",name) if rcount > offset then rows << get_row(cols,row) end if rcount >= (limit + offset) then finished = true end else rows << get_row(cols,row) end end rows ensure cursor.close if cursor end Greg Clarke Canberra, Australia