Clare
2006-Aug-29 00:06 UTC
[Ferret-talk] adding new items to index breaks searches with *
Hi after upgrading to ferret 0.10.1 and bleeding edge aaf i''m getting some strange behavior. Generally much better stability with new version of ferret but when i add new items for some reason i can no longer search with a *. Or rather i can but it returns no results and no errors. I can search and get results normally on other searches and when i rebuild the index i can search with * until i add a new item. Has anyone else experienced this? I use * in my browse items page. I think i have a fairly standard aaf setup. Any ideas what might be going on here or what else to investigate or what else to try? Regards Clare -- Posted via http://www.ruby-forum.com/.
Jens Kraemer
2006-Aug-29 06:50 UTC
[Ferret-talk] adding new items to index breaks searches with *
On Tue, Aug 29, 2006 at 02:06:16AM +0200, Clare wrote:> Hi after upgrading to ferret 0.10.1 and bleeding edge aaf i''m getting > some strange behavior. Generally much better stability with new version > of ferret but when i add new items for some reason i can no longer > search with a *. Or rather i can but it returns no results and no > errors. I can search and get results normally on other searches and when > i rebuild the index i can search with * until i add a new item. Has > anyone else experienced this? I use * in my browse items page.do you mean a query only consisting of ''*'' or wild card queries like ''test*'' ? The former isn''t an allowed query, afaik. Don''t know why it works before modifying the index. Here''s the snippet how I reproduced this behavior: require ''rubygems'' require ''ferret'' include Ferret i = I.new i << ''just some testing'' i.search(''*'').total_hits # => 1 i << ''another testing session'' i.search(''*'').total_hits # => 0 why don''t you just use find(:all) on your browse page ? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Clare
2006-Aug-29 13:49 UTC
[Ferret-talk] adding new items to index breaks searches with *
Hi i could use a find(:all, :conditions => blah) but my browse page is divited into types and categories and so i was using a wildcard search with find_by_contents and then one or two filters depending on whether the user selects a type or a type and category. I just thought that ferret would be faster than a find all with conditions,(also im already using it on my search page and the browse page has similar functionality). Is this not so? The conditions would be an exact match on the full contents of a db cell. Would ferret still be faster with this? So what i basically want to do is a simple search on one or two fields. How is this done with acts as ferret? How do you specify what fields out of the index to search on? Thanks for an advice regards Clare Jens Kraemer wrote:> On Tue, Aug 29, 2006 at 02:06:16AM +0200, Clare wrote: >> Hi after upgrading to ferret 0.10.1 and bleeding edge aaf i''m getting >> some strange behavior. Generally much better stability with new version >> of ferret but when i add new items for some reason i can no longer >> search with a *. Or rather i can but it returns no results and no >> errors. I can search and get results normally on other searches and when >> i rebuild the index i can search with * until i add a new item. Has >> anyone else experienced this? I use * in my browse items page. > > do you mean a query only consisting of ''*'' or wild card queries like > ''test*'' ? The former isn''t an allowed query, afaik. Don''t know why it > works before modifying the index. Here''s the snippet how I reproduced > this behavior: > > require ''rubygems'' > require ''ferret'' > include Ferret > i = I.new > i << ''just some testing'' > i.search(''*'').total_hits # => 1 > i << ''another testing session'' > i.search(''*'').total_hits # => 0 > > > why don''t you just use find(:all) on your browse page ? > > Jens > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66-- Posted via http://www.ruby-forum.com/.
Jens Kraemer
2006-Aug-29 15:11 UTC
[Ferret-talk] adding new items to index breaks searches with *
Hi! On Tue, Aug 29, 2006 at 03:49:10PM +0200, Clare wrote:> > Hi i could use a find(:all, :conditions => blah) but my browse page is > divited into types and categories and so i was using a wildcard search > with find_by_contents and then one or two filters depending on whether > the user selects a type or a type and category. I just thought that > ferret would be faster than a find all with conditions,(also im already > using it on my search page and the browse page has similar > functionality). Is this not so? The conditions would be an exact match > on the full contents of a db cell. Would ferret still be faster with > this?As long as there''s no user-entered search term to find, but just categories and types you might be better off by using the db directly. With a find statement like find(:all, :conditions => ["category=?", params[:category]]) speed won''t differ much as long as you have an index in your db on the category column. In general, Ferret tends to be faster when it comes to searching longer texts. Also keep in mind that acts_as_ferret always fetches records by id from the db anyway to retrieve the records whose ids it has found through Ferret. So aaf can be only faster, if Ferret needs less time for searching than the time difference between these two statements: select * from ... where id in(...) (what aaf does with the ids it found) and select * from ... where category='''' (what you would do when using find(:all ...)> So what i basically want to do is a simple search on one or two fields. > How is this done with acts as ferret? How do you specify what fields out > of the index to search on?in ypur query, prefix the term with the field name, i.e. "title:test" will only retrieve records where the term test occurs in the title field. cheers, Jens> Jens Kraemer wrote: > > On Tue, Aug 29, 2006 at 02:06:16AM +0200, Clare wrote: > >> Hi after upgrading to ferret 0.10.1 and bleeding edge aaf i''m getting > >> some strange behavior. Generally much better stability with new version > >> of ferret but when i add new items for some reason i can no longer > >> search with a *. Or rather i can but it returns no results and no > >> errors. I can search and get results normally on other searches and when > >> i rebuild the index i can search with * until i add a new item. Has > >> anyone else experienced this? I use * in my browse items page. > > > > do you mean a query only consisting of ''*'' or wild card queries like > > ''test*'' ? The former isn''t an allowed query, afaik. Don''t know why it > > works before modifying the index. Here''s the snippet how I reproduced > > this behavior: > > > > require ''rubygems'' > > require ''ferret'' > > include Ferret > > i = I.new > > i << ''just some testing'' > > i.search(''*'').total_hits # => 1 > > i << ''another testing session'' > > i.search(''*'').total_hits # => 0 > > > > > > why don''t you just use find(:all) on your browse page ? > > > > Jens > > -- > > webit! Gesellschaft f?r neue Medien mbH www.webit.de > > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > > Schnorrstra?e 76 Tel +49 351 46766 0 > > D-01069 Dresden Fax +49 351 46766 66 > > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk-- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Jan Prill
2006-Aug-29 15:32 UTC
[Ferret-talk] adding new items to index breaks searches with *
As a sidenote I''d like to mention that ferret tends to be faster as mysql on columns / fields with short texts either if there are many, many datasets. I for example made this experience on a mysql database with millions of rows of short varchars in it''s columns. Mysql, even while optimized by an experienced DBA, with all necessary indices set and queries that EXPLAINed to be optimized had quite some problems in handling lots of queries, while ferret made a damn good job on querying the same data. I''ve had a sense of achievement because of ferret / lucene with this. That said it is quite uncommon to have millions of categories so Jens suggestion seems to be very reasonable on this point. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060829/36199b3d/attachment.html
Clare
2006-Aug-29 15:37 UTC
[Ferret-talk] adding new items to index breaks searches with *
Thanks very much for clarifying that Jens. Much appreciated! regards Clare -- Posted via http://www.ruby-forum.com/.
David Balmain
2006-Sep-02 01:54 UTC
[Ferret-talk] adding new items to index breaks searches with *
On 8/29/06, Jens Kraemer <kraemer at webit.de> wrote:> On Tue, Aug 29, 2006 at 02:06:16AM +0200, Clare wrote: > > Hi after upgrading to ferret 0.10.1 and bleeding edge aaf i''m getting > > some strange behavior. Generally much better stability with new version > > of ferret but when i add new items for some reason i can no longer > > search with a *. Or rather i can but it returns no results and no > > errors. I can search and get results normally on other searches and when > > i rebuild the index i can search with * until i add a new item. Has > > anyone else experienced this? I use * in my browse items page. > > do you mean a query only consisting of ''*'' or wild card queries like > ''test*'' ? The former isn''t an allowed query, afaik. Don''t know why it > works before modifying the index. Here''s the snippet how I reproduced > this behavior: > > require ''rubygems'' > require ''ferret'' > include Ferret > i = I.new > i << ''just some testing'' > i.search(''*'').total_hits # => 1 > i << ''another testing session'' > i.search(''*'').total_hits # => 0 > > > why don''t you just use find(:all) on your browse page ?Thanks for the snippet Jens. This was a bug (quite a serious one) which I have now fixed. As Jens said, "*" queries were not a good idea and would fail on most indexes because of the number of terms (the got expanded as MultiTermQueries with every single term in the index). However, I''ve now modified the QueryParser to translate "*" to a MatchAllQuery so there should be no problem, performance or otherwise with using "*" in your queries. I should note here that "title:*" will match all documents include documents that don''t have a :title field. If you only want documents with a :title field you should use "title:?*". Having said that, if you are using these types of queries there is probably a better way to do what you are doing. Cheers, Dave