similar to: Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit

Displaying 20 results from an estimated 900 matches similar to: "Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit"

2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David, > Deleted documents don''t get deleted until commit is called Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count, even across ruby sessions. On a different note, I''d like to request a variation of #add_document which returns the doc_id of the document added, as opposed to self. I''m trying to track down an issue with a large
2006 Sep 15
0
Possiible Bug ? indexWriter#doc_countcountsdeleted docs after #commit
> I should also mention the reason I wouldn''t want > to return the document ID from any IndexWriter method > is that the document ID could become invalid when the > next document is added (if a segment merge is triggered > and deletes exist). At least when using an IndexReader, > the document ID is valid for the life of the reader. Thanks for your detail Dave!
2006 Aug 28
12
Help with Multiple Readers, 1 Writer scenario
Hi, I''m building a web server application using Ferret [thanks so much Dave], Mongrel and Camping which works fine servicing one request at a time, but serialises searches if more than one request arrives, so I''d like some advice please about the best way to use multiple readers and one writer. Some background ... query requests which in my case are always read only, arrive via
2006 Sep 22
3
Error with :create => true and existing index
I implemented a "reindex" command which simply creates an IndexWriter with :create => true for a prexisting index. The "reindexing" seems to start out ok, with several thousand docs added, then Ferret throws an exception: IO Error occured: couldn''t rename file "index\_0.tmp" to "index\_0.cfs": <File exists> I guess that _0.cfs is held
2006 Nov 22
1
Help with Multiple Readers, 1 Writer scenario
Some time back in September, [sorry to be so slow], Dave wrote: > When you open an IndexReader on the index it is opened up on > that particular version (or state) of the index. So any > operations on the IndexReader (like searches) will only show > what was in the index at the time you opened it. Any modifications > to the index (usually through and IndexWriter) that occur
2007 Apr 28
6
Determine how many documents a term occurs in
Is there a fast way to determine how many documents a term occurs in, besides iterating through every document with TermDocEnum? -- Best regards, Stian Gryt?yr
2006 Jun 14
3
In memory IndexReader bug?
Hi All, Hope all is going well. I''m having trouble with the following code creating an in memory index reader - it seems to be attempting to read from a file regardless. Here''s the simple code: require ''rubygems'' require ''ferret'' a = Ferret::Index::Index.new r = Ferret::Index::IndexReader.new(nil) Running the code on my OS X machine
2007 Apr 12
2
Ferret 0.11.4.win32 indexing speed vs Ferret 0.10.9.win32
Firstly, thanks Dave for all your hard work. Ferret Rocks!, I am just testing 0.11.4.win32 and it seems to work just fine, however the index creation phase of my app is perhaps 3x slower under 0.11.4 vs 0.10.9 Details follow: System: windows xp sp2, index on local hard disk, Ruby 1.8.6 Run #1, Ferret 0.10.9 - Reboot - Build index, 35,000 rows added in 297 seconds - Run #2, Ferret 0.11.4 -
2006 Sep 15
2
Trouble with "updating" a document
Hi, I seem to be having trouble updating a doc, ie, deleting then re-adding to the index. The following script demonstrates my issue - I''m sure I''m missing something obvious, but I can''t seem to find the problem. Can someone point out where I am going wrong please ? Regards Neville === require ''rubygems'' require ''ferret'' p
2006 May 08
3
Index::Index.new vs. Readers and Writers
Hey gang, A post on the Rails forum a while back had it sound like you pretty much had to use the Index Readers & Writers if you were going to be potentially accessing an index from more than one process. (i.e. multiple dispatch.fcgi''s, etc) Is this still the case, or does the main Index class do that black magic behind the scenes? =) I was having trouble implementing the
2006 Sep 28
3
A few questions about numbers and dates
Hi, I just noticed that Ferret seems to convert every field to a string [ruby code appended for those interested], which has thwarted my attempt to format Dates (to "dd/mm/yyyy") and Floats (to "n.nn") for consumption further down the line based on the class of the field stored. I considered pre-formatting Dates and Floats prior to indexing, which would store the field
2008 Jan 09
5
Parallel indexing doesn''t work?
Hi, I''m trying to get parallelized ferret indexing working for my AAF indices, based on the example in the O''Reilly Ferret shortcut. However, the resulting indices after merging seem to have no actual documents. I went and made minimal changes to the example in the Ferret shortcut pdf, and indeed can''t get that to work either. I''d appreciate any help
2006 Jul 05
1
search speed eclipsed by retrieval speed
Hi all, I''ve recently started working with Ferret and I''m getting what seems to be slow searches. I have about 10000 documents in the index, with several fields per document, with some fields having an array of several values that are indexed. I am using a RAMDirectory to store the index for searching. When doing testing, I find that searches are reasonable at around .2 to
2006 Sep 04
7
0.10.2 release with win32 gem
Hey all, I''ve just released Ferret version 0.10.2. It is mostly just a bug fix release. The only change is that a highlight method has been added to Ferret::Index::Index. Please try it out and let me know what you think. The big news for this release is that there is also a binary win32 gem included. This is the first time I''ve build a gem like this so please let me know if
2006 Jun 04
20
Proposal of some radical changes to API
Hey guys, Now that the Lucy[1] project has Apache approval and is about to begin, the onus is no longer on Ferret to strive for Lucene compatability. (We''ll be doing that in Lucy). So I''m starting to think about ways to improve Ferret''s API. The first part that needs to be improved is the Document API. It''s annoying having to type all the attributes to
2005 Nov 17
6
lock problems from concurrent processes.
Hi! First, thanks a LOT for ferret. The API and documentation is great. I''m trying to integrate ferret into a RoR app (DamageControl) and have run into a problem with locks. DamageControl consists of two processes that start up and run in parallel. The first one is the webapp (which is just a plain RoR app). The second is a daemon process that runs in the background. The daemon process
2006 Aug 03
2
Index.optimize
In the documentation, it says that optimize "should only be called when the index will no longer be updated very often, but will be read a lot". Does this mean it actually has a detrimental impact on updates and inserts? In my project there will be many more reads than updates, but there will still be a lot of updates. So should I be calling Optimize once a day or something like that,
2005 Mar 03
5
Whats ''favicon.ico''
I''m seeing the following in the WEBbrick console output after every GET 192.168.0.108 - - [03/Mar/2005:15:35:19 AUS Eastern Daylight Time] "GET /favicon.ico HTTP/1.1" 200 60 - -> /favicon.ico What does /favicon.ico (which doesnt seem to exist in my source) do for Rails? _______________________________________________ Rails mailing list
2006 Apr 20
1
Creating my own analyzer
I created this analyzer: class DescriptionAnalyzer < Ferret::Analysis::Analyzer def token_stream(field, string) if field == "code" return CodeTokenStream.new(string) else return Ferret::Analysis::Analyzer.new.token_stream(field,string) end end end and created an IndexWriter with it: Ferret::Index::IndexWriter.new(get_index_path,
2006 Sep 05
4
Ferret 0.10.2 - Index#search_each() and :num_docs
Hi, I seem to be having trouble getting more than 10 hits from Index#search_each since upgrading to 0.10.2 (ie, this was working in 0.9.4). Maybe a bug, as the #search_each doesn''t seem to use the options parameter any more ? Thanks, Neville =========================================== require ''rubygems'' require ''ferret'' p Ferret::VERSION idx =