search for: tokenstream

Displaying 19 results from an estimated 19 matches for "tokenstream".

Did you mean: token_stream
2007 Apr 08
3
How to make custom TokenFilter?
In the O''reilly Ferret short cuts, I found very useful example for me. It explains how to make custom Tokenizer. But that book doesn''t explain how to make custom Filter. (especially, how to implement the #text=() method) I''m a newbee and I don''t understand how do I create my own custom Filter. Are there some good source code examples?? -- Posted via
2005 Aug 10
1
Issues with Canoo WebTest
...hread.run(Thread.java:552) [canoo] Enclosed exception: [canoo] SyntaxError: illegal character (Wrapper definition for Window.setTimeout(); line 1) [canoo] at org.mozilla.javascript.NativeGlobal.constructError (NativeGlobal.java:597) [canoo] at org.mozilla.javascript.TokenStream.reportSyntaxError(TokenStream.java: 1324) [canoo] at org.mozilla.javascript.TokenStream.getToken (TokenStream.java:1302) [canoo] at org.mozilla.javascript.Parser.memberExprTail (Parser.java:1213) [canoo] at org.mozilla.javascript.Parser.memberExpr (Parser.java:1204)...
2007 May 18
1
roll my own TokenFilter subclass
Hi all, I''d like to write my own TokenStream Filter (in lucene this would be a subclass of a TokenFilter, which ferret seems to lack) but I''m not sure how to go about it. Specifically, it''s not clear how I''d create a non-trivial TokenStream to pass out to any filters that wrapped mine. Can anyone point me towards...
2015 Mar 05
3
Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]
...ype(IndexSchema.java:1269) at org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434) at org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74) at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175) at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207) at org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374) at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:742) at org.apache.solr.pa...
2006 Sep 06
9
Which analyzer to use
Lucene''s standard analyzer splits words separater with underscores. Ferret doesn''t do this. For example, if I create an index with only document ''test_case'' and search for ''case'' it doesn''t find anything. Lucene on the other hand finds it. The same story goes for words separated by colons. Which analyzer should I use to emulate
2007 Apr 13
5
[Ferret] Serious memory leak on Joyent / TextDrive / Solaris
There is serious memory leak bug in ferret. I''m having this error on TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret 0.11.4 It happens while searching for some terms with accented or special characters. This makes ferret to allocate lots of memory (usually reaching 3+ GB) and failing if another query like this is executed. Any ideas on that, could this be locale
2006 Jun 01
8
Windows progress
Hi there, What''s the current status of the Windows port? I may be in a position to lend a hand over the next couple of weeks - where should I start looking? And what''s the best way to get SVN HEAD? This happens: $ svn checkout svn://www.davebalmain.com/ferret/trunk ferret svn: Can''t connect to host ''www.davebalmain.com'': Connection refused --
2006 Jun 13
5
Grep style output?
Hi All, Hope all is going well. Was just wondering if anyone has implemented a grep style output page of hits using Ferret as the index/query engine? Any thoughts about how best to implement it? The previous thread discussess highlighting - would that be the best approach to follow or is there a better way? Cheers, Marcus -- Posted via http://www.ruby-forum.com/.
2007 Aug 03
0
StandardTokenizer Doesn''t Support token_stream method
...http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html I ought to be able to construct a StandardTokenizer like this: t = StandardTokenizer.new( true) # true to downcase tokens and then later: stream = token_stream( ignored_field_name, some_string) To create a new TokenStream from some_string. This approach would be valuable for my application since I am analyzing many short strings -- so I''m thinking that building my 5-deep analyzer chain for each small string will be a nice savings. Unfortunately, StandardTokenizer#initialize does not work as advertised. It...
2006 Oct 19
2
How to deal with accentuated chars in 0.10.8?
I''m startin to use Ferret and acts_as_ferret. I need to use something like EuropeanAnalyzer (http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars). By example, if the user search by "gonzalez" you can find documents taht contents the term "gonz?lez" (gonzález) The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter,
2004 Aug 19
1
Festival Issues
Hey All, I now have Festival compiled, installed and running using the instructions on the Wiki page. When I try to change the voice that is being used however, I am running into a problem. I get the following in the festival server log: Cannot open file /tmp/est_10877_00000/utt.wav as tokenstream Wave load: can't open file "/tmp/est_10877_00000/utt.wav" Cannot load wavefile: /tmp/est_10877_00000/utt.wav When I look in the /tmp/est_10877_00000 folder, while the sound file is still playing according to Asterisk, the following seems to be created: total 56 drwxr-xr-x 2 darr...
2015 Mar 05
0
Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]
...) > at > org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434) > at > org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74) > at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175) > at > org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207) > at > org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374) > at > org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.j...
2006 Oct 20
2
Bug in search matching ?
Hi :) Here''s a little code reproducing something that i consider as a bug, if it''s not please explain :] http://pastie.caboo.se/18693 Thanks by advance, Cheers, J?r?mie ''ahFeel'' BORDIER -- Posted via http://www.ruby-forum.com/.
2013 Apr 05
2
Problem with fts lucene, on solaris 10
Hi all, I'm planning to migrate my courier-imap imap server to dovecot, but I'm experiencing a strange issue with fts-lucene plugin. Basically, every time I start a search, the log starts to write: Apr 05 19:30:53 indexer: Error: Indexer worker disconnected, discarding 1 requests for XXXXXX Apr 05 19:30:53 indexer-worker(XXXXX): Fatal: master: service(indexer-worker): child 809 killed
2006 Apr 19
2
How to do case-sensitive searches
Forgive me if this topic has already been discussed on the list. I googled but couldn''t find much. I''d like to search through text for US state abbreviations that are written in capitals. What is the best way to do this? I read somewhere that tokenized fields are stored in the index in lowercase, so I am concerned that I will lose precision. What is the best way to store a
2015 Mar 05
2
Dovecot Full Text Search: HTTP 500 : Unknown fieldType 'text_general' specified on field text. [SERIOUS]
...>> >> >> >> >> >> >> >> org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74) >> >> >> at >> >> >> org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175) >> >> >> at >> >> >> >> >> >> >> >> >> org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207) >> >> >> at >> >> >> >> >> >> >>...
2007 Jul 07
2
Extending/Modifying QueryParser
...top_words = stop_words end def token_stream(field, str) ts = StandardTokenizer.new(str) ts = LowerCaseFilter.new(ts) if @lower ts = StopFilter.new(ts, @stop_words) ts = SynonymTokenFilter.new(ts, @synonym_engine) end end class SynonymTokenFilter < Ferret::Analysis::TokenStream include Ferret::Analysis def initialize(token_stream, synonym_engine) @token_stream = token_stream @synonym_stack = [] @synonym_engine = synonym_engine end def text=(text) @token_stream.text = text end def next return @synonym_stack.pop if @synonym_stac...
2007 Mar 23
5
Any chance to get 0.11.3 on windows soon ?
...', [''?'',''?'',''?''] => ''y'', [''?'',''?'',''?''] => ''z'' } class TokenFilter < TokenStream # Construct a token stream filtering the given input. def initialize(input) @input = input end end # replace accentuated chars with ASCII one class ToASCIIFilter < TokenFilter def next() token = @input.next() unless token.nil? token.text = token....
2006 Nov 25
5
Metaphone analysis
...algorithm over a token stream. It''s a fairly simple class, but does require the ''Text'' gem be installed. require ''ferret'' require ''text'' module Curtis module Analysis # TODO write tests! class MetaphoneFilter < Ferret::Analysis::TokenStream def initialize(token_stream, version = :double) @input = token_stream @version = version end def next t = @input.next return nil if t.nil? t.text = @version.eql?(:double) ? Text::Metaphone.double_metaphone(t.text) : Text::Metaphone.metaphone(t.text) end en...