Displaying 19 results from an estimated 19 matches for "tokenstream".
Did you mean:
token_stream
2007 Apr 08
3
How to make custom TokenFilter?
In the O''reilly Ferret short cuts, I found very useful example for me.
It explains how to make custom Tokenizer.
But that book doesn''t explain how to make custom Filter.
(especially, how to implement the #text=() method)
I''m a newbee and I don''t understand how do I create my own custom
Filter.
Are there some good source code examples??
--
Posted via
2005 Aug 10
1
Issues with Canoo WebTest
...hread.run(Thread.java:552)
[canoo] Enclosed exception:
[canoo] SyntaxError: illegal character (Wrapper definition for
Window.setTimeout(); line 1)
[canoo] at org.mozilla.javascript.NativeGlobal.constructError
(NativeGlobal.java:597)
[canoo] at
org.mozilla.javascript.TokenStream.reportSyntaxError(TokenStream.java:
1324)
[canoo] at org.mozilla.javascript.TokenStream.getToken
(TokenStream.java:1302)
[canoo] at org.mozilla.javascript.Parser.memberExprTail
(Parser.java:1213)
[canoo] at org.mozilla.javascript.Parser.memberExpr
(Parser.java:1204)...
2007 May 18
1
roll my own TokenFilter subclass
Hi all,
I''d like to write my own TokenStream Filter (in lucene this would be a
subclass of a TokenFilter, which ferret seems to lack) but I''m not
sure how to go about it. Specifically, it''s not clear how I''d create
a non-trivial TokenStream to pass out to any filters that wrapped
mine.
Can anyone point me towards...
2015 Mar 05
3
Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]
...ype(IndexSchema.java:1269)
at org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434)
at org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
at org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374)
at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:742)
at org.apache.solr.pa...
2006 Sep 06
9
Which analyzer to use
Lucene''s standard analyzer splits words separater with underscores.
Ferret doesn''t do this. For example, if I create an index with only
document ''test_case'' and search for ''case'' it doesn''t find anything.
Lucene on the other hand finds it. The same story goes for words
separated by colons.
Which analyzer should I use to emulate
2007 Apr 13
5
[Ferret] Serious memory leak on Joyent / TextDrive / Solaris
There is serious memory leak bug in ferret. I''m having this error on
TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret
0.11.4
It happens while searching for some terms with accented or special
characters.
This makes ferret to allocate lots of memory (usually reaching 3+ GB)
and failing if another query like this is executed.
Any ideas on that, could this be locale
2006 Jun 01
8
Windows progress
Hi there,
What''s the current status of the Windows port? I may be in a position
to lend a hand over the next couple of weeks - where should I start
looking? And what''s the best way to get SVN HEAD? This happens:
$ svn checkout svn://www.davebalmain.com/ferret/trunk ferret
svn: Can''t connect to host ''www.davebalmain.com'': Connection refused
--
2006 Jun 13
5
Grep style output?
Hi All,
Hope all is going well. Was just wondering if anyone has implemented a
grep style output page of hits using Ferret as the index/query engine?
Any thoughts about how best to implement it? The previous thread
discussess highlighting - would that be the best approach to follow or
is there a better way?
Cheers,
Marcus
--
Posted via http://www.ruby-forum.com/.
2007 Aug 03
0
StandardTokenizer Doesn''t Support token_stream method
...http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html
I ought to be able to construct a StandardTokenizer like this:
t = StandardTokenizer.new( true) # true to downcase tokens
and then later:
stream = token_stream( ignored_field_name, some_string)
To create a new TokenStream from some_string. This approach would be
valuable for my application since I am analyzing many short strings --
so I''m thinking that building my 5-deep analyzer chain for each small
string will be a nice savings.
Unfortunately, StandardTokenizer#initialize does not work as advertised.
It...
2006 Oct 19
2
How to deal with accentuated chars in 0.10.8?
I''m startin to use Ferret and acts_as_ferret.
I need to use something like EuropeanAnalyzer
(http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars).
By example, if the user search by "gonzalez" you can find documents taht
contents the term "gonz?lez" (gonzález)
The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter,
2004 Aug 19
1
Festival Issues
Hey All,
I now have Festival compiled, installed and running using the instructions on the Wiki page.
When I try to change the voice that is being used however, I am running into a problem. I get
the following in the festival server log:
Cannot open file /tmp/est_10877_00000/utt.wav as tokenstream
Wave load: can't open file "/tmp/est_10877_00000/utt.wav"
Cannot load wavefile: /tmp/est_10877_00000/utt.wav
When I look in the /tmp/est_10877_00000 folder, while the sound file is still playing according
to Asterisk, the following seems to be created:
total 56
drwxr-xr-x 2 darr...
2015 Mar 05
0
Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]
...)
> at
> org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434)
> at
> org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
> at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
> at
> org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
> at
> org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374)
> at
> org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.j...
2006 Oct 20
2
Bug in search matching ?
Hi :)
Here''s a little code reproducing something that i consider as a bug, if
it''s not please explain :]
http://pastie.caboo.se/18693
Thanks by advance,
Cheers,
J?r?mie ''ahFeel'' BORDIER
--
Posted via http://www.ruby-forum.com/.
2013 Apr 05
2
Problem with fts lucene, on solaris 10
Hi all,
I'm planning to migrate my courier-imap imap server to dovecot, but I'm experiencing a strange issue
with fts-lucene plugin.
Basically, every time I start a search, the log starts to write:
Apr 05 19:30:53 indexer: Error: Indexer worker disconnected, discarding 1 requests for XXXXXX
Apr 05 19:30:53 indexer-worker(XXXXX): Fatal: master: service(indexer-worker): child 809 killed
2006 Apr 19
2
How to do case-sensitive searches
Forgive me if this topic has already been discussed on the list. I
googled but couldn''t find much. I''d like to search through text for
US state abbreviations that are written in capitals. What is the best
way to do this? I read somewhere that tokenized fields are stored in
the index in lowercase, so I am concerned that I will lose precision.
What is the best way to store a
2015 Mar 05
2
Dovecot Full Text Search: HTTP 500 : Unknown fieldType 'text_general' specified on field text. [SERIOUS]
...>> >>
>> >> >>
>> >> >> org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
>> >> >> at
>> >> >> org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
>> >> >> at
>> >> >>
>> >> >>
>> >> >> org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
>> >> >> at
>> >> >>
>> >> >>
>>...
2007 Jul 07
2
Extending/Modifying QueryParser
...top_words = stop_words
end
def token_stream(field, str)
ts = StandardTokenizer.new(str)
ts = LowerCaseFilter.new(ts) if @lower
ts = StopFilter.new(ts, @stop_words)
ts = SynonymTokenFilter.new(ts, @synonym_engine)
end
end
class SynonymTokenFilter < Ferret::Analysis::TokenStream
include Ferret::Analysis
def initialize(token_stream, synonym_engine)
@token_stream = token_stream
@synonym_stack = []
@synonym_engine = synonym_engine
end
def text=(text)
@token_stream.text = text
end
def next
return @synonym_stack.pop if @synonym_stac...
2007 Mar 23
5
Any chance to get 0.11.3 on windows soon ?
...',
[''?'',''?'',''?''] => ''y'',
[''?'',''?'',''?''] => ''z''
}
class TokenFilter < TokenStream
# Construct a token stream filtering the given input.
def initialize(input)
@input = input
end
end
# replace accentuated chars with ASCII one
class ToASCIIFilter < TokenFilter
def next()
token = @input.next()
unless token.nil?
token.text = token....
2006 Nov 25
5
Metaphone analysis
...algorithm over a token stream. It''s a fairly simple class, but does
require the ''Text'' gem be installed.
require ''ferret''
require ''text''
module Curtis
module Analysis
# TODO write tests!
class MetaphoneFilter < Ferret::Analysis::TokenStream
def initialize(token_stream, version = :double)
@input = token_stream
@version = version
end
def next
t = @input.next
return nil if t.nil?
t.text = @version.eql?(:double) ?
Text::Metaphone.double_metaphone(t.text) :
Text::Metaphone.metaphone(t.text)
end
en...