Displaying 20 results from an estimated 1000 matches similar to: "RDig - ferret-based website crawler/indexer"
2006 Jul 25
1
RDig document processing error
Hi all,
Am having problems using RDig:
With this rdig config...
cfg.crawler.start_urls = [''http://www.defensetech.org'']
cfg.crawler.include_hosts = [''www.defensetech.org'']
cfg.index.path = ''/my/path/to/index''
cfg.verbose = true
...I get this output:
$ rdig -c config/rdig_config.rb
/usr/local/lib/site_ruby/1.8/ferret/index/term.rb:45:
2006 Jul 14
2
RDig config file problem
Hi All,
Hope it is ok to post RDig queries on this forum.
Just trying to get RDig working (Ubuntu 6.06, RDig 0.3.0, ferret 0.9.4,
rubyful_soup 1.0.4)
Here is my output:
sh:~/rdigtry$ rdig -c config/rdig_config.rb
discovered content extractor class:
RDig::ContentExtractors::PdfContentExtractor
discovered content extractor class:
RDig::ContentExtractors::WordContentExtractor
discovered
2007 Jan 05
1
adding one url to rdig index?
Hey there,
I''m building a rails site using RDig as a site-wide search. I would like to be able to add just one URL (or possibly a list) to an existing index, so that when certain pages change I can update the index without reindexing the entire site. I looked through the documentation and didn''t see an example on how to do this so I am looking for some guidance here :). Is
2007 Jan 23
3
Someone getting RDig work for Linux?
I got this
root at linux:~# rdig -c configfile
RDig version 0.3.4
using Ferret 0.10.14
added url file:///home/myaccount/documents/
waiting for threads to finish...
root at linux:~# rdig -c configfile -q "Ruby"
RDig version 0.3.4
using Ferret 0.10.14
executing query >Ruby<
Query:
total results: 0
root at linux:~#
my configfile
I changed from config to cfg, because of maybe
2007 Sep 18
4
basic rdig setup
I''m developing locally on Windows and I have a remote dev box that runs
Linux. I''m trying to use RDig just to index using urls, no files.
Both use acts_as_ferret for an administrative search that works fine.
On the Windows machine, I get no errors, but get no results.
On the Linux machine, I get:
File Not Found Error occured at <except.c>:93 in xraise
Error occured in
2007 Jul 29
7
RDig and AAF playing together
I have a site with two indexes. Index A is created offline by RDig
and queried from the web via RDig (specifically,
RDig.searcher.search). Index B is managed by AAF with :remote =>
true. Simple enough. However, I need to query both indexes from RDig.
Usually this is ok, as I modified RDig to accept an array of
search_paths with an element for index A and index B.
However, when Index
2007 Sep 27
2
Problem getting "extract" from RDig
Hi All,
I have to have a site wide search for my current application. By search
I mean I have to search the static and the dynamic contents from the
database. I have been searching on this for a while on the net and RDig
seems to be a apt solution. While using it I have encountered a few
problems. I know these might be very basic issues but I have not been
able to figure out what is wrong with
2007 Jan 21
4
could not install in WinXP
Directory of C:\search_app
01/21/2007 19:37 <DIR> .
01/21/2007 19:37 <DIR> ..
01/21/2007 19:36 427 008 ferret-0.10.13.gem
01/21/2007 19:07 148 992 rdig-0.3.4.gem
2 File(s) 576 000 bytes
2 Dir(s) 45 135 982 592 bytes free
C:\search_app>gem install ferret
Building native extensions. This could
2007 Feb 10
5
Adding extra fields to an index (using RDig?)
Hello everyone,
I am writing an application which collects a set of web sites and caches
them locally for offline viewing. I want to do searches on this
collection and associate extra data with each result (e.g date
collected, reason for collection, perhaps a sequence number).
Now all this data exists when the harvesting is done and could be stored
in a database. I want to use RDig to index my
2007 Feb 15
3
Proximity searching in rdig ferret
Lucene has a syntax "foo bar"~10 for finding foo within 10 words of bar.
Does ferret support this feature? (the ~ is used for fuzzy queries) Does
rdig?
This could be a deal breaker for me ''cos I really need proximity
searches
--
Posted via http://www.ruby-forum.com/.
2007 Jun 23
2
End of File Error on index optmize
I was optimizing a 650MB using ferret (0.11.3) and I received the
following error. I''ve seen some people have similar issues but I
haven''t seen any resolutions. The contents of the index directory
follow the error. Has anyone seen anything like this and found a
resolution? Many thanks.
/mnt/apps/search/releases/20070622175637/script/../config/../vendor/
2006 Aug 24
2
acts_as_ferret for Ferret 0.10
Hi all,
the current acts_as_ferret trunk is now ported to Ferret 0.10.
Get it while it''s hot at
svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin
Nearly everything works, besides this:
- all queries are ORed (no way to tell the QueryParser to build AND
queries by default)
- more_like_this is broken
I''m working with Dave to fix these things soon. The last Ferret 0.9.x
2006 Sep 12
1
options hash ignored when searching multiple readers
Hi,
I''m working on an aaf bug report that led me to what I think is a bug
in Ferret itself. The snippet at
http://pastie.caboo.se/12950
shows the problem, the last two lines should imho only return one
result, because of :offset => 1 or :limit => 1, but both return all
(that is, 2) results (Ferret 0.10.4).
Cheers,
Jens
--
webit! Gesellschaft f?r neue Medien mbH
2006 Jul 10
2
acts_as_ferret 0.2.2
Hi all,
I just tagged acts_as_ferret 0.2.2 as the current stable version, so get
it while it''s hot ;-)
new features:
- added support for the multiple models/single index approach.
- find out the total number of search results by calling total_hits on
the array returned by find_by_contents.
fixes:
- trac tickets #20 (find_by_contents breaks ferret sorting) and #24
2006 Sep 19
1
Mongrel 0.13.3.4 Debian packages
Hi folks,
just wanted to let you know that I''ve updated the Mongrel Debian/Sarge
packages. please see that blog post [1] for install instructions.
If you already got them installed before, updating should be as easy as
apt-get update
apt-get upgrade
btw: Zed, great work :-)
cheers,
Jens
[1]
http://www.jkraemer.net/2006/7/7/mongrel-apache-and-rails-on-debian-sarge
--
webit!
2005 Oct 26
1
index compatibility
Hi,
first of all: great work!
I''d like to know which Lucene Version Ferret is based on, in other
words: will I be able to read/write indexes created with current lucene
trunk ?
Thanks in advance,
Jens
--
webit! Gesellschaft f?r neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de
Schnorrstra?e 76 Telefon +49 351
2006 Aug 04
4
Ruby/Gtk Luke port
Hi all,
some days ago I wrote that I once had started porting Luke to Ferret
with Ruby/Gtk. I just dug out those sources and put them under version
control.
It''s far from finished and my first Gtk program, but might be a good
start anyway.
the code is available at
svn://projects.jkraemer.net/inspector/trunk/
If anybody wants to contribute, I''ll be glad to grant commit rights.
2006 May 15
16
Ferret not able to read a Lucene Index?
Hi all,
Having problems trying to get Ferret to read an index generated by
Lucene.
Am I right in thinking Ferret should be able to read a Lucene generated
index no problem?
Using the code snippets detailed in
http://www.ruby-forum.com/topic/64099#new
Any advice gratefully received.
Many Thanks,
Steven
--
Posted via http://www.ruby-forum.com/.
2007 Feb 15
0
rdig wildcard searches
Lucene has simple wildcard syntax supporting ? and * thus ruby could be
matched by rub? r*by etc.
This doesn''t work using rdig on the command line
e.g. rdig -c config.rb -q ''data:"ru?y"'' gives
RDig version 0.3.4
using Ferret 0.10.14
executing query >data:"ru?y"<
Query: data:"ru y"~1
which is something entirely different. The
2006 Sep 09
2
acts_as_ferret 0.3.0
Hi,
just wanted to officially announce the release of acts_as_Ferret 0.3.0.
As you see, I''m trying to catch up with Ferret''s version numbers ;-)
svn://projects.jkraemer.net/acts_as_ferret/tags/0.3.0/
or
svn://projects.jkraemer.net/acts_as_ferret/tags/stable/
This release is now tagged stable, so in case anybody has used the old
stable release via an svn external, please