Displaying 20 results from an estimated 400 matches similar to: "omindex killed"
2013 Apr 05
1
problems with indexing xlsx files
Hello,
I have a number of Excel .xlsx files that aren't indexed properly. To illustrate, I have a file called "this is a test.xlsx". It consists of four cells:
| this |
| is |
| a |
| test |
It gets indexed but I am unable to search for it.
I was able to determine the index number and use delve to see the term list:
#delve users -r 16496
Term List for record #16496:
2012 Dec 20
1
omega with samba share
Hello,
I have installed and configured omega to index samba shares on a debian server. I would like to know if it's possible to change the HREF links on the search results page to begin "file://" instead of "http://"
I had a look at the templates files and there was no obvious solution that I could see.
Thanks.
--
Chris Purves
Visit my blog: http://chris.northfolk.ca
2012 Dec 30
1
combining databases for omega
From the documentation I've read, omega can read from multiple
databases, but I'm not sure how to go about this.
I have three databases created using omindex, currently located at
/var/lib/xapian-omega/data/share, /var/lib/xapian-omega/data/users, and
/var/lib/xapiax-omega/data/management
The quickstart guide says that in omega.conf database_dir should point
to the directory
2013 Apr 16
0
confusion about term prefixes
I am confused about using term prefixes for omega searches. There are a number of term prefixes that are reserved and used by omindex. In order to use those for searching with omega, do I need to use the $setmap{} function in the omega template or are the reserved ones built in?
--
Chris Purves
Visit my blog: http://chris.northfolk.ca
"I can't have a lobotomy just because I've
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory.
Indexing about 5GB of data average of 2MB per document.
The documents are plain text.
I notice the omindex's memory fott print get's biger an bigger then the
machine starts to swap and it all slows down to a crawl.
In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000
Am I right in saying that for my setup
2007 Jul 17
1
BUG IN XAPIAN_FLUSH_THRESHOLD
There is is bug when setting XAPIAN_FLUSH_THRESHOLD=20000000
When trying for force Xapian flush documents to flush after 20 million
documents Xapian ignores the size and flush it after only 10,000
documents.
Data captured from delve after 60 seconds interval when has been set as follow:
XAPIAN_FLUSH_THRESHOLD=20000000
perl -e ' while(1) { system("delve ."); sleep(60); } '
2012 Dec 13
1
omindex one file at a time?
Hi, all -- I want to do Plain Old Omindex'ing *but* the mapping
between my documents' filenames and the URLs where I hope search
users to find them is, uh..., strange. The simplest thing (to
me) would be to run omindex for each document, e.g.
omindex --no-delete -U /cool-url-1 /funky/doc/file-blah.pdf
omindex --no-delete -U /cool-url-7 /doc/funky/ohmy/blah-file.txt
... and so on...
2005 Mar 31
1
omindex and scriptindex question
Hi,
I was researching indexing of text in omindex and scriptindex.
While indexing text with omindex.cc possition of terms is saved with gap.
This is not happening with scriptindex.cc
While this is happening ?
Another question is why in omindex.cc the term possition starts with 0 while
in scriptindex it starts from 1 ?
Code snippet from omindex.cc
// Add postings for terms to the document
2011 Oct 18
2
patch proposal: omindex library or daemon
Olly (looking at commit logs, I think this is your dept :-)
For apps which re/index files frequently and need format conversion, I'd
like to propose a patch for one of...
Omindex library (thread safe):
Omindex::init(options) // struct Omindex::options { ... }
initialize mime_map, store default options
session = new Omindex::Session(db_pathname)
user threads use different sessions
2010 Dec 15
2
excluding child folders in omindex search
hi there,
is there an option to exclude child folders when running omindex?
For example:
omindex -p --db /var/blah/default --url /something /var/www --exclude
/var/www/ignore
Thanks,
Jeff
2009 Feb 02
2
Ticket #282: omindex-assorted-enhancements.patch woes
I would really like to try out the features in the patch above. But I
can't ever seem to get the resulting omindex.cc to "make".
I tried updating to rev 10801 from the SVN then run /bootstrap but then
I seem to get errors compiling everything when I try and do "make" (I'm
using ubuntu 8.10).
So I thought I'd try an apply the patch to the latest stable version
2013 May 15
1
How to omindex some sub-directories?
Given a directory tree like ...
/foo
|
+-- A
|
+-- B
|
+-- C
... what is the best way to index A and C into a single Xapian database?
AFAIK the alternatives are:
omindex --db /my_db --no-delete /foo /foo/A
omindex --db /my_db --no-delete /foo /foo/B
or
omindex --db /my_A_db /foo /foo/A
omindex --db /my_B_db /foo /foo/B
xapian-compact /my_A_db /my_B_db /my_db
The first alternative does not
2004 Oct 08
1
indexing performance
I've some trouble with my indexer, which builds on simpleindex.cc. The problem
is that indexing process becomes very slow after we indexed 2000k docs (though
the indexer works quite well with first 2000k docs). It took almost three
weeks to index 8 million docs. However, we need to index about 20 million
docs. I have to stop the indexer due to its performance.
I think my question is
2012 Nov 21
1
about index speed of xapian
hi,
i use xapian to index a txt file, it's size is 268M. i take each line as a document, and each line has two field like 13445511 | 111115151. the recored size is 10000000. the XAPIAN_FLUSH_THRESHOLD set 1000000. it takes 1026544ms to index the file, it is more slower than lucene. The lucene speed is about 40000 records per second.
code:
try
{
Xapian::WritableDatabase
2009 Jun 20
3
omindex hangs while scanning
Hello,
I was looking for a search engine for a small internal documentation
site and found xapian and
omega. Downloaded and compiled it using msys and ming on a german
windows xp system. Finally
installed apache on the same box.
Following the omega example I copied the book to .../apache/htdocs and
startet the omindex
which hang up on the first document found. Even on very short doc with
2006 Oct 02
1
Omindex.cc BSD bug
Hi guys:
I was trying to index a large set of PDF documents using omindex
and the system started to run out of forks (sh: fork temporarily
unavailable) making the system unusable and probably skipping documents.
I'm using MAC Osx Server 10.4.3 (Darwin/BSD) and GCC 4.0.
The problem: On function stdout_to_string a popen is called, but is not
closed properly (according the popen
2006 Aug 20
1
omindex patch
Attached is my rather largish omindex.cc patch with ChangeLog.
It needs autoreconf to update configure and the Makefiles.
Note that unrar is not patent infected, only rar, the compressor.
I've put some AC_PATH_PROG checks into configure for all helpers.
The patch is not yet complete.
2006-08-18 15:13:32 Reini Urban <reinhard.urban at avl.com>
omega-0.9.6b:
* omindex.cc: last_mod as
2009 Apr 06
2
omindex => Unknown extension
Hi all,
I'm having a recurrent problem with Omega's indexing.
When I run omindex, it sometimes misses to recognize the extension of
some files (.doc, .pdf) and skips them. In the same run, omindex is
otherwise perfectly able to index other files with same extensions. The
reason is not clear but it should occur before it selects a content
converter since for example, if I manually run
2017 Apr 20
2
Question about the ticket #743 omindex: delay libmagic checks
Hi,
I'm working on the ticket #743 omindex: delay libmagic checks
<https://trac.xapian.org/ticket/743>. As the ticket's
Description mention, the call to libmagic is expensive than call the stat,
so we can check the size by call the stat to get size before call
libmagic to get a mime type.
But how about the timestamps check? since timestamps check need to iterate
the DB to check if
2004 Dec 10
0
Omindex and symlinks
I've just been playing with using omindex to build an index of the
documentation in /usr/share/doc on my Debian workstation. It's all
working pretty well - indexing took only a few minutes, reindexing
appears to work acceptably, and the search results are okay. The main
problem I'm seeing is a lot of duplicate results due to directory
symlinks. Omindex's current behaviour is to