Displaying 20 results from an estimated 200 matches similar to: "Dealing with image PDF's"
2009 Feb 03
1
PowerPoint 2007 filter
Hi,
I'm trying to write the PowerPoint2007 filter in the same manner that I
did for *.docx and *.xlsx but I'm getting the following error when I tru
an index.
The document is called:
Indexing "/Frisk in Power Point.pptx" as
application/vnd.openxmlformats-officedocument.presentationml.presentation ... caution: filename not matched: ppt/notesSlides/notesSlide*.xml
caution:
2009 Feb 02
2
Ticket #282: omindex-assorted-enhancements.patch woes
I would really like to try out the features in the patch above. But I
can't ever seem to get the resulting omindex.cc to "make".
I tried updating to rev 10801 from the SVN then run /bootstrap but then
I seem to get errors compiling everything when I try and do "make" (I'm
using ubuntu 8.10).
So I thought I'd try an apply the patch to the latest stable version
2006 Oct 02
1
Omindex.cc BSD bug
Hi guys:
I was trying to index a large set of PDF documents using omindex
and the system started to run out of forks (sh: fork temporarily
unavailable) making the system unusable and probably skipping documents.
I'm using MAC Osx Server 10.4.3 (Darwin/BSD) and GCC 4.0.
The problem: On function stdout_to_string a popen is called, but is not
closed properly (according the popen
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory.
Indexing about 5GB of data average of 2MB per document.
The documents are plain text.
I notice the omindex's memory fott print get's biger an bigger then the
machine starts to swap and it all slows down to a crawl.
In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000
Am I right in saying that for my setup
2009 Apr 29
1
"DatabaseCorruptError: Cannot open tables at consistent revisions"
Ocassionally when I'm searching using Omega I get:
"DatabaseCorruptError: Cannot open tables at consistent revisions"
If I click reload it's all ok, is this the database being updated?, is
there a way to avoid the message?
Frank
2009 Feb 04
2
wildcard support (left truncation)
Dose Xapian support wildcards (left truncation)?
E.g. *ildcard.doc or *.doc or Wild*.doc
I read a post from Olly in 2005 that said it wasn't supported yet, I was
wonder if there had been any progress or easy work around since.
I mainly need when users want to search by the filename extension.
Thanks,
Frank
2009 May 19
4
omindex options
Hi.
I am writing a python equivalent of omindex (we are using scriptindex
currently - but I wanted to use omindex instead, and extend it to work with
our internal file format.. BUT did not want to compile code if possible...
so anyway).
I have tried to keep the code as close to possible to the omindex native
code, but am facing a bit of confusion: what exactly is the reason for
omindex to take
2013 Nov 08
2
Problems with dovecot 2.1.7, spamassassin 3.3.2 and antispam plugin
This might be a fairly long message, but I wanted to be sure to include
as much information as possible. I'm having an issue with the
dovecot-antispam plugin in that it seems to be unable to successfully
run anything from the pipe backend. To qualify that, they run, but they
fail ...
Running /usr/bin/sa-learn directly always returns with an error code of
1, and the bayes DB isn't
2009 Apr 29
1
antiword
Hi guys,
I've been noticing more and more that antiword has trouble with many
word documents.
It may look like it's converted a document but leaves out headings and
bits of text.
I've been looking into getting openoffice to do it in headless mode but
still have a way to go before it's stable.
I was wondering if anyone else had any luck on this front?
One quick fix I have found
2009 Jul 08
1
php error parse_query
I'm having trouble getting a search via php working, I get the following
error:
*Fatal error*: No matching function for overloaded
'QueryParser_parse_query' in */usr/local/share/php5/xapian.php* on line
*1409
*The error occurs at this code
$query = $qp->parse_query( $query_string ,
XapianQueryParser::FLAG_PHRASE
|
2003 Aug 09
2
First steps towards a simple text stream format.
Hello everyone!
This list may not be entirely appropriate discussion, but in the lack of
ogg@xiph.org or ogg-dev@xiph.org this will have to do.
I've been thinking for a few weeks that Ogg needs a simple text stream
(read subtitle) format to go along with theora. This is important,
because otherwise I can't transcode fellowship of the rings while
keeping the elvish-speek, unless I render
2011 Jun 14
4
How to convert Image To Text in RoR
Hello All
I have too many scanned notes i need to convert them into text and then user
may download that as pdf , How it can be done in RoR
Please help
Thanks in advance :)
--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To view this discussion on the web visit
2002 Feb 07
1
[Fwd: Re: meaning of "IO Error: skipping the delete...."]]
Nitin Agarwal <nitin.agarwal@timesgroup.com> wrote:
> Dear Mr. Rusty,
> Thanks for the reply. The problem was sorted out by changing the uid option in
> rsyncd.conf file to root.
> We are facing two more problem now....
> 1) while transferring the files, sometimes the transfer breaks in between and gives
> us the error message: "readerror: connection reset by
2002 Feb 20
3
importing images
I would like to import "tif" images in R and I do not find any
function that can do that. In Matlab there exists the function "imread"
that can read the most known images format. Does a similar function
exist for R ?
Thanks in advance
--
Herve CARDOT
____________________________________________________________
Unite Biometrie et Intelligence Artificielle, INRA Toulouse
BP
2008 Jul 29
1
Flax support for docx
Anyone used Flax to search for docx?
I'm testing on XP.
Installed msxml6.msi and FilterPackx86.exe but Flax dosen't seem to
register *.docx as a document type.
Is there a way for me to add it in?
Thanks,
Frank
2009 Apr 23
1
PHP Total document
I was also wondering if someone could tell me how to extract the total
number of documents contained in a database via PHP.
Thanks,
Frank
2009 Feb 09
1
last_mod performance
I was just wondering if anyone knew what the performance increase was
using last_mod to only to scanned altered files.
I understand I guess it all depends on how busy the site is and how many
files change, but typically would you say it's a feature that
you would use because of a dramatic increase in performance.
Thanks,
Frank
2009 Apr 23
1
Expanding the search in PHP
I tried using the simpleexpand.php from
http://xapian.org/docs/bindings/php/examples/simpleexpand.php5
I get different results between PHP and the Omega expand (see below),
I'd like to have the same functionality in PHP.
Could anyone suggest how to do it? Is there an example I could use?
Thanks,
Frank
And got the following results from PHP:
Zdefin: weight = 46.963883268652
Zconfigur:
2008 Feb 06
3
poppler-utils missing pdftoppm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I found a thread here about this problem, but no answer or resolution as
to whether it's a bug, or even something that can be fixed.
<http://www.centos.org/modules/newbb/viewtopic.php?topic_id=12114&forum=38>
I'm trying to get PDF support under DocMGR to work correctly and it's
complaining about pdftoppm being missing.
A bit
2007 Sep 12
3
Document Scanning and Storage
I'd like to start scanning our boxed up documents. I'd say about 30,000
files total.
Mostly to eliminate the boxes of paper we have.
I'd like to scan them, store them, Have some sort of index, and be able to
retrieve them on multiple machines. I think PDF would be the desired format.
I'd like be able to set some permissions as well. (not a deal breaker...)
I've searched