Displaying 20 results from an estimated 1000 matches similar to: "Gsoc-2013"
2012 Mar 31
1
Project: Posting list encoding improvements
Hi Xapianers:
My name is Weixian Zhou, Computer Science student of University at Buffalo,
State University of New York. I am interested in the project of posting
list encoding improvements and weighting schemes. I have some questions
toward them.
1) After read the comments in brass_postlist.cc, I am still not very clear
about the detailed structure of postings list. If you can provide some
simple
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote:
> On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote:
> > The advantage of compact - it runs approximately 8 times as fast (we
> > are CPU limited in each case - writing to tmpfs first, then rsyncing
> > to the destination) and it takes approximately 75% of the space of a
> > fresh database with maximum
2009 Jan 12
2
error messgae
Hello,
I am having problems getting one xlite clients to communicate through
asterisk. I am getting an error message:
chan_sip.c:15593 handle_request_register: Registration from '"chinmay
chakraborty"<sip:1234 at 10.44.32.193 <sip%3A1234 at 10.44.32.193>>' failed
for '10.44.32.193' - No matching peer found
sip show peers
Name/username Host
2023 May 03
1
manual flushing thresholds for deletes?
Olly Betts <olly at survex.com> wrote:
> On Mon, Mar 27, 2023 at 11:22:09AM +0000, Eric Wong wrote:
> > Olly Betts <olly at survex.com> wrote:
> > > 10 seems too long. You want the mean word length weighted by frequency
> > > of occurrence. For English that's typically around 5 characters, which
> > > is 5 bytes. If we go for +1 that's:
2023 Mar 27
1
manual flushing thresholds for deletes?
On Mon, Mar 27, 2023 at 11:22:09AM +0000, Eric Wong wrote:
> Olly Betts <olly at survex.com> wrote:
> > 10 seems too long. You want the mean word length weighted by frequency
> > of occurrence. For English that's typically around 5 characters, which
> > is 5 bytes. If we go for +1 that's:
>
> Actually, 10 may be too short in my case since there's a
2023 May 03
1
manual flushing thresholds for deletes?
On Wed, May 03, 2023 at 12:38:15PM +0000, Eric Wong wrote:
> Olly Betts <olly at survex.com> wrote:
> > This will also effectively ignore boolean terms, assuming you're giving
> > them wdf of 0 (because $3 here is the collection frequency, which is
> > sum(wdf(term)) over all documents).
>
> Should boolean terms be ignored when estimating flushing
>
2006 Apr 08
5
Sending email on Windows
Hi!
I''m developing a different kind of Rails app. It''s an email client
*built on RoR *which users can download and use. It''ll look and feel
like a standard webapp, except that the server will be run locally
(maybe webbrick).. Basically, the idea is that:
* Web based apps are easier to upgrade, use etc.
* But people need to have their email and other stuff on
2006 Mar 06
1
Model for Tracking changes
Hi!
I''m new to Rails (and ruby in general). I was planning to implement a
whiteboard like system, in which changes made by users to the board are
automatically tracked.
Currently, I''ve implemented it as a model Post which :has_many Boards. A
new Board is added to the Post every time a change is submitted (with a
created_at timestamp). This allows saving multiple versions of
2010 Jan 18
3
postlist: Tag containing meta information is corrupt.
Greetings,
Using latest svn.
I've noticed the following error when performing index merging:
postlist:
baseB blocksize=8K items=33962 lastblock=534 revision=1 levels=2 root=459
B-tree checked okay
Tag containing meta information is corrupt.
postlist table errors found: 1
I can still search on this index (I've only checked very small indexes),
but merging is now a problem since I check
2013 Mar 09
0
gsoc-13
Hi,
I am Chinmay Naik, an undergraduate in Computer Science at Bangalore
Institute of Technology, Bangalore.
2007 Feb 09
1
Fetching document content by Q term in Python
Hello,
I'd like to be able to retrieve the indexes stored copy of the document
text and tried the following:
terms = self.db.allterms()
terms.skip_to('Q' + uri.encode('utf-8'))
term = terms.next()
doc = self.db.get_document(term[1])
print doc.get_data()
I just wildly guessed that [1] was the docid, but of course it isn't. So the
question is, how do I
2017 May 22
2
Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
Olly Betts writes:
> On Wed, May 17, 2017 at 09:08:32PM +0200, Jean-Francois Dockes wrote:
> > I have a user reporting the following error during recoll indexing:
> >
> > flush() failed: Db block overwritten - are there multiple writers?
> >
> > "flush() failed" is from recoll, the rest is, I think the text of the Xapian
> > exception.
2017 May 17
2
Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
Hi,
I have a user reporting the following error during recoll indexing:
flush() failed: Db block overwritten - are there multiple writers?
"flush() failed" is from recoll, the rest is, I think the text of the Xapian
exception.
This is with Xapian 1.4.3 on Linux (I asked for more details, should be
coming).
I don't think that I've ever seen this error, and I also
2018 Jan 03
2
Storing the documents text: data record or value ?
Hi,
Following the Recoll snippets generation performance problem caused by the
new positions list storage scheme in Xapian 1.4, I am experimenting with
generating snippets from the complete document text stored in the index.
This increases the index size much less than I would have expected (around
10-15% apparently with my home directory data), which is good news
obviously.
I have tried
2020 Aug 25
2
MultiDatabase shard count limitations
Olly Betts <olly at survex.com> wrote:
> On Mon, Aug 24, 2020 at 05:58:02AM +0000, Eric Wong wrote:
> > Olly Betts <olly at survex.com> wrote:
> > > Can prof report time for a function including things it calls?
> >
> > callgraph? Attached is a profile the output of "perf report -g"
> > with callgraph info. I'm no perf expert,
2007 Jul 13
3
THANK YOU: Updating R version
Based on the feedback received, I did the following:
a) moved my lib sub-directory from the existing installed R version to
c:\myRLib
b) installed the updated R version
c) created .Renviron file in the home directory (C:\R-2.5.1) with the line
R_LIBS=c:/myRLib
d) used .libPaths() command to confirm that the new R installation was
recognizing the myRLib sub-directory
e) deleted my old R
2010 Feb 18
2
xapian.DocNotFoundError: regression?
Hello,
I've installed xapian-core 1.1.3 and xapian-bindings 1.1.4 from the
tarballs announced by Olly the other day. With these versions,
Enquire.get_mset() seems to consistently be raising
xapian.DocNotFoundError.
I've attached a small test case which reproduces this. The same test
case works fine with 1.0.16 (not the latest 1.0.x, but it's what I had
installed).
Program output
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in
Xapian (with some frequently used normalizations) as it will also give me a
good hang of implementing a weighting scheme before I start working on
implementing DFR schemes.
I read the following as references and I think Ive understood it well and
can write the hack :-
1.)
2006 Jun 08
4
weblog.rubyonrails.com - needs upgrade ?
Last 2 articles have 500+ comments.
Also, whenever I comment, it never appears immediately.
It terribly needs an upgrade.
-Pratik
--
rm -rf / 2>/dev/null - http://null.in
2004 Dec 17
2
Custom weight factors - pushing the relevancy ranking how we want it
Hi guys (and gals?),
We're using Xapian/Omega for indexing and searching forums.
As forums are, the content that is relevant to a search is not just
determined by the frequency or location of the terms; the date the topic
has been last modified is important as well.
Another issue we find is that the amount of results is so overwhelming,
the user is unable to find the correct topic for his