Benny Chan
2007-Jul-20 23:01 UTC
[Xapian-discuss] Making Title Fields Have More Weight In Xapian Search
Hello, I currently have about 5000 documents that I am indexing with scriptindex with the following input file: document_id: field=ref unique=Q boolean=Q document_title: field=title weight=3 unhtml index url: field=url document_text: field=document_text unhtml index abstract: field=abstract category: field=category boolean=XC This is working fine however the document results are not showing up with the most relevant titles ranked highest. For example, if I search for the term "trash can," I might get results like this: 1. How trash is stored 2. Talking trash can make you trash 3. Stories about cats where the first result is obviously most relevant based on the text of the document. However, I want to base my results more on the title of the document, giving that more weight. When that happens, result #2 should really become result #1 because it contains both words. So I tried changing the line: document_title: field=title weight=3 unhtml index to document_title: field=title weight=100 unhtml index After re-indexing all the documents I tried such a search again and the results were nearly identical. I'm not sure if I'm making the right adjustments or if I'm missing some other adjustments to get the results I want. If anyone has any ideas on how to increase the weight on the title, I'd really appreciate the help. Thanks all. Benny
Ryan Mahoney
2007-Jul-25 17:07 UTC
[Xapian-discuss] Making Title Fields Have More Weight In Xapian Search
I'm also having difficulties applying weights to fields. Any ideas why changing the weight value seems to have little effect on the query results? -r Benny Chan wrote:> Hello, > > I currently have about 5000 documents that I am indexing with scriptindex > with the following input file: > > document_id: field=ref unique=Q boolean=Q > document_title: field=title weight=3 unhtml index > url: field=url > document_text: field=document_text unhtml index > abstract: field=abstract > category: field=category boolean=XC > > This is working fine however the document results are not showing up with > the most relevant titles ranked highest. For example, if I search for the > term "trash can," I might get results like this: > > 1. How trash is stored > 2. Talking trash can make you trash > 3. Stories about cats > > where the first result is obviously most relevant based on the text of > the > document. However, I want to base my results more on the title of the > document, giving that more weight. When that happens, result #2 should > really become result #1 because it contains both words. So I tried > changing > the line: > > document_title: field=title weight=3 unhtml index > > to > > document_title: field=title weight=100 unhtml index > > After re-indexing all the documents I tried such a search again and the > results were nearly identical. I'm not sure if I'm making the right > adjustments or if I'm missing some other adjustments to get the results I > want. If anyone has any ideas on how to increase the weight on the title, > I'd really appreciate the help. Thanks all. > > Benny > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss@lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss > >
Benny Chan
2007-Jul-25 19:22 UTC
[Xapian-discuss] Making Title Fields Have More Weight In Xapian Search
Yea I tried delve --help but i'm pretty new to Xapian and that help isn't very comprehensive. For instance, what are "term lists" "posting lists" "position lists" ? Benny On 7/25/07, Rafael SDM Sierra <rafaeljsg14@gmail.com> wrote:> > On 7/25/07, Benny Chan <misterchan@gmail.com> wrote: > > I feel retarded asking this, but how do I used delve to get that > > information? When I use delve normally, I just get the number of > documents > > and the avg length of documents returned. > > > delve -t database (?) > > [?] - I don't remember it exactly now, but delve --help will help you > > -- > Rafael "SDM" Sierra > http://stiod.com.br/ >