Jianping Wang
2012-Oct-04 15:48 UTC
[Xapian-devel] some improvements about the latent semantic search
Hi,all, Recently I invented a new ranking algorithm inspired by the theory of spread activation and probabilistic model, which can find the latent semantic relationship between docs and terms and is almost linear time, and I took one afternoon to code and implement this algorithm. And the testing result shows that the speed of this algorithm is much faster than the famous Latent Semantic Analysis algorithm, and the affect is almost as good as the LSA. I wanna share my idea to all of you and add this algorithm to the Xapian project. -------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20121004/c770aa1b/attachment.html>
Olly Betts
2012-Oct-09 01:41 UTC
[Xapian-devel] some improvements about the latent semantic search
On Thu, Oct 04, 2012 at 11:48:13PM +0800, Jianping Wang wrote:> Recently I invented a new ranking algorithm inspired by the theory of > spread activation and probabilistic model, which can find the latent > semantic relationship between docs and terms and is almost linear time, and > I took one afternoon to code and implement this algorithm. And the testing > result shows that the speed of this algorithm is much faster than the > famous Latent Semantic Analysis algorithm, and the affect is almost as good > as the LSA. I wanna share my idea to all of you and add this algorithm to > the Xapian project.Can you express your algorithm as a sum of a positive weight from each matching term, optionally plus a per-document component? That's a requirement for it to be implementable within the Xapian matcher framework. If it doesn't fit into this form, you'll need to do a lot more work to fit it into Xapian. If the algorithm is a product of a contribution per term, then taking the log may allow you to express it as such a sum. To implement a new weighting scheme, you need to subclass Xapian::Weight and implement several methods: http://trac.xapian.org/browser/trunk/xapian-core/include/xapian/weight.h Cheers, Olly
Seemingly Similar Threads
- Solution to: Error "... x must be atomic" when using lsa (latent semantic analysis) package
- Error "... x must be atomic" when using lsa (latent semantic analysis) package
- Xapian-devel Digest, Vol 90, Issue 1
- Non-linear regression with latent variable
- Located Latent Class Analysis (Uebersax)