Displaying 20 results from an estimated 9000 matches similar to: "[GSoC 2014] clustering of search results"
2014 Mar 10
2
[GSoC 2014] clustering of search results
On Mon, Mar 10, 2014 at 3:59 PM, Olly Betts <olly at survex.com> wrote:
> Exactly what approach the project takes isn't nailed down - it just
> seemed something which would be interesting for a student to work on,
> and would be useful to Xapian users.
>
> My understanding of the current clustering branch (which may not be
> completely accurate) is that it clusters
2014 Apr 13
2
Adding an external library to Xapian
My code is not on Github. I am using the tarball as of now. The following
it the error that occurred:
http://pastebin.com/cVJrjUZX
On Sun, Apr 13, 2014 at 8:16 PM, James Aylett <james-xapian at tartarus.org>wrote:
> On 13 Apr 2014, at 15:37, Pallavi Gudipati <pallavigudipati at gmail.com>
> wrote:
>
> > A linker error is encountered even after following the above
2010 Jul 26
2
related documents
Hi All,
I would like to take a doc in the xapian DB and find all related
documents by relevance e.g. so when you view one document it says
"Related entries X Y Z".
I'm aware of the "Morelikethis" Lucene plugin that is supposed to do
something like this, by generating a query from a document based on term
frequency.
Has anyone developed a tool to generate a query from a
2012 Mar 22
1
GSOC : Language Modelling for information retrieval with Diversified Search results
Hello,
I am a undergraduate student at DA-IICT,India pursuing Btech in
Information and Communication Technology.Major field of my Research is
Information Retrieval and Natural Language processing. xapain being an
powerful Information retrieval library have attracted me towards
implementing stuff learned in class for this project.I have worked on
entity search on RDF data,SMS based FAQ
2009 Jan 27
1
Segmentation fault in MSetIterator get_weight
Hi,
I'm using xapian with c# and mono and i'm having a segfault in get_weight.
When i print the index variable, the value is clearly too high.
I think something write over it. Do you have any idea on how i could
trace the beginning of the segmentation fault ?
Thanks,
--
Yann
2009 Nov 14
1
[Xapian discuss] Is there difference between single database and multiple database?
Hi, all:
I am not quite clear about the internal process of query in xapian, is
there any differences between querying in a single database and a multiple
database, if both of them have the same amount of data?
For example, the single database has n documents, which is equal to
the total documents' count of the databases in the multiple database ?
What are the differences
2009 Apr 23
1
Expanding the search in PHP
I tried using the simpleexpand.php from
http://xapian.org/docs/bindings/php/examples/simpleexpand.php5
I get different results between PHP and the Omega expand (see below),
I'd like to have the same functionality in PHP.
Could anyone suggest how to do it? Is there an example I could use?
Thanks,
Frank
And got the following results from PHP:
Zdefin: weight = 46.963883268652
Zconfigur:
2016 Mar 06
3
GSOC-2016 Project : Clustering of search results
On Sun, Mar 6, 2016 at 7:17 AM, James Aylett <james-xapian at tartarus.org>
wrote:
> On Sat, Mar 05, 2016 at 10:58:43PM +0530, Richhiey Thomas wrote:
>
> K-Means or something related certainly seems like a viable approach,
> so what you'll need to do is to come up with a proposal of how you'd
> implement this in Xapian (either with reference to the previous work,
>
2012 Apr 27
4
GSoC xapian node binding
Posting recent offline discussion...
On Fri, Apr 27, 2012 at 10:55 AM, Marius Tibeica <mtibeica at gmail.com> wrote:
> Hi Liam,
>
> I've added the Enquire class and designed a query spec structured as a JS
> object. Hope you like it :)
> I'll probably be off a few days (there is a national holiday Tuesday which
> means i have a long weekend :D) but maybe I'll
2016 Mar 05
2
GSOC-2016 Project : Clustering of search results
Hello devs,
I am Richhiey Thomas, pursuing my third year of undergraduate studies in
Computer Science from Mumbai University. I had gone through the project
list for this year and the project idea based on clustering caught my
attention. I spoke to Assem Chelli on IRC who guided me to the code and got
me started.
I started going through the code and have successfully built Xapian on my
machine.
2014 Mar 10
2
[GSoC 2014] About "Clustering of Search Results"
Hello. I am Liu Chi(??) from Peking University, China. I am planning to
join GSoC. I am interested in Xapian and looking forward to find something
interesting in GSoC 2014 Project Ideas List.
The topic of "Clustering of Search Results" looks interesting and I think
it suits me. I have been involved in a project that aims to clustering
tweets based on the text similarity and user
2016 Mar 07
2
GSOC-2016 Project : Clustering of search results
On Mon, Mar 07, 2016 at 01:36:43AM +0530, Richhiey Thomas wrote:
> My questions are:
> 1) Can you direct me on how to convert this raw idea into a proposal in
> context to Xapian with more detail? What areas do I focus on?
Our GSoC guide has an application template
<https://trac.xapian.org/wiki/GSoCApplicationTemplate> which you
should use to structure your proposal. It has some
2014 Apr 13
2
Adding an external library to Xapian
We are using the --enable-maintainer-mode and will move to git soon.
The diff file is attached.
*Siddhant Mutha*
Undergraduate Student
Department of Computer Science and Engineering
IIT Madras
Chennai
http://www.siddhantmutha.com/ <http:/www.siddhantmutha.com/>
On Sun, Apr 13, 2014 at 8:26 PM, James Aylett <james-xapian at tartarus.org>wrote:
> On 13 Apr 2014, at 15:48, Pallavi
2015 Feb 19
1
CentOS Participation in GSOC-2015
Hi Johnny,
This is to enquire as to whether CentOS will be participating in GSOC
this year?
The Mentoring Organization applications are now being accepted for Google
Summer of Code 2015.
http://google-opensource.blogspot.in/2015/02/mentoring-organization-applications-now.html
Regards,
Saket Sinha
2013 Oct 23
2
performance on document.get_data()
I got some performance issue for document.get_data() and
enquire.get_mset(). It costs 35 seconds for matches =
enquire.get_mset(0,200), and 3 seconds for iterating all doc in matches to
get_data. Is't normal? My index contains 30millions documents. I use python
binding to operate xapian. Bellow it's my index structure
# value: 0:date, 1:site
# data: json message which contains: author,
2016 May 01
2
GSoC 2016 - Introduction
Before going ahead with the tests as you mentioned above, I would just like
to clarify a few higher level things that I am still in doubt about.
1) As discussed during the IRC interview, I was suggested about first
implementing a normal K-means clustering implementation and then adding on
the PSO module as a functionality that can be used to improve quality of
clustering for speed as a trade off.
2017 Mar 09
2
GSoC 2017 Project Proposal
Hello devs.
I would like to propose how I plan to go about improving and getting a
system that can be integrated into Xapian in this GSoC for the clustering
branch.
I have identified three areas of work which were not touched last time.
1) Automated Performance Analysis
I had roughly implemented 2 evaluation techniques previously (Distance b/w
document and centroids within clusters and
2018 Mar 30
2
sorting large msets
Hello, is there a way to optimize sorting by certain values
for queries which return a huge amount of results?
For example, I just want a simple query that gives me the 200
most recent emails out of millions. The elapsed time for
get_mset increases as the number of documents ($n * 2000)
increases.
I suppose I could store a pre-sorted set using SQLite or
similar. Thanks in advance for any
2005 Jun 29
2
Sort by docid
Hello,
I wonder if there is a way to cause Xapian to order a result set purely by
docid. In other words, once the result set has been determined, I'd like the
results to be returned to me ordered by their docid, as opposed to by their
match relevance.
The problem at hand is that I'm building a search engine for a mailing list
and I would like to return matches sorted by date; ordering by
2023 Aug 17
1
does Xapian::Enquire hold an MVCC revision?
In other words, is it possible to avoid duplicates if new
documents are inserted into the DB by another process in-between
->get_mset calls when reusing Xapian::Enquire objects?
I do some expensive processing on each mset window, so I always
limit the results to limit heap usage even if I'm planning on
going through a big chunk of the DB:
$mset = $enq->get_mset(0, 1000);