search for: richhiey

Displaying 20 results from an estimated 30 matches for "richhiey".

2015 Jan 03
3
Xapian-discuss Digest, Vol 127, Issue 1
Hey Richhiey, Most probably Xapian is used with CYGWIN in Windows and Windows Specific Code in Xapian is based on CYGWIN, However we would be able to help you out with this issue, if you could pastebin whole 'gnu-make' generated report. Regards, Abhishek On Sat, Jan 3, 2015 at 5:30 PM, <xapian-di...
2016 Jun 09
2
2nd week progress
Hello devs, I have filled out the repo link on TRAC as suggested. I'll also keep the journal updated on TRAC from now on. I am almost done with defining all the base classes required for the clusterer and have started coding the euclidian distance metric. This should be completed by tomorrow after which I'll be spending one day to test and make sure everything functions as expected, so
2015 Feb 15
3
Bitsize project: Krovetz Stemmer
Hello xapian devs, I had shown interest in writing a krovetz stemmer for xapian and spoke to James Aylett about it. Since it was hard to code the stemmer in snowball, I came up with a C++ implementation of the stemmer. But since it is a dictionary based stemmer, im having problems on deciding how to create the dictionary. I did check out some of the implementations of the Krovetz stemmer online
2015 Feb 10
3
Bitsize project - Krovetz stemmer
Hello Xapian devs, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20150210/c848e9b7/attachment-0002.html>
2015 Mar 28
2
Weighting schemes for Xapian
Hello xapian devs, Sorry for not getting back sooner. I was stuck up with coursework. I would like to work on LDA based document modelling and Heimstra's language modelling and would like to form a concrete plan on how to proceed. It would be really helpful if I could have a mentor to assist me with this. Looking forwards to your reply. Thanks. :) -------------- next part -------------- An
2016 Mar 05
2
GSOC-2016 Project : Clustering of search results
Hello devs, I am Richhiey Thomas, pursuing my third year of undergraduate studies in Computer Science from Mumbai University. I had gone through the project list for this year and the project idea based on clustering caught my attention. I spoke to Assem Chelli on IRC who guided me to the code and got me started. I started...
2017 Jun 14
2
KMeans Clusterer - Going forward
...uality) to be working well. As Olly has mentioned in one of his comments on the PR, it wouldn't be ideal to use hard coded criteria for feature selection. Thus using something like an ExpandDecider would certainly be great. I will look into it and make my approach clear as I go ahead. Thanks, Richhiey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20170614/fc5f3a7e/attachment.html>
2016 Oct 01
2
New to Xapian project
Hi, I am currently pursuing my computing science bachelors degree at university of Alberta, Canada. My speciality lie in Information retrieval, machine learning and data mining. In order to get hands on experience with real world information retrieval systems, I would like to contribute to the Xapian project. I have been going through some of the project ideas in
2016 Apr 08
2
Bite-size project
On Fri, Apr 08, 2016 at 09:57:16AM -0400, Richhiey Thomas wrote: > Sorry to take so much time on this. Was down with coursework because the > semester end is nearing. Not a problem -- that sort of thing is affecting a lot of people at the moment! > I used the latest development version which is 1.3.5 for this patch. > I have implemen...
2016 Aug 19
2
KMeans - Evaluation Results
On 18 Aug 2016, at 23:59, Richhiey Thomas <richhiey.thomas at gmail.com> wrote: > I've currently added a few classes which don't really belong to the public API (currently) into private headers and used PIMPL with the Cluster class. I'm having difficulty reading your changes, because you aren't keeping to...
2016 Apr 25
2
GSoC 2016 - Introduction
Hello devs, My name is Richhiey Thomas.and I've been selected for GSoC 2016 for the project Clustering of Search Results. I would like to thank the Xapian GSoC admin's for giving me this opportunity and James and Olly to help me with my first merge request. In the next two to three days, I'll critically examine all t...
2017 Mar 09
2
GSoC 2017 Project Proposal
Hello devs. I would like to propose how I plan to go about improving and getting a system that can be integrated into Xapian in this GSoC for the clustering branch. I have identified three areas of work which were not touched last time. 1) Automated Performance Analysis I had roughly implemented 2 evaluation techniques previously (Distance b/w document and centroids within clusters and
2016 Jul 27
2
K MEANS clustering
Hey Parth, Thanks for the reply. I am considering implementing a cosine distance metric too, along with euclidian distance because of the dimensionality issue that comes in with K-Means and euclidian distance metric. That does help when we deal with sparse vectors for documents. The particular problem I'm having is representing centroids in an efficient way. For example, when we find the mean
2016 Mar 29
2
Bite-size project
On Mar 29, 2016 4:49 PM, "Olly Betts" <olly at survex.com> wrote: > > On Tue, Mar 29, 2016 at 11:41:02AM +0100, James Aylett wrote: > > It's probably helpful to create a ticket and claim it (and update the > > project ideas list to link to it), so other people don't try to work > > on it as well. (I have a feeling that it might have been among the
2016 Mar 06
3
GSOC-2016 Project : Clustering of search results
On Sun, Mar 6, 2016 at 7:17 AM, James Aylett <james-xapian at tartarus.org> wrote: > On Sat, Mar 05, 2016 at 10:58:43PM +0530, Richhiey Thomas wrote: > > K-Means or something related certainly seems like a viable approach, > so what you'll need to do is to come up with a proposal of how you'd > implement this in Xapian (either with reference to the previous work, > or separately), and also how you'd go ab...
2016 Jul 26
3
K MEANS clustering
Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be
2016 Aug 17
2
KMeans - Evaluation Results
On Wed, Aug 17, 2016 at 7:23 PM, James Aylett <james-xapian at tartarus.org> wrote: > >> How long does 200?300 documents take to cluster? How does it grow as > more documents are included in the MSet? We'd expect an MSet of 1000 > documents to take longer to cluster than one with 100, but the important > thing is _how_ the time increases as the number of documents
2016 Aug 15
2
KMeans - Evaluation Results
...the final week for GSoC is starting now, should I list the API that was merged into a different branch before as merged or as yet to merge? And incase I am not able to complete PSO by 23rd august since I am behind the timeline, would it be possible for me to continue it later on? Thanks. Regards, Richhiey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160816/a94f3ed1/attachment.html>
2016 Aug 17
2
KMeans - Evaluation Results
...g some time to sink in :) Say I start with the Clusterer class, I create a ClustererImpl class which is the internal class that Clusterer points to. But since Clusterer is abstract, and KMeans inherits from Clusterer, how do I maintain the inheritance and do the same for KMeans? Thanks. Regards, Richhiey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160818/6119d957/attachment-0001.html>
2016 Aug 18
3
KMeans - Evaluation Results
> > > > Actually, you're doing something slightly unusual there: making the > internal member public. Protected would be better, and private is I think > most usual; library clients aren't going to have access to the Internal > class declaration, so they can't call things on it. This means it's > actually difficult right now to subclass Feature. > > I