thr3ads.net - similar to: "GSOC 2016 project on Ranking"

Displaying 20 results from an estimated 8000 matches similar to: "GSOC 2016 project on Ranking"

2004 Feb 25

Computing very large distance matrix

Hello All, I have a 131072x132 matrix for which I need to compute a regular euclidean distance matrix, which I then need to transform and run agnes() on this transformed matrix. I am having trouble computing the distance matrix as it is fairly large and I am sure I have gone over the max. The specific error I am getting is: Error in vector("double", length) : negative length vectors

About the projects of "Ranking" for GSoC 2012

2012 Mar 27

About the projects of "Ranking" for GSoC 2012

Hello, I am Mohiuddin Abdul Qader, final year student from dept of CSE in Bangladesh University of Engineering & Technology(BUET). My major was artificial intelligence & i finished my course on Machine Learning and Pattern Recognition this year. I am very keen to contribute in open source community. I have just completed my thesis on 'Location Based Structured Web Search'. For the

SVD for reducing dimensions

2002 Nov 17

SVD for reducing dimensions

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, this is probably simple and I'm just doing something stupid, sorry about that :-) I'm trying to convert words (strings of letters) into a fairly small dimensional space (say 10, but anything between about 5 and 50 would be ok), which I will call a feature vector. The the distance between two words represents the similarity of the

help with writing output from two different arrays to two columns in an output file

2006 May 10

help with writing output from two different arrays to two columns in an output file

Hi, I am very new to R and I have written the following block of code to generate a gamma distribution for variable x (which is an array) and a function "y" whose array values depend on the individual array values of "x". The code is as follows: n=1000 x=rgamma(n,1.5,2) y=vector("numeric",n) for (i in 1:n){ y[i]=(2937/50000*exp(-1/1000*x[i])/x[i]) } now I want to

New Idea on Ranking in IR

2011 Apr 01

New Idea on Ranking in IR

Hello, I want to discuss my idea on ranking in IR system which I think can be good extension to Xapian. If I am not too late to discuss it then please consider it. I first give you brief background of me, I am a Masters student working on my thesis in the Information Retrieval. I today only got a mail from one of the professor from Europe whom i am going to join for Ph.D about GSoC and more

Clustering with 'agnes'

2004 Feb 04

Clustering with 'agnes'

Hello, I had a question regarding clustering using the agnes() function from the 'cluster' package. I was wondering if anyone knew how I can identify cluster points after running the agnes function. For example, I created a dataset with points randomly scattered around (0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and

GSoc 2017 Introduction(Weighting Schemes)

2017 Mar 05

GSoc 2017 Introduction(Weighting Schemes)

Hello Everyone, I am a second year graduate student at IIIT-Bangalore and my interest is in the field of Information Retrieval. I have successfully compiled Xapian from source and have implemented some examples. While going through the project list Weighting Schemes project is the one I was looking to contribute to. So i went through the xapian-core/weight where most of the schemes are already

GSoc Project Idea Weighting Schemes (Ranking)

2014 Nov 23

GSoc Project Idea Weighting Schemes (Ranking)

Hi, I am Abhishek Currently Xapian::Weight follows BM25 scheme, many models such as the Divergence from Randomness (DfR) family of models, Unigram Language Model and the Bi-gram Language Model implemented two years ago in GSoc 2012 yet not merged to the master. The new weighing schemes or improvement in implementing the previous models to change the default scheme of BM25 from SMART with

Index indexed words

2010 Jan 18

Index indexed words

Hello, We would like to create Google or Firefox like "search hints". If someone types "abc", the search system should name some possible hints. I think, Firefox does it by indexing 3-characters of the domain name. If you enter parts, you get some hints. Thank you very much Marcus

GSOC 2012

2012 Apr 03

GSOC 2012

Hi, I am final year student at Indian Institute of Technology, Kharagpur in the computer science department. I am interested to apply in the following projects for gsoc 2012 1.Weighting Schemes 2.Learning to Rank 3.Gmane Search Improvements I have a strong background in Information retrieval and Machine learning. I have worked previously with Yahoo Research Labs in the area of Information

[GSoC2012] Learning to Rank: few thoughts/issues

2012 Apr 01

[GSoC2012] Learning to Rank: few thoughts/issues

Hello, I would like to work with Orange as part of GSoC 2012(and continue henceforth). Apologies for joining in a bit late- i was waiting to get a proper grasp of things before discussing it here. Currently I am a Masters students in Mathematics with my bachelors in Computer Science[integrated dual degree]. Over the last year and a half, I have worked on a few ML projects and have a couple of

Chinese, Japanese, Korean Tokenizer.

2007 Jun 05

Chinese, Japanese, Korean Tokenizer.

Hi, I am looking for Chinese Japanese and Korean tokenizer that could can be use to tokenize terms for CJK languages. I am not very familiar with these languages however I think that these languages contains one or more words in one symbol which it make more difficult to tokenize into searchable terms. Lucene has CJK Tokenizer ... and I am looking around if there is some open source that we

*wildcard* support?

2005 Oct 08

*wildcard* support?

Hello, First I wanted to say thanks for a great piece of software, thanks Olly and others who've contributed! I know that Xapian supports right-truncating, if that's the proper name for wildcard support, as in a search for "xapia*". I don't believe Xapian supports wildcards on both sides of a term, correct? Is this something that is technically unfeasable, unpalatable

Participation in GSOC

2011 Mar 29

Participation in GSOC

Hi, I'm Michael, I would like to participate in this year's Google Summer of Code, and I picked Xapian as the project to code for. Before writing a full proposal, I want to get in contact with the community, as well as introducing myself and discuss my ideas for the contribution to Xapian. First of all I'd like to talk about my motivation. I'm currently working on a webapp

Participation in GSOC

2011 Mar 29

Participation in GSOC

Learning to Rank : GSoC 2012

2012 Apr 01

Learning to Rank : GSoC 2012

Hello all, This is in reference to "Learning to Rank" Project Idea. [I know, i made the entry a bit late, but hope you are still in interest to help out] I am looking for suggestions to help me narrowing down the choices of algorithms. I had been readily surveying on the referred algorithms for the purpose of choosing the right one. I am mentioning here some of my doubts to discuss and

Xapian 1.3.5 snapshot performance and index size

2016 Apr 11

Xapian 1.3.5 snapshot performance and index size

Olly Betts writes: > On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote: > > Some might notice the 50% index size increase. Excessive index size is > > already one relatively rare, but recurring complaint. Except if I did > > something wrong: I'm actually quite surprised by it. > > Did you try compacting the resulting databases? > >

GSoC Term Weighting project

2012 Mar 23

GSoC Term Weighting project

Hi everyone, I'm a graduate student in Linguistics and Computer Science in the US, and I'm planning to propose a project to Xapian for GSoC that would implement and evaluate a variety of weighting schemes and ranking methods, allowing users to select different combinations. I have pretty thorough knowledge IR weighting and ranking, and I'm good in Java and Perl, and functional in

Lucene ranking

2004 Oct 28

Lucene ranking

Kevin Burton has posted about poor ranking in Lucene preferring shorter documents over longer ones[1]. A similar search in Xapian returns documents in the expected order: Performing query `Xapian::Query(foo)' 3 results found ID 3 99% [foo foo foo] ID 2 94% [foo foo] ID 1 80% [foo] Anyone know what Lucene is doing here? Their FAQ doesn't mention what weighting scheme they use, and I

Ranking and term proximity

2011 Sep 04

Ranking and term proximity

Hi, I was reading an article recently about how google ranks results (among many other things of course) based on the proximity of the search terms in the source documents. In addition, the position of the search terms in the search query string itself is also taken into consideration when determining how important each term is. Does Xapian do something similar - at least for the first part?

similar to: GSOC 2016 project on Ranking