Displaying 20 results from an estimated 8000 matches similar to: "GSOC 2016 project on Ranking"
2004 Feb 25
4
Computing very large distance matrix
Hello All,
I have a 131072x132 matrix for which I need to compute a regular euclidean distance matrix, which I then need to transform and run agnes() on this transformed matrix. I am having trouble computing the distance matrix as it is fairly large and I am sure I have gone over the max.
The specific error I am getting is:
Error in vector("double", length) : negative length vectors
2012 Mar 27
1
About the projects of "Ranking" for GSoC 2012
Hello,
I am Mohiuddin Abdul Qader, final year student from dept of CSE in
Bangladesh University of Engineering & Technology(BUET).
My major was artificial intelligence & i finished my course on Machine
Learning and Pattern Recognition this year. I am very keen to contribute in
open source community. I have just completed my thesis on 'Location Based
Structured Web Search'. For the
2002 Nov 17
1
SVD for reducing dimensions
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all, this is probably simple and I'm just doing something stupid, sorry
about that :-)
I'm trying to convert words (strings of letters) into a fairly small
dimensional space (say 10, but anything between about 5 and 50 would be ok),
which I will call a feature vector. The the distance between two words
represents the similarity of the
2006 May 10
1
help with writing output from two different arrays to two columns in an output file
Hi,
I am very new to R and I have written the following block of code to
generate a gamma distribution for variable x (which is an array) and a
function "y" whose array values depend on the individual array values of
"x".
The code is as follows:
n=1000
x=rgamma(n,1.5,2)
y=vector("numeric",n)
for (i in 1:n){
y[i]=(2937/50000*exp(-1/1000*x[i])/x[i])
}
now I want to
2011 Apr 01
2
New Idea on Ranking in IR
Hello,
I want to discuss my idea on ranking in IR system which I think can be good
extension to Xapian. If I am not too late to discuss it then please consider
it. I first give you brief background of me, I am a Masters student working
on my thesis in the Information Retrieval. I today only got a mail from one
of the professor from Europe whom i am going to join for Ph.D about GSoC and
more
2004 Feb 04
1
Clustering with 'agnes'
Hello,
I had a question regarding clustering using the agnes() function from the 'cluster' package.
I was wondering if anyone knew how I can identify cluster points after running the agnes function.
For example, I created a dataset with points randomly scattered around (0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and
2017 Mar 05
3
GSoc 2017 Introduction(Weighting Schemes)
Hello Everyone,
I am a second year graduate student at IIIT-Bangalore and my interest is in
the field of Information Retrieval. I have successfully compiled Xapian
from source and have implemented some examples. While going through the
project list Weighting Schemes project is the one I was looking to
contribute to. So i went through the xapian-core/weight where most of the
schemes are already
2014 Nov 23
2
GSoc Project Idea Weighting Schemes (Ranking)
Hi,
I am Abhishek
Currently Xapian::Weight follows BM25 scheme, many models such as the
Divergence from Randomness (DfR) family of models, Unigram Language Model
and the Bi-gram Language Model implemented two years ago in GSoc 2012 yet
not merged to the master.
The new weighing schemes or improvement in implementing the previous models
to change the default scheme of BM25 from SMART with
2010 Jan 18
4
Index indexed words
Hello,
We would like to create Google or Firefox like "search hints".
If someone types "abc", the search system should name
some possible hints.
I think, Firefox does it by indexing 3-characters of the domain
name. If you enter parts, you get some hints.
Thank you very much
Marcus
2012 Apr 03
1
GSOC 2012
Hi,
I am final year student at Indian Institute of Technology, Kharagpur in the
computer science department.
I am interested to apply in the following projects for gsoc 2012
1.Weighting Schemes
2.Learning to Rank
3.Gmane Search Improvements
I have a strong background in Information retrieval and Machine learning. I
have worked previously with Yahoo Research Labs in the area of Information
2012 Apr 01
1
[GSoC2012] Learning to Rank: few thoughts/issues
Hello,
I would like to work with Orange as part of GSoC 2012(and continue
henceforth). Apologies for joining in a bit late- i was waiting to get a
proper grasp of things before discussing it here. Currently I am a Masters
students in Mathematics with my bachelors in Computer Science[integrated
dual degree]. Over the last year and a half, I have worked on a few ML
projects and have a couple of
2007 Jun 05
7
Chinese, Japanese, Korean Tokenizer.
Hi,
I am looking for Chinese Japanese and Korean tokenizer that could can
be use to tokenize terms for CJK languages. I am not very familiar
with these languages however I think that these languages contains one
or more words in one symbol which it make more difficult to tokenize
into searchable terms.
Lucene has CJK Tokenizer ... and I am looking around if there is some
open source that we
2005 Oct 08
1
*wildcard* support?
Hello,
First I wanted to say thanks for a great piece of software, thanks
Olly and others who've contributed!
I know that Xapian supports right-truncating, if that's the proper
name for wildcard support, as in a search for "xapia*".
I don't believe Xapian supports wildcards on both sides of a term, correct?
Is this something that is technically unfeasable, unpalatable
2011 Mar 29
2
Participation in GSOC
Hi,
I'm Michael, I would like to participate in this year's Google Summer of
Code, and I picked Xapian as the project to code for.
Before writing a full proposal, I want to get in contact with the
community, as well as introducing myself and discuss my ideas for the
contribution to Xapian.
First of all I'd like to talk about my motivation.
I'm currently working on a webapp
2011 Mar 29
2
Participation in GSOC
Hi,
I'm Michael, I would like to participate in this year's Google Summer of
Code, and I picked Xapian as the project to code for.
Before writing a full proposal, I want to get in contact with the
community, as well as introducing myself and discuss my ideas for the
contribution to Xapian.
First of all I'd like to talk about my motivation.
I'm currently working on a webapp
2012 Apr 01
2
Learning to Rank : GSoC 2012
Hello all,
This is in reference to "Learning to Rank" Project Idea. [I know, i made
the entry a bit late, but hope you are still in interest to help out]
I am looking for suggestions to help me narrowing down the choices of
algorithms. I had been readily surveying on the referred algorithms for the
purpose of choosing the right one. I am mentioning here some of my doubts
to discuss and
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes:
> On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote:
> > Some might notice the 50% index size increase. Excessive index size is
> > already one relatively rare, but recurring complaint. Except if I did
> > something wrong: I'm actually quite surprised by it.
>
> Did you try compacting the resulting databases?
>
>
2012 Mar 23
1
GSoC Term Weighting project
Hi everyone,
I'm a graduate student in Linguistics and Computer Science in the US, and
I'm planning to propose a project to Xapian for GSoC that would implement
and evaluate a variety of weighting schemes and ranking methods, allowing
users to select different combinations. I have pretty thorough knowledge IR
weighting and ranking, and I'm good in Java and Perl, and functional in
2004 Oct 28
1
Lucene ranking
Kevin Burton has posted about poor ranking in Lucene preferring
shorter documents over longer ones[1]. A similar search in Xapian
returns documents in the expected order:
Performing query `Xapian::Query(foo)'
3 results found
ID 3 99% [foo foo foo]
ID 2 94% [foo foo]
ID 1 80% [foo]
Anyone know what Lucene is doing here? Their FAQ doesn't mention what
weighting scheme they use, and I
2011 Sep 04
5
Ranking and term proximity
Hi,
I was reading an article recently about how google ranks results
(among many other things of course) based on the proximity of the
search terms in the source documents. In addition, the position of
the search terms in the search query string itself is also taken into
consideration when determining how important each term is.
Does Xapian do something similar - at least for the first part?