Displaying 20 results from an estimated 1000 matches similar to: "GSOC 2011 : Weighting Schemes"
2016 Aug 07
2
Weighting Schemes: Evaluation results
Hi,
Evaluation of pivoted normalization ("PPP") of tf-idf weighting scheme is
also complete now. I have also evaluated the default tf-idf normalization
("ntn") and other normalizations combinations involving pivoted
normalization in wdfn, idfn and wtn component as "Pxx", "xPx" and "xxP"
normalization strings respectively to have a clear idea about
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in
Xapian (with some frequently used normalizations) as it will also give me a
good hang of implementing a weighting scheme before I start working on
implementing DFR schemes.
I read the following as references and I think Ive understood it well and
can write the hack :-
1.)
2017 Mar 05
3
GSoc 2017 Introduction(Weighting Schemes)
Hello Everyone,
I am a second year graduate student at IIIT-Bangalore and my interest is in
the field of Information Retrieval. I have successfully compiled Xapian
from source and have implemented some examples. While going through the
project list Weighting Schemes project is the one I was looking to
contribute to. So i went through the xapian-core/weight where most of the
schemes are already
2020 Apr 29
2
[Posible SPAM] Re: Stopwords: Topic modelling con LDA
Hola,
Acabo de calcular tf-idf y me surge una duda. ¿Habría un valor de idf o
tf-idf que se considerara como umbral para establecer que una palabra es
muy común o no? Los valores de idf en mis datos van entre 0 y 3.78 y los
de tf-idf ente 0 y 0.07.
Un saludo
El Mar, 28 de Abril de 2020, 12:53, Carlos Ortega escribió:
> Hola,
> Yo de primeras los quitaría para qué otros topics aparecen.
2016 Jul 28
2
Weighting Schemes: Evaluation results
Ah. If FIRE doesn't have something that can show this suitably, then
> maybe Parth can advise on access to TREC, as I know he's used some of
> them in the past.
>
?I can say FIRE is also a reliable source but INEX/TREC are better. INEX
can give you free access and TREC is not freely available. I had used INEX
for xapian in the past and some details are here:
2008 Nov 12
1
Two problems with Samba in AD realm
Hello list.
I recently moved to an AD environment. I'm still keeping a samba servers
to make my cups-managed printers available to windows users, rather than
duplicating configuration with a Windows print service. But I'm facing
two problems, probably due to the way we manage AD.
First, all my host belong to a Unix-managed DNS domain
(msr-inria.inria.fr), not to the windows-managed
2011 Jul 17
1
How to speed up interpolation
df is a very large data frame with arrival estimates for many flights
(DF$flightfact) at random times (df$PredTime). The error of the estimate
is df$dt.
My problem is that I want to know the prediction error at each minute
before landing. This code works, but is very slow, and dominates
everything. I tried using split(), but that rapidly ate up my 12 GB of
memory. So, is there a better R way of
2006 Sep 20
8
Understanding boost ?
Hi,
I''m confused about managing field boosting ...
I have set the :boost for the :name field in my docs to 10, via :boost
=> 10
Then I performed a search for ''keith'' over all fields via with
*:(keith*), expecting a doc with Keith in the :name field to come out on
top. But another doc with Keith mentioned in other fields (:comments,
:address) scored higher.
I
2013 Mar 03
0
Added code and tests for the tf-idf weighting scheme.
Hello guys.I have sent a pull request for the code and tests of the Tf-Idf
weighting scheme.
Please do let me know if any changes are required.Meanwhile,Ill begin
working on implementing normalizations which require additional statistics
and on the DFR schemes.
https://github.com/xapian/xapian/pull/6
On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote:
>
2013 Oct 20
3
Errore : requires numeric/complex matrix/vector arguments
Dear R users,I'm a new user of R. I'm trying to do a LM test an there is this type of error: Error in t(mX) %*% mX : requires numeric/complex matrix/vector arguments.
To be clear I write down the code in which mY ( 126,1 ) mX (126,1) mZ(126,1) are matrix.
LMTEST <- function(mY, mX, mZ)#mY, mX, mZ must be matrices!#returns the LM test statistic and the degree of freedom{iT =
2008 Mar 05
3
ipf function in R
Hi
I have a 3 x 2 contingency table:
10 20
30 40
50 60
I want to update the frequencies to new marginal totals:
100 130
40 80 110
I want to use the ipf (iterative proportional fitting) function which
is apparently in the cat package.
Can somebody please advice me how to input this data and invoke ipf in R
to obtain an updated contingency table?
Thanks.
By the way I am quite new to R.
--
Dr
2017 Mar 16
2
GSoC-2017 Introduction and Project Discussion
Hello,
I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate
at Institute of Engineering & Technology in Lucknow, India. This mail is an
expression of my interest for Google Summer of Code program of this year. I
want to apologize for getting in so late. Actually I would have contacted
earlier, but sudden demise of my Grandfather disabled me in doing so.
I am
2012 Apr 02
0
GSoC, Xapian Project Weighting Schemes
Hello all,
I am very sorry I did not include xapian-devel mailing list in my previous mail.
Thanks for responding my mail.
Mohd Azeem
NIT UK
________________________________
From: Olly Betts <olly at survex.com>
To: Mohd Azeem <azeem201001 at yahoo.in>
Cc: Parth Gupta <parthg.88 at gmail.com>
Sent: Saturday, 31 March 2012 11:40 AM
Subject: Re: GSoC, Xapian Project Weighting
2016 Mar 10
2
Introduction and Doubts
Tf-idf is most used used weighting scheme is easy to understand and has
been used in other frameworks like lucene and many other places.
okapi bm25(implemented in xapian) is theoretically better/improved measure
than tf-idf and
i am looking into various other weighting scheme which are there in xapian
or can be implemented like TF-ICF(term frequecy inverse corpus
frequency),TF-RF(term
2004 Sep 10
3
Improving on Rice coding
Hello,
I am the author of the Bonk audio compression program... i've just been
looking at your comparrison table, and i noticed bonk gets marginally
better compression than Flac on some files (actually i was
rather surprised to see bonk on the list at all, it's not exactly high
profile :-) ).
Bonk in lossless mode is a pretty naive implementation of a predictive
coder, so the main
2010 May 15
1
conditional calculations per row (loop versus apply)
Hi all,
I'm hoping someone might help with a query about conditionally applying formulas to a dataframe.
In essence I have 3 lookup tables (Table A, B & C) and a dataframe with a variable Type.Code, which identifies the Lookup Table to which each record belongs. The lookup tables reference different sensor types for which I need apply a different formula to values in Column3 in each row
2016 May 05
2
GSoC 2016 - Introduction
Hello,
Thanks James for the reply. That cleared a few things out. Apologies for
replying late because of exams going on.
I was going through the previous clustering API to understand how it worked
and it seems like the the approach for construction of the termlists which
are used for distance metrics use TF-IDF weighting with cosine similarity,
which is very similar to the approach I would need
2003 Mar 03
1
transition matrix problem
I'm having trouble using some fairly simple code to change the entries
in a vector (x - the numbers 0-5) according to a simple transition
matrix that I've called p.dry.
the error message I get is
"no finite arguments to min; returning Inf"
Any suggestions as to where I'm going wrong greatly appreciated. Sorry
the message is lengthy!
cheers
Nick
The code is
2012 Apr 20
1
Implementing the tf-idf weighting scheme
Hi, all:
This is the basic implementation of tf-idf scheme (basic scheme used in
SMART) that can be used in the Xapian. It might still need some futher
revision, but I believe it works anyway.:)
I modified the weight.h to define a subclass Tf_idfWeight and add a new
file tf_idf.cc in ../weight in the repo, to implement Tf_idfWeight.
Here is the git diff patch:
https://gist.github.com/2422049
2013 Feb 25
0
Sent a pull request for the Tf-Idf Weighting scheme
Hello guys :) I have sent a pull request for the Tf-Idf Weighting scheme
incorporating as many normalizations as I could with the help of statistics
currently available from Xapian::Weight . Please let me know what you'll
think about it.
I used the weighting scheme in a simple searcher and it did a fine job with
it. I have no experience with writing tests for features like this.Please
give me