Dear Sidhant,
We do welcome the student's idea but it would be more useful if you
introduce your idea with more details. Based on the information you provide
and glancing through the attached paper, I have following questions:
1. The confidence measure mentioned in the paper assigns multiple weights
to the terms which are basically features for the categorization task, how
do you perceive it for the query-document setting of IR here in Xapian?
Please be more elaborate.
2. "The major problem with text categoriztion is that the system
doesn't
take into account the context of the query." - Yes, it is certainly a
challenge but there are several ways to get this context from the user
profiling (personalised search) to diversified IR where you give all the
diverse results for the same query. Where do you place your proposal and
how do you want to achieve it.
Cheers,
Parth.
On Mon, Mar 3, 2014 at 10:35 AM, Sidhant Panda <sidhantpanda at
gmail.com>wrote:
> Hi,
>
> I would like to contribute to the "Weighting Schemes" project. I
have
> previously worked with weighting schemes like tf-idf.
>
> My past experience was in a project which was able to successfully
> classify a text question into its subject (like Physics) and also its sub
> topic (like reflection, refraction etc) based on an ontology built from
> crawling wikipedia articles.
>
> The major problem with text categoriztion is that the system doesn't
take
> into account the context of the query.
>
> I would like to propose an alternate measure based on a "confidence
> measure". I am currently trying to implement the same in another
project. I
> have attached the paper which talks about this "confidence"
measure.
>
> Regards
> Sidhant Panda
>
> _______________________________________________
> Xapian-devel mailing list
> Xapian-devel at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20140303/f07a2aa7/attachment-0002.html>