Displaying 20 results from an estimated 10000 matches similar to: "Gsoc- Text Extraction Libraries"
2016 Mar 07
2
GSoC 2016: Text-Extraction Libraries in Omega
Hi, everyone. I'm a third-year student in Computer Science. I have a few
projects (school-related) on Bitbucket
<https://bitbucket.org/philipchung/philipchungtech>.
I've been looking at the project-ideas list and I'm interested in making
Omega use libraries instead of external programs.
Right now I'm trying to get Olly's patch that was linked there to apply
to the
2011 Mar 28
0
Draft Application for GSoC 11 - Text extraction libraries - please review
Proposal for Google Summer of Code 2011 (draft)
Appling organisation:Xapian
Name : Nijil.Y
E-mail address: nijil.y at gmail.com
IRC nickname : laserbled
Biography
I am 4thyear Computer Science and Engineering undergraduate student at CUSAT
University from India.I am interested in open source and search engines ,
cluster computing , HPC and AI would be my areas of interest.
* Analytical,
2019 Mar 21
2
[GSoC] Questions about project Text-Extraction Libraries
Hello!
I have a few question related to the project Text-Extraction Libraries.
Firstly, I think that trying to isolate library bugs in subprocesses could
get to work, but I am not sure about how to handle deadlocks or infinite
loops. I feel that using a timer is the only way to deal with it but I
would like to know what you think about it.
Secondly, I have been reading the source code of
2019 Mar 23
2
[GSoC] Questions about project Text-Extraction Libraries
Thanks!
That was really useful!
I wanted to share my approach to this project with the hope that you can
give me some feedback.
I am think that applying a design that foresees the incorporation of new
file formats is the most suitable way to solve the problem.
In the attached sketch we can see:
* Bug_Box: It is responsible for encapsulating and handling errors.
* File_extrator: It presents an
2019 Jun 14
2
Text-Extraction Libraries for Omindex
This is a list with some libraries that I have been looking at.
The idea is to discuss the advantages and disadvantages of adding some of
these libraries to Xapian.
If anyone knows another library that could be add to the list it would be
great!
Libfreexl:
* For Excel (.xls)
* Last release: 2018-02
* Info: gaia-gis.it/fossil/freexl/index
* License: MPL tri-license
2012 Mar 24
1
Regarding OMEGA project, GsoC
Hi,
This is Rahul Singhal, a student of Computer Science And Engineering
Department of Indian Institute Of Technology, Bombay.
My interest lies in coding & algorithm development . I love being wired all
the night. I have a lot of experience of C\C++ language,
did a course which aims at the deep knowledge of C++ at IIT Bombay itself.
As a part of this course I made an SQL compiler in C++
in my
2017 Mar 23
2
GSoC 2017: Letor Click Data Mining
> You could do that by identifying the search session instead of the user,
> which makes it closer to what we need than to something that might trip you
> into privacy concerns.
Okay, that would be much better. :)
> Third records some information about what sort of query it is — add,
> morelike or a plain query. Last provides the estimated match size and then
> the HTTP
2011 Mar 21
1
GSOC 2011 - QueryParser Reimplementation
hello everyone,
I am Maheshwar, a prefinal year Computer Science undergraduate student at
BITS-Pilani, India. When i was going through the GSOC ideas , i felt
interested in Quesry parser project. Till now i have implemented a couple of
LL(1) parsers as a part of my assignment in Compiler construction course,
so i would love to join and contribute to this project. So can any one tell
me how to go
2017 Mar 22
2
GSoC 2017: Letor Click Data Mining
Hi James,
> Isn't this from the query template, ie from the main web page of search
> results? (It might make sense from opensearch as well, though.)
Yes, you are right; it is the query template. The reason I said opensearch
template is that I haven't quite read all sections of the Omega docs and I'm
still in the process. Thanks for pointing that out.
I'm aiming to cover
2017 Mar 21
2
GSoC 2017: Letor Click Data Mining
Hi Olly. Thanks for your reply to the previous email.
To have an appropriate subject I've started this new thread for further
discussions.
> There's a $log{} command available in Omega templates. We can't log from
> the result page template, as the clicks happen after that is used, but we
> could make result links redirect via a second Omega template which does
> the
2017 Mar 13
2
GSOC 2017 Project: Learning to Rank Click Data Mining
I am interested in the project 'Learning to Rank Click Data Mining', and
here is my current understanding about this project:
1. where can we get your click data. we can extend the omega to supports
log the user's search and clicked documents
2. the specific click data information and format. Based on some paper and
public query dataset format(AOL search query logs[1] and Sogou
2014 Mar 11
2
[GSOC 2013] Question about indexing INEX dataset
Hi,
I?m trying to use Omega to index INEX dataset for Letor. But omindex told me these xml files are unknown. Olly told me I could tell omindex to handle them as HTML. (Thanks Olly :) ) Is it appropriate? Parth, could you give me some suggestions?
Thank you!
Jiarong Wei
2019 Apr 10
2
GSoC 2019 Update
Student applications closed about 13 hours ago. Thanks to all the
students who applied to us.
We've made a start on reviewing proposals - please keep an eye open for
emails or IRC (if you're on IRC) as we may have further questions for
you.
One general note - like many orgs in GSoC, we do expect students to
contribute a patch as part of their application. This is very helpful
as it
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
Hi Parth,
I?ve implemented SVMRanker class and also sorted out most of current Letor APIs.
Now I?m trying to use INEX dataset to verify my implement. But I stuck in the indexing part. You said in the documentation that we have to add prefix when indexing. Also I notice that you set some metadata in omindex.cc of your version. But the omindex.cc has changed since 2011. I think that?s why my result
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote:
> During the indexing with omindex, only you need to make sure is indexing
> with prefix 'S' for title as explained here in Letor documentation:
> xapian-letor/docs/letor.rst
>
> Previously when I edited omindex.cc it was modified as can be seen
>
2014 Feb 25
2
GSOC 2014
Hi,
I am Jiarong Wei (irc: VcamX). I?m a 3rd year computer science student at Zhejiang University, China. I?m very willing to contribute to Xapian as part of GSoC 2014. Now I?m at Simon Fraser University, Canada, as an exchange student. I?ll go back to China on the end of April. I think it doesn?t matter I?ll change the time zone :)
From the list of project?s ideas, Learning to Rank interests me
2014 Feb 14
2
GSoC 2014
Hi,
I am Nikhar Agrawal, currently studying in my third year at IIIT-H,
pursuing Computer Science and Engineering. I am fairly proficient in C++. I
have been a GSoC 2013 participant for Boost C++ libraries and managed to
successfully merge my project into Boost trunk.
As a part of my course on Information Retrieval and Extraction, I did a
project on searching for queries on the latest 40 gb
2012 Mar 19
1
I want to volunteer Xapian in GSoC 2012.
Hi,
I am an undergraduate student of Computer Science and Engineering. I want
to volunteer Xapian in Google Summer of Code 2012. But I have no experience
of working on Open Source projects. I am really interested in "Test
Extraction Library" project. I meet some of its requirements like I'm
pretty much familiar with C++ (and now learning even advanced C++
programming) and have keen
2018 Feb 12
0
Fwd: GSoC 2018: Xapian Search Engine Library has been accepted as a mentor organization!
Good news everybody:
---------- Forwarded message ----------
From: Google Summer of Code <summerofcode-noreply at google.com>
Date: 13 February 2018 at 06:11
Subject: GSoC 2018: Xapian Search Engine Library has been accepted as a mentor organization!
[image: Google Summer of Code]
Congratulations! Xapian Search Engine Library has been selected as a Google
Summer of Code 2018 mentor
2014 Mar 06
2
Regarding GSOC 2014
Sir,
I am a 4th yr undergraduate student pursuing my BTech in CSE at IIIT
Hyderbad, India.
I am interested in applying for Xapian in Gsoc 2014. I had gone through
this year's idea page and interested in applying for 'posting list encoding
improvements' project.
I am good at C/C++,python; which is one of the requirement. I had done gone
through the information Retrieval and