Vladimir Zaytsev
2011-Mar-28 19:41 UTC
[Xapian-discuss] GSoC Project, Support Erlang Language
"Support Erlang Language" By Vladimir Zaytsev, Xapian, 2011 *About me* Name: Vladimir Zaytsev E-mail address: vladimir at zvm.me WWW: zvm.me/, facebook.com/vladimir.zaytsev<http://www.facebook.com/vladimir.zaytsev> Emergency contact phone number: +79028195844 Short biography: I was born in 5th Febrary, 1991 in Donetsk, USSR; now live in Khanty-Mansiysk, Russia. In 2008 finished Ugra Lyceum of Physics and Mathematics. Now study at the Ugra State University, Institute of Applied Mathematics, Computer Science and Management (GPA:5.0/5.0). I was participated in the 9th Estonian Summer School on Computer and Systems Science, August 2010; XLIX International Scientific Students Conference in Novosibirsk, April 2010. My career and research interests are included: software engineering, functional programming, information retrieval, machine learning and data mining. *Eligibility* I fulfil the eligibility requirements. *Background Information* - Have you taken part in GSoC and/or GHOP<http://code.google.com/opensource/ghop/2007-8/>and/or GCI <http://code.google.com/gci> before? I have not take part in GSoC, GHOP, GCI before. - Please tell us about any previous experience you have with Xapian, or other systems for indexed text search. I don?t have any practical experience with Xapian or another indexed search but I have some theoretical knowledge, I?ve read Manning?s ?Introduction to Information Retrieval? and Segaran's ?Collective Intelligence? and similar, so I would have chance to use that theory in practice. - Do you have previous experience with Free Software and Open Source other than Xapian? - I have previous experience with such OpenSource software as Python, Erlang/OTP, Linux(Debian), GCC, PostgreSQL and so on. - Do you have any other relevant prior experience? I have similar experience from November 2009 to June 2010 I was working on the project of developing a facts extracting system (Erlang and Python) at the Ugra Research Institute of Information Technologies. - What development platforms, tools and methods do you prefer to use? I prefer to use: - OS: Mac OS X and Linux; - Languages: Erlang, C++, Python; - Environment: Emacs, Textmate, Eclipse, git, make, gdb, valgrind; - I prefer to use a functional programming style where it is appropriate. - Have you previously been responsible (as an employee/volunteer/student/etc) for a project of a similar size? No, I have not. - What timezone will you be in during the coding period? GMT/UTC + 6:00 - Will your Summer of Code project be the main focus of your time during the program? Yes it will. - How many hours a week will you realistically be able to devote to your project? I plan to invest 10-15 hours per week until 25th of April, 40 hours during the GSoC. - Are you applying for other projects in GSoC 2011? If so, with which organisations? Yes, I?m also applying for Shogun Toolbox. *Project* Title: ?Support Erlang Language? Summary: Add to Xapian bindings which allowed Xapian to be used from Erlang language. There are three reasons why I have chosen this project. Firstly, Erlang is gaining popularity language for developing distributed scalable web(and not only) applications where it is often needed fast search so it would be nice to have a comfortable support of Xapian. Secondly, I?m interested in information retrieval and similar areas so this project would be a good starting point for practice. In conclusion I am familiar with Erlang and enthusiastic about using my knowledge and skills to help OpenSource community and gain new experience. In addition I have my own Erlang-driven project where I plan to use Xapian. * * *Benefits* Nowadays there are lots small and big companies(Amazon, Facebook, Mochimedia, JS-Kit, etc) which use Erlang and need to use search engines in their projects so I think some of them would be interested about use Xapian-Erlang interface. *Project Details* Main concepts: - I plan to use Erlang NIF <http://www.erlang.org/doc/tutorial/nif.html>mechanism, which will allow to run C++ code inside Erlang VM to minimize latency. - Of course I plan to use OTP primitives(firstly, gen_server and supervisors) which provide most useful behavior patterns to not invent the wheel. - I?m familiar with various Erlang interfaces for accessing to some applications like a DBMS, format converters, web servers, ect; so I think it would be better to implement Xapian interface in similar Erlang-style way to be more compatible with some of them. - On the other hand I plan to take into account all features of Xapian to provide the most complete access to the library. I think it is important implement basic parts of interface first and make it less complex. In case not everything works out exactly as planned we will ensure at least operability of these parts. *Approximate Project Timeline* before 30 April Read documentation and source code to familiarize myself with functionality, architecture and C++ API of the Xapian. 2 - 16 May Learn and understand another languages bindings and SWIG. 16 - 21 May Prepare environment to code 23 May - mid June Define and implement all the required Erlang modules. mid-June - 26 June Improve speed and functionality, scrub code. 26 June - mid-July Integrate code into Xapian. Write tests, fix bugs. after mid-July Write documentation, tests and examples.