thr3ads.net - Xapian discuss - [Xapian-discuss] encoding? [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Gupteshwar Joshi

2006-Mar-31 13:58 UTC

[Xapian-discuss] encoding?

Hello

Does omega supports different kind of encodings for searching the the
indexed data .
I have applied the indexing on all the documents of english+devnagari
language.
It  does work without prompting any error if i consider that my local data
too is indexed then
 it is not showing any reult for devnagari key .
     I have attached meta tag for encoding type in head query template but
still it doesnt searching for
those key words.
So there is any method to apply different encoding in query templte..?
Or can i have hint for doing that ?

Thank You

Olly Betts

2006-Apr-01 00:46 UTC

head link

[Xapian-discuss] encoding?

Please don't send essentially the same message to the list multiple
times (less than 90 minutes apart too!)  And don't cc: individual
developers - we all read the list so you'll just annoy people and be
less likely to get a useful answer.  Overall, remember this mailing list
is a free resource, and nobody is under any obligation to help you.  So
if you want help, play nicely and respect the other list members.

On Fri, Mar 31, 2006 at 06:28:37PM +0530, Gupteshwar Joshi
wrote:> Does omega supports different kind of encodings for searching the the
> indexed data .
Currently Omega doesn't perform any character encoding conversions.
So if you're trying to handle a non-latin language, you'll probably
be disappointed.
> I have applied the indexing on all the documents of english+devnagari
> language.
Sorry, I don't know what encoding devnagari requires.
> It  does work without prompting any error if i consider that my local data
> too is indexed then it is not showing any reult for devnagari key .
Assuming devnagari uses a non-latin character set, then the word
tokeniser won't tokenise devnagari words correctly (or at all in fact).

The plan for Xapian 1.0 is to fix Omega to convert everything to utf-8
and use unicode definitions of what is a word character, etc.  Then this
should all work.

Meanwhile, if you're prepared to write your own indexer (or at least
your own word tokeniser), then there's a patch to make the QueryParser
utf-8 aware (which is what the gmane search uses).
>      I have attached meta tag for encoding type in head query template but
> still it doesnt searching for those key words.
Well, that only tells the browser what character set the output is in so
it's not going to affect the searching.

Incidentally, a slightly better approach than a meta tag is to set the
charset in the Content-Type: header of the response by adding something
like this to the top of the query template:

$httpheader{Content-Type,text/html; charset=utf-8}

Cheers,
    Olly

गुप्तेश्वर जोशी

2006-May-02 14:15 UTC

head link

[Xapian-discuss] encoding?

Hello,
In search result is this possible to show line number of the in the document
where the search word found...

On 5/2/06, ?????????? ???? <gupteshwar.joshi@gmail.com>
wrote:>
>
>
> ---------- Forwarded message ----------
> From: Olly Betts <olly@survex.com >
> Date: May 1, 2006 8:31 PM
> Subject: Re: [Xapian-discuss] encoding?
> To: ?????????????????????????????? ????????????
<gupteshwar.joshi@gmail.com
> >
> Cc: xapian-discuss@lists.xapian.org
>
> On Mon, May 01, 2006 at 05:10:47PM +0530, ??????????????????????????????
> ???????????? wrote:
> >    Problem is that all marathi chracters are geting converted to %EA0
> > something values and search returns no result.
>
> The %-encoding is just how non-ASCII (and some "unsafe" ASCII)
> characters are encoded in URLs - so it's how CGI GET requests are
> supposed to work.
>
> > I have tried the 'no transliteration' patch too but I didnt
get whether
> too
> > add that line to existing code or too relace available lines with that
> > single line.
>
> You need to *replace* lines to make the code of those two methods look
> like the code in this message:
>
>  http://article.gmane.org/gmane.comp.search.xapian.general/1927
>
> Cheers,
>     Olly
>
>
> --
>
> ----- ?????????? ???? -----
>


--
----- ?????????? ???? -----

Xapian discuss - Mar 2006 - encoding?

[Xapian-discuss] encoding?

[Xapian-discuss] encoding?

[Xapian-discuss] encoding?