io <io at ooeeeoo.com> writes:> what xapian 'indexing system' did was to index the entire sentence > 'xxx_yyy' and you will not be able to find any sentence which contain > the word 'yyy'?I'm curious that you refer to xxx_yyy as a sentence. In the contexts I am familiar with, the point of _ is to join things together into one word (or one identifier/token). Other than that your understanding seems correct.> xapian should have this simple wildcard feature which 'grep'(search) > offer. ($grep '*word*' file). It is strange that xapian restrict the > search to 'trailing wildcard' only.I guess the restriction is based on what is easy to do efficiently with the Xapian database (find prefixes). If I remember correctly there was some work in progress to support leading wildcards in Xapian. I can't find relevant discussion now, but I CC'ed the Xapian list in case someone remembers that.> Novice user who get introduce to notmuch just want to run the search > and get the result straight away.Generally the focus of Xapian (and thus notmuch) is on words and phrases like "bob ate my pizza". I agree this is disappointing for someone who wants "all the flexibility of grep, but faster".
On Mon, Dec 04, 2023 at 06:39:43AM -0500, David Bremner wrote:> I guess the restriction is based on what is easy to do efficiently with > the Xapian database (find prefixes). If I remember correctly there was > some work in progress to support leading wildcards in Xapian. I can't > find relevant discussion now, but I CC'ed the Xapian list in case > someone remembers that.The development version of Xapian supports both `*` and `?` glob-style wildcards in any position. You can enable them for the QueryParser using FLAG_WILDCARD_MULTI, FLAG_WILDCARD_SINGLE or FLAG_WILDCARD_GLOB (the last one is just the first two combined). Cheers, Olly